To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Although ‘in-the-wild’ technology testing provides an important opportunity to collect evidence about the performance of new technologies in real world deployment environments, such tests may themselves cause harm and wrongfully interfere with the rights of others. This paper critically examines real-world AI testing, focusing on live facial recognition technology (FRT) trials by European law enforcement agencies (in London, Wales, Berlin, and Nice) undertaken between 2016 and 2020, which serve as a set of comparative case studies. We argue that there is an urgent need for a clear framework of principles to govern real-world AI testing, which is currently a largely ungoverned ‘wild west’ without adequate safeguards or oversight. We propose a principled framework to ensure that these tests are undertaken in an epistemically, ethically, and legally responsible manner, thereby helping to ensure that such tests generate sound, reliable evidence while safeguarding the human rights and other vital interests of others. Although the case studies of FRT testing were undertaken prior to the passage of the EU’s AI Act, we suggest that these three kinds of responsibility should provide the foundational anchor points to inform the design and conduct of real-world testing of high-risk AI systems pursuant to Article 60 of the AI Act.
The hard-core model has as its configurations the independent sets of some graph instance $G$. The probability distribution on independent sets is controlled by a ‘fugacity’ $\lambda \gt 0$, with higher $\lambda$ leading to denser configurations. We investigate the mixing time of Glauber (single-site) dynamics for the hard-core model on restricted classes of bounded-degree graphs in which a particular graph $H$ is excluded as an induced subgraph. If $H$ is a subdivided claw then, for all $\lambda$, the mixing time is $O(n\log n)$, where $n$ is the order of $G$. This extends a result of Chen and Gu for claw-free graphs. When $H$ is a path, the set of possible instances is finite. For all other $H$, the mixing time is exponential in $n$ for sufficiently large $\lambda$, depending on $H$ and the maximum degree of $G$.
This appendix delves into the mathematical foundations of network representation techniques, focusing on two key areas: maximum likelihood estimation (MLE) and spectral embedding theory. It begins by exploring MLE for Erdös-Rényi (ER) and stochastic block model (SBM) networks, demonstrating the unbiasedness and consistency of estimators. The limitations of MLE for more complex models are discussed, leading to the introduction of spectral methods. The chapter then presents theoretical considerations for spectral embeddings, including the adjacency spectral embedding (ASE) and its statistical properties. It explores the concepts of consistency and asymptotic normality in the context of random dot product graphs (RDPGs). Finally, we extend these insights to multiple network models, covering graph matching for correlated networks and joint spectral embeddings like the omnibus embedding and multiple adjacency spectral embedding (MASE).
This chapter presents a unified framework for analyzing complex networks through statistical models. Starting with the Inhomogeneous Erdős-Rényi model’s concept of independent edge probabilities, we progress through increasingly sophisticated representations, including the Erdös-Rényi, Stochastic Block Model, and Random Dot Product Graph (RDPG) models. We explore how each model generalizes its predecessors, with the RDPG encompassing many earlier models under certain conditions. The crucial role of positive semidefiniteness in connecting block models to RDPGs is examined, providing insight into model interrelationships. We also introduce models addressing specific network characteristics, such as heterogeneous node degrees and edge-based clustering. The chapter extends to multiple and correlated network models, demonstrating how concepts from simpler models inform more complex scenarios. A hierarchical framework is presented, unifying these models and illustrating their relative generality, thus laying the groundwork for advanced network analysis techniques.
This chapter explores practical applications of network representation learning techniques for analyzing individual networks. It begins by addressing the community detection problem, demonstrating how to estimate community labels using network embeddings. The chapter then discusses the challenges posed by network sparsity and introduces efficient storage methods for sparse networks. The text proceeds to examine testing for differences between groups of edges, applying hypothesis testing to stochastic block models and structured independent edge models. It also covers model selection techniques for stochastic block models, helping readers choose appropriate levels of model complexity. The chapter introduces the vertex nomination problem, which aims to identify nodes similar to a set of known "seed" nodes. It presents spectral vertex nomination techniques and explores extensions to related problems. Finally, the chapter addresses out-of-sample embedding, providing efficient strategies for embedding new nodes into existing network representations. This approach is particularly valuable for large-scale, dynamic networks where frequent re-embedding would be computationally prohibitive.
This chapter explores techniques for analyzing and comparing pairs of networks, building on previously introduced statistical models and representation learning methods. It focuses on two-sample testing for networks, introducing methods to determine whether two network observations are sampled from the same or different random networks. The chapter covers latent position and distribution testing, addressing nonidentifiability issues in network comparisons. It then explores specialized techniques for comparing stochastic block models (SBMs), leveraging their community structure and discussing methods for testing differences in block matrices, including density adjustment approaches. A significant portion is devoted to the graph matching problem, addressing the challenge of identifying node correspondences between networks. This section introduces permutation matrices and explores optimization-based methods, including gradient descent approaches, for both exact and inexact matching scenarios. Throughout, the chapter emphasizes practical implementations with code examples, bridging the gap between theoretical concepts and real-world applications in network analysis. These techniques provide a comprehensive toolkit for comparing networks, essential for understanding evolving networks, analyzing differences across domains, and integrating multisource network data.
This appendix provides a comprehensive overview of statistical network models, building from fundamental concepts to advanced frameworks. It begins with essential mathematical background and probability theory, then introduces the foundations of random network models. The appendix covers a range of models, including Erdös-Rényi, stochastic block models (both a priori and a posteriori), random dot product graphs, and their generalizations. Each model is presented with its parameters, generative process, probability calculations, and equivalence classes. The appendix also explores degree-corrected variants and the Inhomogeneous Erdös-Rényi model. Throughout, we emphasize the relationships between models and their increasing complexity, providing a solid theoretical foundation for understanding network structures and dynamics.
In this paper, we provide a new property of value at risk (VaR), which is a standard risk measure that is widely used in quantitative financial risk management. We show that the subadditivity of VaR for given loss random variables holds for any confidence level if and only if those are comonotonic. This result also gives a new equivalent condition for the comonotonicity of random vectors.
This chapter presents a comprehensive workflow for applying network machine learning to functional MRI connectomes. We demonstrate data preprocessing, edge weight transformations, and spectral embedding techniques to analyze multiple brain networks simultaneously. Using multiple adjacency spectral embedding (MASE) and unsupervised clustering, we identify functionally similar brain regions across subjects. Results are visualized through abstract representations and brain-space projections, and compared with established brain parcellations. Our findings reveal that MASE-derived communities often align with known functional and spatial organization of the brain, particularly in occipital and parietal areas, while also identifying regions where functional similarity doesn’t imply spatial proximity. We illustrate how network machine learning can uncover meaningful patterns in complex neuroimaging data, emphasizing the importance of combining algorithmic approaches with domain expertise to motivate the remainder of the book.
This chapter introduces the network machine learning landscape, bridging traditional machine learning with network-specific approaches. It defines networks, contrasts them with tabular data structures, and explains their ubiquity in various domains. The chapter outlines different types of network learning systems, including single vs. multiple network, attributed vs. non-attributed, and model-based vs. non-model-based approaches. It also discusses the scope of network analysis, from individual edges to entire networks. The chapter concludes by addressing key challenges in network machine learning, such as imperfect observations, partial network visibility, and sample limitations. Throughout, it emphasizes the importance of statistical learning in generalizing findings from network samples to broader populations, setting the stage for more advanced concepts in subsequent chapters.
This chapter presents a framework for learning useful representations, or embeddings, of networks. Building on the statistical models from Chapter 4, we explore techniques to transform complex network data into vector representations suitable for traditional machine learning algorithms. We begin with maximum likelihood estimation for simple network models, then motivate the need for network embeddings by contrasting network dependencies with typical machine learning independence assumptions. We progress through spectral embedding methods, introducing adjacency spectral embedding (ASE) for learning latent position representations from adjacency matrices, and Laplacian spectral embedding (LSE) as an alternative approach effective for networks with degree heterogeneities. The chapter then extends to multiple network representations, exploring parallel techniques like omnibus embedding (OMNI) and fused methods such as multiple adjacency spectral embedding (MASE). We conclude by addressing the estimation of appropriate latent dimensions for embeddings. Throughout, we emphasize practical applications with code examples and visualizations. This unified framework for network embedding enables the application of various machine learning algorithms to network analysis tasks, bridging complex network structures and traditional data analysis techniques.
This appendix provides a concise introduction to key machine learning techniques employed throughout the book. It focuses on two main areas: unsupervised learning and Bayesian classification. The appendix begins with an exploration of K-means clustering, a fundamental unsupervised learning algorithm, demonstrating its application to network community detection. It then discusses methods for evaluating unsupervised learning techniques, including confusion matrices and the adjusted Rand index. The silhouette score is introduced as a metric for assessing clustering quality across different numbers of clusters. The appendix concludes with an explanation of the Bayes plugin classifier, a simple yet effective tool for network classification tasks.