To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this paper, we provide a new property of value at risk (VaR), which is a standard risk measure that is widely used in quantitative financial risk management. We show that the subadditivity of VaR for given loss random variables holds for any confidence level if and only if those are comonotonic. This result also gives a new equivalent condition for the comonotonicity of random vectors.
This chapter presents a comprehensive workflow for applying network machine learning to functional MRI connectomes. We demonstrate data preprocessing, edge weight transformations, and spectral embedding techniques to analyze multiple brain networks simultaneously. Using multiple adjacency spectral embedding (MASE) and unsupervised clustering, we identify functionally similar brain regions across subjects. Results are visualized through abstract representations and brain-space projections, and compared with established brain parcellations. Our findings reveal that MASE-derived communities often align with known functional and spatial organization of the brain, particularly in occipital and parietal areas, while also identifying regions where functional similarity doesn’t imply spatial proximity. We illustrate how network machine learning can uncover meaningful patterns in complex neuroimaging data, emphasizing the importance of combining algorithmic approaches with domain expertise to motivate the remainder of the book.
This chapter introduces the network machine learning landscape, bridging traditional machine learning with network-specific approaches. It defines networks, contrasts them with tabular data structures, and explains their ubiquity in various domains. The chapter outlines different types of network learning systems, including single vs. multiple network, attributed vs. non-attributed, and model-based vs. non-model-based approaches. It also discusses the scope of network analysis, from individual edges to entire networks. The chapter concludes by addressing key challenges in network machine learning, such as imperfect observations, partial network visibility, and sample limitations. Throughout, it emphasizes the importance of statistical learning in generalizing findings from network samples to broader populations, setting the stage for more advanced concepts in subsequent chapters.
This chapter presents a framework for learning useful representations, or embeddings, of networks. Building on the statistical models from Chapter 4, we explore techniques to transform complex network data into vector representations suitable for traditional machine learning algorithms. We begin with maximum likelihood estimation for simple network models, then motivate the need for network embeddings by contrasting network dependencies with typical machine learning independence assumptions. We progress through spectral embedding methods, introducing adjacency spectral embedding (ASE) for learning latent position representations from adjacency matrices, and Laplacian spectral embedding (LSE) as an alternative approach effective for networks with degree heterogeneities. The chapter then extends to multiple network representations, exploring parallel techniques like omnibus embedding (OMNI) and fused methods such as multiple adjacency spectral embedding (MASE). We conclude by addressing the estimation of appropriate latent dimensions for embeddings. Throughout, we emphasize practical applications with code examples and visualizations. This unified framework for network embedding enables the application of various machine learning algorithms to network analysis tasks, bridging complex network structures and traditional data analysis techniques.
This appendix provides a concise introduction to key machine learning techniques employed throughout the book. It focuses on two main areas: unsupervised learning and Bayesian classification. The appendix begins with an exploration of K-means clustering, a fundamental unsupervised learning algorithm, demonstrating its application to network community detection. It then discusses methods for evaluating unsupervised learning techniques, including confusion matrices and the adjusted Rand index. The silhouette score is introduced as a metric for assessing clustering quality across different numbers of clusters. The appendix concludes with an explanation of the Bayes plugin classifier, a simple yet effective tool for network classification tasks.
This chapter establishes the foundation for network machine learning. We begin with network fundamentals: adjacency matrices, edge directionality, node loops, and edge weights. We then explore node-specific properties such as degree and path length, followed by network-wide metrics including density, clustering coefficients, and average path lengths. The chapter progresses to advanced matrix representations, notably degree matrices and various Laplacian forms, which are crucial for spectral analysis methods. We examine subnetworks and connected components, tools for focusing on relevant network structures. The latter half of the chapter delves into preprocessing techniques. We cover node pruning methods to manage outliers and low-degree nodes. Edge regularization techniques, including thresholding and sparsification, address issues in weighted and dense networks. Finally, we explore edge-weight rescaling methods such as z-score standardization and ranking-based approaches. Throughout, we emphasize practical applications, illustrating concepts with examples and code snippets. These preprocessing steps are vital for addressing noise, sparsity, and computational challenges in network data. By mastering these concepts and techniques, readers will be well-equipped to prepare network data for sophisticated machine learning tasks, setting the stage for the advanced methods presented in subsequent chapters.
This chapter explores deep learning methods for network analysis, focusing on graph neural networks (GNNs) and diffusion-based approaches. We introduce GNNs through a drug discovery case study, demonstrating how molecular structures can be analyzed as networks. The chapter covers GNN architecture, training processes, and their ability to learn complex network representations without explicit feature engineering. We then examine diffusion-based methods, which use random walks to develop network embeddings. These techniques are compared and contrasted with earlier spectral approaches, highlighting their capacity to capture nonlinear relationships and local network structures. Practical implementations using frameworks such as PyTorch Geometric illustrate the application of these methods to large-scale network datasets, showcasing their power in addressing complex network problems across various domains.
This chapter explores advanced applications of network machine learning for multiple networks. We introduce anomaly detection in time series of networks, identifying significant structural changes over time. The chapter then focuses on signal subnetwork estimation for network classification tasks. We present both incoherent and coherent approaches, with incoherent methods identifying edges that best differentiate between network classes, and coherent methods leveraging additional network structure to improve classification accuracy. Practical applications, such as classifying brain networks, are emphasized throughout. These techniques apply to collections of networks, providing a toolkit for analyzing and classifying complex, multinetwork datasets. By integrating previous concepts with new methodologies, we offer a framework for extracting insights and making predictions from diverse network structures with associated attributes.
In this paper, we study a two-period optimal insurance problem for a policyholder with mean-variance preferences who purchases proportional insurance at the beginning of each period. The insurance premium is calculated by a variance premium principle with a risk loading that depends on the policyholder’s claim history. We derive the time-consistent optimal insurance strategy in closed form and the optimal constant precommitment strategy in semiclosed form. For the optimal general precommitment strategy, we obtain the solution for the second period semi-explicitly and, then, the solution for the first period numerically via an efficient algorithm. Furthermore, we compare the three types of optimal strategies, highlighting their differences, and we examine the impact of the key model parameters on the optimal strategies and value functions.
As social media continues to grow, understanding the impact of storytelling on stakeholder engagement becomes increasingly important for policymakers and organizations who wish to influence policymaking. While prior research has explored narrative strategies in advertising and branding, researchers have paid scant attention to the specific influence of stories on social media stakeholder engagement. This study addresses this gap by employing Narrative Transportation Theory (NTT) and leveraging Natural Language Processing (NLP) to analyze the intricate textual data generated by social media platforms. The analysis of 85,075 Facebook publications from leading Canadian manufacturing companies, using Spearman’s rank correlation coefficient, underscores that individual storytelling components—character, sequence of events, and setting—along with the composite narrative structure significantly enhance stakeholder engagement. This research contributes to a deeper understanding of storytelling dynamics in social media, emphasizing the importance of crafting compelling stories to drive meaningful stakeholder engagement in the digital realm. The results of our research can prove useful for those who wish to influence policymakers or for policymakers who want to promote new policies.
A predictive column chart was developed to assess the risk of primary liver cancer (PLC) in hepatitis B patients. Data from 107 PLC patients and 107 controls were used as the training set, with 92 patients as the validation set. An additional 446 patients from other hospitals, including 15 with PLC, formed the external validation group. Multivariate logistic regression identified gender, BMI, alcohol consumption, diabetes, family history of liver cancer, cirrhosis, and HBV DNA load as independent risk factors. The model showed strong discrimination with AUCs of 0.882 and 0.859 in the training and validation sets, respectively, and good calibration (Hosmer–Lemeshow χ² = 2.648, P = 0.954; χ² = 4.117, P = 0.846). Decision curve analysis (DCA) confirmed clinical benefit within a risk threshold of 0.07–0.95. In the external validation group, the model maintained discrimination (AUC = 0.863) and calibration (Hosmer–Lemeshow χ² = 7.999, P = 0.434), with DCA showing net benefit across 0.14–0.95. These results indicate the column chart is a reliable tool for PLC risk prediction in hepatitis B patients.
We introduce a novel class of bivariate common-shock discrete phase-type (CDPH) distributions to describe dependencies in loss modeling, with an emphasis on those induced by common shocks. By constructing two jointly evolving terminating Markov chains that share a common evolution up to a random time corresponding to the common shock component, and then proceed independently, we capture the essential features of risk events influenced by shared and individual-specific factors. We derive explicit expressions for the joint distribution of the termination times and prove various class and distributional properties, facilitating tractable analysis of the risks. Extending this framework, we model random sums where aggregate claims are sums of continuous phase-type random variables with counts determined by these termination times and show that their joint distribution belongs to the multivariate phase-type or matrix-exponential class. We develop estimation procedures for the CDPH distributions using the expectation-maximization algorithm and demonstrate the applicability of our models through simulation studies and an application to bivariate insurance claim frequency data. In particular, the distribution of the latent common shock component present in correlated count data can be estimated as well.
Despite significant advances in Building Information Modeling (BIM) and increased adoption, numerous challenges remain. Discipline-specific BIM software tools with file storage have unresolved interoperability issues and do not capture or express interdisciplinary design intent. This hobbles machines’ ability to process design information. The lack of suitable data representation hinders the application of machine learning and other data-centric applications in building design. We propose Building Information Graphs (BIGs) as an alternative modeling method. In BIGs, discipline-specific design models are compiled as subgraphs in which nodes and edges model objects and their relationships. Additional nodes and edges in a meta-graph link the building objects across subgraphs. Capturing both intradisciplinary and interdisciplinary relationships, BIGs provide a dimension of contextual data for capturing design intent and constraints. BIGs are designed for computation and applications. The explicit relationships enable advanced graph functionalities, such as across-domain change propagation and object-level version control. BIGs preserve multimodal design data (geometry, attributes, and topology) in a graph structure that can be embedded into high-dimensional vectors, in which learning algorithms can detect statistical patterns and support a wide range of downstream tasks, such as link prediction and graph generation. In this position article, we highlight three key challenges: encapsulating and formalizing object relationships, particularly design intent and constraints; designing graph learning techniques; and developing innovative domain applications that leverage graph structures and learning. BIGs represent a paradigm shift in design technologies that bridge artificial intelligence and building design to enable intelligent and generative design tools for architects, engineers, and contractors.
Over 193 countries have signed at least one of more than 500 multilateral treaties addressing critical global issues, such as human rights, environmental protection, and trade. Ratifying a treaty obligates a country, as a “State Party,” to report to the United Nations on its progress toward implementing the treaty’s provisions. These reports and their associated review processes generate a wealth of textual data. Effectively monitoring, reviewing, and assessing national, regional, and global progress toward these treaty commitments is crucial for ensuring compliance and realizing the benefits of international cooperation. The UN Convention on the Rights of Persons with Disabilities (CRPD), which has been ratified by 191 countries, exemplifies this challenge. With over 1.3 billion people worldwide living with disabilities, the CRPD aims to promote a shift from a charity-based “medical model” that views disability as an individual deficiency, to a rights-based “social justice model” that emphasizes societal barriers and inclusivity. Each State Party submits periodic reports to the Committee on the Rights of Persons with Disabilities detailing their implementation efforts. This study analyzed all available CRPD State Reports (N = 170) using text mining, Natural Language Processing, and GenerativeAI tools to assess global progress, identify regional variations, and explore the factors influencing successful implementation. The findings reveal evidence of widespread CRPD implementation, growing support for social justice and economic inclusion, and the importance of civil society engagement. Hybrid data analysis approach of this study offers a promising framework for harnessing the power of textual data to advance the realization of treaty commitments worldwide.