To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter establishes the foundation for network machine learning. We begin with network fundamentals: adjacency matrices, edge directionality, node loops, and edge weights. We then explore node-specific properties such as degree and path length, followed by network-wide metrics including density, clustering coefficients, and average path lengths. The chapter progresses to advanced matrix representations, notably degree matrices and various Laplacian forms, which are crucial for spectral analysis methods. We examine subnetworks and connected components, tools for focusing on relevant network structures. The latter half of the chapter delves into preprocessing techniques. We cover node pruning methods to manage outliers and low-degree nodes. Edge regularization techniques, including thresholding and sparsification, address issues in weighted and dense networks. Finally, we explore edge-weight rescaling methods such as z-score standardization and ranking-based approaches. Throughout, we emphasize practical applications, illustrating concepts with examples and code snippets. These preprocessing steps are vital for addressing noise, sparsity, and computational challenges in network data. By mastering these concepts and techniques, readers will be well-equipped to prepare network data for sophisticated machine learning tasks, setting the stage for the advanced methods presented in subsequent chapters.
This chapter explores deep learning methods for network analysis, focusing on graph neural networks (GNNs) and diffusion-based approaches. We introduce GNNs through a drug discovery case study, demonstrating how molecular structures can be analyzed as networks. The chapter covers GNN architecture, training processes, and their ability to learn complex network representations without explicit feature engineering. We then examine diffusion-based methods, which use random walks to develop network embeddings. These techniques are compared and contrasted with earlier spectral approaches, highlighting their capacity to capture nonlinear relationships and local network structures. Practical implementations using frameworks such as PyTorch Geometric illustrate the application of these methods to large-scale network datasets, showcasing their power in addressing complex network problems across various domains.
This chapter explores advanced applications of network machine learning for multiple networks. We introduce anomaly detection in time series of networks, identifying significant structural changes over time. The chapter then focuses on signal subnetwork estimation for network classification tasks. We present both incoherent and coherent approaches, with incoherent methods identifying edges that best differentiate between network classes, and coherent methods leveraging additional network structure to improve classification accuracy. Practical applications, such as classifying brain networks, are emphasized throughout. These techniques apply to collections of networks, providing a toolkit for analyzing and classifying complex, multinetwork datasets. By integrating previous concepts with new methodologies, we offer a framework for extracting insights and making predictions from diverse network structures with associated attributes.
In this paper, we study a two-period optimal insurance problem for a policyholder with mean-variance preferences who purchases proportional insurance at the beginning of each period. The insurance premium is calculated by a variance premium principle with a risk loading that depends on the policyholder’s claim history. We derive the time-consistent optimal insurance strategy in closed form and the optimal constant precommitment strategy in semiclosed form. For the optimal general precommitment strategy, we obtain the solution for the second period semi-explicitly and, then, the solution for the first period numerically via an efficient algorithm. Furthermore, we compare the three types of optimal strategies, highlighting their differences, and we examine the impact of the key model parameters on the optimal strategies and value functions.
As social media continues to grow, understanding the impact of storytelling on stakeholder engagement becomes increasingly important for policymakers and organizations who wish to influence policymaking. While prior research has explored narrative strategies in advertising and branding, researchers have paid scant attention to the specific influence of stories on social media stakeholder engagement. This study addresses this gap by employing Narrative Transportation Theory (NTT) and leveraging Natural Language Processing (NLP) to analyze the intricate textual data generated by social media platforms. The analysis of 85,075 Facebook publications from leading Canadian manufacturing companies, using Spearman’s rank correlation coefficient, underscores that individual storytelling components—character, sequence of events, and setting—along with the composite narrative structure significantly enhance stakeholder engagement. This research contributes to a deeper understanding of storytelling dynamics in social media, emphasizing the importance of crafting compelling stories to drive meaningful stakeholder engagement in the digital realm. The results of our research can prove useful for those who wish to influence policymakers or for policymakers who want to promote new policies.
We aimed to determine the prevalence of antimicrobial resistance, carriage of Panton-Valentine leucocidin (PVL), and the clonal structure of MRSA isolates collected from skin and soft tissue infections at a tertiary care hospital in Pakistan. Between August 2021 and May 2022, 154 non-repetitive MRSA isolates were consecutively collected and characterized by antimicrobial susceptibility testing, SCCmec typing, spa typing, and detection of PVL by PCR. MLST clonal complexes (CCs) were inferred from spa type using the Based Upon Repeat Pattern (BURP) algorithm. High levels of resistance were observed to ciprofloxacin (85.7%), erythromycin (76.0%), sulfamethoxazole (68.8%), gentamicin (68.8%), fusidic acid (57.8%), tetracycline (55.8%), and clindamycin (42.2%). Clonal analysis revealed 16 lineages, with the most frequent being CC8-MRSA-IV (27.3%), PVL-positive “Bengal Bay” CC1/ST772-MRSA-V (26.0%), and CC1-MRSA-IV (16.2%). PVL was detected in 45.5% of isolates across multiple lineages. Our findings highlight the coexistence of high antimicrobial resistance and frequent PVL carriage among MRSA in Pakistan. Given the association of PVL with severe infections and the limited treatment options for multidrug-resistant strains, these data underscore a significant public health concern and the need for systematic surveillance and prudent antibiotic use.
A predictive column chart was developed to assess the risk of primary liver cancer (PLC) in hepatitis B patients. Data from 107 PLC patients and 107 controls were used as the training set, with 92 patients as the validation set. An additional 446 patients from other hospitals, including 15 with PLC, formed the external validation group. Multivariate logistic regression identified gender, BMI, alcohol consumption, diabetes, family history of liver cancer, cirrhosis, and HBV DNA load as independent risk factors. The model showed strong discrimination with AUCs of 0.882 and 0.859 in the training and validation sets, respectively, and good calibration (Hosmer–Lemeshow χ² = 2.648, P = 0.954; χ² = 4.117, P = 0.846). Decision curve analysis (DCA) confirmed clinical benefit within a risk threshold of 0.07–0.95. In the external validation group, the model maintained discrimination (AUC = 0.863) and calibration (Hosmer–Lemeshow χ² = 7.999, P = 0.434), with DCA showing net benefit across 0.14–0.95. These results indicate the column chart is a reliable tool for PLC risk prediction in hepatitis B patients.
We introduce a novel class of bivariate common-shock discrete phase-type (CDPH) distributions to describe dependencies in loss modeling, with an emphasis on those induced by common shocks. By constructing two jointly evolving terminating Markov chains that share a common evolution up to a random time corresponding to the common shock component, and then proceed independently, we capture the essential features of risk events influenced by shared and individual-specific factors. We derive explicit expressions for the joint distribution of the termination times and prove various class and distributional properties, facilitating tractable analysis of the risks. Extending this framework, we model random sums where aggregate claims are sums of continuous phase-type random variables with counts determined by these termination times and show that their joint distribution belongs to the multivariate phase-type or matrix-exponential class. We develop estimation procedures for the CDPH distributions using the expectation-maximization algorithm and demonstrate the applicability of our models through simulation studies and an application to bivariate insurance claim frequency data. In particular, the distribution of the latent common shock component present in correlated count data can be estimated as well.
Despite significant advances in Building Information Modeling (BIM) and increased adoption, numerous challenges remain. Discipline-specific BIM software tools with file storage have unresolved interoperability issues and do not capture or express interdisciplinary design intent. This hobbles machines’ ability to process design information. The lack of suitable data representation hinders the application of machine learning and other data-centric applications in building design. We propose Building Information Graphs (BIGs) as an alternative modeling method. In BIGs, discipline-specific design models are compiled as subgraphs in which nodes and edges model objects and their relationships. Additional nodes and edges in a meta-graph link the building objects across subgraphs. Capturing both intradisciplinary and interdisciplinary relationships, BIGs provide a dimension of contextual data for capturing design intent and constraints. BIGs are designed for computation and applications. The explicit relationships enable advanced graph functionalities, such as across-domain change propagation and object-level version control. BIGs preserve multimodal design data (geometry, attributes, and topology) in a graph structure that can be embedded into high-dimensional vectors, in which learning algorithms can detect statistical patterns and support a wide range of downstream tasks, such as link prediction and graph generation. In this position article, we highlight three key challenges: encapsulating and formalizing object relationships, particularly design intent and constraints; designing graph learning techniques; and developing innovative domain applications that leverage graph structures and learning. BIGs represent a paradigm shift in design technologies that bridge artificial intelligence and building design to enable intelligent and generative design tools for architects, engineers, and contractors.
Over 193 countries have signed at least one of more than 500 multilateral treaties addressing critical global issues, such as human rights, environmental protection, and trade. Ratifying a treaty obligates a country, as a “State Party,” to report to the United Nations on its progress toward implementing the treaty’s provisions. These reports and their associated review processes generate a wealth of textual data. Effectively monitoring, reviewing, and assessing national, regional, and global progress toward these treaty commitments is crucial for ensuring compliance and realizing the benefits of international cooperation. The UN Convention on the Rights of Persons with Disabilities (CRPD), which has been ratified by 191 countries, exemplifies this challenge. With over 1.3 billion people worldwide living with disabilities, the CRPD aims to promote a shift from a charity-based “medical model” that views disability as an individual deficiency, to a rights-based “social justice model” that emphasizes societal barriers and inclusivity. Each State Party submits periodic reports to the Committee on the Rights of Persons with Disabilities detailing their implementation efforts. This study analyzed all available CRPD State Reports (N = 170) using text mining, Natural Language Processing, and GenerativeAI tools to assess global progress, identify regional variations, and explore the factors influencing successful implementation. The findings reveal evidence of widespread CRPD implementation, growing support for social justice and economic inclusion, and the importance of civil society engagement. Hybrid data analysis approach of this study offers a promising framework for harnessing the power of textual data to advance the realization of treaty commitments worldwide.
The population-based structural health monitoring paradigm has recently emerged as a promising approach to enhance data-driven assessment of engineering structures by facilitating transfer learning between structures with some degree of similarity. In this work, we apply this concept to the automated modal identification of structural systems. We introduce a graph neural network (GNN)-based deep learning scheme to identify modal properties, including natural frequencies, damping ratios, and mode shapes of engineering structures based on the power spectral density of spatially sparse vibration measurements. Systematic numerical experiments are conducted to evaluate the proposed model, employing two distinct truss populations that possess similar topological characteristics but varying geometric (size and shape) and material (stiffness) properties. The results demonstrate that, once trained, the proposed GNN-based model can identify modal properties of unseen structures within the same structural population with good efficiency and acceptable accuracy, even in the presence of measurement noise and sparse measurement locations. The GNN-based model exhibits advantages over the classic frequency domain decomposition method in terms of identification speed, as well as against an alternate multilayer perceptron architecture in terms of identification accuracy, rendering this a promising tool for PBSHM purposes.
This article investigates global patterns of facilitation and interference among identities—socially recognizable categories that shape individuals’ sense of who they are and carry cultural expectations (e.g., mother, worker). While identity theory suggests that identities interact in structured ways, existing research often examines identities in isolation or conventional roles, limiting the ability to observe broader patterns. This study adopts a relational approach to explore how identities facilitate or interfere with each other. By drawing on sociological identity theory, I formulate hypotheses about these interactions. Using original survey data, I construct identity networks where nodes represent identities and ties indicate the prevalence of facilitation or interference. Blockmodeling techniques are then employed to characterize the global structure of these networks. The findings reveal distinct positions within the network, largely aligning with theoretical expectations.
Industrial mobile robots as service units will be increasingly used in the future in factories with Industry 4.0 production cells in an island-like manner. The differences between the mobile robots available on the market make it necessary to help the optimal selection and use of these robots. In this article, we present a concept that focuses on the mobile robot as a way to investigate the manufacturing system. This approach will help to find the optimal solution when selecting robots. With the parameters that can be included, the robot can be characterized in the manufacturing system environment, making it much easier to express and compute capacity, performance, and efficiency characteristics compared to previous models. In this article, we also present a case study based on the outlined method, which investigates the robot utilization as a function of battery capacity and the number of packages to be transported.
Cyber breaches pose a significant threat to both enterprises and society. Analyzing cyber breach data is essential for improving cyber risk management and developing effective cyber insurance policies. However, modeling cyber risk is challenging due to its inherent characteristics, including sparsity, heterogeneity, heavy tails, and dependence. This work introduces a cluster-based dependence model that captures both temporal and cross-group dependencies, providing a more accurate representation of multivariate cyber breach risks. The proposed framework employs a cluster-based kernel approach to model breach severity, effectively handling heterogeneity and extreme values, while a copula-based method is used to capture multivariate dependence. Our findings, validated through both empirical and synthetic studies, demonstrate that the proposed model effectively captures the statistical characteristics of multivariate cyber breach risks and outperforms commonly used models in predictive accuracy. Furthermore, we show that our approach can enhance cyber insurance pricing by generating more profitable insurance contracts.
Climate conditions are known to modulate infectious disease transmission, yet their impact on measles transmission remains underexplored. In this study, we investigate the extent to which climate conditions modulate measles transmission, utilizing measles incidence data during 2005–2008 from China. Three climate-forced models were employed: a sinusoidal function, an absolute humidity (AH)-forced model, and an AH and temperature (AH/T)-forced model. These models were integrated into an inference framework consisting of a susceptible–exposed–infectious–recovered (SEIR) model and an iterated filter (IF2) to estimate epidemiological characteristics and assess climate influences on measles transmission. During the study period, measles epidemics peaked in spring in northern China and were more diverse in the south. Our analyses showed that the AH/T model better captured measles epidemic dynamics in northern China, suggesting a combined impact of humidity and temperature on measles transmission. Furthermore, we preliminarily examined the impact of other factors and found that population susceptibility and incidence rate were both positively correlated with migrant worker influx, suggesting that higher susceptibility among migrant workers may sustain measles transmission. Taken together, our study supports a role of humidity and temperature in modulating measles transmission and identifies additional factors in shaping measles epidemic dynamics in China.
We study the local limit in distribution of Bienaymé–Galton–Watson trees conditioned on having large sub-populations. Assuming a generic and aperiodic condition on the offspring distribution, we prove the existence of a limit given by a Kesten’s tree associated with a certain critical offspring distribution.