To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The expanding application of advanced analytics in insurance has generated numerous opportunities, such as more accurate predictive modeling powered by machine learning and artificial intelligence (AI) methods, the utilization of novel and unstructured datasets, and the automation of key operations. Significant advances in these areas are being made through novel applications and adaptations of predictive modeling techniques for insurance purposes, while, concurrently, rapid advances in machine learning methods are being made outside of the insurance sector. However, these innovations also bring substantial challenges, particularly around the transparency, explanation, and fairness of complex algorithmic models and the economic and societal impacts of their adoption in decision-making. As insurance is a highly regulated industry, models may be required by regulators to be explainable, in order to enable analysis of the basis for decision making. Due to the societal importance of insurance, significant attention is being paid to ensuring that insurance models do not discriminate unfairly. In this special issue, we feature papers that explore key issues in insurance analytics, focusing on prediction, explainability, and fairness.
This article examines the National Health Data Network (RNDS), the platform launched by the Ministry of Health in Brazil as the primary tool for its Digital Health Strategy 2020–2028, including innovation aspects. The analysis is made through two distinct frameworks: Right to health and personal data protection in Brazil. The first approach is rooted in the legal framework shaped by Brazil’s trajectory on health since 1988, marked by the formal acknowledgment of the Right to health and the establishment of the Unified Health System, Brazil’s universal access health system, encompassing public healthcare and public health actions. The second approach stems from the repercussions of the General Data Protection Law, enacted in 2018 and the inclusion of Right to personal data protection in Brazilian’s Constitution. This legislation, akin to the EU’s General Data Protection Regulations, addressed the gap in personal data protection in Brazil and established principles and rules for data processing. The article begins by explanting the two approaches, and then it provides a brief history of health informatics policies in Brazil, leading to the current Digital Health Strategy and the RNDS. Subsequently, it delves into an analysis of the RNDS through the lenses of the two aforementioned approaches. In the final discussion sections, the article attempts to extract lessons from the analyses, particularly in light of ongoing discussions such as the secondary use of data for innovation in the context of different interpretations about innovation policies.
The embedding problem of Markov chains examines whether a stochastic matrix$\mathbf{P} $ can arise as the transition matrix from time 0 to time 1 of a continuous-time Markov chain. When the chain is homogeneous, it checks if $ \mathbf{P}=\exp{\mathbf{Q}}$ for a rate matrix $ \mathbf{Q}$ with zero row sums and non-negative off-diagonal elements, called a Markov generator. It is known that a Markov generator may not always exist or be unique. This paper addresses finding $ \mathbf{Q}$, assuming that the process has at most one jump per unit time interval, and focuses on the problem of aligning the conditional one-jump transition matrix from time 0 to time 1 with $ \mathbf{P}$. We derive a formula for this matrix in terms of $ \mathbf{Q}$ and establish that for any $ \mathbf{P}$ with non-zero diagonal entries, a unique $ \mathbf{Q}$, called the ${\unicode{x1D7D9}}$-generator, exists. We compare the ${\unicode{x1D7D9}}$-generator with the one-jump rate matrix from Jarrow, Lando, and Turnbull (1997), showing which is a better approximate Markov generator of $ \mathbf{P}$ in some practical cases.
Turbulent flows are chaotic and multi-scale dynamical systems, which have large numbers of degrees of freedom. Turbulent flows, however, can be modeled with a smaller number of degrees of freedom when using an appropriate coordinate system, which is the goal of dimensionality reduction via nonlinear autoencoders. Autoencoders are expressive tools, but they are difficult to interpret. This article aims to propose a method to aid the interpretability of autoencoders. First, we introduce the decoder decomposition, a post-processing method to connect the latent variables to the coherent structures of flows. Second, we apply the decoder decomposition to analyze the latent space of synthetic data of a two-dimensional unsteady wake past a cylinder. We find that the dimension of latent space has a significant impact on the interpretability of autoencoders. We identify the physical and spurious latent variables. Third, we apply the decoder decomposition to the latent space of wind-tunnel experimental data of a three-dimensional turbulent wake past a bluff body. We show that the reconstruction error is a function of both the latent space dimension and the decoder size, which are correlated. Finally, we apply the decoder decomposition to rank and select latent variables based on the coherent structures that they represent. This is useful to filter unwanted or spurious latent variables or to pinpoint specific coherent structures of interest. The ability to rank and select latent variables will help users design and interpret nonlinear autoencoders.
Usage data on research outputs such as books and journals is well established in the scholarly community. Yet, as research impact is derived from a broader set of scholarly outputs, such as data, code, and multimedia, more holistic usage and impact metrics could inform national innovation and research policy. While usage data reporting standards, such as Project COUNTER, provide the basis for shared statistics reporting practice, mandated access to publicly funded research has increased the demand for impact metrics and analytics. In this context, stakeholders are exploring how to scaffold and strengthen shared infrastructure to better support the trusted, multistakeholder exchange of usage data across a variety of outputs. In April 2023, a workshop on Exploring National Infrastructure for Public Access and Impact Reporting supported by the United States (US) National Science Foundation (NSF) explored these issues. This paper contextualizes the resources shared and recommendations generated in the workshop.
In Chung–Lu random graphs, a classic model for real-world networks, each vertex is equipped with a weight drawn from a power-law distribution, and two vertices form an edge independently with probability proportional to the product of their weights. Chung–Lu graphs have average distance $O(\log\log n)$ and thus reproduce the small-world phenomenon, a key property of real-world networks. Modern, more realistic variants of this model also equip each vertex with a random position in a specific underlying geometry. The edge probability of two vertices then depends, say, inversely polynomially on their distance.
In this paper we study a generic augmented version of Chung–Lu random graphs. We analyze a model where the edge probability of two vertices can depend arbitrarily on their positions, as long as the marginal probability of forming an edge (for two vertices with fixed weights, one fixed position, and one random position) is as in a Chung–Lu random graph. The resulting class contains Chung–Lu random graphs, hyperbolic random graphs, and geometric inhomogeneous random graphs as special cases.
Our main result is that every random graph model in this general class has the same average distance as a Chung–Lu random graph, up to a factor of $1+o(1)$. This shows in particular that specific choices, such as taking the underlying geometry to be Euclidean, do not significantly influence the average distance. The proof also shows that every random graph model in our class has a giant component and polylogarithmic diameter with high probability.
We consider the performance of Glauber dynamics for the random cluster model with real parameter $q\gt 1$ and temperature $\beta \gt 0$. Recent work by Helmuth, Jenssen, and Perkins detailed the ordered/disordered transition of the model on random $\Delta$-regular graphs for all sufficiently large $q$ and obtained an efficient sampling algorithm for all temperatures $\beta$ using cluster expansion methods. Despite this major progress, the performance of natural Markov chains, including Glauber dynamics, is not yet well understood on the random regular graph, partly because of the non-local nature of the model (especially at low temperatures) and partly because of severe bottleneck phenomena that emerge in a window around the ordered/disordered transition. Nevertheless, it is widely conjectured that the bottleneck phenomena that impede mixing from worst-case starting configurations can be avoided by initialising the chain more judiciously. Our main result establishes this conjecture for all sufficiently large $q$ (with respect to $\Delta$). Specifically, we consider the mixing time of Glauber dynamics initialised from the two extreme configurations, the all-in and all-out, and obtain a pair of fast mixing bounds which cover all temperatures $\beta$, including in particular the bottleneck window. Our result is inspired by the recent approach of Gheissari and Sinclair for the Ising model who obtained a similar flavoured mixing-time bound on the random regular graph for sufficiently low temperatures. To cover all temperatures in the RC model, we refine appropriately the structural results of Helmuth, Jenssen and Perkins about the ordered/disordered transition and show spatial mixing properties ‘within the phase’, which are then related to the evolution of the chain.
For coherent systems with components and active redundancies having heterogeneous and dependent lifetimes, we prove that the lifetime of system with redundancy at component level is stochastically larger than that with redundancy at system level. In particular, in the setting of homogeneous components and redundancy lifetimes linked by an Archimedean survival copula, we develop sufficient conditions for the reversed hazard rate order, the hazard rate order and the likelihood ratio order between two system lifetimes, respectively. The present results substantially generalize some related results in the literature. Several numerical examples are presented to illustrate the findings as well.
Can trust norms within the African moral system support data gathering for Generative AI (GenAI) development in African society? Recent developments in the field of large language models, such as GenAI, including models like ChatGPT and Midjourney, have identified a common issue with these GenAI models known as “AI hallucination,” which involves the presentation of misinformation as facts along with its potential downside of facilitating public distrust in AI performance. In the African context, this paper frames unsupportive data-gathering norms as a contributory factor to issues such as AI hallucination and investigates the following claims. First, this paper explores the claim that knowledge in the African context exists in both esoteric and exoteric forms, incorporating such diverse knowledge as data could imply that a GenAI tailored for Africa may have unlimited accessibility across all contexts. Second, this paper acknowledges the formidable challenge of amassing a substantial volume of data, which encompasses esoteric information, requisite for the development of a GenAI model, positing that the establishment of a foundational framework for data collection, rooted in trust norms that is culturally resonant, has the potential to engender trust dynamics between data providers and collectors. Lastly, this paper recommends that trust norms in the African context require recalibration to align with contemporary social progress, while preserving their core values, to accommodate innovative data-gathering methodologies for a GenAI tailored to the African setting. This paper contributes to how trust culture within the African context, particularly in the domain of GenAI for African society, propels the development of Afro-AI technologies.
Nontyphoidal Salmonella enterica infections are a leading cause of enteric disease in Canada, most commonly associated with foodborne exposures. Raw frozen breaded chicken products (FBCP) have been implicated in 16 Salmonella outbreaks between 2017 and 2019. This study quantified the impact of the 1 April 2019 requirement by the Canadian Food Inspection Agency (CFIA) for manufacturers to reduce Salmonella in raw FBCP. An intervention study approach utilizing the pre–post intervention data with a comparison group methodology was used to: (1) estimate the reduction in FBCP Salmonella prevalence using retail meat FoodNet Canada data; (2) estimate the reduction in the human salmonellosis incidence rate using data from the Canadian National Enteric Surveillance Program; and (3) estimate the proportion of reported cases attributed to FBCP if the human exposure to Salmonella through FBCP was completely eliminated. The FBCP Salmonella prevalence decreased from 28% observed before 1 April 2019 to 2.9% after the requirement implementation. The CFIA requirement was estimated to reduce the human salmonellosis incidence rate by 23%. An estimated 26% of cases during the pre-intervention period can be attributed to FBCP. The CFIA requirement was successful at significantly reducing Salmonella prevalence in retail FBCP, and at reducing salmonellosis burden.
Several African countries are developing artificial intelligence (AI) strategies and ethics frameworks with the goal of accelerating responsible AI development and adoption. However, many of these governance actions are emerging without consideration for their suitability to local contexts, including whether the proposed policies are feasible to implement and what their impact may be on regulatory outcomes. In response, we suggest that there is a need for more explicit policy learning, by looking at existing governance capabilities and experiences related to algorithms, automation, data, and digital technology in other countries and in adjacent sectors. From such learning, it will be possible to identify where existing capabilities may be adapted or strengthened to address current AI-related opportunities and risks. This paper explores the potential for learning by analysing existing policy and legislation in twelve African countries across three main areas: strategy and multi-stakeholder engagement, human dignity and autonomy, and sector-specific governance. The findings point to a variety of existing capabilities that could be relevant to responsible AI; from existing model management procedures used in banking and air quality assessment to efforts aimed at enhancing public sector skills and transparency around public–private partnerships, and the way in which existing electronic transactions legislation addresses accountability and human oversight. All of these point to the benefit of wider engagement on how existing governance mechanisms are working, and on where AI-specific adjustments or new instruments may be needed.
To enhance the capacity for early and effective management of genital tract infections at primary and secondary levels of the healthcare system, we developed a prediction model, validated internally to help predict individual risk of self-reported genital tract infections (sGTIs) at the community level in Ghana. The study involved 32973 men and women aged 15–49 years from three rounds of the Ghana Demographic Health Survey, from 2003 to 2014. The outcomes were sGTIs. We applied the least absolute shrinkage and selection operator (LASSO) penalized regression with a 10-fold cross-validation model to 11 predictors based on prior review of the literature. The bootstrapping technique was also employed as a sensitivity analysis to produce a robust model. We further employed discriminant and calibration analyses to evaluate the performance of the model. Statistical significance was set at P-value <0.05. The mean±standard deviation age was 29.1±9.7 years with female preponderance (60.7%). The prevalence of sGTIs within the period was 11.2% (95% CI = 4.5–17.8) and it ranged from 5.4% (95% CI = 4.8–5.86) in 2003 to 17.5% (95% CI = 16.4–18.7) in 2014. The LASSO regression model retained all 11 predictors. The model’s ability to discriminate between those with sGTIs and those without sGTIs was approximately 73.50% (95% CI = 72.50–74.26) from the area under the curve with bootstrapping technique. There was no evidence of miscalibration from the calibration belt plot with bootstrapping (test statistic = 17.30; P-value = 0.060). The model performance was judged to be good and acceptable. In the absence of clinical measurement, this prediction tool can be used to identify individuals aged 15–49 years with a high risk of sGTIs at the community level in Ghana. Frontline healthcare staff can use this tool for screening and early detection. We, therefore, propose external validation of the model to confirm its generalizability and reliability in different population.
Residual blood specimens collected at health facilities may be a source of samples for serosurveys of adults, a population often neglected in community-based serosurveys. Anonymized residual blood specimens were collected from individuals 15 – 49 years of age attending two sub-district hospitals in Palghar District, Maharashtra, from November 2018 to March 2019. Specimens also were collected from women 15 – 49 years of age enrolled in a cross-sectional, community-based serosurvey representative at the district level that was conducted 2 – 7 months after the residual specimen collection. Specimens were tested for IgG antibodies to measles and rubella viruses. Measles and rubella seroprevalence estimates using facility-based specimens were 99% and 92%, respectively, with men having significantly lower rubella seropositivity than women. Age-specific measles and rubella seroprevalence estimates were similar between the two specimen sources. Although measles seropositivity was slightly higher among adults attending the facilities, both facility and community measles seroprevalence estimates were 95% or higher. The similarity in measles and rubella seroprevalence estimates between the community-based and facility serosurveys highlights the potential value of residual specimens to approximate community seroprevalence.
Our study aimed to develop and validate a nomogram to assess talaromycosis risk in hospitalized HIV-positive patients. Prediction models were built using data from a multicentre retrospective cohort study in China. On the basis of the inclusion and exclusion criteria, we collected data from 1564 hospitalized HIV-positive patients in four hospitals from 2010 to 2019. Inpatients were randomly assigned to the training or validation group at a 7:3 ratio. To identify the potential risk factors for talaromycosis in HIV-infected patients, univariate and multivariate logistic regression analyses were conducted. Through multivariate logistic regression, we determined ten variables that were independent risk factors for talaromycosis in HIV-infected individuals. A nomogram was developed following the findings of the multivariate logistic regression analysis. For user convenience, a web-based nomogram calculator was also created. The nomogram demonstrated excellent discrimination in both the training and validation groups [area under the ROC curve (AUC) = 0.883 vs. 0.889] and good calibration. The results of the clinical impact curve (CIC) analysis and decision curve analysis (DCA) confirmed the clinical utility of the model. Clinicians will benefit from this simple, practical, and quantitative strategy to predict talaromycosis risk in HIV-infected patients and can implement appropriate interventions accordingly.
Serogroup epidemiology of invasive meningococcal disease (IMD) is constantly evolving, varying by time and location. Surveillance reports have indicated a rise in meningococcal serogroup Y (MenY) in some regions in recent years. This systematic literature review explores the evolving epidemiology of MenY IMD globally based on review of recent articles and national surveillance reports published between 1 January 2010 and 25 March 2021. Generally, MenY incidence was low (<0.2/100,000) across all ages in most countries. The reported incidence was more frequent among infants, adolescents, and those aged ≥65 years. More than 10% of all IMD cases were MenY in some locations and time periods. Implementation of vaccination evolved over time as the rise in MenY IMD percentage occurred. Cases decreased in countries with quadrivalent vaccine programs (e.g., United Kingdom, the Netherlands, United States, and Australia), whereas the MenY burden increased and made up a large proportion of cases in areas without vaccine programs. Continuous monitoring of epidemiologic changes of IMD is essential to establish MenY burden and for implementation of prevention strategies.
This cohort study evaluated the associations of C-reactive protein-neutrophil to lymphocyte ratio (C-NLR) and lymphocyte-CRP ratio (LCR) with refractory Mycoplasma pneumoniae pneumonia (RMPP), and the predictive values of C-NLR and LCR for RMPP and prolonged fever in children based on 389 children with MPP. The associations of NLR, C-NLR, and LCR with RMPP and prolonged fever were evaluated by logistic regression analysis. C-NLR was correlated with an increased risk of RMPP in children [odds ratio (OR) = 3.459, 95% confidence interval (CI): 1.598–7.491]. A higher risk of RMPP was identified in the C-NLR > 29.9 group (OR = 2.885, 95% CI: 1.599–5.203). LCR > 1584.2 was associated with a decreased risk of RMPP (OR = 0.500, 95% CI: 0.282–0.887). Increased risk of prolonged fever in children was identified with the increase of C-NLR (OR = 5.913, 95% CI: 2.335–14.972) or NLR (OR = 2.413, 95% CI: 1.689–3.446). The AUCs of C-NLR, LCR, and NLR for predicting RMPP were 0.630, 0.623, and 0.608, respectively. In conclusion, C-NLR was associated with increased RMPP risk in children and had good value for predicting RMPP and prolonged fever in children.
Between February and April 2018, Salmonella typhimurium within a unique 5-single nucleotide polymorphism (SNP) address was isolated from 28 cases with links to a small rural area of Northeast England, with five cases prospectively identified by whole genome sequencing (WGS). Infections had a severe clinical picture with ten cases hospitalized (36%), two cases with invasive disease, and two deaths reported. Interviews determined that 24 cases (86%) had been exposed to a local independent butcher’s shop (Butcher A).
A case-control study using controls recruited by systematic digit dialling established that cases were 68 times more likely to have consumed cooked meat from Butcher A (Adjusted OR 68.1; 95% CI: 1.9–2387.6; P = 0.02). Salmonella typhimurium genetically highly related to 28 of the outbreak cases was also isolated from a sample of cooked meat on sale in the premises.
Epidemiological and microbiological investigations suggest this outbreak was likely associated with the consumption of ready-to-eat foods supplied by the implicated butcher. A relatively large number of cases were involved despite the rurality of the food business, with cases resident across the Northeast and Yorkshire identified using WGS, demonstrating the benefit of timely sequencing information to community outbreak investigations.
This paper questions how the drive toward introducing artificial intelligence (AI) in all facets of life might endanger certain African ethical values. It argues in the affirmative that indeed two primary values that are prized in nearly all versions of sub-Saharan African ethics (available in the literature) might sit in direct opposition to the fundamental motivation of corporate adoption of AI; these values are Afro-communitarianism grounded on relationality, and human dignity grounded on a normative conception of personhood. This paper offers a unique perspective on AI ethics from the African place, as there is little to no material in the literature that discusses the implications of AI on African ethical values. The paper is divided into two broad sections that are focused on (i) describing the values at risk from AI and (ii) showing how the current use of AI undermines these said values. In conclusion, I suggest how to prioritize these values in working toward the establishment of an African AI ethics framework.