To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
As the world has become more digitally dependent, questions of data governance, such as ethics, institutional arrangements, and statistical protection measures, have increased in significance. Understanding the economic contribution of investments in data sharing and data governance is highly problematic: outputs and outcomes are often widely dispersed and hard to measure, and the value of those investments is very context-dependent. The “Five Safes” is a popular data governance framework. It is used to design and critique data management strategies across the world and has also been used as a performance framework to measure the effectiveness of data access operations. We report on a novel application of the Five Safes framework to structure the economic evaluation of data governance. The Five Safes was designed to allow structured investigation into data governance. Combining this with more traditional logic models can provide an evaluation methodology that is practical, reproducible, and comparable. We illustrate this by considering the application of the combined logic model-Five Safes framework to data governance for agronomy investments in Ethiopia. We demonstrate how the Five Safes was used to generate the necessary context for a more traditional quantitative study, and consider lessons learned for the wider evaluation of data and data governance investments.
Unstructured data are a promising new source of information that insurance companies may use to understand their risk portfolio better and improve the customer experience. However, these novel data sources are difficult to incorporate into existing ratemaking frameworks due to the size and format of the unstructured data. This paper proposes a framework to use street view imagery within a generalized linear model. To do so, we use representation learning to extract an embedding vector containing useful information from the image. This embedding is dense and low dimensional, making it appropriate to use within existing ratemaking models. We find that there is useful information included in street view imagery to predict the frequency of claims for certain types of perils. This model can be used as in a ratemaking framework but also opens the door to future empirical research on attempting to extract which characteristics within the image leads to increased or decreased predicted claim frequencies. Throughout, we discuss the practical difficulties (technical and social) of using this type of data for insurance pricing.
Autochthonous hepatitis E virus (HEV) infection is increasingly reported in industrialized countries and is mostly associated with zoonotic HEV genotype 3 (HEV-3). In this study, we examined the molecular epidemiology of 63 human clinical HEV-3 isolates in Canada between 2014 and 2022. Fifty-five samples were IgM positive, 45 samples were IgG positive and 44 were IgM and IgG positive. The majority of the isolates belong to the subtypes 3a, 3b, and 3j, with high sequence homology to Canadian swine and pork isolates. There were a few isolates that clustered with subtypes 3c, 3e, 3f, 3h, and 3g, and an isolate from chronic infection with a rabbit strain (3ra). Previous studies have demonstrated that the isolates from pork products and swine from Canada belong to subtypes 3a and 3b, therefore, domestic swine HEV is likely responsible for the majority of clinical HEV cases in Canada and further support the hypothesis that swine serve as the main reservoirs for HEV-3 infections. Understanding the associated risk of zoonotic HEV infection requires the establishment of sustainable surveillance strategies at the interface between humans, animals, and the environment within a One-Health framework.
There has been a lack of information on vaccine acceptance for Finnish adults. We conducted a secondary analysis of cross-sectional data collected through the Finnish Medicines Agency Medicine Barometer 2021 survey (response rate: 20.6%). We described and explained vaccine acceptance by investigating the associations between socio-demographic factors and statements using logistic regression and conducted a factor analysis. The majority of respondents (n = 2081) considered vaccines to be safe (93%), effective (97%), and important (95%). However, 20% and 14% felt they did not have enough information about vaccines and vaccine-preventable diseases (VPDs), respectively. Respondents aged 18–39 were 2.8 times more likely to disagree that they had enough information about VPDs compared to respondents aged 60–79 (p < 0.001), while respondents with poorer self-perceived health were 1.8 times more likely to declare not having enough information about vaccines (p < 0.001). We generated three-factor dimensions from the eight statements. They were related to ‘Confidence and attitudes towards vaccines’, ‘Access to information on vaccines and VPDs’, and ‘Debate on vaccine issues’, which may reflect the underlying thinking patterns. Access to and understanding of information about vaccines and VPDs need to be improved for Finnish adults to increase vaccine acceptance and uptake, thus preventing the spread of VPDs.
People rely extensively on online social networks (OSNs) in Africa, which aroused cyber attackers’ attention for various nefarious actions. This global trend has not spared African online communities, where the proliferation of OSNs has provided new opportunities and challenges. In Africa, as in many other regions, a burgeoning black-market industry has emerged, specializing in the creation and sale of fake accounts to serve various purposes, both malicious and deceptive. This paper aims to build a set of machine-learning models through feature selection algorithms to predict the fake account, increase performance, and reduce costs. The suggested approach is based on input data made up of features that describe the profiles being investigated. Our findings offer a thorough comparison of various algorithms. Furthermore, compared to machine learning without feature selection and Boruta, machine learning employing the suggested genetic algorithm-based feature selection offers a clear runtime advantage. The final prediction model achieves AUC values between 90% and 99.6%. The findings showed that the model based on the features chosen by the GA algorithm provides a reasonable prediction quality with a small number of input variables, less than 31% of the entire feature space, and therefore permits the accurate separation of fake from real users. Our results demonstrate exceptional predictive accuracy with a significant reduction in input variables using the genetic algorithm, reaffirming the effectiveness of our approach.
This book provides statistics instructors and students with complete classroom material for a one- or two-semester course on applied regression and causal inference. It is built around 52 stories, 52 class-participation activities, 52 hands-on computer demonstrations, 52 drills, and 52 discussion problems that allow instructors and students to explore in a fun way the real-world complexity of the subject. The book fosters an engaging “flipped classroom” environment with a focus on visualization and understanding.
The book provides instructors with frameworks for self-study or for structuring the course, along with tips for maintaining student engagement at all levels and practice exam questions to help guide learning.
Designed to accompany the authors’ previous textbook Regression and Other Stories, its modular nature and wealth of material allow this book to be adapted to different courses and texts or to be used by learners as a hands-on workbook.
The authors are experienced researchers who have published articles in hundreds of different scientific journals in fields including statistics, computer science, policy, public health, political science, economics, sociology, and engineering. They have also published articles in the Washington Post, the New York Times, Slate, and other public venues. Their previous books include Bayesian Data Analysis, Teaching Statistics: A Bag of Tricks, and Regression and Other Stories.
This book provides statistics instructors and students with complete classroom material for a one- or two-semester course on applied regression and causal inference. It is built around 52 stories, 52 class-participation activities, 52 hands-on computer demonstrations, 52 drills, and 52 discussion problems that allow instructors and students to explore in a fun way the real-world complexity of the subject. The book fosters an engaging “flipped classroom” environment with a focus on visualization and understanding.
The book provides instructors with frameworks for self-study or for structuring the course, along with tips for maintaining student engagement at all levels and practice exam questions to help guide learning.
Designed to accompany the authors’ previous textbook Regression and Other Stories, its modular nature and wealth of material allow this book to be adapted to different courses and texts or to be used by learners as a hands-on workbook.
The authors are experienced researchers who have published articles in hundreds of different scientific journals in fields including statistics, computer science, policy, public health, political science, economics, sociology, and engineering. They have also published articles in the Washington Post, the New York Times, Slate, and other public venues. Their previous books include Bayesian Data Analysis, Teaching Statistics: A Bag of Tricks, and Regression and Other Stories.
The effective reproduction number $ R $ was widely accepted as a key indicator during the early stages of the COVID-19 pandemic. In the UK, the $ R $ value published on the UK Government Dashboard has been generated as a combined value from an ensemble of epidemiological models via a collaborative initiative between academia and government. In this paper, we outline this collaborative modelling approach and illustrate how, by using an established combination method, a combined $ R $ estimate can be generated from an ensemble of epidemiological models. We analyse the $ R $ values calculated for the period between April 2021 and December 2021, to show that this $ R $ is robust to different model weighting methods and ensemble sizes and that using heterogeneous data sources for validation increases its robustness and reduces the biases and limitations associated with a single source of data. We discuss how $ R $ can be generated from different data sources and show that it is a good summary indicator of the current dynamics in an epidemic.
With recent epidemics such as COVID-19, H1N1 and SARS causing devastating financial loss to the economy, it is important that insurance companies plan for financial costs of epidemics. This article proposes a new methodology for epidemic and insurance modelling by combining the existing deterministic compartmental models and the Markov multiple state models to facilitate actuarial computations to design new health insurance plans that cover epidemics. Our method is inspired by the seminal paper (Feng and Garrido (2011) North American Actuarial Journal, 15, 112–136.) of Feng and Garrido and complements the work of Hillairet and Lopez et al. in Hillairet and Lopez ((2021) Scandinavian Actuarial Journal, 2021(8), 671–694.) and Hillairet et al. ((2022) Insurance: Mathematics and Economics, 107, 88–101.) In this work, we use the deterministic SIR model and the Eyam epidemic data set to provide numerical illustrations for our method.
This book provides statistics instructors and students with complete classroom material for a one- or two-semester course on applied regression and causal inference. It is built around 52 stories, 52 class-participation activities, 52 hands-on computer demonstrations, 52 drills, and 52 discussion problems that allow instructors and students to explore in a fun way the real-world complexity of the subject. The book fosters an engaging “flipped classroom” environment with a focus on visualization and understanding.
The book provides instructors with frameworks for self-study or for structuring the course, along with tips for maintaining student engagement at all levels and practice exam questions to help guide learning.
Designed to accompany the authors’ previous textbook Regression and Other Stories, its modular nature and wealth of material allow this book to be adapted to different courses and texts or to be used by learners as a hands-on workbook.
The authors are experienced researchers who have published articles in hundreds of different scientific journals in fields including statistics, computer science, policy, public health, political science, economics, sociology, and engineering. They have also published articles in the Washington Post, the New York Times, Slate, and other public venues. Their previous books include Bayesian Data Analysis, Teaching Statistics: A Bag of Tricks, and Regression and Other Stories.
This book provides statistics instructors and students with complete classroom material for a one- or two-semester course on applied regression and causal inference. It is built around 52 stories, 52 class-participation activities, 52 hands-on computer demonstrations, 52 drills, and 52 discussion problems that allow instructors and students to explore in a fun way the real-world complexity of the subject. The book fosters an engaging “flipped classroom” environment with a focus on visualization and understanding.
The book provides instructors with frameworks for self-study or for structuring the course, along with tips for maintaining student engagement at all levels and practice exam questions to help guide learning.
Designed to accompany the authors’ previous textbook Regression and Other Stories, its modular nature and wealth of material allow this book to be adapted to different courses and texts or to be used by learners as a hands-on workbook.
The authors are experienced researchers who have published articles in hundreds of different scientific journals in fields including statistics, computer science, policy, public health, political science, economics, sociology, and engineering. They have also published articles in the Washington Post, the New York Times, Slate, and other public venues. Their previous books include Bayesian Data Analysis, Teaching Statistics: A Bag of Tricks, and Regression and Other Stories.
Offshore wind turbines intend to take a rapidly growing share in the electric mix. The design, installation, and exploitation of these industrial assets are regulated by international standards, providing generic guidelines. Constantly, new projects reach unexploited wind resources, pushing back installation limits. Therefore, turbines are increasingly subject to uncertain environmental conditions, making long-term investment decisions riskier (at the design or end-of-life stage). Fortunately, numerical models of wind turbines enable to perform accurate multi-physics simulations of such systems when interacting with their environment. The challenge is then to propagate the input environmental uncertainties through these models and to analyze the distribution of output variables of interest. Since each call of such a numerical model can be costly, the estimation of statistical output quantities of interest (e.g., the mean value, the variance) has to be done with a restricted number of simulations. To do so, the present paper uses the kernel herding method as a sampling technique to perform Bayesian quadrature and estimate the fatigue damage. It is known from the literature that this method guarantees fast and accurate convergence together with providing relevant properties regarding subsampling and parallelization. Here, one numerically strengthens this fact by applying it to a real use case of an offshore wind turbine operating in Teesside, UK. Numerical comparison with crude and quasi-Monte Carlo sampling demonstrates the benefits one can expect from such a method. Finally, a new Python package has been developed and documented to provide quick open access to this uncertainty propagation method.