To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Effective management of uncertainty can lead to better, more informed decisions. However, many decision makers and their advisers do not always face up to uncertainty, in part because there is little constructive guidance or tools available to help. This paper outlines six Uncertainty Principles to manage uncertainty.
Face up to uncertainty
Deconstruct the problem
Don’t be fooled (un/intentional biases)
Models can be helpful, but also dangerous
Think about adaptability and resilience
Bring people with you
These were arrived at following extensive discussions and literature reviews over a 5-year period. While this is an important topic for actuaries, the intended audience is any decision maker or advisor in any sector (public or private).
A novel coronavirus disease, designated as COVID-19, has become a pandemic worldwide. This study aims to estimate the incubation period and serial interval of COVID-19. We collected contact tracing data in a municipality in Hubei province during a full outbreak period. The date of infection and infector–infectee pairs were inferred from the history of travel in Wuhan or exposed to confirmed cases. The incubation periods and serial intervals were estimated using parametric accelerated failure time models, accounting for interval censoring of the exposures. Our estimated median incubation period of COVID-19 is 5.4 days (bootstrapped 95% confidence interval (CI) 4.8–6.0), and the 2.5th and 97.5th percentiles are 1 and 15 days, respectively; while the estimated serial interval of COVID-19 falls within the range of −4 to 13 days with 95% confidence and has a median of 4.6 days (95% CI 3.7–5.5). Ninety-five per cent of symptomatic cases showed symptoms by 13.7 days (95% CI 12.5–14.9). The incubation periods and serial intervals were not significantly different between male and female, and among age groups. Our results suggest a considerable proportion of secondary transmission occurred prior to symptom onset. And the current practice of 14-day quarantine period in many regions is reasonable.
Even though the impact of COVID-19 in metropolitan areas has been extensively studied, the geographic spread to smaller cities is also of great concern. We conducted an ecological study aimed at identifying predictors of early introduction, incidence rates of COVID-19 and mortality (up to 8 May 2020) among 604 municipalities in inner São Paulo State, Brazil. Socio-demographic indexes, road distance to the state capital and a classification of regional relevance were included in predictive models for time to COVID-19 introduction (Cox regression), incidence and mortality rates (zero-inflated binomial negative regression). In multivariable analyses, greater demographic density and higher classification of regional relevance were associated with both early introduction and increased rates of COVID-19 incidence and mortality. Other predictive factors varied, but distance from the State Capital (São Paulo City) was negatively associated with time-to-introduction and with incidence rates of COVID-19. Our results reinforce the hypothesis of two patterns of geographical spread of SARS-Cov-2 infection: one that is spatial (from the metropolitan area into the inner state) and another which is hierarchical (from urban centres of regional relevance to smaller and less connected municipalities). Those findings may apply to other settings, especially in developing and highly heterogeneous countries, and point to a potential benefit from strengthening non-pharmaceutical control strategies in areas of greater risk.
Since the beginning of the COVID-19 epidemic, there is an ongoing debate and research regarding the possible ways of virus transmission. We conducted an epidemiological investigation which revealed a cluster of five COVID-19 cases, linked to playing squash at a sports venue in Maribor, Slovenia. Acquired data raises possibility that the transmission occurred indirectly through contaminated objects in changing room or squash hall or via aerosolisation in squash hall.
Growing in a saline environment causes changes in important physiological processes that are directly related to plant growth and development. In this study we evaluated the effect of salinity on transpiration of sorghum plants in semi-arid conditions and found that the highest rates of transpiration were observed in the hottest hours of the day, between 10 a.m. and 3 p.m., with plants subjected to the saline environment having their transpiration reduced by up to 70% when compared to the non-saline environment. This behavior can be reflected in reductions in plant growth and development due to reduced water absorption by the roots, consequently causing an imbalance of nutrients in the plant due to low absorption rate and competition between nutrients and salts in the preferred routes of absorption in the roots.
In this discussion paper, we outline the motivations and the main principles of the Trusted Smart Statistics (TSS) concept that is under development in the European Statistical System. TSS represents the evolution of official statistics in response to the challenges posed by the new datafied society. Taking stock from the availability of new digital data sources, new technologies, and new behaviors, statistical offices are called nowadays to rethink the way they operate in order to reassert their role in modern democratic society. The issue at stake is considerably broader and deeper than merely adapting existing processes to embrace so-called Big Data. In several aspects, such evolution entails a fundamental paradigm shift with respect to the legacy model of official statistics production based on traditional data sources, for example, in the relation between data and computation, between data collection and analysis, between methodological development and statistical production, and of course in the roles of the various stakeholders and their mutual relationships. Such complex evolution must be guided by a comprehensive system-level view based on clearly spelled design principles. In this paper, we aim at providing a general account of the TSS concept reflecting the current state of the discussion within the European Statistical System.
This paper considers specification testing for regression models with errors-in-variables and proposes a test statistic comparing the distance between the parametric and nonparametric fits based on deconvolution techniques. In contrast to the methods proposed by Hall and Ma (2007, Annals of Statistics, 35, 2620–2638) and Song (2008, Journal of Multivariate Analysis, 99, 2406–2443), our test allows general nonlinear regression models and possesses complementary local power properties. We establish the asymptotic properties of our test statistic for the ordinary and supersmooth measurement error densities. Simulation results endorse our theoretical findings: our test has advantages in detecting high-frequency alternatives and dominates the existing tests under certain specifications.
This paper introduces a set of principles that articulate a shared vision for increasing access to data in the engineering and related sectors. The principles are intended to help guide progress toward a data ecosystem that provides sustainable access to data, in ways that will help a variety of stakeholders in maximizing its value while mitigating potential harms. In addition to being a manifesto for change, the principles can also be viewed as a means for understanding the alignment, overlaps and gaps between a range of existing research programs, policy initiatives, and related work on data governance and sharing. After providing background on the growing data economy and relevant recent policy initiatives in the United Kingdom and European Union, we then introduce the nine key principles of the manifesto. For each principle, we provide some additional rationale and links to related work. We invite feedback on the manifesto and endorsements from a range of stakeholders.
The COVID-19 pandemic is exerting major pressures on society, health and social care services and science. Understanding the progression and current impact of the pandemic is fundamental to planning, management and mitigation of future impact on the population. Surveillance is the core function of any public health system, and a multi-component surveillance system for COVID-19 is essential to understand the burden across the different strata of any health system and the population. Many countries and public health bodies utilise ‘syndromic surveillance’ (using real-time, often non-specific symptom/preliminary diagnosis information collected during routine healthcare provision) to supplement public health surveillance programmes. The current COVID-19 pandemic has revealed a series of unprecedented challenges to syndromic surveillance including: the impact of media reporting during early stages of the pandemic; changes in healthcare-seeking behaviour resulting from government guidance on social distancing and accessing healthcare services; and changes in clinical coding and patient management systems. These have impacted on the presentation of syndromic outputs, with changes in denominators creating challenges for the interpretation of surveillance data. Monitoring changes in healthcare utilisation is key to interpreting COVID-19 surveillance data, which can then be used to better understand the impact of the pandemic on the population. Syndromic surveillance systems have had to adapt to encompass these changes, whilst also innovating by taking opportunities to work with data providers to establish new data feeds and develop new COVID-19 indicators. These developments are supporting the current public health response to COVID-19, and will also be instrumental in the continued and future fight against the disease.
Conveyor belt wear is an important consideration in the bulk materials handling industry. We define four belt wear rate metrics and develop a model to predict wear rates of new conveyor configurations using an industry dataset that includes ultrasonic thickness measurements, conveyor attributes, and conveyor throughput. All variables are expected to contribute in some way to explaining wear rate and are included in modeling. One specific metric, the maximum throughput-based wear rate, is selected as the prediction target, and cross-validation is used to evaluate the out-of-sample performance of random forest and linear regression algorithms. The random forest approach achieves a lower error of 0.152 mm/megatons (standard deviation [SD] = 0.0648). Permutation importance and partial dependence plots are computed to provide insights into the relationship between conveyor parameters and wear rate. This work demonstrates how belt wear rate can be quantified from imprecise thickness testing methods and provides a transparent modeling framework applicable to other supervised learning problems in risk and reliability.
Foot and mouth disease (FMD) is a highly contagious viral disease that affects domestic and wild artiodactyl animals and causes considerable economic losses related to outbreak management, production losses and trade impacts. In Tunisia, the last FMD outbreak took place in 2018–2019. The effectiveness of control measures implemented to control FMD depends, in particular, on the human resources used to implement them. Tunisia has the ultimate objective of obtaining OIE status as ‘FMD-free with vaccination’. The aim of this study was to determine and compare the necessary and available human resources to control FMD outbreaks in Tunisia using emergency vaccination and to assess the gaps that would play a role in the implementation of the strategy. We developed a resources-requirement grid of necessary human resources for the management of the emergency vaccination campaign launched after the identification of a FMD-infected premises in Tunisia. Field surveys, conducted in the 24 governorates of Tunisia, allowed quantifying the available human resources for several categories of skills considered in the resources-requirement grid. For each governorate, we then compared available and necessary human resources to implement vaccination according to eight scenarios mixing generalised or cattle-targeted vaccination and different levels of human resources. The resources-requirement grid included 11 tasks in three groups: management of FMD-infected premises, organisational tasks and vaccination implementation. The available human resources for vaccination-related tasks included veterinarians and technicians from the public sector and appointed private veterinarians. The comparison of available and necessary human resources showed vaccination-related tasks to be the most time-consuming in terms of managing a FMD outbreak. Increasing the available human resources using appointed private veterinarians allowed performing the emergency vaccination of animals in the governorate in due time, especially if vaccination was targeted on cattle. The overall approach was validated by comparing the predicted and observed durations of a vaccination campaign conducted under the same conditions as during the 2014 Tunisian outbreak. This study could provide support to the Tunisian Veterinary Services or to other countries to optimise the management of a FMD outbreak.
Another large outbreak of mumps occurred in Lothian from October 2017, which coincided with the commencement of the higher education term. During this period 324 cases were notified, most of whom were aged 18–22 years old. Although previous outbreaks had a focus in student populations, 43% of current cases reported that they were not a student. There has been increases in private student housing where students from all universities live, which may have contributed to the wide spread of the outbreak and complicated outbreak control. Information on vaccination status was available for 244 cases (75%), of whom the majority (75.8%) reported having two MMR doses. To investigate potential waning vaccine immunity the mean length of time since last mumps containing vaccine was calculated as 14.3 years. The outbreak was declared over in May 2018 after case numbers returned to background levels. This outbreak highlighted that mumps outbreaks occur cyclically coinciding with new cohorts of susceptible students entering the Lothian population. The lessons from this outbreak are to encourage students to have two MMR doses and also be prepared for mumps outbreaks in the near future. In future outbreaks the utility of a third MMR for outbreak control could be examined.
We apply deep kernel learning (DKL), which can be viewed as a combination of a Gaussian process (GP) and a deep neural network (DNN), to compression ignition engine emissions and compare its performance to a selection of other surrogate models on the same dataset. Surrogate models are a class of computationally cheaper alternatives to physics-based models. High-dimensional model representation (HDMR) is also briefly discussed and acts as a benchmark model for comparison. We apply the considered methods to a dataset, which was obtained from a compression ignition engine and includes as outputs soot and NOx emissions as functions of 14 engine operating condition variables. We combine a quasi-random global search with a conventional grid-optimization method in order to identify suitable values for several DKL hyperparameters, which include network architecture, kernel, and learning parameters. The performance of DKL, HDMR, plain GPs, and plain DNNs is compared in terms of the root mean squared error (RMSE) of the predictions as well as computational expense of training and evaluation. It is shown that DKL performs best in terms of RMSE in the predictions whilst maintaining the computational cost at a reasonable level, and DKL predictions are in good agreement with the experimental emissions data.
Data-Centric Engineering is an emerging branch of science that certainly will take on a leading role in data-driven research. We live in the Big Data era with huge amounts of available data and unseen computing power, and therefore a crafty combination of Statistics (or, in more modern terms, Data Science), Computer Science and Engineering is required to filter out the most important information, master the ever more difficult challenges of a changing world and open new paths. In this paper, we will highlight some of these aspects from a combined perspective of a statistician, an engineer and a software developer. In particular, we will focus on sound data handling and analysis, computational science in Structural Engineering, data care, security and monitoring, and conclude with an outlook on future developments.
In this paper, we explore the use of an extensive list of Archimedean copulas in general and life insurance modelling. We consider not only the usual choices like the Clayton, Gumbel–Hougaard, and Frank copulas but also several others which have not drawn much attention in previous applications. First, we apply different copula functions to two general insurance data sets, co-modelling losses and allocated loss adjustment expenses, and also losses to building and contents. Second, we adopt these copulas for modelling the mortality trends of two neighbouring countries and calculate the market price of a mortality bond. Our results clearly show that the diversity of Archimedean copula structures gives much flexibility for modelling different kinds of data sets and that the copula and tail dependence assumption can have a significant impact on pricing and valuation. Moreover, we conduct a large simulation exercise to investigate further the caveats in copula selection. Finally, we examine a number of other estimation methods which have not been tested in previous insurance applications.
In this paper, we propose a multivariate Hawkes framework for modelling and predicting cyber attacks frequency. The inference is based on a public data set containing features of data breaches targeting the US industry. As a main output of this paper, we demonstrate the ability of Hawkes models to capture self-excitation and interactions of data breaches depending on their type and targets. In this setting, we detail prediction results providing the full joint distribution of future cyber attacks times of occurrence. In addition, we show that a non-instantaneous excitation in the multivariate Hawkes model, which is not the classical framework of the exponential kernel, better fits with our data. In an insurance framework, this study allows to determine quantiles for number of attacks, useful for an internal model, as well as the frequency component for a data breach guarantee.
Non-invasive prenatal testing (NIPT) using cell-free foetal DNA has been widely accepted in recent years for detecting common foetal chromosome aneuploidies, such as trisomies 13, 18 and 21, and sex chromosome aneuploidies. In this study, the practical clinical performance of our foetal DNA testing was evaluated for analysing all chromosome aberrations among 7113 pregnancies in Italy.
Methods
This study was a retrospective analysis of collected NIPT data from the Ion S5 next-generation sequencing platform obtained from Altamedica Medical Centre in Rome, Italy.
Results
In this study, NIPT showed 100% sensitivity and 99.9% specificity for trisomies 13, 18 and 21. Out of the 7113 samples analysed, 74 cases (1%) were positive by NIPT testing; foetal karyotyping and follow-up results validated 2 trisomy 13 cases, 5 trisomy 18 cases, 58 trisomy 21 cases and 10 sex chromosome aneuploidy cases. There were no false-negative results.
Conclusion
In our hands, NIPT had high sensitivity and specificity for common chromosomal aneuploidies such as trisomies 13, 18 and 21.
We propose a new nonparametric test of stochastic monotonicity which adapts to the unknown smoothness of the conditional distribution of interest, possesses desirable asymptotic properties, is conceptually easy to implement, and computationally attractive. In particular, we show that the test asymptotically controls size at a polynomial rate, is nonconservative, and detects certain smooth local alternatives that converge to the null with the fastest possible rate. Our test is based on a data-driven bandwidth value and the critical value for the test takes this randomness into account. Monte Carlo simulations indicate that the test performs well in finite samples. In particular, the simulations show that the test controls size and, under some alternatives, is significantly more powerful than existing procedures.