To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We show that for any $\varepsilon \gt 0$ and $\Delta \in \mathbb{N}$, there exists $\alpha \gt 0$ such that for sufficiently large $n$, every $n$-vertex graph $G$ satisfying that $\delta (G)\geq \varepsilon n$ and $e(X, Y)\gt 0$ for every pair of disjoint vertex sets $X, Y\subseteq V(G)$ of size $\alpha n$ contains all spanning trees with maximum degree at most $\Delta$. This strengthens a result of Böttcher, Han, Kohayakawa, Montgomery, Parczyk, and Person.
It is proven that a conjecture of Tao (2010) holds true for log-concave random variables on the integers: For every $n \geq 1$, if $X_1,\ldots,X_n$ are i.i.d. integer-valued, log-concave random variables, then
as $H(X_1) \to \infty$, where $H(X_1)$ denotes the (discrete) Shannon entropy. The problem is reduced to the continuous setting by showing that if $U_1,\ldots,U_n$ are independent continuous uniforms on $(0,1)$, then
A local COVID-19 outbreak with two community clusters occurred in a large industrial city, Shaoxing, China, in December 2021 after serial interventions were imposed. We aimed to understand the reason by analysing the characteristics of the outbreak and evaluating the effects of phase-adjusted interventions. Publicly available data from 7 December 2021 to 25 January 2022 were collected to analyse the epidemiological characteristics of this outbreak. The incubation period was estimated using Hamiltonian Monte Carlo method. A well-fitted extended susceptible-exposed-infectious-recovered model was used to simulate the impact of different interventions under various combination of scenarios. There were 387 SARS-CoV-2-infected cases identified, and 8.3% of them were initially diagnosed as asymptomatic cases. The estimated incubation period was 5.4 (95% CI 5.2–5.7) days for all patients. Strengthened measures of comprehensive quarantine based on tracing led to less infections and a shorter duration of epidemic. With a same period of incubation, comprehensive quarantine was more effective in containing the transmission than other interventions. Our findings reveal an important role of tracing and comprehensive quarantine in blocking community spread when a cluster occurred. Regions with tense resources can adopt home quarantine as a relatively affordable and low-impact intervention measure compared with centralized quarantine.
Under the European Union’s Solvency II regulations, insurance firms are required to use a one-year VaR (Value at Risk) approach. This involves a one-year projection of the balance sheet and requires sufficient capital to be solvent in 99.5% of outcomes. The Solvency II Internal Model risk calibrations require annual changes in market indices/term structure/transitions for the estimation of the risk distribution for each of the Internal Model risk drivers.
Transition and default risk are typically modelled using transition matrices. To model this risk requires a model of transition matrices and how these can change from year to year. In this paper, four such models have been investigated and compared to the raw data they are calibrated to. The models investigated are:
A bootstrapping approach – sampling from an historical data set with replacement.
The Vašíček model was calibrated using the Belkin approach.
The K-means model – a new non-parametric model produced using the K-means clustering algorithm.
A two-factor model – a new parametric model, using two factors (instead of a single factor with the Vašíček) to represent each matrix.
The models are compared in several ways:
1. A principal components analysis (PCA) approach that compares how closely the models move compared to the raw data.
2. A backtesting approach that compares how each model’s extreme percentile compares to regulatory backtesting requirements.
3. A commentary on the amount of expert judgement in each model.
4. Model simplicity and breadth of uses are also commented on.
People who inject drugs are at risk of acute bacterial and fungal injecting-related infections. There is evidence that incidence of hospitalizations for injecting-related infections are increasing in several countries, but little is known at an individual level. We aimed to examine injecting-related infections in a linked longitudinal cohort of people who inject drugs in Melbourne, Australia. A retrospective descriptive analysis was conducted to estimate the prevalence and incidence of injecting-related infections using administrative emergency department and hospital separation datasets linked to the SuperMIX cohort, from 2008 to 2018. Over the study period, 33% (95%CI: 31–36%) of participants presented to emergency department with any injecting-related infections and 27% (95%CI: 25–30%) were admitted to hospital. Of 1,044 emergency department presentations and 740 hospital separations, skin and soft tissue infections were most common, 88% and 76%, respectively. From 2008 to 2018, there was a substantial increase in emergency department presentations and hospital separations with any injecting-related infections, 48 to 135 per 1,000 person-years, and 18 to 102 per 1,000 person-years, respectively. The results emphasize that injecting-related infections are increasing, and that new models of care are needed to help prevent and facilitate early detection of superficial infection to avoid potentially life-threatening severe infections.
We examined the association between face masks and risk of infection with SARS-CoV-2 using cross-sectional data from 3,209 participants in a randomized trial exploring the effectiveness of glasses in reducing the risk of SARS-CoV-2 infection. Face mask use was based on participants’ response to the end-of-follow-up survey. We found that the incidence of self-reported COVID-19 was 33% (aRR 1.33; 95% CI 1.03–1.72) higher in those wearing face masks often or sometimes, and 40% (aRR 1.40; 95% CI 1.08–1.82) higher in those wearing face masks almost always or always, compared to participants who reported wearing face masks never or almost never. We believe the observed increase in the incidence of infection associated with wearing a face mask is likely due to unobservable and hence nonadjustable differences between those wearing and not wearing a mask. Observational studies reporting on the relationship between face mask use and risk of respiratory infections should be interpreted cautiously, and more randomized trials are needed.
This paper establishes bounds on the performance of empirical risk minimization for large-dimensional linear regression. We generalize existing results by allowing the data to be dependent and heavy-tailed. The analysis covers both the cases of identically and heterogeneously distributed observations. Our analysis is nonparametric in the sense that the relationship between the regressand and the regressors is not specified. The main results of this paper show that the empirical risk minimizer achieves the optimal performance (up to a logarithmic factor) in a dependent data setting.
A collection of graphs is nearly disjoint if every pair of them intersects in at most one vertex. We prove that if $G_1, \dots, G_m$ are nearly disjoint graphs of maximum degree at most $D$, then the following holds. For every fixed $C$, if each vertex $v \in \bigcup _{i=1}^m V(G_i)$ is contained in at most $C$ of the graphs $G_1, \dots, G_m$, then the (list) chromatic number of $\bigcup _{i=1}^m G_i$ is at most $D + o(D)$. This result confirms a special case of a conjecture of Vu and generalizes Kahn’s bound on the list chromatic index of linear uniform hypergraphs of bounded maximum degree. In fact, this result holds for the correspondence (or DP) chromatic number and thus implies a recent result of Molloy and Postle, and we derive this result from a more general list colouring result in the setting of ‘colour degrees’ that also implies a result of Reed and Sudakov.
A comprehensive overview of essential statistical concepts, useful statistical methods, data visualization, and modern computing tools for the climate sciences and many others such as geography and environmental engineering. It is an invaluable reference for students and researchers in climatology and its connected fields who wish to learn data science, statistics, R and Python programming. The examples and exercises in the book empower readers to work on real climate data from station observations, remote sensing and simulated results. For example, students can use R or Python code to read and plot the global warming data and the global precipitation data in netCDF, csv, txt, or JSON; and compute and interpret empirical orthogonal functions. The book's computer code and real-world data allow readers to fully utilize the modern computing technology and updated datasets. Online supplementary resources include R code and Python code, data files, figure files, tutorials, slides and sample syllabi.
Benford's Law is a probability distribution for the likelihood of the leading digit in a set of numbers. This book seeks to improve and systematize the use of Benford's Law in the social sciences to assess the validity of self-reported data. The authors first introduce a new measure of conformity to the Benford distribution that is created using permutation statistical methods and employs the concept of statistical agreement. In a switch from a typical Benford application, this book moves away from using Benford's Law to test whether the data conform to the Benford distribution, to using it to draw conclusions about the validity of the data. The concept of 'Benford validity' is developed, which indicates whether a dataset is valid based on comparisons with the Benford distribution and, in relation to this, diagnostic procedure that assesses the impact of not having Benford validity on data analysis is devised.
Organic data have the potential to enable innovative measurements and research designs by virtue of capturing human behavior and interactions in social, educational, and organizational processes. Yet what makes organic data valuable also raises privacy concerns for those individuals whose personal information is being collected and analyzed. This chapter discusses the potential privacy threats posed by organic datasets and the technical tools available to ameliorate such threats. Also noted is the importance for educators and research scientists to participate in interdisciplinary research that addresses the privacy challenges arising from the collection and use of organic data.
Despite promising early evidence for the validity of well-designed game-based assessments (GBAs) for employee selection, the interaction between the complexity of games and their use in international and cross-cultural contexts is unknown. To address this, this paper presents a descriptive, qualitative study examining the perspectives of both GBA vendors and organizational stakeholders related to cross-cultural issues unique to GBAs related to 1) privacy, 2) legality, and 3) applicant reactions. Overall, privacy and legality concerns appeared similar for GBAs as with other assessment methods, although certain common characteristics of GBAs amplify common concerns. Applicant reactions appeared more positive to GBAs across national borders and cultures than traditional assessments, although some international differences were reported. Other cross-cultural topics raised included international differences in the conflation of GBA and artificial intelligence, in the importance of mobile-first design, and in the ability of GBAs to provide a more language-agnostic experience than other assessment types.
Social media is an ever-increasing aspect of the internet presence and daily life. Despite certain challenges in defining the construct, researchers have realized the possibility that social media can allow for the measurement and assessment of a wide variety of variables. Throughout the ever-growing number of social media sites and apps across countries and languages, there is an abundance of formats that researchers can utilize, such as photo, text, location, video, and more. In this book chapter, we conducted a literature search and identified four constructs that are most frequently studied using social media (i.e., personality, emotion/affect/mood, life satisfaction, and political views). We then summarized a list of studies that use social media to investigate these four constructs. Additionally, social media offers unique opportunities for researchers to assess various cross-cultural data, which can present its own challenges. We also provide examples of the potential opportunities and challenges, as well as ethical and technical considerations for researchers to keep in mind.
Technology, that is, the output of human innovation, has always been central to human progress worldwide. Early on, the ancients developed the wheel, concrete, calculus, and paper, which led to advances in transportation, construction, and communication. Today, the incarnation of technology falls in the realm of the digital and computational, and its progress has been rapid, even arguably exponential. In his chapter, “The Law of Accelerating Returns,” Ray Kurzweil writes, “An analysis of the history of technology shows that technological change is exponential, contrary to the common-sense ‘intuitive linear’ view. So we won’t experience 100 years of progress in the 21st century – it will be more like 20,000 years of progress (at today’s rate)” (Kurzweil, 2004, p. 381).
The curse of dimensionality confounds the comprehensive evaluation of computational structural mechanics problems. Adequately capturing complex material behavior and interacting physics phenomenon in models can lead to long run times and memory requirements resulting in the need for substantial computational resources to analyze one scenario for a single set of input parameters. The computational requirements are then compounded when considering the number and range of input parameters spanning material properties, loading, boundary conditions, and model geometry that must be evaluated to characterize behavior, identify dominant parameters, perform uncertainty quantification, and optimize performance. To reduce model dimensionality, global sensitivity analysis (GSA) enables the identification of dominant input parameters for a specific structural performance output. However, many distinct types of GSA methods are available, presenting a challenge when selecting the optimal approach for a specific problem. While substantial documentation is available in the literature providing details on the methodology and derivation of GSA methods, application-based case studies focus on fields such as finance, chemistry, and environmental science. To inform the selection and implementation of a GSA method for structural mechanics problems for a nonexpert user, this article investigates five of the most widespread GSA methods with commonly used structural mechanics methods and models of varying dimensionality and complexity. It is concluded that all methods can identify the most dominant parameters, although with significantly different computational costs and quantitative capabilities. Therefore, method selection is dependent on computational resources, information required from the GSA, and available data.