To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Signal-to-interference-plus-noise ratio (SINR) percolation is an infinite-range dependent variant of continuum percolation modeling connections in a telecommunication network. Unlike in earlier works, in the present paper the transmitted signal powers of the devices of the network are assumed random, independent and identically distributed, and possibly unbounded. Additionally, we assume that the devices form a stationary Cox point process, i.e., a Poisson point process with stationary random intensity measure, in two or more dimensions. We present the following main results. First, under suitable moment conditions on the signal powers and the intensity measure, there is percolation in the SINR graph given that the device density is high and interferences are sufficiently reduced, but not vanishing. Second, if the interference cancellation factor $\gamma$ and the SINR threshold $\tau$ satisfy $\gamma \geq 1/(2\tau)$, then there is no percolation for any intensity parameter. Third, in the case of a Poisson point process with constant powers, for any intensity parameter that is supercritical for the underlying Gilbert graph, the SINR graph also percolates with some small but positive interference cancellation factor.
This paper studies the joint tail asymptotics of extrema of the multi-dimensional Gaussian process over random intervals defined as $P(u)\;:\!=\; \mathbb{P}\{\cap_{i=1}^n (\sup_{t\in[0,\mathcal{T}_i]} ( X_{i}(t) +c_i t )>a_i u )\}$, $u\rightarrow\infty$, where $X_i(t)$, $t\ge0$, $i=1,2,\ldots,n$, are independent centered Gaussian processes with stationary increments, $\boldsymbol{\mathcal{T}}=(\mathcal{T}_1, \ldots, \mathcal{T}_n)$ is a regularly varying random vector with positive components, which is independent of the Gaussian processes, and $c_i\in \mathbb{R}$, $a_i>0$, $i=1,2,\ldots,n$. Our result shows that the structure of the asymptotics of P(u) is determined by the signs of the drifts $c_i$. We also discuss a relevant multi-dimensional regenerative model and derive the corresponding ruin probability.
Rubella is a highly contagious mild viral illness. It is a leading cause of congenital rubella syndrome (CRS). Routine data of rubella do not exist in Ethiopia. However, laboratory-based conformation of rubella cases from measles negative samples were collected from a measles surveillance system. The current study was to analyse the epidemiological distribution of rubella cases from measles-suspected cases in Ethiopia from 2011 to 2015. National-based secondary data analysis of rubella through measles-based surveillances was carried out. Measles-suspected cases were investigated using the case investigation form, and a serum sample collected and sent to the Ethiopian laboratory for conformation. Samples tested for measles immunoglobulin M (IgM) were tested for rubella. The investigation results were entered into an electronic database using SPSS version 25 for analysis. Out of 11749 samples tested for rubella IgM from 2011 to 2015, 2295 (19.5%) were positive for rubella IgM and 51% of rubella-positive cases were female. Five per cent of all cases were female aged between 15 and 49. Cases were confirmed from all regions, two administrative towns and seasonal variations were observed with peaks in the first and fourth seasonal periods of the years. As fear of congenital abnormality (CRS), the Ethiopian government should focus on rubella syndrome surveillance with the aim of starting a rubella vaccine.
After the introduction of the 13-valent pneumococcal conjugate vaccine (PCV13), serotype replacement has occurred in Japan, and serotype 24 has become the most common serotype in paediatric invasive pneumococcal disease (IPD). To understand the characteristics of serotype 24-IPD in Japanese children in the post-PCV13 era, we conducted a retrospective study in children aged ≤15 years from 2010 to 2020 using a database of paediatric IPD surveillance in Chiba prefecture, Japan. We identified a total of 357 IPD cases and collected clinical information on 225 cases (24: 32 cases, non-24: 193 cases). Compared with the non-serotype 24-IPD, serotype 24-IPD was independently related to be <2 years of age [odds ratio (OR) 3.91, 95% confidence interval (CI) 1.47–10.44; P = 0.0064] and bacteremia (OR 2.28, 95% CI 1.01–5.13; P = 0.0475), as a result of the multivariate regression analysis. We also conducted a bacterial analysis, and the isolates of serotype 24-IPD had tendencies of PCG-susceptible (24: 100.0%, non-24: 61.3%; P < 0.0001) and macrolide-resistance (24: 100.0%, non-24: 87.3%; P = 0.0490). Their multilocus sequence typing was mostly ST2572 and the variants, which were unique to Japan. This tendency might have been a result of the progress made in the Japanese PCV13 immunisation programme.
In this editorial, Guest Editors Richard Benjamins (Telefónica), Jeanine Vos (GSMA), and Stefaan Verhulst (Data & Policy Editor-in-Chief) draw insights from a set of peer-reviewed, open access articles in a Data & Policy special collection dedicated to the use of Telco Big Data Analytics for COVID-19.
We revisit the so-called cat-and-mouse Markov chain, studied earlier by Litvak and Robert (2012). This is a two-dimensional Markov chain on the lattice $\mathbb{Z}^2$, where the first component (the cat) is a simple random walk and the second component (the mouse) changes when the components meet. We obtain new results for two generalisations of the model. First, in the two-dimensional case we consider far more general jump distributions for the components and obtain a scaling limit for the second component. When we let the first component be a simple random walk again, we further generalise the jump distribution of the second component. Secondly, we consider chains of three and more dimensions, where we investigate structural properties of the model and find a limiting law for the last component.
Veterinary healthcare workers are in close contact with many different animals and might be at an increased risk of acquiring Clostridioides difficile. In this cross-sectional study, we assessed the prevalence and risk factors of C. difficile carriage in Dutch veterinary healthcare workers. Participants provided a faecal sample and filled out a questionnaire covering potential risk factors for C. difficile carriage. C. difficile culture positive isolates were polymerase chain reaction (PCR) ribotyped and the presence of toxin genes tcdA, tcdB and cdtA/cdtB was determined. Eleven of 482 [2.3%; 95% confidence interval (CI) 1.3–4.0] veterinary healthcare workers were carriers of C. difficile. Three persons carried C. difficile ribotype 078 (0.6%; 95% CI 0.2–1.8). Risk factors for carriage were health/medication and hygiene related, including poor hand hygiene after patient (animal) contact, and did not include occupational contact with certain animal species. In conclusion, the prevalence of C. difficile carriage in veterinary healthcare workers was low and no indications were found that working in veterinary care is a risk for C. difficile carriage.
For a one-locus haploid infinite population with discrete generations, the celebrated model of Kingman describes the evolution of fitness distributions under the competition of selection and mutation, with a constant mutation probability. This paper generalises Kingman’s model by using independent and identically distributed random mutation probabilities, to reflect the influence of a random environment. The weak convergence of fitness distributions to the globally stable equilibrium is proved. Condensation occurs when almost surely a positive proportion of the population travels to and condenses at the largest fitness value. Condensation may occur when selection is favoured over mutation. A criterion for the occurrence of condensation is given.
Outbreaks caused by Chlamydia psittaci and other chlamydial species have recently been reported in poultry farms worldwide, causing considerable economic losses. The objective of this study was to determine the presence of chlamydial species in these birds in Costa Rica. One hundred and fifty pools of lung tissue samples from industrial poultry with respiratory problems and 112 pools of tracheal swabs from asymptomatic backyard poultry were analysed by real-time quantitative polymerase chain reaction (qPCR), end-point PCR and sequencing. A total of 16.8% (44/262) samples were positive for Chlamydia spp., most of them detected in asymptomatic backyard poultry (28.6%, 32/112) and fewer in industrial poultry (8%, 12/150). Of these positive samples, 45.5% (20/44) were determined to be C. psittaci. For the first time C. psittaci genotype A is reported in poultry in Latin America. In addition, the presence of Chlamydia gallinacea in backyard poultry and of Chlamydia muridarum in industrial and backyard poultry is reported for the first time in Central America. In 40.9% (18/44) of the positive samples, it was not possible to identify the infecting chlamydial species. These findings reveal a zoonotic risk, particularly for poultry farm and slaughterhouse workers having direct contact with these birds.
Preterm infants show postnatal deficits of long-chain polyunsaturated fatty acids (LCPUFAs) which are essential for adequate growth and neurodevelopment. Human milk is a primary source of fatty acids (FAs) for the preterm infant, and therefore, knowledge about milk FA levels is required to design appropriate supplementation strategies. Here, we expanded on our previous study (Nilsson et al., 2018, Acta Paediatrica, 107, 1020–1027) determining FA composition in milk obtained from mothers of extremely low gestational age (<28 weeks) infants on three occasions during lactation. There was a clear difference in FA composition in milk collected at Day 7 and milk collected at postmenstrual weeks (PMW) 32 or PMW 40. Notably, the proportion of LCPUFAs was low and declined significantly during milk maturation. These results strengthen previous data that the content of FAs required by the preterm infant is not supplied in sufficient amounts when the mother’s own milk is the sole source of these essential nutrients.
This chapter gives an introduction to extreme value theory. Unlike most statistical analyses, which are concerned with the typical properties of a random variable, extreme value theory is concerned with rare events that occur in the tail of the distribution. The cornerstone of extreme value theory is the Extremal Types Theorem. This theorem states that the maximum of N independent and identically distributed random variables can converge, after suitable normalization, only to a single distribution in the limit of large N. This limiting distribution is called the Generalized Extreme Value (GEV) distribution. This theorem is analogous to the central limit theorem, except that the focus is on the maximum rather than the sum of random variables. The GEV provides the basis for estimating the probability of extremes that are more extreme than those that occurred in a sample. The GEV is characterized by three parameters, called the location, scale, and shape. A procedure called the maximum likelihood method can be used to estimate these parameters, quantify their uncertainty, and account for dependencies on time or external environmental conditions.
The correlation test is a standard procedure for deciding if two variables are linearly related. This chapter discusses a test for independence that avoids the linearity assumption. The basic idea is the following. If two variables are dependent, then changing the value of one them, say c, changes the distribution of the other. Therefore, if samples are collected for fixed value of c, and additional samples are collected for a different value of c, and so on for different values of c, then a dependence implies that the distributions for different c’s should differ. It follows that deciding that some aspect of the distributions depend on c is equivalent to deciding that the variables are dependent. A special case of this approach is the t-test, which tests if two populations have identical means. Generalizing this test to more than two populations leads to Analysis of Variance (ANOVA), which is the topic of this chapter. ANOVA is a method for testing if two or more populations have the same means. In weather and climate studies, ANOVA is used most often to quantify the predictability of an ensemble forecast, hence this framing is discussed extensively in this chapter.
We investigate random minimal factorizations of the n-cycle, that is, factorizations of the permutation $(1 \, 2 \cdots n)$ into a product of cycles $\tau_1, \ldots, \tau_k$ whose lengths $\ell(\tau_1), \ldots, \ell(\tau_k)$ satisfy the minimality condition $\sum_{i=1}^k(\ell(\tau_i)-1)=n-1$. By associating to a cycle of the factorization a black polygon inscribed in the unit disk, and reading the cycles one after another, we code a minimal factorization by a process of colored laminations of the disk. These new objects are compact subsets made of red noncrossing chords delimiting faces that are either black or white. Our main result is the convergence of this process as $n \rightarrow \infty$, when the factorization is randomly chosen according to Boltzmann weights in the domain of attraction of an $\alpha$-stable law, for some $\alpha \in (1,2]$. The limiting process interpolates between the unit circle and a colored version of Kortchemski’s $\alpha$-stable lamination. Our principal tool in the study of this process is a new bijection between minimal factorizations and a model of size-conditioned labeled random trees whose vertices are colored black or white, as well as the investigation of the asymptotic properties of these trees.
The previous chapter considered the following problem: given a distribution, deduce the characteristics of samples drawn from that distribution. This chapter goes in the opposite direction: given a random sample, infer the distribution from which the sample was drawn. It is impossible to infer the distribution exactly from a finite sample. Our strategy is more limited: we propose a hypothesis about the distribution, then decide whether or not to accept the hypothesis based on the sample. Such procedures are called hypothesis tests. In each test, a decision rule for deciding whether to accept or reject the hypothesis is formulated. The probability that the rule gives the wrong decision when the hypothesis is true leads to the concept of a significance level. In climate studies, the most common questions addressed by hypothesis test are whether two random variables (1) have the same mean, (2) have the same variance, or (3) are independent. This chapter discusses the corresponding tests for normal distributions, called the (1) t-test (or difference-in-means test), (2) F-test (or difference-in-variance test), and (3) correlation test.
This chapter reviews some essential concepts of probability and statistics, including: line plots, histograms, scatter plots, mean, median, quantiles, variance, random variables, probability density function, expectation of a random variable, covariance and correlation, independence the normal distribution (also known as the Gaussian distribution), the chi-square distribution. The above concepts provide the foundation for the statistical methods discussed in the rest of this book.
Field significance is concerned with testing a large number of hypothesis simultaneously. Previous chapters have discussed methods for testing one hypothesis, such as whether one variable is correlated with one other variable. Field significance is concerned with whether one variable is related to a random vector. In climate applications, a characteristic feature of field significance problems is that the variables in the random vector correspond to quantities at different geographic locations. As such, neighboring variables are correlated and therefore exhibit spatial dependence. This spatial dependence needs to be taken into account when testing hypotheses. This chapter introduces the concept of field significance and explains three hypothesis test procedures: a Monte Carlo method proposed by Livezey and Chen (1983) and an associated permutation test, a regression method proposed by DelSole and Yang (2011), and a procedure to control the false discovery rate, proposed in a general context by Benjamini and Hockberg (1995) and applied to field significance problems by Ventura et al. (2004) and Wilks (2006).
The previous chapter discussed Analysis of Variance (ANOVA), a procedure for deciding if populations have identical scalar means. This chapter discusses the generalization of this test to vector means, which is called Multivariate Analysis of Variance, or MANOVA. MANOVA can detect predictability of random vectors and decompose a random vector into an sum of components ordered such that the first maximizes predictability, the second maximizes predictability subject to being uncorrelated with the first, and so on. This decomposition is called Predictable Component Analysis (PrCA) or signal-to-noise maximizing EOF analysis. A slight modification of this procedure can decompose forecast skill. The connection between PrCA, Canonical Correlation Analysis, and Multivariate Regression is reviewed. In typical climate studies, the dimension of the random vector exceeds the number of samples, leading to an ill-posed problem. The standard approach to this problem is to apply PrCA on a small number of principal components. The problem of selecting the number of principal components can be framed as a model selection problem in regression.
Many scientific questions lead to hypotheses about random vectors. For instance, the question of whether global warming has occurred over a geographic region is a question about whether temperature has changed at each spatial location within the region. One approach to addressing such a question is to apply a univariate test to each location separately and then use the results collectively to make a decision. This approach is called multiple testing or multiple comparisons and is common in genomics for analyzing gene expressions. The disadvantage of this approach is that it does not fully account for correlation between variables. Multivariate techniques provide a framework for hypothesis testing that takes into account correlations between variables. Although multivariate tests are more comprehensive, they require estimating more parameters and therefore have low power when the number of variables is large. Multivariate statistical analysis draws heavily on linear algebra and includes a generalization of the normal distribution, called the multivariate normal distribution, whose population parameters are the mean vector and the covariance matrix.
Climate data are correlated over short spatial and temporal scales. For instance, today’s weather tends to be correlated with tomorrow’s weather, and weather in one city tends to be correlated with weather in a neighboring city. Such correlations imply that weather events are not independent. This chapter discusses an approach to accounting for spatial and temporal dependencies based on stochastic processes. A stochastic process is a collection of random variables indexed by a parameter, such as time or space. A stochastic process is described by the moments at a single time (e.g., mean and variance), and also by the degree of dependence between two times, often measured by the autocorrelation function. This chapter presents these concepts and discusses common mathematical models for generating stochastic processes, especially autoregressive models. The focus of this chapter is on developing the language for describing stochastic processes. Challenges in estimating parameters and testing hypotheses about stochastic processes are discussed.