To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We study a stochastic differential equation with an unbounded drift and general Hölder continuous noise of order $\lambda \in (0,1)$. The corresponding equation turns out to have a unique solution that, depending on a particular shape of the drift, either stays above some continuous function or has continuous upper and lower bounds. Under some mild assumptions on the noise, we prove that the solution has moments of all orders. In addition, we provide its connection to the solution of some Skorokhod reflection problem. As an illustration of our results and motivation for applications, we also suggest two stochastic volatility models which we regard as generalizations of the CIR and CEV processes. We complete the study by providing a numerical scheme for the solution.
This paper deals with ergodic theorems for particular time-inhomogeneous Markov processes, whose time-inhomogeneity is asymptotically periodic. Under a Lyapunov/minorization condition, it is shown that, for any measurable bounded function f, the time average $\frac{1}{t} \int_0^t f(X_s)ds$ converges in $\mathbb{L}^2$ towards a limiting distribution, starting from any initial distribution for the process $(X_t)_{t \geq 0}$. This convergence can be improved to an almost sure convergence under an additional assumption on the initial measure. This result is then applied to show the existence of a quasi-ergodic distribution for processes absorbed by an asymptotically periodic moving boundary, satisfying a conditional Doeblin condition.
To describe the trend of cumulative incidence of coronavirus disease 19 (COVID-19) and undiagnosed cases over the pandemic through the emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants among healthcare workers in Tokyo, we analysed data of repeated serological surveys and in-house COVID-19 registry among the staff of National Center for Global Health and Medicine. Participants were asked to donate venous blood and complete a survey questionnaire about COVID-19 diagnosis and vaccine. Positive serology was defined as being positive on Roche or Abbott assay against SARS-CoV-2 nucleocapsid protein, and cumulative infection was defined as either being seropositive or having a history of COVID-19. Cumulative infection has increased from 2.0% in June 2021 (pre-Delta) to 5.3% in December 2021 (post-Delta). After the emergence of the Omicron, it has increased substantially during 2022 (16.9% in June and 39.0% in December). As of December 2022, 30% of those who were infected in the past were not aware of their infection. Results indicate that SARS-CoV-2 infection has rapidly expanded during the Omicron-variant epidemic among healthcare workers in Tokyo and that a sizable number of infections were undiagnosed.
Two-part framework and the Tweedie generalized linear model (GLM) have traditionally been used to model loss costs for short-term insurance contracts. For most portfolios of insurance claims, there is typically a large proportion of zero claims that leads to imbalances, resulting in lower prediction accuracy of these traditional approaches. In this article, we propose the use of tree-based methods with a hybrid structure that involves a two-step algorithm as an alternative approach. For example, the first step is the construction of a classification tree to build the probability model for claim frequency. The second step is the application of elastic net regression models at each terminal node from the classification tree to build the distribution models for claim severity. This hybrid structure captures the benefits of tuning hyperparameters at each step of the algorithm; this allows for improved prediction accuracy, and tuning can be performed to meet specific business objectives. An obvious major advantage of this hybrid structure is improved model interpretability. We examine and compare the predictive performance of this hybrid structure relative to the traditional Tweedie GLM using both simulated and real datasets. Our empirical results show that these hybrid tree-based methods produce more accurate and informative predictions.
We present an efficient algorithm to generate a discrete uniform distribution on a set of p elements using a biased random source for p prime. The algorithm generalizes Von Neumann’s method and improves the computational efficiency of Dijkstra’s method. In addition, the algorithm is extended to generate a discrete uniform distribution on any finite set based on the prime factorization of integers. The average running time of the proposed algorithm is overall sublinear: $\operatorname{O}\!(n/\log n)$.
Multilayer networks are in the focus of the current complex network study. In such networks, multiple types of links may exist as well as many attributes for nodes. To fully use multilayer—and other types of complex networks in applications, the merging of various data with topological information renders a powerful analysis. First, we suggest a simple way of representing network data in a data matrix where rows correspond to the nodes and columns correspond to the data items. The number of columns is allowed to be arbitrary, so that the data matrix can be easily expanded by adding columns. The data matrix can be chosen according to targets of the analysis and may vary a lot from case to case. Next, we partition the rows of the data matrix into communities using a method which allows maximal compression of the data matrix. For compressing a data matrix, we suggest to extend so-called regular decomposition method for non-square matrices. We illustrate our method for several types of data matrices, in particular, distance matrices, and matrices obtained by augmenting a distance matrix by a column of node degrees, or by concatenating several distance matrices corresponding to layers of a multilayer network. We illustrate our method with synthetic power-law graphs and two real networks: an Internet autonomous systems graph and a world airline graph. We compare the outputs of different community recovery methods on these graphs and discuss how incorporating node degrees as a separate column to the data matrix leads our method to identify community structures well-aligned with tiered hierarchical structures commonly encountered in complex scale-free networks.
We study a general model of recursive trees where vertices are equipped with independent weights and at each time-step a vertex is sampled with probability proportional to its fitness function, which is a function of its weight and degree, and connects to $\ell$ new-coming vertices. Under a certain technical assumption, applying the theory of Crump–Mode–Jagers branching processes, we derive formulas for the limiting distributions of the proportion of vertices with a given degree and weight, and proportion of edges with endpoint having a certain weight. As an application of this theorem, we rigorously prove observations of Bianconi related to the evolving Cayley tree (Phys. Rev. E66, paper no. 036116, 2002). We also study the process in depth when the technical condition can fail in the particular case when the fitness function is affine, a model we call ‘generalised preferential attachment with fitness’. We show that this model can exhibit condensation, where a positive proportion of edges accumulates around vertices with maximal weight, or, more drastically, can have a degenerate limiting degree distribution, where the entire proportion of edges accumulates around these vertices. Finally, we prove stochastic convergence for the degree distribution under a different assumption of a strong law of large numbers for the partition function associated with the process.
The epidemiology of invasive meningococcal disease (IMD) is unpredictable, varies by region and age group and continuously evolves. This review aimed to describe trends in the incidence of IMD and serogroup distribution by age group and global region over time. Data were extracted from 90 subnational, national and multinational grey literature surveillance reports and 22 published articles related to the burden of IMD from 2010 to 2019 in 77 countries. The global incidence of IMD was generally low, with substantial variability between regions in circulating disease-causing serogroups. The highest incidence was usually observed in infants, generally followed by young children and adolescents/young adults, as well as older adults in some countries. Globally, serogroup B was a predominant cause of IMD in most countries. Additionally, there was a notable increase in the number of IMD cases caused by serogroups W and Y from 2010 to 2019 in several regions, highlighting the unpredictable and dynamic nature of the disease. Overall, serogroups A, B, C, W and Y were responsible for the vast majority of IMD cases, despite the availability of vaccines to prevent disease due to these serogroups.
The 'data revolution' offers many new opportunities for research in the social sciences. Increasingly, social and political interactions can be recorded digitally, leading to vast amounts of new data available for research. This poses new challenges for organizing and processing research data. This comprehensive introduction covers the entire range of data management techniques, from flat files to database management systems. It demonstrates how established techniques and technologies from computer science can be applied in social science projects, drawing on a wide range of different applied examples. This book covers simple tools such as spreadsheets and file-based data storage and processing, as well as more powerful data management software like relational databases. It goes on to address advanced topics such as spatial data, text as data, and network data. This book is one of the first to discuss questions of practical data management specifically for social science projects. This title is also available as Open Access on Cambridge Core.
The Defining Issues Test (DIT) has been widely used in psychological experiments to assess one’s developmental level of moral reasoning in terms of postconventional reasoning. However, there have been concerns regarding whether the tool is biased across people with different genders and political and religious views. To address the limitations, in the present study, I tested the validity of the brief version of the test, that is, the behavioral DIT, in terms of the measurement invariance and differential item functioning (DIF). I could not find any significant non-invariance at the test level or any item demonstrating practically significant DIF at the item level. The findings indicate that neither the test nor any of its items showed a significant bias toward any particular group. As a result, the collected validity evidence supports the use of test scores across different groups, enabling researchers who intend to examine participants’ moral reasoning development across heterogeneous groups to draw conclusions based on the scores.
We determine the distributions of some random variables related to a simple model of an epidemic with contact tracing and cluster isolation. This enables us to apply general limit theorems for super-critical Crump–Mode–Jagers branching processes. Notably, we compute explicitly the asymptotic proportion of isolated clusters with a given size amongst all isolated clusters, conditionally on survival of the epidemic. Somewhat surprisingly, the latter differs from the distribution of the size of a typical cluster at the time of its detection, and we explain the reasons behind this seeming paradox.
Seven varieties of forage oats from China were evaluated in the temperate environment of Bhutan for morphological traits, dry matter production, and forage quality. The oat variety Qingyin No. 1 provided a greater plant height (61 cm) and the largest number of tillers per plant (five tillers per plant). The leaf-stem ratio (LSR) was highest for Longyan No. 2 (LSR 0.73). During harvest in late winter, Longyan No. 2 had a greater plant height (64 cm) and the highest number of tillers per plant (seven tillers per plant), followed by Qingyin No. 1. The top three varieties with high LSRs of 1.49, 1.31, and 1.35 were Longyan No. 1, 2, and 3, respectively. In both summer and winter, Longyan No. 2 had the highest forage yields of around 5.00 and 4.00 DM t/ha, respectively. Qingyin No. 1 was the second largest forage producer, with under 5.00 DM t/ha in summer and under 3.00 DM t/ha in winter. For forage quality, Longyan No. 2 and Longyan No. 3 had the highest levels of crude protein (15%) in summer. However, during late winter, the Linna variety had the highest crude protein content (13%). The overall results of the field experiments suggest that Longyan No. 2 and Qingyin No. 1 are promising new oat varieties for winter fodder production in the temperate environments of Bhutan.
Financial models are an inescapable feature of modern financial markets. Yet it was over reliance on these models and the failure to test them properly that is now widely recognized as one of the main causes of the financial crisis of 2007–2011. Since this crisis, there has been an increase in the amount of scrutiny and testing applied to such models, and validation has become an essential part of model risk management at financial institutions. The book covers all of the major risk areas that a financial institution is exposed to and uses models for, including market risk, interest rate risk, retail credit risk, wholesale credit risk, compliance risk, and investment management. The book discusses current practices and pitfalls that model risk users need to be aware of and identifies areas where validation can be advanced in the future. This provides the first unified framework for validating risk management models.
While the Poisson distribution is a classical statistical model for count data, the distributional model hinges on the constraining property that its mean equal its variance. This text instead introduces the Conway-Maxwell-Poisson distribution and motivates its use in developing flexible statistical methods based on its distributional form. This two-parameter model not only contains the Poisson distribution as a special case but, in its ability to account for data over- or under-dispersion, encompasses both the geometric and Bernoulli distributions. The resulting statistical methods serve in a multitude of ways, from an exploratory data analysis tool, to a flexible modeling impetus for varied statistical methods involving count data. The first comprehensive reference on the subject, this text contains numerous illustrative examples demonstrating R code and output. It is essential reading for academics in statistics and data science, as well as quantitative researchers and data analysts in economics, biostatistics and other applied disciplines.
Around 0.4% of pregnant women in England have chronic hepatitis B virus (HBV) infection and need services to prevent vertical transmission. In this national audit, sociodemographic, clinical and laboratory information was requested from all maternity units in England for hepatitis B surface antigen-positive women initiating antenatal care in 2014. We describe these women's characteristics and indicators of access to/uptake of healthcare. Of 2542 pregnancies in 2538 women, median maternal age was 31 [IQR 27, 35] years, 94% (1986/2109) were non-UK born (25% (228/923) having arrived into the UK <2 years previously) and 32% (794/2473) had ⩾2 previous live births. In 39%, English levels were basic/less than basic. Antenatal care was initiated at median 11.3 [IQR 9.6, 14] gestation weeks, and ‘late’ (⩾20 weeks) in 10% (251/2491). In 70% (1783/2533) of pregnancies, HBV had been previously diagnosed and 11.8% (288/2450) had ⩾1 marker of higher infectivity. Missed specialist appointments were reported in 18% (426/2339). Late antenatal care and/or missed specialist appointments were more common in pregnancies among women lacking basic English, arriving in the UK ⩽2 years previously, newly HBV diagnosed, aged <25 years and/or with ⩾2 previous live births. We show overlapping groups of pregnant women with chronic HBV vulnerable to delayed or incomplete care.
We study a sceptical rumour model on the non-negative integer line. The model starts with two spreaders at sites 0, 1 and sceptical ignorants at all other natural numbers. Then each sceptic transmits the rumour, independently, to the individuals within a random distance on its right after s/he receives the rumour from at least two different sources. We say that the process survives if the size of the set of vertices which heard the rumour in this fashion is infinite. We calculate the probability of survival exactly, and obtain some bounds for the tail distribution of the final range of the rumour among sceptics. We also prove that the rumour dies out among non-sceptics and sceptics, under the same condition.
Donor organizations and multilaterals require ways to measure progress toward the goals of creating an open internet, and condition assistance on recipient governments maintaining access to information online. Because the internet is increasingly becoming a leading tool for exchanging information, authoritarian governments around the world often seek methods to restrict citizens’ access. Two of the most common methods for restricting the internet are shutting down internet access entirely and filtering specific content. We conduct a systematic literature review of articles on the measurement of internet censorship and find that little work has been done comparing the tradeoffs of using different methods to measure censorship on a global scale. We compare the tradeoffs between measuring these phenomena using expert analysis (as measured by Freedom House and V-Dem) and remote measurement with manual oversight (as measured by Access Now and the OpenNet Initiative [ONI]) for donor organizations that want to incentivize and measure good internet governance. We find that remote measurement with manual oversight is less likely to include false positives, and therefore may be more preferable for donor organizations that value verifiability. We also find that expert analysis is less likely to include false negatives, particularly for very repressive regimes in the Middle East and Central Asia and therefore these data may be preferable for advocacy organizations that want to ensure very repressive regimes are not able to avoid accountability, or organizations working primarily in these areas.
There is an increasing gap between the policy cycle’s speed and that of technological and social change. This gap is becoming broader and more prominent in robotics, that is, movable machines that perform tasks either automatically or with a degree of autonomy. This is because current legislation was unprepared for machine learning and autonomous agents. As a result, the law often lags behind and does not adequately frame robot technologies. This state of affairs inevitably increases legal uncertainty. It is unclear what regulatory frameworks developers have to follow to comply, often resulting in technology that does not perform well in the wild, is unsafe, and can exacerbate biases and lead to discrimination. This paper explores these issues and considers the background, key findings, and lessons learned of the LIAISON project, which stands for “Liaising robot development and policymaking,” and aims to ideate an alignment model for robots’ legal appraisal channeling robot policy development from a hybrid top-down/bottom-up perspective to solve this mismatch. As such, LIAISON seeks to uncover to what extent compliance tools could be used as data generators for robot policy purposes to unravel an optimal regulatory framing for existing and emerging robot technologies.