To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We propose a one-to-many matching estimator of the average treatment effect based on propensity scores estimated by isotonic regression. This approach is predicated on the assumption of monotonicity in the propensity score function, a condition that can be justified in many economic applications. We show that the nature of the isotonic estimator can help us to fix many problems of existing matching methods, including efficiency, choice of the number of matches, choice of tuning parameters, robustness to propensity score misspecification, and bootstrap validity. As a by-product, a uniformly consistent isotonic estimator is developed for our proposed matching method.
We consider an inhomogeneous Erdős–Rényi random graph ensemble with exponentially decaying random disconnection probabilities determined by an independent and identically distributed field of variables with heavy tails and infinite mean associated with the vertices of the graph. This model was recently investigated in the physics literature (Garuccio, Lalli, and Garlaschelli 2023) as a scale-invariant random graph within the context of network renormalization. From a mathematical perspective, the model fits in the class of scale-free inhomogeneous random graphs whose asymptotic geometrical features have recently attracted interest. While for this type of graph several results are known when the underlying vertex variables have finite mean and variance, here instead we consider the case of one-sided stable variables with necessarily infinite mean. To simplify our analysis, we assume that the variables are sampled from a Pareto distribution with parameter $\alpha\in(0,1)$. We start by characterizing the asymptotic distributions of the typical degrees and some related observables. In particular, we show that the degree of a vertex converges in distribution, after proper scaling, to a mixed Poisson law. We then show that correlations among degrees of different vertices are asymptotically non-vanishing, but at the same time a form of asymptotic tail independence is found when looking at the behavior of the joint Laplace transform around zero. Moreover, we present some findings concerning the asymptotic density of wedges and triangles, and show a cross-over for the existence of dust (i.e. disconnected vertices).
The use of large language models (LLMs) has exploded since November 2022, but there is sparse evidence regarding LLM use in health, medical, and research contexts. We aimed to summarise the current uses of and attitudes towards LLMs across our campus’ clinical, research, and teaching sites. We administered a survey about LLM uses and attitudes. We conducted summary quantitative analysis and inductive qualitative analysis of free text responses. In August–September 2023, we circulated the survey amongst all staff and students across our three campus sites (approximately n = 7500), comprising a paediatric academic hospital, research institute, and paediatric university department. We received 281 anonymous survey responses. We asked about participants’ knowledge of LLMs, their current use of LLMs in professional or learning contexts, and perspectives on possible future uses, opportunities, and risks of LLM use. Over 90% of respondents have heard of LLM tools and about two-thirds have used them in their work on our campus. Respondents reported using LLMs for various uses, including generating or editing text and exploring ideas. Many, but not necessarily all, respondents seem aware of the limitations and potential risks of LLMs, including privacy and security risks. Various respondents expressed enthusiasm about the opportunities of LLM use, including increased efficiency. Our findings show LLM tools are already widely used on our campus. Guidelines and governance are needed to keep up with practice. Insights from this survey were used to develop recommendations for the use of LLMs on our campus.
Inference and prediction under partial knowledge of a physical system is challenging, particularly when multiple confounding sources influence the measured response. Explicitly accounting for these influences in physics-based models is often infeasible due to epistemic uncertainty, cost, or time constraints, resulting in models that fail to accurately describe the behavior of the system. On the other hand, data-driven machine learning models such as variational autoencoders are not guaranteed to identify a parsimonious representation. As a result, they can suffer from poor generalization performance and reconstruction accuracy in the regime of limited and noisy data. We propose a physics-informed variational autoencoder architecture that combines the interpretability of physics-based models with the flexibility of data-driven models. To promote disentanglement of the known physics and confounding influences, the latent space is partitioned into physically meaningful variables that parametrize a physics-based model, and data-driven variables that capture variability in the domain and class of the physical system. The encoder is coupled with a decoder that integrates physics-based and data-driven components, and constrained by an adversarial training objective that prevents the data-driven components from overriding the known physics, ensuring that the physics-grounded latent variables remain interpretable. We demonstrate that the model is able to disentangle features of the input signal and separate the known physics from confounding influences using supervision in the form of class and domain observables. The model is evaluated on a series of synthetic case studies relevant to engineering structures, demonstrating the feasibility of the proposed approach.
A meta-conjecture of Coulson, Keevash, Perarnau, and Yepremyan [12] states that above the extremal threshold for a given spanning structure in a (hyper-)graph, one can find a rainbow version of that spanning structure in any suitably bounded colouring of the host (hyper-)graph. We solve one of the most pertinent outstanding cases of this conjecture by showing that for any $1\leq j\leq k-1$, if $G$ is a $k$-uniform hypergraph above the $j$-degree threshold for a loose Hamilton cycle, then any globally bounded colouring of $G$ contains a rainbow loose Hamilton cycle.
Political polarization is a group phenomenon in which opposing factions, often of unequal size, exhibit asymmetrical influence and behavioral patterns. Within these groups, elites and masses operate under different motivations and levels of influence, challenging simplistic views of polarization. Yet, existing methods for measuring polarization in social networks typically reduce it to a single value, assuming homogeneity in polarization across the entire system. While such approaches confirm the rise of political polarization in many social contexts, they overlook structural complexities that could explain its underlying mechanisms. We propose a method that decomposes existing polarization and alignment measures into distinct components. These components separately capture polarization processes involving elites and masses from opposing groups. Applying this method to Twitter discussions surrounding the 2019 and 2023 Finnish parliamentary elections, we find that (1) opposing groups rarely have a balanced contribution to observed polarization, and (2) while elites strongly contribute to structural polarization and consistently display greater alignment across various topics, the masses, too, have recently experienced a surge in alignment. Our method provides an improved analytical lens through which to view polarization, explicitly recognizing the complexity of and need to account for elite-mass dynamics in polarized environments.
We show that the Potts model on a graph can be approximated by a sequence of independent and identically distributed spins in terms of Wasserstein distance at high temperatures. We prove a similar result for the Curie–Weiss–Potts model on the complete graph, conditioned on being close enough to any of its equilibrium macrostates, in the low-temperature regime. Our proof technique is based on Stein’s method for comparing the stationary distributions of two Glauber dynamics with similar updates, one of which is rapid mixing and contracting on a subset of the state space. Along the way, we prove a new upper bound on the mixing time of the Glauber dynamics for the conditional measure of the Curie–Weiss–Potts model near an equilibrium macrostate.
In this paper we investigate large-scale linear systems driven by a fractional Brownian motion (fBm) with Hurst parameter $H\in [1/2, 1)$. We interpret these equations either in the sense of Young ($H>1/2$) or Stratonovich ($H=1/2$). In particular, fractional Young differential equations are well suited to modeling real-world phenomena as they capture memory effects, unlike other frameworks. Although it is very complex to solve them in high dimensions, model reduction schemes for Young or Stratonovich settings have not yet been much studied. To address this gap, we analyze important features of fundamental solutions associated with the underlying systems. We prove a weak type of semigroup property which is the foundation of studying system Gramians. From the Gramians introduced, a dominant subspace can be identified, which is shown in this paper as well. The difficulty for fractional drivers with $H>1/2$ is that there is no link between the corresponding Gramians and algebraic equations, making the computation very difficult. Therefore we further propose empirical Gramians that can be learned from simulation data. Subsequently, we introduce projection-based reduced-order models using the dominant subspace information. We point out that such projections are not always optimal for Stratonovich equations, as stability might not be preserved and since the error might be larger than expected. Therefore an improved reduced-order model is proposed for $H=1/2$. We validate our techniques conducting numerical experiments on some large-scale stochastic differential equations driven by fBm resulting from spatial discretizations of fractional stochastic PDEs. Overall, our study provides useful insights into the applicability and effectiveness of reduced-order methods for stochastic systems with fractional noise, which can potentially aid in the development of more efficient computational strategies for practical applications.
This article introduces a blockchain-based insurance scheme that integrates parametric and collaborative elements. A pool of investors, referred to as surplus providers, locks funds in a smart contract, enabling blockchain users to underwrite parametric insurance contracts. These contracts automatically trigger compensation when predefined conditions are met. The collaborative aspect is embodied in the generation of tokens, which are distributed to surplus providers. These tokens represent each participant’s share of the surplus and grant voting rights for management decisions. The smart contract is developed in Solidity, a high-level programming language for the Ethereum blockchain, and deployed on the Sepolia testnet, with data processing and analysis conducted using Python. In addition, open-source code is provided and main research challenges are identified, so that further research can be carried out to overcome limitations of this first proof of concept.
Detecting multiple structural breaks at unknown dates is a central challenge in time-series econometrics. Step-indicator saturation (SIS) addresses this challenge during model selection, and we develop its asymptotic theory for tuning parameter choice. We study its frequency gauge—the false detection rate—and show it is consistent and asymptotically normal. Simulations suggest that a smaller gauge minimizes bias in post-selection regression estimates. For the small gauge situation, we develop a complementary Poisson theory. We compare the local power of SIS to detect shifts with that of Andrews’ break test. We find that SIS excels when breaks are near the sample end or closely spaced. An application to U.K. labor productivity reveals a growth slowdown after the 2008 financial crisis.
Measure of uncertainty in past lifetime distribution plays an important role in the context of information theory, forensic science and other related fields. In the present work, we propose non-parametric kernel type estimator for generalized past entropy function, which was introduced by Gupta and Nanda [9], under $\alpha$-mixing sample. The resulting estimator is shown to be weak and strong consistent and asymptotically normally distributed under certain regularity conditions. The performance of the estimator is validated through simulation study and a real data set.
We study backward stochastic difference equations (BS$\Delta$Es) driven by a d-dimensional stochastic process on a lattice, whose increments take only $d+1$ possible values that generate the lattice. Interpreting the driving process as a d-dimensional asset price process, we provide applications to an optimal investment problem and to a market equilibrium analysis, where utility functionals are defined via BS$\Delta$Es.
We consider the problem of detecting whether a power-law inhomogeneous random graph contains a geometric community, and we frame this as a hypothesis-testing problem. More precisely, we assume that we are given a sample from an unknown distribution on the space of graphs on n vertices. Under the null hypothesis, the sample originates from the inhomogeneous random graph with a heavy-tailed degree sequence. Under the alternative hypothesis, $k=o(n)$ vertices are given spatial locations and connect following the geometric inhomogeneous random graph connection rule. The remaining $n-k$ vertices follow the inhomogeneous random graph connection rule. We propose a simple and efficient test based on counting normalized triangles to differentiate between the two hypotheses. We prove that our test correctly detects the presence of the community with high probability as $n\to\infty$, and identifies large-degree vertices of the community with high probability.
Course-prerequisite networks (CPNs) are directed acyclic graphs that model complex academic curricula by representing courses as nodes and dependencies between them as directed links. These networks are indispensable tools for visualizing, studying, and understanding curricula. For example, CPNs can be used to detect important courses, improve advising, guide curriculum design, analyze graduation time distributions, and quantify the strength of knowledge flow between different university departments. However, most CPN analyses to date have focused only on micro- and meso-scale properties. To fill this gap, we define and study three new global CPN measures: breadth, depth, and flux. All three measures are invariant under transitive reduction and are based on the concept of topological stratification, which generalizes topological ordering in directed acyclic graphs. These measures can be used for macro-scale comparison of different CPNs. We illustrate the new measures numerically by applying them to three real and synthetic CPNs from three universities: the Cyprus University of Technology, the California Institute of Technology, and Johns Hopkins University. The CPN data analyzed in this paper are publicly available in a GitHub repository.
The main goal of this paper is to introduce a new model of evolvement of beliefs on networks. It generalizes the DeGroot model and describes the iterative process of establishing the consensus in isolated social networks in the case of nonlinear aggregation functions. Our main tools come from mean theory and graph theory. The case, when the root set of the network (influencers, news agencies, etc.) is ergodic is fully discussed. The other possibility, when the root contains more than one component, is partially discussed and it could be a motivation for further research.
Quick and accurate forecasts of incidence and mortality trends for the near future are particularly useful for the immediate allocation of available public health resources, as well as for understanding the long-term course of the pandemic. The surveillance data used for predictions, however, may come with some reporting delays. Consequently, auxiliary data sources that are available immediately can provide valuable additional information for recent time periods for which surveillance data have not yet become fully available. In this work, a set of Google search queries by individual users related to COVID-19 incidence and mortality is collected and analyzed. The information from these queries aims to improve quick forecasts. Initially, the identified search query keywords were ranked according to their predictive abilities with reported incidence and mortality. After that, the ARIMA, Prophet, and XGBoost models were fitted to generate forecasts using only the available reported incidence and mortality (baseline model) or together with combinations of searched keywords identified based on their predictive abilities (predictors model). In summary, the inclusion of top-ranked keywords as predictors significantly enhanced prediction accuracy for the majority of scenarios in the range from 50% to 90% across all considered models and is recommended for future use. The inclusion of low-ranked keywords did not provide such an improvement. In general, the ranking of predictors and the corresponding forecast improvements were more pronounced for incidence, while the results were less pronounced for mortality.
A finite point set in $\mathbb{R}^d$ is in general position if no $d + 1$ points lie on a common hyperplane. Let $\alpha _d(N)$ be the largest integer such that any set of $N$ points in $\mathbb{R}^d$, with no $d + 2$ members on a common hyperplane, contains a subset of size $\alpha _d(N)$ in general position. Using the method of hypergraph containers, Balogh and Solymosi showed that $\alpha _2(N) \lt N^{5/6 + o(1)}$. In this paper, we also use the container method to obtain new upper bounds for $\alpha _d(N)$ when $d \geq 3$. More precisely, we show that if $d$ is odd, then $\alpha _d(N) \lt N^{\frac {1}{2} + \frac {1}{2d} + o(1)}$, and if $d$ is even, we have $\alpha _d(N) \lt N^{\frac {1}{2} + \frac {1}{d-1} + o(1)}$. We also study the classical problem of determining $a(d,k,n)$, the maximum number of points selected from the grid $[n]^d$ such that no $k + 2$ members lie on a $k$-flat, and improve the previously best known bound for $a(d,k,n)$, due to Lefmann in 2008, by a polynomial factor when $k$ = 2 or 3 (mod 4).
This research presents the design, pricing, and consumer testing results of a potential private financial product that integrates retirement savings with social care funding through contributions to a supplemental defined contribution pension scheme. With this product, some contributions will be earmarked specifically to cover social care expenses if needed post-retirement. Our research indicates that offering benefits that address both retirement income supplementation and social care funding in a combined approach is appealing to consumers and could help overcome behavioural barriers to planning for social care. As with established defined contribution schemes, this product is designed for distribution in the workplace. Employees can contribute a portion of their earnings to their pension accounts. Employers may partially or fully match these contributions, further incentivising participation. In addition to financial support, participants will gain access to social care coordination services designed to facilitate ageing at home. These services will help retirees navigate care options, coordinate necessary support, and optimise the use of their allocated social care funds, ultimately promoting independence and well-being in later life.