To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This study investigates the incorporation of advanced heating, ventilation, and air conditioning (HVAC) systems with reinforcement learning (RL) control to enhance energy efficiency in low-energy buildings amid the extreme seasonal temperatures of Tehran. We conducted comprehensive simulation assessments using the EnergyPlus and HoneybeeGym platforms to evaluate two distinct reinforcement learning models: traditional Q-learning (Model A) and deep reinforcement learning (DRL) with neural networks (Model B). Model B consisted of a deep convolutional network architecture with 256 neurons in each hidden layer, employing rectified linear units as activation functions and the Adam optimizer at a learning rate of 0.001. The results demonstrated that the RL-managed systems resulted in a statistically significant reduction in energy-use intensity of 25 percent (p < 0.001), decreasing from 250 to 200 kWh/m² annually in comparison to the baseline scenario. The thermal comfort showed notable improvements, with the expected mean vote adjusting to 0.25, which falls within the ASHRAE Standard 55 comfort range, and the percentage of anticipated dissatisfaction reduced to 10%. Model B (DRL) demonstrated a 50 percent improvement in prediction accuracy over Model A, with a mean absolute error of 0.579366 compared to 1.140008 and a root mean square error of 0.689770 versus 1.408069. This indicates enhanced adaptability to consistent daily trends and irregular periodicities, such as weather patterns. The proposed reinforcement learning method achieved energy savings of 10–15 percent compared to both rule-based and model predictive control and approximately 10 percent improvement over rule-based control, while employing fewer building features than existing state-of-the-art control systems.
Let $\Sigma$ be an alphabet and $\mu$ be a distribution on $\Sigma ^k$ for some $k \geqslant 2$. Let $\alpha \gt 0$ be the minimum probability of a tuple in the support of $\mu$ (denoted $\mathsf{supp}(\mu )$). We treat the parameters $\Sigma , k, \mu , \alpha$ as fixed and constant. We say that the distribution $\mu$ has a linear embedding if there exist an Abelian group $G$ (with the identity element $0_G$) and mappings $\sigma _i : \Sigma \rightarrow G$, $1 \leqslant i \leqslant k$, such that at least one of the mappings is non-constant and for every $(a_1, a_2, \ldots , a_k)\in \mathsf{supp}(\mu )$, $\sum _{i=1}^k \sigma _i(a_i) = 0_G$. In [Bhangale-Khot-Minzer, STOC 2022], the authors asked the following analytical question. Let $f_i: \Sigma ^n\rightarrow [\!-1,1]$ be bounded functions, such that at least one of the functions $f_i$ essentially has degree at least $d$, meaning that the Fourier mass of $f_i$ on terms of degree less than $d$ is at most $\delta$. If $\mu$ has no linear embedding (over any Abelian group), then is it necessarily the case that
where the right hand side $\to 0$ as the degree $d \to \infty$ and $\delta \to 0$?
In this paper, we answer this analytical question fully and in the affirmative for $k=3$. We also show the following two applications of the result.
1. The first application is related to hardness of approximation. Using the reduction from [5], we show that for every $3$-ary predicate $P:\Sigma ^3 \to \{0,1\}$ such that $P$ has no linear embedding, an SDP (semi-definite programming) integrality gap instance of a $P$-Constraint Satisfaction Problem (CSP) instance with gap $(1,s)$ can be translated into a dictatorship test with completeness $1$ and soundness $s+o(1)$, under certain additional conditions on the instance.
2. The second application is related to additive combinatorics. We show that if the distribution $\mu$ on $\Sigma ^3$ has no linear embedding, marginals of $\mu$ are uniform on $\Sigma$, and $(a,a,a)\in \texttt{supp}(\mu )$ for every $a\in \Sigma$, then every large enough subset of $\Sigma ^n$ contains a triple $({\textbf {x}}_1, {\textbf {x}}_2,{\textbf {x}}_3)$ from $\mu ^{\otimes n}$ (and in fact a significant density of such triples).
Let $\pi$ be a probability distribution in $\mathbb{R}^d$ and f a test function, and consider the problem of variance reduction in estimating $\mathbb{E}_\pi(f)$. We first construct a sequence of estimators for $\mathbb{E}_\pi (f)$, say $({1}/{k})\sum_{i=0}^{k-1} g_n(X_i)$, where the $X_i$ are samples from $\pi$ generated by the Metropolized Hamiltonian Monte Carlo algorithm and $g_n$ is the approximate solution of the Poisson equation through the weak approximate scheme recently invented by Mijatović and Vogrinc (2018). Then we prove under some regularity assumptions that the estimation error variance $\sigma_\pi^2(g_n)$ can be as arbitrarily small as the approximation order parameter $n\rightarrow\infty$. To illustrate, we confirm that the assumptions are satisfied by two typical concrete models, a Bayesian linear inverse problem and a two-component mixture of Gaussian distributions.
Asymptotic properties of random graph sequences, like the occurrence of a giant component or full connectivity in Erdös–Rényi graphs, are usually derived with very specific choices for the defining parameters. The question arises as to what extent those parameter choices may be perturbed without losing the asymptotic property. For two sequences of graph distributions, asymptotic equivalence (convergence in total variation) and contiguity have been considered by Janson (2010) and others; here we use so-called remote contiguity to show that connectivity properties are preserved in more heavily perturbed Erdös–Rényi graphs. The techniques we demonstrate here with random graphs also extend to general asymptotic properties, e.g. in more complex large-graph limits, scaling limits, large-sample limits, etc.
This article studies the identification of complete economic models with testable assumptions. We start with a local average treatment effect ($LATE$) model where the “No Defiers,” the independent IV assumption, and the exclusion restrictions can be jointly refuted by some data distributions. We propose two relaxed assumptions that are not refutable, with one assumption focusing on relaxing the “No Defiers” assumption while the other relaxes the independent IV assumption. The identified set of $LATE$ under either of the two relaxed assumptions coincides with the classical $LATE$ Wald ratio expression whenever the original assumption is not refuted by the observed data distribution. We propose an estimator for the identified $LATE$ and derive the estimator’s limit distribution. We then develop a general method to relax a refutable assumption A. This relaxation method requires finding a function that measures the deviation of an econometric structure from the original assumption A, and a relaxed assumption $\tilde {A}$ is constructed using this measure of deviation. We characterize a condition to ensure the identified sets under $\tilde {A}$ and A coincide whenever A is not refuted by the observed data distribution and discuss the criteria to choose among different relaxed assumptions.
Despite the appeal of screening travellers to prevent case importation during infectious disease outbreaks, evidence shows that symptom screening is largely ineffective in delaying the geographical spread of infection. Molecular tests offer high sensitivity and specificity and can detect infections earlier than symptom screening, suggesting potential for improved outcomes. However, they were used to screen travellers for COVID-19 with mixed success. To investigate molecular screening’s role in controlling COVID-19, and to quantify the effectiveness of screening for future pathogens of concern, we developed a probabilistic model that incorporates within-host viral kinetics. We then evaluated the potential effectiveness of screening travellers for influenza A, SARS-CoV-1, SARS-CoV-2, and Ebola virus. Even under highly optimistic assumptions, we found that the inability to detect recent infections always limits the effectiveness of traveller screening. We quantify this fundamental limit by proposing an estimator for the fraction of transmission that is preventable by screening. We also demonstrate that estimates of ascertainment overestimate reductions in transmission. These results highlight the essential role that quarantine and repeated testing play in infectious disease containment. Furthermore, our findings indicate that improving screening effectiveness requires the ability to detect infection much earlier than current state-of-the-art molecular tests.
Anonymous online surveys using financial incentives are an essential tool for understanding sexual networks and risk factors including attitudes, sexual behaviors, and practices. However, these surveys are vulnerable to bots attempting to exploit the incentive. We deployed an in-person, limited audience survey via QR code at select locations in North Carolina to assess geolocation application use among men who have sex with men to characterize the role of app usage on infection risk and behavior. The survey was unexpectedly posted on a social media platform and went viral. Descriptive statistics were performed on repeat responses, free-text length, and demographic consistency. Between August 2022 and March 2023, we received 4,709 responses. Only 13 responses were recorded over a 6-month period until a sharp spike occurred: over 500 responses were recorded in a single hour and over 2,000 in a single day. Although free-text responses were often remarkably sophisticated, many multiple-choice responses were internally inconsistent. To protect data quality, all online surveys must incorporate defensive techniques such as response time validation, logic checks, and IP screening. With the rise of large language models, bot attacks with sophisticated responses to open-ended questions pose a growing threat to the integrity of research studies.
The rapid evolution of SARS-CoV-2 has led to the emergence of variants of concern (VOCs) characterized by increased transmissibility, pathogenicity, and resistance to neutralizing antibodies. Identifying these variants is essential for guiding public health efforts to control COVID-19. Although whole genome sequencing (WGS) is the gold standard for variant identification, its implementation is often limited in developing countries due to resource constraints. In Bolivia, genomic surveillance is a challenge due to its limited technological infrastructure and resources. An RT-qPCR-based strategy was designed to address these limitations and detect the mutations associated with VOCs and variants of interest (VOIs). The multiplex RT-qPCR commercial kits AllplexTM Master and Variants I (Seegene®) and the ValuPanelTM (Biosearch®) were used to target mutations such as HV69/70del, E484K, N501Y, P681H, and K417N/T. They are characteristic of the Alpha (B.1.1.7), Beta (B.1.531), Gamma (P.1), Omicron (B.1.1.529), Mu (B.1.621), and Zeta (P.2) variants. A total of 157 samples collected in Cochabamba from January to November 2021 were evaluated, identifying 44 Gamma, 2 Zeta, 20 Mu, and 10 Omicron were identified. The strategy’s effectiveness was validated against WGS data generated with Oxford NanoporeTM technology, showing a concordance rate of 0.96. This highlights the value of the RT-qPCR strategy in guiding the selection of samples for WGS, enabling broader detection of new variants that cannot be identified by RT-qPCR alone.
Poor socket fit is the leading cause of prosthetic limb discomfort. However, currently clinicians have limited objective data to support and improve socket design. Finite element analysis predictions might help improve the fit, but this requires internal and external anatomy models. While external 3D surface scans are often collected in routine clinical computer-aided design practice, detailed internal anatomy imaging (e.g., MRI or CT) is not. We present a prototype statistical shape model (SSM) describing the transtibial amputated residual limb, generated using a sparse dataset of 33 MRI and CT scans. To describe the maximal shape variance, training scans are size-normalized to their estimated intact tibia length. A mean limb is calculated and principal component analysis used to extract the principal modes of shape variation. In an illustrative use case, the model is interrogated to predict internal bone shapes given a skin surface shape. The model attributes ~52% of shape variance to amputation height and ~17% to slender-bulbous soft tissue profile. In cross-validation, left-out shapes influenced the mean by 0.14–0.88 mm root mean square error (RMSE) surface deviation (median 0.42 mm), and left-out shapes were recreated with 1.82–5.75 mm RMSE (median 3.40 mm). Linear regression between mode scores from skin-only- and full-model SSMs allowed prediction of bone shapes from the skin with 3.56–10.9 mm RMSE (median 6.66 mm). The model showed the feasibility of predicting bone shapes from surface scans, which addresses a key barrier to implementing simulation within clinical practice, and enables more representative prosthetic biomechanics research.
Limited studies on the seasonality of pharyngitis and tonsillitis suggest subtle but unexplained fluctuations in case numbers that deviate from patterns seen in other respiratory diagnoses. Data on weekly acute respiratory infection diagnoses from 2010–2022, provided by the Polish National Healthcare Fund, included a total of 360 million visits. Daily mean temperature and relative humidity were sourced from the Copernicus Climate Data Store. Seasonal pattern was estimated using the STL model, while the impact of temperature was calculated with SARIMAX. A recurring early-summer wave of an unspecified pathogen causing pharyngitis and tonsillitis was identified. The strongest pattern was observed in children under 10, though other age groups also showed somewhat elevated case numbers. The reproductive number of the pathogen is modulated by warmer temperatures; however, summer holidays and pandemic restrictions interrupt its spread. The infection wave is relatively flat, suggesting either genuinely slow spread or multiple waves of related pathogens. Symptomatic data unambiguously demonstrate existence of pathogens of quite distinct characteristics. Given its consistent year-to-year pattern, identifying these potential pathogens could enhance respective treatment, including antibiotic therapy.
The chapter is fully dedicated to the theory of large deviations. To carry out the proof of the theorem and the actual computation of various distributions of large deviations, a detailed appendix is dedicated to the saddle point theorem to compute certain fundamental integrals, recurring in the theory. Lagrange transforms stem naturally from the large deviations theory, and we discuss their properties “in-line” for non-experts.
This is a rich chapter in which we delve into the study of the (weak and strong) laws of large numbers, and of the central limit theorem. The latter is first considered for sums of independent stochastic variables whose distributions have a finite variance, and then for variables with diverging variance. Several appendices report on both basic mathematical tools and lengthy details of computation. Among the first, the rules of variable change in probability are presented, Fourier and Laplace transforms are introduced, and their role as generating functionals of moments and cumulants, and the different kinds of convergence of stochastic functions are considered and exemplified.
Analysis of experimental data with several degrees of freedom is reported, starting from the Gaussian case, from the ground of the least-squares method, whose theory is detailed at the end of the chapter, for both independent and correlated data. The multi-dimensional versions of the reweighting method for unknown distributed data and of the bootstrap and the jackknife resampling methods are presented. How the possible correlation of multivariate data affects the methods is discussed and dealt with.
The foundations of modern probability theory are briefly presented and discussed for both discrete and continuous stochastic variables. Without daring to give a rigorous mathematical construction – but citing different extremely well-written handbooks on the matter – the axiomatic theory of Kolmogorov and the concepts of joint, conditional, and marginal probability are introduced, along with the operations of union and intersection of generic random events. Eventually, Bayes’ formula is put forward with some examples. This will be at the cornerstone of statistical inference methods reported in Chapters 5 and 6.
Our last chapter is devoted to entropy. With this excuse we first present Shannon’s information theory, including the derivation of his entropy, and the enunciations and proofs of the source coding theorem and of the noisy-channel coding theorem. Then, we consider dynamical systems and the production of entropy in chaotic systems, termed Kolmogorov–Sinai entropy. For non-experts or readers who require a memory jog, we make a short recap of statistical mechanics. That is just enough to tie up some knots left untied in Chapter 4, when we developed large deviations theory for independent variables. Here we generalize to correlated variables and make one application to statistical mechanics. In particular, we find out that entropy is a large deviations function, apart from constants. We end with a lightning fast introduction to configurational entropy in disordered complex systems. Just to give a tiny glimpse of … what we do for a living!
Here we face the analysis of another kind of memoryless discrete process: branching processes, otherwise termed “chain reactions” under more physical inspiration. Before that, we carefully deepen and generalize the knowledge of the very useful tool of generating functions. This will be soon applied to the study of the dynamics of a population, predicting whether it will certainly be extinct – and how fast – or it will be self-sustaining.
In this chapter we study the first example of a correlated memoryless phenomenon: the famous “drunkard’s walk”, formally termed the random walk. We begin from a very simple case, in a homogeneous and isotropic space on a discrete hypercubic lattice. Then we add traps here and there. Eventually we make a foray into the continuous regime, with the Fokker–Planck diffusion equation (which, we see, is what physicists call a Schrödinger equation in imaginary time), and the stochastic differential Langevin equation.
Here we return to discrete Markov processes, but this time with continuous-time processes. We first consider, study, and solve specific examples such as Poisson processes, divergent birth processes, and birth-and-death processes. We derive the master equations for their probability distributions, and derive and discuss important solutions. In particular, we deepen the theory of Feller for divergent birth processes. In the end we formally study the general case of Markov processes in the stationary case, writing down the forward and the backward Kolmogorov master equations.