To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This paper deals with the multivariate tail conditional expectation (MTCE) for generalized skew-elliptical distributions. We present tail conditional expectation for univariate generalized skew-elliptical distributions and MTCE for generalized skew-elliptical distributions. There are many special cases for generalized skew-elliptical distributions, such as generalized skew-normal, generalized skew Student-t, generalized skew-logistic and generalized skew-Laplace distributions.
As characterization and modeling of complex materials by phenomenological models remains challenging, data-driven computing that performs physical simulations directly from material data has attracted considerable attention. Data-driven computing is a general computational mechanics framework that consists of a physical solver and a material solver, based on which data-driven solutions are obtained through minimization procedures. This work develops a new material solver built upon the local convexity-preserving reconstruction scheme by He and Chen (2020) A physics-constrained data-driven approach based on locally convex reconstruction for noisy database. Computer Methods in Applied Mechanics and Engineering 363, 112791 to model anisotropic nonlinear elastic solids. In this approach, a two-level local data search algorithm for material anisotropy is introduced into the material solver in online data-driven computing. A material anisotropic state characterizing the underlying material orientation is used for the manifold learning projection in the material solver. The performance of the proposed data-driven framework with noiseless and noisy material data is validated by solving two benchmark problems with synthetic material data. The data-driven solutions are compared with the constitutive model-based reference solutions to demonstrate the effectiveness of the proposed methods.
Anomaly detection in asset condition data is critical for reliable industrial asset operations. But statistical anomaly classifiers require certain amount of normal operations training data before acceptable accuracy can be achieved. The necessary training data are often not available in the early periods of assets operations. This problem is addressed in this paper using a hierarchical model for the asset fleet that systematically identifies similar assets, and enables collaborative learning within the clusters of similar assets. The general behavior of the similar assets are represented using higher level models, from which the parameters are sampled describing the individual asset operations. Hierarchical models enable the individuals from a population, comprising of statistically coherent subpopulations, to collaboratively learn from one another. Results obtained with the hierarchical model show a marked improvement in anomaly detection for assets having low amount of data, compared to independent modeling or having a model common to the entire fleet.
This paper presents the development process of a digital twin of a unique hydroponic underground farm in London, Growing Underground (GU). Growing 12x more per unit area than traditional greenhouse farming in the UK, the farm also consumes 4x more energy per unit area. Key to the ongoing operational success of this farm and similar enterprises is finding ways to minimize the energy use while maximizing crop growth by maintaining optimal growing conditions. As such, it belongs to the class of Controlled Environment Agriculture, where indoor environments are carefully controlled to maximize crop growth by using artificial lighting and smart heating, ventilation, and air conditioning systems. We tracked changing environmental conditions and crop growth across 89 different variables, through a wireless sensor network and unstructured manual records, and combined all the data into a database. We show how the digital twin can provide enhanced outputs for a bespoke site like GU, by creating inferred data fields, and show the limitations of data collection in a commercial environment. For example, we find that lighting is the dominant environmental factor for temperature and thus crop growth in this farm, and that the effects of external temperature and ventilation are confounded. We combine information learned from historical data interpretation to create a bespoke temperature forecasting model (root mean squared error < 1.3°C), using a dynamic linear model with a data-centric lighting component. Finally, we present how the forecasting model can be integrated into the digital twin to provide feedback to the farmers for decision-making assistance.
Using monthly data from the Ebola-outbreak 2013–2016 in West Africa, we compared two calibrations for data fitting, least-squares (SSE) and weighted least-squares (SWSE) with weights reciprocal to the number of new infections. To compare (in hindsight) forecasts for the final disease size (the actual value was observed at month 28 of the outbreak) we fitted Bertalanffy–Pütter growth models to truncated initial data (first 11, 12, …, 28 months). The growth curves identified the epidemic peak at month 10 and the relative errors of the forecasts (asymptotic limits) were below 10%, if 16 or more month were used; for SWSE the relative errors were smaller than for SSE. However, the calibrations differed insofar as for SWSE there were good fitting models that forecasted reasonable upper and lower bounds, while SSE was biased, as the forecasts of good fitting models systematically underestimated the final disease size. Furthermore, for SSE the normal distribution hypothesis of the fit residuals was refuted, while the similar hypothesis for SWSE was not refuted. We therefore recommend considering SWSE for epidemic forecasts.
Previous studies have revealed associations of meteorological factors with tuberculosis (TB) cases. However, few studies have examined their lag effects on TB cases. This study was aimed to analyse nonlinear lag effects of meteorological factors on the number of TB notifications in Hong Kong. Using a 22-year consecutive surveillance data in Hong Kong, we examined the association of monthly average temperature and relative humidity with temporal dynamics of the monthly number of TB notifications using a distributed lag nonlinear models combined with a Poisson regression. The relative risks (RRs) of TB notifications were >1.15 as monthly average temperatures were between 16.3 and 17.3 °C at lagged 13–15 months, reaching the peak risk of 1.18 (95% confidence interval (CI) 1.02–1.35) when it was 16.8 °C at lagged 14 months. The RRs of TB notifications were >1.05 as relative humidities of 60.0–63.6% at lagged 9–11 months expanded to 68.0–71.0% at lagged 12–17 months, reaching the highest risk of 1.06 (95% CI 1.01–1.11) when it was 69.0% at lagged 13 months. The nonlinear and delayed effects of average temperature and relative humidity on TB epidemic were identified, which may provide a practical reference for improving the TB warning system.
Although testing is widely regarded as critical to fighting the COVID-19 pandemic, what measure and level of testing best reflects successful infection control remains unresolved. Our aim was to compare the sensitivity of two testing metrics – population testing number and testing coverage – to population mortality outcomes and identify a benchmark for testing adequacy. We aggregated publicly available data through 12 April on testing and outcomes related to COVID-19 across 36 OECD (Organization for Economic Development) countries and Taiwan. Spearman correlation coefficients were calculated between the aforementioned metrics and following outcome measures: deaths per 1 million people, case fatality rate and case proportion of critical illness. Fractional polynomials were used to generate scatter plots to model the relationship between the testing metrics and outcomes. We found that testing coverage, but not population testing number, was highly correlated with population mortality (rs = −0.79, P = 5.975 × 10−9vs. rs = −0.3, P = 0.05) and case fatality rate (rs = −0.67, P = 9.067 × 10−6vs. rs = −0.21, P = 0.20). A testing coverage threshold of 15–45 signified adequate testing: below 15, testing coverage was associated with exponentially increasing population mortality; above 45, increased testing did not yield significant incremental mortality benefit. Taken together, testing coverage was better than population testing number in explaining country performance and can serve as an early and sensitive indicator of testing adequacy and disease burden.
Amplifying the testing capacity and making better use of testing resources is a crucial measure when fighting any pandemic. A pooled testing strategy for SARS-CoV-2 has theoretically been shown to increase the testing capacity of a country, especially when applied in low prevalence settings. Experimental studies have shown that the sensitivity of reverse transcription-polymerase chain reaction is not affected when implemented in small groups. Previous models estimated the optimum group size as a function of the historical prevalence; however, this implies a homogeneous distribution of the disease within the population. This study aimed to explore whether separating individuals by age groups when pooling samples results in any further savings on test kits or affects the optimum group size estimation compared to Dorfman's pooling, based on historical prevalence. For this evaluation, age groups of interest were defined as 0–19 years, 20–59 years and over 60 years old. Generalisation of Dorfman's pooling was performed by adding statistical weight to the age groups based on the number of confirmed cases and tests performed in the segment. The findings showed that when the pooling samples are based on age groups, there is a decrease in the number of tests per subject needed to diagnose one subject. Although this decrease is minuscule, it might account for considerable savings when applied on a large scale. In addition, the savings are considerably higher in settings where there is a high standard deviation among the positivity rate of the age segments of the general population.
Tuberculosis (TB) remains a global public health threat. Misdiagnosis and delayed therapy of sputum smear-negative TB can affect the treatment outcomes and promote pathogen transmission. The application of Xpert MTB/RIF assay in bronchoalveolar lavage fluid (BALF) has been recommended but needs clinical evidence. We carried out a prospective study in the Nanjing Public Health Medical Center from September 2018 to August 2019. Pulmonary tuberculosis (PTB) patients were enrolled in the study if they had negative results of sputum smear. We compared the performance of Xpert MTB/RIF assay in sputum and BALF using sputum culture as the reference. In addition to this, we applied parallel tests using sputum culture, sputum-based Xpert MTB/RIF assay and BALF-based Xpert MTB/RIF assay to jointly detect smear-negative PTB using clinical diagnosis as the reference. With mycobacterial culture as the reference standard, Xpert MTB/RIF of BALF showed a higher sensitivity (14/16, 87.5%), but a relatively lower specificity (57/92, 62.0%). Xpert MTB/RIF of sputum showed relatively lower sensitivity (6/10, 60.0%) and higher specificity (63/88, 71.6%). Compared with sputum culture, Xpert MTB /RIF assay reduced the median detection time of MTB from 30 to 0 days, which significantly shortened the diagnosis time of the smear-negative TB patients. Among the combined detections, the positive detection proportion was improved with significant differences comparing with sputum culture only, from 11.1% (10/90) to 46.7% (42/90) (P < 0.05). Our study showed Xpert MTB/RIF in BALF had a better performance in detecting MTB of smear-negative patients.
With the rapid rise in the prevalence of non-tuberculous mycobacteria (NTM) diseases across the world, the microbiological diagnosis of NTM isolates is becoming increasingly important for the diagnosis and treatment of NTM disease. In this study, the clinical presentation, species distribution and drug susceptibility of patients with NTM disease visiting the Chongqing Public Health Medical Centre during March 2016–April 2019 were retrospectively analysed. Among the 146 patients with NTM disease, eight NTM species (complex) were identified. The predominant NTM species in these patients were identified to be Mycobacterium abscessus complex (53, 36.3%), M. intracellulare (38, 26%) and M. fortuitum (17, 11.7%). In addition, two or more species were isolated from 7.5% of the patients. Pulmonary NTM disease (142, 97.3%) showed the highest prevalence among the patients. It was observed that 40.1% of the patients with pulmonary NTM disease had chronic pulmonary obstructive disease and bronchiectasis, while 22.5% had prior tuberculosis. Male patients showed more association with the conditions of cough and haemoptysis than the female patients. In an in vitro antimicrobial susceptibility testing, most of the species showed susceptibility to linezolid, amikacin and clarithromycin, while M. fortuitum exhibited low susceptibility to tobramycin. In conclusion, the prevalence of NTM disease, especially that of the pulmonary NTM disease, is common in Southwest China. Species identification and drug susceptibility testing are thus extremely important to ensure appropriate treatment regimens for patient care and management.
The second edition of Statistics for the Social Sciences prepares students from a wide range of disciplines to interpret and learn the statistical methods critical to their field of study. By using the General Linear Model (GLM), the author builds a foundation that enables students to see how statistical methods are interrelated enabling them to build on the basic skills. The author makes statistics relevant to students' varying majors by using fascinating real-life examples from the social sciences. Students who use this edition will benefit from clear explanations, warnings against common erroneous beliefs about statistics, and the latest developments in the philosophy, reporting, and practice of statistics in the social sciences. The textbook is packed with helpful pedagogical features including learning goals, guided practice, and reflection questions.
We propose, calibrate, and validate a crowdsourced approach for estimating power spectral density (PSD) of road roughness based on an inverse analysis of vertical acceleration measured by a smartphone mounted in an unknown position in a vehicle. Built upon random vibration analysis of a half-car mechanistic model of roughness-induced pavement–vehicle interaction, the inverse analysis employs an L2 norm regularization to estimate ride quality metrics, such as the widely used International Roughness Index, from the acceleration PSD. Evoking the fluctuation–dissipation theorem of statistical physics, the inverse framework estimates the half-car dynamic vehicle properties and related excess fuel consumption. The method is validated against (a) laser-measured road roughness data for both inner city and highway road conditions and (b) road roughness data for the state of California. We also show that the phone position in the vehicle only marginally affects road roughness predictions, an important condition for crowdsourced capabilities of the proposed approach.
The study of the distributions of sums of dependent risks is a key topic in actuarial sciences, risk management, reliability and in many branches of applied and theoretical probability. However, there are few results where the distribution of the sum of dependent random variables is available in a closed form. In this paper, we obtain several analytical expressions for the distribution of the aggregated risks under dependence in terms of copulas. We provide several representations based on the underlying copula and the marginal distribution functions under general hypotheses and in any dimension. Then, we study stochastic comparisons between sums of dependent risks. Finally, we illustrate our theoretical results by studying some specific models obtained from Clayton, Ali-Mikhail-Haq and Farlie-Gumbel-Morgenstern copulas. Extensions to more general copulas are also included. Bounds and the limiting behavior of the hazard rate function for the aggregated distribution of some copulas are studied as well.
Large-scale population surveys have been an important source of data for the study of migration, and in many countries provide the only widely accessible data on migrants’ characteristics and outcomes after they arrive. For immigration policymakers, however, official survey data have some important limitations. Nonresponse to surveys is particularly likely to affect newly arrived migrants, biasing analysis toward more settled populations who have different characteristics (e.g., different fiscal costs), and hindering analysis of how integration outcomes evolve after arrival. Survey data are not well suited to capturing the dynamics of a mobile population, particularly among groups of migrants who spend substantial periods outside the country. And perhaps most importantly, official survey data usually identify migrants by country of birth and nationality (and sometimes self-reported reason for migration) but rarely include information on a person’s legal status either at arrival or at the time of data collection. This significantly limits the possibilities for evaluating policy and the impacts of policy changes: the characteristics of migrants coming for different reasons can vary enormously, so policymakers should be cautious about assuming that aggregate evidence on migrants or migration will be relevant to the specific routes on which they are taking decisions. This article illustrates some of these problems in practice showing how official survey data in the United Kingdom have been unable to answer one of the key questions facing the government, namely how many and which EU citizens need to apply to secure their residence rights after Brexit.
Owing to limited data, we conducted a meta-analysis to re-evaluate the relationship between obesity and coronavirus-2019 (COVID-19). Literature published between 1 January 2020 and 22 August 2020 was comprehensively analysed, and RevMan3.5 was used for data analysis. A total of 50 studies, including data on 18 260 378 patients, were available. Obesity was associated with a higher risk of severe acute respiratory syndrome-coronavirus 2 (SARS-CoV2) infection (odds ratio (OR): 1.39, 95% confidence interval (CI) 1.25–1.54; P < 0.00001) and increased severity of COVID-19 (hospitalisation rate: OR: 2.45, 95% CI 1.78–3.39; P < 0.00001; severe cases: OR: 3.74, 95% CI 1.18–11.87; P: 0.02; need for intensive care unit admission: OR: 1.30, 95% CI 1.21–1.40; P < 0.00001; need for invasive mechanical ventilation: OR: 1.59, 95% CI 1.35–1.88; P < 0.00001 and mortality: OR: 1.65, 95% CI 1.21–2.25; P: 0.001). However, we found a non-linear association between BMI and the severity of COVID-19. In conclusion, we found that obesity could increase the risk of SARS-CoV2 infection and aggregate the severity of COVID-19. Further studies are needed to explore the possible mechanisms behind this association.
Data trusts have been proposed as a mechanism through which data can be more readily exploited for a variety of aims, including economic development and social-benefit goals such as medical research or policy-making. Data trusts, and similar data governance mechanisms such as data co-ops, aim to facilitate the use and re-use of datasets across organizational boundaries and, in the process, to protect the interests of stakeholders such as data subjects. However, the current discourse on data trusts does not acknowledge another common stakeholder in the data value chain—the crowd workers who are employed to collect, validate, curate, and transform data. In this paper, we report on a preliminary qualitative investigation into how crowd data workers themselves feel datasets should be used and governed. We find that while overall remuneration is important to those workers, they also value public-benefit data use but have reservations about delayed remuneration and the trustworthiness of both administrative processes and the crowd itself. We discuss the implications of our findings for how data trusts could be designed, and how data trusts could be used to give crowd workers a more enduring stake in the product of their work.
Cloud storage faces many problems in the storage process which badly affect the system's efficiency. One of the most problems is insufficient buffer space in cloud storage. This means that the packets of data wait to have storage service which may lead to weakness in performance evaluation of the system. The storage process is considered a stochastic process in which we can determine the probability distribution of the buffer occupancy and the buffer content and predict the performance behavior of the system at any time. This paper modulates a cloud storage facility as a fluid queue controlled by Markovian queue. This queue has infinite buffer capacity which determined by the M/M/1/N queue with constant arrival and service rates. We obtain the analytical solution of the distribution of the buffer occupancy. Moreover, several performance measures and numerical results are given which illustrate the effectiveness of the proposed model.
Magnant and Martin conjectured that the vertex set of any d-regular graph G on n vertices can be partitioned into $n / (d+1)$ paths (there exists a simple construction showing that this bound would be best possible). We prove this conjecture when $d = \Omega(n)$, improving a result of Han, who showed that in this range almost all vertices of G can be covered by $n / (d+1) + 1$ vertex-disjoint paths. In fact our proof gives a partition of V(G) into cycles. We also show that, if $d = \Omega(n)$ and G is bipartite, then V(G) can be partitioned into n/(2d) paths (this bound is tight for bipartite graphs).