To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The estimation task is classified as filtering, smoothing, and prediction, depending on when the estimation and the observation incorporation are made. Basic techniques of filtering and smoothing are introduced. Characteristics and formulations of various filters and smoothers are discussed, including the Kalman filter, extended Kalman filter, fixed-point smoother, fixed-lag smoother, and fixed-interval smoother. Bayesian perspectives of filtering and smoothing are also discussed, especially on joint smoother and marginal smoother.
A standard probability formalism is introduced, including a definition of the probability density function (PDF) and its first four moments. Most basic PDFs such as the uniform and Gaussian PDFs, are defined. The fundamentals of the Bayes’ formula derivation and its formulation in terms of PDFs are also presented. More importantly, data assimilation is described as a recursive Bayes’ formula, which connects the standard Bayes’ formula from different analysis times by using transition PDFs. A basic introduction to Shannon information theory is presented, followed by a definition of uncertainty in terms of entropy, and therefore establishing a mathematical basis for interpreting data assimilation in terms of information processing that is used throughout this book. The multivariate Gaussian data assimilation framework, most often used in practice, is described. Common analysis solutions that include maximum a posteriori and minimum variance methods are derived, which include a formulation of the cost function and posterior probability.
Probabilistic prediction in terms of the probability density function and using Kolmogorov equation is introduced. The error of probabilistic prediction is defined, and its growth further analyzed using normed measures. Error growth is also connected with the Lyapunov exponent to underline its relevance to chaotic dynamics. Furthermore, the forecast and analysis errors of recursive data assimilation are mathematically related to the Lyapunov exponent, with implications for the control of errors of dynamical imbalances. It is shown how errors propagate in a data assimilation system and that the control of unbalanced errors is critical for successful data assimilation. In addition, Bayesian inference was identified as a mechanism that can help in implicitly controlling the growth of errors. A practical approach of dealing with dynamical imbalances in data assimilation using penalty function is also presented and briefly discussed.
The role of forecast error covariance in practical ensemble and variational data assimilation is described following algebraic and dynamical views. This is used to introduce a motivation for ensemble data assimilation. It is shown how a dynamically induced and anisotropic ensemble error covariance can benefit data assimilation, compared to climatological (static) and isotropic error covariance used in variational methods. In addition to the standard ensemble Kalman filter (EnKF), more practical square root EnKF equations are also presented. Direct transform ensemble methods are also introduced and their connection with both ensemble and variational methods described. Error covariance localization in terms of the Schur product, a standard component of any realistic ensemble-based data assimilation, is also introduced and discussed. Following that, hybrid data assimilation and in particular the ensemble-variational (EnVar) methods are introduced and presented in relation to pure ensemble and variational methods. As a particular example of hybrid methods the maximum likelihood ensemble filter (MLEF) is introduced.
Hands-on experiments of constructing tangent linear and adjoint model are given along with the practical generation of their codes by both hands-on work and an automatic differentiation tool (Tapenade).
This chapter focuses on assimilation of observations from satellites, which is a dominant source of observation information in weather and climate. This includes satellite radiances, both clear sky and all-sky. The most important challenges of all-sky radiances come from their connection to cloud microphysics, which potentially implies nonlinear, non-Gaussian, and nondifferentiable processes that are difficult for data assimilation. The complexity of error covariance with microphysical variables is illustrated in a few real-world examples. An additional difficulty with assimilating all-sky radiances comes from correlated observation errors that require special attention in data assimilation. Practical ways to deal with correlated observation errors are described. Nonlinearity and nondifferentiability of observation operators for all-sky radiances is also briefly explained. Since satellite radiance observations and observation operators generally contain bias, a common formulation of radiance bias correction methods is also presented. The observations from satellites also include radio occultation and lightning observations, as well as satellite products.
Generating the adjoint model (ADJM) by hand is tedious, time-consuming, and error prone. In most practical applications of data assimilation these days, the derivative codes, including the ADJM, are generated by the automatic differentiation (AD) tools, which evaluate the exact derivative information of a function in terms of a program. Terminologies and methods in AD are introduced, including the practical exclusion of the forward and reverse modes of differentiation. Various AD tools based on two major AD approaches, source transformation and operator overloading, are compiled with their webpages.
Hydrogen sulfide (H2S, “sulfide”) is a naturally occurring component of the marine sediment. Eutrophication of coastal waters, however, can lead to an excess of sulfide production that can prove toxic to seagrasses. We used stable sulfur isotope ratio (δ34S) measurements to assess sulfide intrusion in the seagrass Halodule wrightii, a semi-tropical species found throughout the Gulf of Mexico, Caribbean Sea, and both western and eastern Atlantic coasts. We found a gradient in δ34S values (−5.58 ± 0.54‰+13.58 ± 0.30‰) from roots to leaves, in accordance with prior observations and those from other species. The results may also represent the first values reported for H. wrightii rhizome tissue. The presence of sulfide-derived sulfur in varying proportions (15–55%) among leaf, rhizome, and root tissues suggests H. wrightii is able to assimilate sedimentary H2S into non-toxic forms that constitute a significant portion of the plant’s total sulfur content.
This article uses a “mystery client” approach and visits the websites of National Statistical Offices and international microdata libraries to assess whether foundational microdata sets for countries in the Middle East and North Africa region are collected, up to date, and made available to researchers. The focus is on population and economic censuses, price data and consumption, labor, health, and establishment surveys. The results show that about half of the expected core data sets are being collected and that only a fraction is made available publicly. As a consequence, many summary statistics, including national accounts and welfare estimates, are outdated and of limited relevance to decision-makers. Additional investments in microdata collection and publication of the data once collected are strongly advised.
The 2014 Research Excellence Framework (REF) assessed the quality of university research in the UK. 20% of the assessment was allocated according to peer review of the impact of research, reflecting the growing importance of impact in UK government policy. Beyond academia, impact is defined as a change or benefit to the economy, society, culture, public policy or services, health, the environment, or quality of life. Each institution submitted a set of four-page impact case studies. These are predominantly free-form descriptions and evidences of the impact of study. Numerous analyses of these case studies have been conducted, but they have utilised either qualitative methods or primary forms of text searching. These approaches have limitations, including the time required to manually analyse the data and the frequently inferior quality of the answers provided by applying computational analysis to unstructured, context-less free text data. This paper describes a new system to address these problems. At its core is a structured, queryable representation of the case study data. We describe the ontology design used to structure the information and how semantic web related technologies are used to store and query the data. Experiments show that this gives two significant advantages over existing techniques: improved accuracy in question answering and the capability to answer a broader range of questions, by integrating data from external sources. Then we investigate whether machine learning can predict each case study’s grade using this structured representation. The results provide accurate predictions for computer science impact case studies.
Graph embedding is a transformation of nodes of a network into a set of vectors. A good embedding should capture the underlying graph topology and structure, node-to-node relationship, and other relevant information about the graph, its subgraphs, and nodes themselves. If these objectives are achieved, an embedding is a meaningful, understandable, and often compressed representation of a network. Unfortunately, selecting the best embedding is a challenging task and very often requires domain experts. In this paper, we extend the framework for evaluating graph embeddings that was recently introduced in [15]. Now, the framework assigns two scores, local and global, to each embedding that measure the quality of an evaluated embedding for tasks that require good representation of local and, respectively, global properties of the network. The best embedding, if needed, can be selected in an unsupervised way, or the framework can identify a few embeddings that are worth further investigation. The framework is flexible and scalable and can deal with undirected/directed and weighted/unweighted graphs.
Coronavirus disease 2019 (COVID-19) asymptomatic cases are hard to identify, impeding transmissibility estimation. The value of COVID-19 transmissibility is worth further elucidation for key assumptions in further modelling studies. Through a population-based surveillance network, we collected data on 1342 confirmed cases with a 90-days follow-up for all asymptomatic cases. An age-stratified compartmental model containing contact information was built to estimate the transmissibility of symptomatic and asymptomatic COVID-19 cases. The difference in transmissibility of a symptomatic and asymptomatic case depended on age and was most distinct for the middle-age groups. The asymptomatic cases had a 66.7% lower transmissibility rate than symptomatic cases, and 74.1% (95% CI 65.9–80.7) of all asymptomatic cases were missed in detection. The average proportion of asymptomatic cases was 28.2% (95% CI 23.0–34.6). Simulation demonstrated that the burden of asymptomatic transmission increased as the epidemic continued and could potentially dominate total transmission. The transmissibility of asymptomatic COVID-19 cases is high and asymptomatic COVID-19 cases play a significant role in outbreaks.
In this paper, we study asymmetric Ramsey properties of the random graph $G_{n,p}$. Let $r \in \mathbb{N}$ and $H_1, \ldots, H_r$ be graphs. We write $G_{n,p} \to (H_1, \ldots, H_r)$ to denote the property that whenever we colour the edges of $G_{n,p}$ with colours from the set $[r] \,{:\!=}\, \{1, \ldots, r\}$ there exists $i \in [r]$ and a copy of $H_i$ in $G_{n,p}$ monochromatic in colour $i$. There has been much interest in determining the asymptotic threshold function for this property. In several papers, Rödl and Ruciński determined a threshold function for the general symmetric case; that is, when $H_1 = \cdots = H_r$. A conjecture of Kohayakawa and Kreuter from 1997, if true, would fully resolve the asymmetric problem. Recently, the $1$-statement of this conjecture was confirmed by Mousset, Nenadov and Samotij.
Building on work of Marciniszyn, Skokan, Spöhel and Steger from 2009, we reduce the $0$-statement of Kohayakawa and Kreuter’s conjecture to a certain deterministic subproblem. To demonstrate the potential of this approach, we show this subproblem can be resolved for almost all pairs of regular graphs. This therefore resolves the $0$-statement for all such pairs of graphs.
“Return-to-player” information is used in several jurisdictions to display the long-run cost of gambling, but previous evidence suggests that these messages are frequently misunderstood by gamblers. Two ways of improving the communication of return-to-player information have been suggested: switching to an equivalent “house-edge” format, or via the use of a “volatility warning,” clarifying that the information applies only in the statistical long run. In this study, Australian participants (N = 603) were presented with either a standard return-to-player message, the same message supplemented with a volatility warning, or a house-edge message. The return-to-player plus volatility warning message was understood correctly more frequently than the return-to-player message, but the house-edge message was understood best of all. Participants perceived the lowest chance of winning in the return-to-player plus volatility warning condition. These findings contribute data on the relative merits of two proposed approaches in the design of improved gambling information.
We use probabilistic methods to study properties of mean-field models, which arise as large-scale limits of certain particle systems with mean-field interaction. The underlying particle system is such that n particles move forward on the real line. Specifically, each particle ‘jumps forward’ at some time points, with the instantaneous rate of jumps given by a decreasing function of the particle’s location quantile within the overall distribution of particle locations. A mean-field model describes the evolution of the particles’ distribution when n is large. It is essentially a solution to an integro-differential equation within a certain class. Our main results concern the existence and uniqueness of—and attraction to—mean-field models which are traveling waves, under general conditions on the jump-rate function and the jump-size distribution.
The unfolded protein response has recently been implicated as a mechanism by which 1,10-phenanthroline-containing coordination compounds trigger cell death. We explored the interaction of two such compounds—one containing copper and one containing manganese—with endoplasmic reticulum (ER) stress. Pretreatment with anisomycin significantly enhanced the cytotoxic activity of both metal-based compounds in A2780, but only the copper-based compound in A549 cells. The effects of pretreatment with tunicamycin were dependent on the nature of the metal center in the compounds. In A2780 cells, the cytotoxic action of the copper compound was reduced by tunicamycin only at high concentration. In contrast, in A549 cells the efficacy of the manganese compound cells was reduced at all tested concentrations. Intriguingly, some impact of free 1,10-phenanthroline was also observed in A549 cells. These results are discussed in the context of the emerging evidence that the ER plays a role in the cytotoxic action of 1,10-phenanthroline-based compounds.