To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter the data assimilation problem is introduced as a control theory problem for partial differential equations, with initial conditions, model error, and empirical model parameters as optional control variables. An alternative interpretation of data assimilation as a processing of information in a dynamic-stochastic system is also introduced. Both approaches will be addressed in more detail throughout this book. The historical development of data assimilation has been documented, starting from the early nineteenth-century works by Legendre, Gauss, and Laplace, to optimal interpolation and Kalman filtering, to modern data assimilation based on variational and ensemble methods, and finally to future methods such as particle filters. This suggests that data assimilation is not a very new concept, given that it has been of scientific and practical interest for a long time. Part of the chapter focuses on introducing the common terminologies and notation used in data assimilation, with special emphasis on observation equation, observation errors, and observation operators. Finally, a basic linear estimation problem based on least squares is presented.
Coupled data assimilation is presented in detail. Starting from a coupled modeling system, a classification of coupled data assimilation based on coupling strength is defined. This includes uncoupled, weakly coupled, and strongly coupled data assimilation, and the coupling strength is quantified using mutual information. The most interesting aspects of coupled data assimilation can be related to a strongly coupled system in which the information exchange is maximized. The challenges of strongly coupled data assimilation include the account of the complex control variable and error covariance. The mentioned challenges can considerably increase in realistic high-dimensional applications. Additional issues that can hamper strongly coupled data assimilation include non-Gaussian errors and potentially different spatiotemporal scales of coupled system components. To improve understanding of strongly coupled data assimilation, a simple two-component system is introduced and analyzed. The theoretical assessment is followed by real-world examples of strongly coupled forecast error covariance. Finally, the coupled covariance localization is analyzed and a practical method to address it is described.
Variational data assimilation (VAR) is described in its various forms and their mathematical formulations are explained, including three-dimensional/four-dimensional VAR, first guess at appropriate time (FGAT), Physical-space Statistical Analysis System (PSAS), and incremental approaches. A historical overview of and differences in the calculus of variations and optimal control theory, the root theories on VAR, are also discussed, which are represented by the Euler–Lagrange equations and Pontryagin’s maximum (minimum) principle, respectively. Furthermore, major elements of VAR are reviewed with an emphasis on various formalisms of cost function, including Tikhonov regularization, strong- versus weak-constraint and incremental formulation, and on specification and diagnosis of error covariances, including observation error covariance, background error covariance, and model error covariance. Issues on minimization of the VAR cost function, including gradient, preconditioning, and assimilation period, are also addressed.
Mathematical background and formulation of numerical minimization process are described in terms of gradient-based methods, whose ingredients include gradient, Hessian, directional derivatives, optimality conditions for minimization, Hessian eigensystem, conjugate number of Hessian, and conjugate vectors. Various minimization algorithms, such as the steepest descent method, Newton’s method, conjugate gradient method, and quasi-Newton’s method, are introduced along with practical examples.
The estimation task is classified as filtering, smoothing, and prediction, depending on when the estimation and the observation incorporation are made. Basic techniques of filtering and smoothing are introduced. Characteristics and formulations of various filters and smoothers are discussed, including the Kalman filter, extended Kalman filter, fixed-point smoother, fixed-lag smoother, and fixed-interval smoother. Bayesian perspectives of filtering and smoothing are also discussed, especially on joint smoother and marginal smoother.
A standard probability formalism is introduced, including a definition of the probability density function (PDF) and its first four moments. Most basic PDFs such as the uniform and Gaussian PDFs, are defined. The fundamentals of the Bayes’ formula derivation and its formulation in terms of PDFs are also presented. More importantly, data assimilation is described as a recursive Bayes’ formula, which connects the standard Bayes’ formula from different analysis times by using transition PDFs. A basic introduction to Shannon information theory is presented, followed by a definition of uncertainty in terms of entropy, and therefore establishing a mathematical basis for interpreting data assimilation in terms of information processing that is used throughout this book. The multivariate Gaussian data assimilation framework, most often used in practice, is described. Common analysis solutions that include maximum a posteriori and minimum variance methods are derived, which include a formulation of the cost function and posterior probability.
Probabilistic prediction in terms of the probability density function and using Kolmogorov equation is introduced. The error of probabilistic prediction is defined, and its growth further analyzed using normed measures. Error growth is also connected with the Lyapunov exponent to underline its relevance to chaotic dynamics. Furthermore, the forecast and analysis errors of recursive data assimilation are mathematically related to the Lyapunov exponent, with implications for the control of errors of dynamical imbalances. It is shown how errors propagate in a data assimilation system and that the control of unbalanced errors is critical for successful data assimilation. In addition, Bayesian inference was identified as a mechanism that can help in implicitly controlling the growth of errors. A practical approach of dealing with dynamical imbalances in data assimilation using penalty function is also presented and briefly discussed.
The role of forecast error covariance in practical ensemble and variational data assimilation is described following algebraic and dynamical views. This is used to introduce a motivation for ensemble data assimilation. It is shown how a dynamically induced and anisotropic ensemble error covariance can benefit data assimilation, compared to climatological (static) and isotropic error covariance used in variational methods. In addition to the standard ensemble Kalman filter (EnKF), more practical square root EnKF equations are also presented. Direct transform ensemble methods are also introduced and their connection with both ensemble and variational methods described. Error covariance localization in terms of the Schur product, a standard component of any realistic ensemble-based data assimilation, is also introduced and discussed. Following that, hybrid data assimilation and in particular the ensemble-variational (EnVar) methods are introduced and presented in relation to pure ensemble and variational methods. As a particular example of hybrid methods the maximum likelihood ensemble filter (MLEF) is introduced.
Hands-on experiments of constructing tangent linear and adjoint model are given along with the practical generation of their codes by both hands-on work and an automatic differentiation tool (Tapenade).
This chapter focuses on assimilation of observations from satellites, which is a dominant source of observation information in weather and climate. This includes satellite radiances, both clear sky and all-sky. The most important challenges of all-sky radiances come from their connection to cloud microphysics, which potentially implies nonlinear, non-Gaussian, and nondifferentiable processes that are difficult for data assimilation. The complexity of error covariance with microphysical variables is illustrated in a few real-world examples. An additional difficulty with assimilating all-sky radiances comes from correlated observation errors that require special attention in data assimilation. Practical ways to deal with correlated observation errors are described. Nonlinearity and nondifferentiability of observation operators for all-sky radiances is also briefly explained. Since satellite radiance observations and observation operators generally contain bias, a common formulation of radiance bias correction methods is also presented. The observations from satellites also include radio occultation and lightning observations, as well as satellite products.
Generating the adjoint model (ADJM) by hand is tedious, time-consuming, and error prone. In most practical applications of data assimilation these days, the derivative codes, including the ADJM, are generated by the automatic differentiation (AD) tools, which evaluate the exact derivative information of a function in terms of a program. Terminologies and methods in AD are introduced, including the practical exclusion of the forward and reverse modes of differentiation. Various AD tools based on two major AD approaches, source transformation and operator overloading, are compiled with their webpages.
Hydrogen sulfide (H2S, “sulfide”) is a naturally occurring component of the marine sediment. Eutrophication of coastal waters, however, can lead to an excess of sulfide production that can prove toxic to seagrasses. We used stable sulfur isotope ratio (δ34S) measurements to assess sulfide intrusion in the seagrass Halodule wrightii, a semi-tropical species found throughout the Gulf of Mexico, Caribbean Sea, and both western and eastern Atlantic coasts. We found a gradient in δ34S values (−5.58 ± 0.54‰+13.58 ± 0.30‰) from roots to leaves, in accordance with prior observations and those from other species. The results may also represent the first values reported for H. wrightii rhizome tissue. The presence of sulfide-derived sulfur in varying proportions (15–55%) among leaf, rhizome, and root tissues suggests H. wrightii is able to assimilate sedimentary H2S into non-toxic forms that constitute a significant portion of the plant’s total sulfur content.