To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter presents a series of astronomical applications using some of the models presented earlier in the book. Each section concerns a specific type of astronomical data situation and the associated statistical model. The examples cover a large variety of topics, from solar (sunspots) to extragalactic (type Ia supernova) data, and the accompanying codes were designed to be easily modified in order to include extra complexity or alternative data sets. Our goal is not only to demonstrate how the models presented earlier can impact the classical approach to astronomical problems but also to provide resources which will enable both young and experienced researchers to apply such models in their daily analysis.
Following the same philosophy as in previous chapters, we provide codes in R/JAGS and Python/Stan for almost all the case studies. The exceptions are examples using type Ia supernova data for cosmological parameter estimation and approximate Bayesian computation (ABC). In the former we take advantage of the Stan ordinary differential equation solver, which, at the time of writing, is not fully functional within PyStan. Thus, we take the opportunity to show one example of how Stan can also be easily called from within R. The ABC approach requires completely different ingredients and, consequently, different software. Here we have used the cosmoabc Python package in order to demonstrate how the main algorithm works in a simple toy model. We also point the reader to the main steps towards using ABC for cosmological parameter inference from galaxy cluster number counts. This is considered an advanced topic and is presented as a glimpse of the potential of Bayesian analysis beyond the exercises presented in previous chapters. Many models discussed in this book represent a step forward from the types of models generally used by the astrophysical community. In the future we expect that Bayesian methods will be the predominant statistical approach to the analysis of astrophysical data.
Accessing Data
For all the examples presented in this chapter we will use publicly available astronomical data sets. These have been formatted to allow easy integration with our R and Python codes. All the code snippets in this chapter contain a path_to_data variable, which has the format path_to_data = ”˜ <some path>”, where the symbol ˜ should be substituted for the complete path to our GitHub repository.
Bayesian Models for Astrophysical Data provides those who are engaged in the Bayesian modeling of astronomical data with guidelines on how to develop code for modeling such data, as well as on how to evaluate a model as to its fit. One focus in this volume is on developing statistical models of astronomical phenomena from a Bayesian perspective. A second focus of this work is to provide the reader with statistical code that can be used for a variety of Bayesian models.
We provide fully working code, not simply code snippets, in R, JAGS, Python, and Stan for a wide range of Bayesian statistical models. We also employ several of these models in real astrophysical data situations, walking through the analysis and model evaluation. This volume should foremost be thought of as a guidebook for astronomers who wish to understand how to select the model for their data, how to code it, and finally how best to evaluate and interpret it. The codes shown in this volume are freely available online at www.cambridge.org/bayesianmodels. We intend to keep it continuously updated and report any eventual bug fixes and improvements required by the community. We advise the reader to check the online material for practical coding exercises.
This is a volume devoted to applying Bayesian modeling techniques to astrophysical data. Why Bayesian modeling? First, science appears to work in accordance with Bayesian principles. At each stage in the development of a scientific study new information is used to adjust old information. As will be observed when reviewing the examples later in this volume, this is how Bayesian modeling works. A posterior distribution created from the mixing of the model likelihood (derived from the model data) and a prior distribution (outside information we use to adjust the observed data) may itself be used as a prior for yet another enhanced model. New information is continually being used in models over time to advance yet newer models. This is the nature of scientific discovery. Yet, even if we think of a model in isolation from later models, scientists always bring their own perspectives into the creation of a model on the basis of previous studies or from their own experience in dealing with the study data.
Defines the concept of planetary photometry and provides a broad overview of its history, including early theories of vision and light, the origin of the stellar magnitude scale, and the seminal work of Bouguer and Lambert in the development of the field. It concludes with early photometric work to define the brightness of objects in the solar system and its recent success in elucidating the state of the lunar regolith prior to landing there.
Astrostatistics has only recently become a fully fledged scientific discipline. With the creation of the International Astrostatistics Association, the Astroinformatics & Astrostatistics Portal (ASAIP), and the IAU Commission on Astroinformatics and Astrostatistics, the discipline has mushroomed in interest and visibility in less than a decade.
With respect to the future, though, we believe that the above three organizations will collaborate on how best to provide astronomers with tutorials and other support for learning about the most up-to-date statistical methods appropriate for analyzing astrophysical data. But it is also vital to incorporate trained statisticians into astronomical studies. Even though some astrophysicists will become experts in statistical modeling, we cannot expect most astronomers to gain this expertise. Access to statisticians who are competent to engage in serious astrophysical research will be needed.
The future of astrostatistics will be greatly enhanced by the promotion of degree programs in astrostatistics at major universities throughout the world. At this writing there are no MS or PhD programs in astrostatistics at any university. Degree programs in astrostatistics can be developed with the dual efforts of departments of statistics and astronomy–astrophysics. There are several universities that are close to developing such a degree, and we fully expect that PhD programs in astrostatistics will be common in 20 years from now. They would provide all the training in astrophysics now given in graduate programs but would also add courses and training at the MS level or above in statistical analysis and in modeling in particular.
We expect that Bayesian methods will be the predominant statistical approach to the analysis of astrophysical data in the future. As computing speed and memory become greater, it is likely that new statistical methods will be developed to take advantage of the new technology. We believe that these enhancements will remain in the Bayesian tradition, but modeling will become much more efficient and reliable. We expect to see more non-parametric modeling taking place, as well as advances in spatial statistics – both twoand three-dimensional methods.
Finally, astronomy is a data-driven science, currently being flooded by an unprecedented amount of data, a trend expected to increase considerably in the next decade. Hence, it is imperative to develop new paradigms of data exploration and statistical analysis.
The CMB data comes from many different kinds of experiment, and in the competition between the small balloon-based missions and huge satellite-borne programs it is occasionally the former who win. But in the end, the high quality sky-wide data comes from space experiments. During this century, two have dominated cosmology so far: WMAP and Planck.
In this chapter we describe how the data is acquired, how it is stored in special format maps, and how it is treated to remove all the non-cosmological contributions. After this has been done we have our data, but it is the need to remove noise and foreground contamination that is the most challenging, and the most demanding of computing resources.
We treat the problem of noise and foreground removal in as generic a way as possible, there is not a single best way of doing this. The mathematical formalism is quite complex, as is the statistical data analysis that will follow.
Introduction
Some of the history of the observations of the Cosmic Microwave Background radiation (CMB) has been recounted in the first part of this book (see Chapter 3). The culmination of the early efforts, the ‘first 30 years’, was perhaps the COBE DMR/FIRAS mission. COBE DMR produced the first all-sky maps of the CMB (Smoot et al., 1991), while the FIRAS experiment on the same satellite had established the remarkable accuracy of the Planckian form of the radiation spectrum (Mather and the COBE collaboration, 1990). Importantly, COBE had delivered a low resolution, albeit noisy, picture of the microwave sky that not only made substantial scientific advances, but also attracted widespread public attention and provided a vital scientific stimulus.
The manifest success of the COBE mission inspired a series of ground-based, balloon-based and and space observatories to look at the fluctuations with more sensitivity and at higher angular resolution. Satellite experiments are expensive and take a long time to build and so the competition to get the key results from lower cost ground based experiments before the space experiments could deliver was fierce, and in large part successful.
In this chapter we cover one particular aspect of frequentist statistics in order to be able to compare and contrast the approach with the Bayesian approach that we shall be discussing shortly. In much of the world that uses statistics, the principle objective appears to be to test hypotheses about given data and come to some conclusion. The conclusion might be the answer to either of the questions: ‘Is this supported by the data or not?’, or ‘Which is the better descriptor of the data?’. Control samples are commonly used as an alterrnative against which the data will be evaluated. Whatever the question, the questioner expects a quantitative answer.
In cosmology we only have the one Universe as a source of data, and we have no other Universe that can act as a control. Of course we do have numerical simulations that can play that role. Our goal is often to fit a parameterised model to data and we can reasonably ask how confident we should be that this is the ‘best’ answer.
Classical Inference
In generic terms, the goal of statistical analysis is to discover how some factor Y responds to changes in some input, or set of inputs X. The observations (x, y) are paired, and while the values of x are generally under control of the experimentalist, the response y is subject to measurement errors ∊. The mechanism for doing this may take the form of establishing a model for the dependence of Y on X, or might be to discover which of the factors X, or which combination of the elements of X, are the main contributory factors to the response Y. Here we will consider only the first of these, fitting parameterised models.
Linear Regression
The simplest data model, linear regression is commonplace and involves determining parameters α and β in the relationship of the form
Y = AX + ∊, (24.1)
where the experiment consists of observing the values of Y, the response, resulting from treatmentsX. The observations Y are subject to errors ∊.
Unlike Newton's theory of gravitation, Einstein's theory of general relativity views the gravitational force as a manifestation of the geometry of the underlying space-time. The idea sounds good, it evokes images of billiard balls rolling around on tables with hills and valleys where the balls’ otherwise rectilinear motion is disturbed by the geometry of their environment. The difficulty is how to achieve the parallel goal of expressing the force of gravitation geometrically, and to do so without destroying all that we have learned about physics in our local environment.
As we saw in the previous chapter, Einstein saw the principles of covariance and equivalence as a way of formalising that. The theory of gravitation should always admit local inertial frames in which our known laws of physics would hold. Moreover, the mathematical expression of the laws of physics would be the same in all inertial frames. A key step at this point was to argue that physical entities are described by mathematical objects that transform correctly under local Lorentz transformations. This brings us to Minkowski's use of 4-vectors and tensors as the mathematical embodiment of physical quantities.
However, these are local statements, not global ones. They do not tell us how the geometry would affect two widely separated inertial observers in the presence of a gravitational field. The clue was given to Einstein by Marcel Grossmann who suggested that this link would be provided by insisting that the underlying geometry was the geometry of a Riemannian space. This provides the structure to address global issues and to connect different parts of the space.
In this chapter we describe this process and provide the mathematical structure that arises when we follow up on Grossmann's plan. We learn about connecting parts of the space-time and we establish notions of derivatives, geodesics and measures of the curvature.
A Geometric Perspective
The space time of Einstein's theory is specified by its geometry. A gravitational field is thought of as distorting the space-time away from the no-gravity space-time of Minkowski. So, just as the Minkowski space of special relativity is entirely specified by a metric tensor or line element telling us what the space time separation of neighbouring points is, the space-time of the general theory is specified by a more general metric.
It could be said that the discovery and subsequent exploration of the cosmic microwave background radiation marked a transition from cosmology as a branch of astronomy that was largely a philosophical endeavour to cosmology as an astrophysical discipline that is now a fully fledged branch of physics. The CMB established a secure physical framework within which we could undertake sophisticated experiments that would define the parameters of that framework with ever greater precision. As understanding has grown, more and more of traditional astronomy has been embraced to provide evidence in support of this new paradigm: we have garnered evidence from the study of stars, such as supernovae, of galaxies and their velocity fields, and from observations of galaxy clusters and clustering. These studies have involved ground-based and space-based experiments at all wavelengths from radio to gamma-rays.
‘Precision Cosmology’ was born and we now have the recognised disciplines of ‘astro-particle physics’, ‘astro-statistics’ and ‘numerical cosmology’, to name but a few.
Fifty years on from the discovery of the CMB a number of issues have been clarified, but many more remain. Among the numerous areas of active research in cosmology today, there are three which have particular bearing on the first half million years: dark matter, dark energy and gravitational waves.
In the Aftermath of the CMB
The discovery of the CMB not only served to establish our cosmological paradigm as the Hot Big Bang theory, it also stimulated a growth in cosmology as a branch of physics. Fifty years later cosmology has reached a depth and precision of understanding that would have been inconceivable prior to 1965. The 50 years from 1965–2015 saw remarkable advances on both theoretical and data-acquisition and analysis fronts, which are described briefly in this chapter. One important consequence is the blurring of the boundary between theory and observation. Theoretical advances have driven experiments to gather and analyse data through which the theory could be exploited, while ever more sophisticated experiments demand improved methods of data analysis and impose constraints on the values of the parameters of the theories.
It has previously been said that the essence of physics, and science in general, is that it should be possible to perform experiments and to have other groups repeat those experiments, thus providing verification of the results and the consequent conclusions.
We will use Newtonian versions of solutions to the Einstein field equations to describe a series of model universes that provide a framework within which we can better understand how our Universe works. Newton's theory of gravitation has been replaced by Einstein's, but in many respects Newton's theory is a pretty good approximation: good enough that we depend on it in our everyday lives. The Newtonian view is certainly easier for us to relate to and exploit, but we must understand the inherent limitations.
Here, we highlight the fundamental differences between the Newtonian and Einsteinian theories: Newton with his absolute space and universal time, and Einstein with his geometrisation of gravity. Fortunately, there are some relevant solutions of the Einstein equations that have direct Newtonian analogues. Those Newtonian analogues are lacking some important features, notably a lack of a description of how light propagates. Fortunately, we can graft the information from some of the Einstein models onto the Newtonian models to produce what we might call Newtonian surrogates of the Einsteinian models.
In this chapter we introduce the simplest of a series of homogeneous and isotropic cosmological models formulated within the limited framework of Newtonian gravity. These models contains only ‘dust’: pressure free matter made up of particles that are neither created nor destroyed, and that do not interact with one another. The Universe evolves under the mutual gravitational interaction of those particles.
While this model is not realistic it nevertheless allows us to develop a full cosmological model that is the template for the more complex models that follow. We develop these models in considerable detail since much of what is done will be repeated for the other, more realistic, models.
It is worth remarking that these models are fundamental to numerical N-body simulations of the Universe in which the constituent particles are ‘dust’. Such dust models can also be used to study the growth of the structure in the Universe.
Why Bother with Newton?
Two Views of Gravity
The expansion of the Universe is dominated by the gravitational force. The best theory we have for the gravitational force is Einstein's Theory of General Relativity (Einstein, 1916a), which relates geometry with the material content of the space-time containing that matter.