Hostname: page-component-89b8bd64d-b5k59 Total loading time: 0 Render date: 2026-05-12T20:37:01.573Z Has data issue: false hasContentIssue false

The Dawes Review 10: The impact of deep learning for the analysis of galaxy surveys

Published online by Cambridge University Press:  10 January 2023

M. Huertas-Company*
Affiliation:
Instituto de Astrofísica de Canarias, c/ Vía Láctea sn, 38025 La Laguna, Spain Universidad de La Laguna. Avda. Astrofísico Fco. Sanchez, La Laguna, Tenerife, Spain LERMA, Observatoire de Paris, CNRS, PSL, Université Paris-Cité, Paris, France
F. Lanusse
Affiliation:
AIM, CEA, CNRS, Université Paris-Saclay, Université Paris-Cité, Sorbonne Paris Cité, F-91191 Gif-sur-Yvette, France
*
Corresponding author: M. Huertas-Company, Email: mhuertas@iac.es
Rights & Permissions [Opens in a new window]

Abstract

The amount and complexity of data delivered by modern galaxy surveys has been steadily increasing over the past years. New facilities will soon provide imaging and spectra of hundreds of millions of galaxies. Extracting coherent scientific information from these large and multi-modal data sets remains an open issue for the community and data-driven approaches such as deep learning have rapidly emerged as a potentially powerful solution to some long lasting challenges. This enthusiasm is reflected in an unprecedented exponential growth of publications using neural networks, which have gone from a handful of works in 2015 to an average of one paper per week in 2021 in the area of galaxy surveys. Half a decade after the first published work in astronomy mentioning deep learning, and shortly before new big data sets such as Euclid and LSST start becoming available, we believe it is timely to review what has been the real impact of this new technology in the field and its potential to solve key challenges raised by the size and complexity of the new datasets. The purpose of this review is thus two-fold. We first aim at summarising, in a common document, the main applications of deep learning for galaxy surveys that have emerged so far. We then extract the major achievements and lessons learned and highlight key open questions and limitations, which in our opinion, will require particular attention in the coming years. Overall, state-of-the-art deep learning methods are rapidly adopted by the astronomical community, reflecting a democratisation of these methods. This review shows that the majority of works using deep learning up to date are oriented to computer vision tasks (e.g. classification, segmentation). This is also the domain of application where deep learning has brought the most important breakthroughs so far. However, we also report that the applications are becoming more diverse and deep learning is used for estimating galaxy properties, identifying outliers or constraining the cosmological model. Most of these works remain at the exploratory level though which could partially explain the limited impact in terms of citations. Some common challenges will most likely need to be addressed before moving to the next phase of massive deployment of deep learning in the processing of future surveys; for example, uncertainty quantification, interpretability, data labelling and domain shift issues from training with simulations, which constitutes a common practice in astronomy.

Information

Type
Dawes Review
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Astronomical Society of Australia
Figure 0

Figure 1. Relative change of the number of papers on arXiv:astro-ph with different keywords in the abstract as a function of time. The number of works mentioning neural networks in the abstract has experienced an unprecedented growth in the last ${\sim}6$ yr, significantly steeper than other topic in astrophysics. Source: ArXivSorter.

Figure 1

Figure 2. Example of level of agreement (red circles) and model confidence (blue squares) versus classification accuracy. Each panel shows a different question in the Galaxy Zoo classification tree (smoothness, edge-on, bar). The authors quote an unprecedented accuracy of ${>}90\%$. This is the first work that uses CNNs in astrophysics. The figure is adapted from Dieleman et al. (2015)

Figure 2

Figure 3. Schematic view of a simple Vanilla type Convolutional Neural Network, the most common approach for binary, multi-class classification and regression in extragalactic imaging. The input, which is typically an image is fed to a series of convolutional layers. The resulting embedding is used an input of a Multi-Layer Perceptron which outputs a float or array of floats. If the problem is a classification, the standard loss function is the cross-entropy (H), while if it is a regression the quadratic loss ($L_2$) is usually employed.

Figure 3

Figure 4. Example of two different simulated samples of strong lenses used for training a CNN. These simulations were used to detect strong lenses in the CFHTLS survey. Figure adapted from Jacobs et al. (2017)

Figure 4

Figure 5. Comparison of Capsule Networks and CNNs applied to classify morphologies of radio galaxies. The ROC curves show the different performances. Capsule Networks do not offer a significant gain in this context. Figure adapted from Lukic et al. (2019)

Figure 5

Figure 6. Example of posterior distributions of morphologies estimated from the votes of different classifiers. The leftmost column shows an image of a galaxy. The middle column shows the posterior predicted by a single network (black), while the right column shows the posterior averaged over 30 Monte Carlo dropout approximated networks. The red vertical lines indicate the ground truth value, which generally shows a good overlap with the posterior distribution. Figure adapted from Walmsley et al. (2020)

Figure 6

Figure 7. Example of Transfer Learning adapted to galaxy morphological classification exploring how to use a CNN trained on SDSS to the DES survey. Each panel shows a different morphological feature (smooth/disk, bars from top to bottom). The different lines show the ROC curves using different training models and test sets. The work concludes that when using a small subset of DES galaxies to refine the weights of a CNN trained on SDSS, the accuracy of the classification becomes optimal (red solid lines compared to blue dashed lines). Figure adapted from Domínguez Sánchez et al. (2019)

Figure 7

Figure 8. Schematic view of a Recursive Neural Network (RNN) which has been used for classification of light curves in several works. The photometric sequence is fed to a recursive block and trained with a cross-entropy loss. The RNN blocks keep a memory ($h_t$) of the previous time step which make them suitable for handling time sequences.

Figure 8

Figure 9. RNNs used for SN photometric light curve classification. The figure shows an example of light curve in five photometric bands (top panel) and the probability of classifications as a function of the time step (bottom panel). After roughly 50 d, the supernova type is classified with high confidence. Figure adapted from Charnock & Moss (2017)

Figure 9

Figure 10. Example of typical transformer architecture for the classification of light curves. A Gaussian Process is first used to homogenise the time sampling and the resulting sequence is fed into a CNN for feature extraction. The attention modules are used to extract the final features which are used to classify.

Figure 10

Figure 11. Transformer applied to SN photometric light curve classification. The figure shows a confusion matrix on the PLAsTiCC simulated dataset. Most of the classes are identified with accuracies larger than 90% as seen in the diagonal of the matrix. Figure adapted from Allam & McEwen (2021)

Figure 11

Figure 12. Schematic representation of a Unet type of architecture. This is typically used for addressing image to image problems (e.g. segmentation, galaxy properties in 2D). A first CNN (encoder) encodes information into a latent space (z), which is then fed to another CNN (decoder) which produces an image. One particularity of the Unet is the presence of skipped connections between the encoding and decoding parts which have been demonstrated to provide higher accuracies.

Figure 12

Figure 13. Comparison between the photometry measured with the Unet and with SExtractor. All panels show a comparison between input and recovered photometry. The top row shows the central galaxy in the stamp while the bottom row is the companion one. The two leftmost panels show two different deep learning architectures. The rightmost panel shows the results of the baseline SExtractor. Both dispersion and bias are improved with deep learning. Figure adapted from Boucaud et al. (2020)

Figure 13

Figure 14. Unet used to segment different types of artefacts on CCD images. The leftmost figures show a field containing different artefacts. The middle panel shows the ground truth and the rightmost panels the predictions obtained by the Unet. Figure adapted from Paillassa et al. (2020)

Figure 14

Figure 15. Example of pixel-level classification of a region in the CANDELS survey using a Unet. The leftmost image shows the original field. The six smaller images show different channels with different morphological types and the rightmost image is the combination of all channels with a pixel-level morphological classification. Figure adapted from Hausen & Robertson (2020)

Figure 15

Figure 16. Schematic representation of a standard Generative Adversarial Network (GAN). It has been used as a generative model for deblending, identifying outliers as well as for accelerating cosmological simulations among others. A first CNN (generator) maps a random variable into an image, which is then fed to a second CNN (discriminator) together with real images to distinguish between both datasets. The generator and discriminator networks are trained alternatively.

Figure 16

Figure 17. Example of galaxy deblending using GANs. The central images show three overlapping galaxies. On each side of the big images, the original (left) and reconstructed (right) images of the pair of galaxies are shown. Figure adapted from Reiman & Göhre (2019).

Figure 17

Figure 18. Illustration of a Variational Autoencoder (VAE). This generative model has been extensively used in astronomy for multiple emulation tasks. A first CNN maps the input into a distribution, which is then sampled to provide the input to a second CNN which reconstructs the input image. A regularisation term is added to ensure that the latent distribution behaves close to a prior distribution.

Figure 18

Figure 19. Illustration of a Normalising flow. This density estimator model is used to transform a simple distribution (typically a Normal distribution) into a more complex density using an invertible function f parametrised by a neural network which is optimised by maximising the log likelihood. The transformation can also be conditioned to an additional set of parameters (s in the figure).

Figure 19

Figure 20. Representation of a Mixture Density Network (MDN). It is a generalisation of a standard CNN/ANN in which the output has been replaced by a probability distribution parametrised with some parameters ($\mu$, $\sigma$ in the figure). MDNs provide an estimation of the random (aleatoric) uncertainty. The loss function is the generalised log likelihood.

Figure 20

Figure 21. Comparison of photometric redshift performance between a deep CNN and a colour-based k-NN (B16) method reported in (Pasquet et al. 2019). The top row shows the predicted redshifts vs spectroscopic redshifts for the CNN (left) and the k-NN method (right). The distribution is noticeably tighter for the CNN with smaller overall bias, scatter, and outlier rate. The bottom row show the photometric redshift bias $\Delta z$ for the two methods, as a function of extinction (left panel) and disk inclination (right panel) of galaxies classified as star-forming or starbust. Having access to the whole image information, the CNN is able to automatically account for the reddening induced by looking at galaxies with high inclination, whereas the k-NN method only using colour information is blind to this effect and exhibit a strong bias with inclination.

Figure 21

Figure 22. Convolutional Neural Network to measure galaxy structure. The left column shows the difference between true and estimated radius. The right column is the same for Sersic index. Top row shows the results as a function of magnitude while the bottom row is as a function of radius. Figure adapted from Li et al. (2021).

Figure 22

Figure 23. CNN applied to estimate resolved stellar populations of galaxies. The input are images in different wavebands (2 left columns) and the output are different stellar population maps (3 right columns). In the stellar population maps, the top row is the ground truth, the middle one the prediction and the bottom row shows the residual. Figure adapted from Buck & Wolf (2021).

Figure 23

Figure 24. CNN applied to reconstruct the Star Formation Histories of galaxies. The input are photometric images and the output the Star Formation Rates in different bins of time. The different panels illustrate some examples of predictions (red lines) together with the ground truth from cosmological simulations (blue lines). Figure adapted from Lovell et al. (2019).

Figure 24

Figure 25. Estimation of strong lensing parameters with CNNs. Each panel shows a different parameter (ground truth versus CNN estimation). Figure from Hezaveh et al. (2017)

Figure 25

Figure 26. Lens modelling with a fully differentiable approach. The panels show the posterior distribution of different parameters of the model for different SNRs as labelled, together with the ground truth (black circle). Figure adapted from Chianese et al. (2020)

Figure 26

Figure 27. Supervised deep learning trained on simulations used to infer galaxy merger rates. The panels show the inferred merger fractions (left) and merger rates (right) with deep learning as compared to other independent techniques. Figure adapted from Ferreira et al. (2020)

Figure 27

Figure 28. CNN applied to reconstruct to classify galaxies in different evolutionary stages defined by cosmological simulations. Each column shows a different phase of evolution as defined by the simulation. The top row is the high resolution from the simulation. The middle row shows the same galaxy with observational realism added. The bottom row shows real observed galaxies identified by the CNN as being in one of the three different phases. Figure adapted from Huertas-Company et al. (2018)

Figure 28

Figure 29. The figure shows the predicted dark matter mass as a function of the true mass along with uncertainties using a Normalising Flow. Figure adapted from Kodi Ramanah et al. (2021)

Figure 29

Figure 30. Illustration of a Neural Flow for estimating posterior distributions. This type of approach is starting to become common in simulation based inference approaches for estimating cluster masses or cosmological parameters. A first Neural Network with a standard $L_2$ loss is used to obtain some summary statistics of the data which are then use as a condition for a Neural Flow mapping a simple distribution into an approximation of the posterior.

Figure 30

Figure 31. The top panel illustrates the accuracy obtained on simulations, when the training and testing is done on datasets coming from the same underlying cosmological simulation. The bottom right panel is when two different simulations are used for training and testing respectively. Figure adapted from Villanueva-Domingo et al. (2021a)

Figure 31

Figure 32. Variational Autoencoder for dimensionality reduction and data visualisation. The figure show how the different types of spectra (labelled with different colours) populate different projections of the latent space. Figure adapted from Portillo et al. (2020)

Figure 32

Figure 33. Illustration of a self-supervised contrastive learning architecture. Multiple random augmentations of the same image (positive pairs) are fed to two different CNNs which map them into a latent representation space. Also during training, pairs of completely different images (negative pairs) are also fed to the two CNNs. The contrastive loss is optimised to increase (decrease) the dot product of representations of positive (negative) pairs. Contrastive learning is starting to be used for dimensionality reduction and as a generalised feature extraction process for multiple downstream tasks such as galaxy classification or photometric redshift estimation.

Figure 33

Figure 34. Self-supervised learning applied to multi-band SDSS images. The left panels shows a UMAP of the representations obtained with contrastive learning. The panels on the right show the same UMAP colour coded with different galaxy properties. Similar images are clustered together. Figure adapted from Hayat et al. (2021).

Figure 34

Figure 35. Representations of Manga maps using PCA and contrastive learning, and projected into a UMAP. The two leftmost columns show the plane colour coded with non-physical parameter (e.g. number of fibres in the IFU). The rightmost columns show the same maps colour coded with physical properties. Self-supervised representations cluster galaxies according to physical parameters while PCA focuses mostly on the number of fibres. Figure adapted from Sarmiento et al. (2021).

Figure 35

Figure 36. Anomaly scores for different types of light curves obtained with deep AutoRegressive Model (bottom panel) and with a Bayesian Reconstruction algorithm (top panel). Unknown light curves not used for training have larger anomaly scores when using the Bayesian method than with the Neural Network. Figure from Muthukrishna et al. (2021).

Figure 36

Figure 37. Example of anomalous objects identified with a combination of WGAN and a CAE applied to the HSC survey. The top panel shows the latent space from the image residuals of the WGAN reconstruction obtained with the CAE. The images below show examples of different regions of the parameter space. Figure from Storey-Fisher et al. (2021).

Figure 37

Figure 38. Distribution of likelihood ratios obtained with two pixelCNN networks of observations (SDSS) and different models as labelled. The closer the histograms from simulations are to the one from SDSS, the more realistic the simulation is. The unsupervised model is able to capture the improvement of simulations with time. Figure from Zanisi et al. (2021).

Figure 38

Figure 39. Cartoon illustrating the method to extract physical equation from a deep learning model applied to different datasets. Figure from Cranmer et al. (2020).

Figure 39

Figure 40. Illustration of learned displacement field in an N-body simulation from He et al. (2018). The first column is the reference simulation (FastPM), the second column shows a simple linear displacement (ZA), the third column shows a second order Lagrangian Perturbation Theory displacement (2LPT) and the last column shows the output of the 3D U-Net ($\mathrm{D}^{3}\mathrm{M}$). The top row shows final particle positions, the bottom row shows the final displacement field. The colour scale shows the error in position or displacement vectors between each approach and the reference FastPM simulation.

Figure 40

Figure 41. Illustration of N-body simulation super-resolution from Li et al. (2020a) showing from left to right, the Low-Resolution (LR) input, High-Resolution (HR) target, and Super-Resolution output of the model. The bottom row is a zoom-in on the region marked A.

Figure 41

Figure 42. Sequential generation and upsampling strategy of the scalable GAN for N-body modelling presented in Perraudin et al. (2019a) scaling up to $256^3$. The left illustration shows how the sequential generation of a large volume would proceed. The right plot illustrates the proposed architecture where the generator is conditioned on both neighbouring patches, and on the lower resolution patch, which at sampling time would be generated by a separate GAN trained on coarser volumes. Distribute under the Creative Commons CC BY licence (http://creativecommons.org/licenses/by/4.0/).

Figure 42

Figure 43. Illustration of an application of differentiable neural mapping between the dark matter density and dark matter halo fields from Modi et al. (2018). The top row shows the initial conditions dark matter field, final dark matter field (at $z=0$), and the dark matter halo field obtained by a FoF halo finder. The bottom row shows the result of a reconstruction by gradient descent of these initial conditions, using a neural network to provide the mapping between the final density field and the halo field.

Figure 43

Figure 44. Illustration of lensed systems and the corresponding likelihood ratio maps estimated with simulation based inference and deep learning. The black crosses show the true values. Figure from Brehmer et al. (2019).

Figure 44

Figure 45. Illustration of weak-lensing mass-map reconstructions in a simulated COSMOS survey setting with the posterior sampling method of Remy et al. (2022) and the DeepMass direct posterior mean estimation method of Jeffrey et al. (2020). As can be seen, individual posterior samples (bottom row) are visually similar to a real convergence map (e.g. ground truth at the top left) but exhibit variability on structures not strongly constrained by data (e.g. outside of the survey region marked by the white contours). The top row illustrates that the DeepMass estimate indeed recovers the Bayesian posterior mean.

Figure 45

Figure 46. CosmicRIM initial conditions reconstruction technique of Modi et al. (2021a) based on a 3D LSTM recurrent neural network and includes explicitly at each iteration the gradients of the data likelihood. The plot shows a given ground truth initial condition field (top left) and associated final field (bottom left), along with the reconstructed initial conditions (top right) and reconstructed final field (bottom right).

Figure 46

Table 1. Overview of the different deep learning techniques used in the fields of galaxy formation and cosmology, divided by type of application (see text for details).

Figure 47

Figure 47. Impact of works using deep learning for galaxy surveys. Each symbol shows a different class of application as labelled (see text for details). The top left and right panels show the number of papers and number of citations as a function of time respectively. The bottom left and right panels show the fraction of papers and citations in each class of application.

Figure 48

Figure 48. Number of citations normalised by the number of papers and years (from the first publication in that category) as a function of the number of papers per year. Each symbol shows a different category as defined in this work (see text for details). The ‘all galaxy evolution’ group includes an automatic search of all publications in the field of galaxy formation (with or without deep learning).

Figure 49

Table 2. Major challenges that deep learning works applied to astronomy might suffer and that will need to be addressed in the coming years. We also provide elements of solutions already being explored along with the corresponding references.