Hostname: page-component-89b8bd64d-ksp62 Total loading time: 0 Render date: 2026-05-07T01:25:16.071Z Has data issue: false hasContentIssue false

Group therapy for halos: Advancing halo mass estimation for galaxy groups

Published online by Cambridge University Press:  22 January 2026

Wesley Van Kempen*
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology , Australia
Michelle Cluver
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology – Hawthorn Campus, Australia
Edward N. Taylor
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology , Australia
Darren Croton
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology , Australia ARC Centre of Excellence for All-Sky Astrophysics, Australia ARC Centre of Excellence for Dark Matter Particle Physics, Australia
Trystan Lambert
Affiliation:
Faculty of Science, International Centre for Radio Astronomy Research, ICRAR, University of Western Australia, Australia
Claudia Lagos
Affiliation:
Faculty of Science, International Centre for Radio Astronomy Research, ICRAR, University of Western Australia, Australia
*
Corresponding author: Wesley Van Kempen, Email: wvankempen@swin.edu.au.
Rights & Permissions [Opens in a new window]

Abstract

Accurate estimation of dark matter halo masses for galaxy groups is central to studies of galaxy evolution and for leveraging group catalogues as cosmological probes. In this work, we present a comprehensive evaluation and calibration of two complementary halo mass estimators: a dynamical estimator based on a modified virial theorem (MVT) and an empirical summed stellar mass to halo mass relation (sSHMR), which uses the summed mass of the three most massive group galaxies as a proxy for halo mass. Using a suite of state-of-the-art semi-analytic models (SAMs; Shark, SAGE, and GAEA) to produce observationally motivated mock light-cone catalogues, we rigorously quantify the accuracy, uncertainty, and model dependence of each method. The MVT halo mass estimator achieves negligible systematic bias (mean $\Delta = -0.01$ dex) and low scatter (mean $\sigma = 0.20$ dex) as a function of the predicted halo mass, with no sensitivity to the SAM baryonic physics. The calibrated sSHMR yields the highest precision, with mean $\Delta = 0.02$ dex and mean $\sigma = 0.14$ dex as a function of the predicted halo mass but exhibits greater model dependence due to its sensitivity to varying baryonic physics and physical prescriptions across the SAMs. We demonstrate the application of these estimators to observational group catalogues, including the construction of the empirical halo mass function and the mapping of quenched fractions in the stellar mass–halo mass plane. We provide clear guidance on the optimal application of each method: the MVT is recommended for GAMA-like surveys ($i \lt 19.2$) calibrated to $z \lt 0.1$ and should be used for studies that require minimal model dependence, while the sSHMR is optimal for high-precision halo mass estimation across diverse catalogues with magnitude limits of $Z \lt 21.2$ or brighter and to redshifts of $z \leq 0.3$. These calibrated estimators will be of particular value for upcoming wide-area spectroscopic surveys, enabling robust and precise analyses between the galaxy–halo connection and the underlying dark matter distribution.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of Astronomical Society of Australia
Figure 0

Figure 1. RA–Dec distribution of the three SGP sub-samples used for uniform completeness: SGP–2dF (red), G23–2dF (green), and G23–GAMA (orange). Comparing G23–2dF with G23–GAMA highlights the effect of spectroscopic completeness. Numbers in brackets denote sample sizes.

Figure 1

Figure 2. Comparison of the SMF from Shark to well-established observational SMFs and SMFs produced by observational data from Van Kempen et al. (2024). The figure shows the SMF ($\log \unicode{x03D5}$) versus stellar mass ($\log M_{\star}$) for Shark v2.0 (blue circles) alongside canonical constraints from Baldry et al. (2012) (black dashed line) and Driver et al. (2022a) (grey dashed line). Red triangles, orange diamonds, and green crosses represent our observational data from Van Kempen et al. (2024) (SGP-2dF, G23-GAMA, and G23-2dF datasets, respectively). Error bars indicate Poisson uncertainties. Shark demonstrates excellent agreement with the established SMFs and across our observational datasets, with only minor deviations.

Figure 2

Figure 3. Comparison of the HMF derived from Shark with analytic and observational benchmarks. The blue data points represent the Shark HMF, with error bars indicating Poisson uncertainties in each mass bin. The solid black curve denotes the analytic HMF prediction for the input cosmology of Shark, computed using the hmf Python package (Murray, Power, & Robotham 2013). The grey dashed line shows the empirical fit from Driver et al. (2022a), based on GAMA5, SDSS5, and REFLEX II data. The magenta dashed curve corresponds to a Schechter function fit to the Shark HMF.

Figure 3

Figure 4. Comparison of halo mass estimates derived from the velocity dispersion relation before and after calibration, across three semi-analytic models: Shark, SAGE, and GAEA. All samples have a manitude limit of $i\lt19.2$ and a redshift limit of $z\lt0.1$. Top Row: The virial theorem halo mass ($\log M_{\mathrm{halo, \, VT}}$) versus the true halo mass ($\log M_{\mathrm{halo}}$) for each model, with colour indicating the logarithmic number density of halos in each bin. The dashed line denotes the one-to-one relation. Second Row: Corresponding residuals ($\Delta \log M_{\mathrm{halo}} = \log M_{\mathrm{halo, \, VT}} - \log M_{\mathrm{halo}}$) and error bars indicate 16th and 84th percentiles in each bin. The mean offset and scatter indicated in the legend. Third Row: The MVT halo mass estimates ($\log$ M$_{\mathrm{halo, \, MVT}}$), compared to the true halo mass ($\log M_{\mathrm{halo}}$). Bottom Row: Corresponding median of residuals ($\Delta \log M_{\mathrm{halo}} = \log M_{\mathrm{halo, \, MVT}} - \log M_{\mathrm{halo}}$) and error bars indicate 16th and 84th percentiles in each bin. The mean offset and scatter indicated in the legend.

Figure 4

Table 1. Best-fitting parameters for the calibrated virial theorem relation.

Figure 5

Table 2. Best-fitting parameters for the summed stellar mass–halo mass relation.

Figure 6

Figure 5. Comparison of baryonic and halo mass relations across three SAMs: Shark, SAGE, and GAEA. The Shark sample has a fainter magnitude limit of $Z\lt21.2$ and a deeper redshift limit of $z\lt0.3$ compared to SAGE and GAEA ($i\lt19.2$ and $z\lt0.1$). Top Row: The sSHMR using the summed stellar mass of the three most massive group galaxies ($\log \sum {M}_{*,3}$) and the true halo mass ($\log M_{halo}$) is shown for each SAM, with the colour scale indicating the logarithm of the number of groups per bin. The black dashed line in each panel represents the MCMC-derived fit to the Shark data, optimised to minimise scatter in halo mass. This same fit is overlaid on the SAGE and GAEA panels to illustrate model dependence. Second Row: The residuals in halo mass ($\Delta\log M_{halo}=\log\sum M_{*,3}-\log M_{halo}$) is shown as a function of the summed stellar mass. Third Row: Estimated halo masses, derived by applying the Shark-calibrated sSHMR, are plotted against the true halo masses from each simulation. The black dashed line denotes the one-to-one relation. The close alignment of the points along this line in all models demonstrates that the Shark-based calibration provides robust halo mass estimates, with low bias and scatter, even when applied to independent SAMs. Bottom Row: The residuals in halo mass are shown as a function of the estimated halo mass. In both residual panels, the colour scale again indicates the logarithm of the number of groups per bin.

Figure 7

Figure 6. Empirical HMF for galaxy groups with three or more members, constructed from the group catalogue of Van Kempen et al. (2024). The HMF is shown separately for groups in the SGP region dominated by 2dF coverage (red triangles), the G23 region with GAMA spectroscopy (orange diamonds), and the G23 region with 2dF spectroscopy (green crosses). Halo masses are estimated using the traditional and calibrated virial theorem method ($\log M_{\mathrm{halo,\,VT}}$ or $\log M_{\mathrm{halo,\,MVT}}$). Number densities are not corrected for survey volume or selection effects, due to the heterogeneous nature of the group sample and the challenges in defining a complete selection function. The solid black curve represents the analytic HMF prediction for the adopted cosmology of Shark, while the grey dashed line shows the empirical fit from Driver et al. (2022a) based on GAMA5, SDSS5, and REFLEX II data. Error bars reflect Poisson uncertainties.

Figure 8

Figure 7. Distribution of quenched fraction in the stellar mass–halo mass plane for galaxy groups in the SHARK, SAGE, and GAEA simulations. The colour scale indicates the fraction of quenched galaxies (defined as those with $\log\,\mathrm{sSFR} \lt -11.0$) within each region of parameter space, ranging from blue (predominantly star-forming) to red (predominantly quenched). The underlying distribution is computed using a two-dimensional kernel density estimate (KDE) with adaptive smoothing. The quenched fraction is calculated as the ratio of the KDE-weighted density of quenched galaxies to the total KDE-weighted density in each cell. Black contours show the underlying density of individual galaxies. Halo masses are estimated using the calibrated relation between the sum of the stellar masses of the three most massive group galaxies and the group halo mass.

Figure 9

Figure 8. Distribution of quenched fraction in the stellar mass–halo mass plane for galaxy groups in the observational dataset of Van Kempen et al. (2024). The colour scale represents the fraction of quenched galaxies (defined as those with) within each region of parameter space, ranging from blue (predominantly star-forming) to red (predominantly quenched). The underlying distribution is computed using a two-dimensional kernel density estimate (KDE) with adaptive smoothing. The quenched fraction is calculated as the ratio of the KDE-weighted density of quenched galaxies to the total KDE-weighted density in each cell. Black contours indicate the underlying density of individual galaxies. The halo masses used in this figure are estimated using the calibrated sSHMR between the sum of the stellar masses of the three most massive group galaxies and the group halo mass. This figure highlights the dependence of quenching on both stellar and halo mass, with the quenched fraction increasing towards higher masses in both dimensions.

Figure 10

Table A1. Best fitting of the coefficients for the modified virial theorem relation, including their 16th and 84th percentile scatter.

Figure 11

Figure A1. Posterior distributions and covariances for the six free parameters of the calibrated dispersion-based halo mass relation, as determined by MCMC sampling on the Shark model. The corner plot displays the marginalised one-dimensional distributions along the diagonal, with median values and 68% credible intervals indicated, and the two-dimensional projections of the posterior for each parameter pair in the off-diagonal panels. The parameters correspond to the normalisation and scaling of the velocity dispersion and maximum projected separation terms ($\alpha$, $\sigma_{\mathrm{lim}}$, $\unicode{x03B2}$, $R_{\mathrm{lim}}$, $n_1$, $n_2$) in the power-law model for halo mass. The MCMC analysis was performed using the emcee sampler, with convergence assessed via autocorrelation time. This figure demonstrates that all parameters are well-constrained and highlights the correlations between them, providing a robust statistical foundation for the calibrated group halo mass estimator.

Figure 12

Figure B1. Posterior distributions and covariances for the four free parameters of the summed stellar mass–halo mass relation, as determined by MCMC sampling on the Shark model. The corner plot displays the marginalised one-dimensional posterior distributions for each parameter (normalisation A, characteristic mass $M_A$, low-mass slope $\unicode{x03B2}$, and high-mass slope $\gamma$) along the diagonal, with median values and 68% credible intervals indicated. Off-diagonal panels show the two-dimensional projections of the posterior for each parameter pair, highlighting correlations and degeneracies. The MCMC analysis was performed using the emcee sampler, with convergence assessed via autocorrelation time. This figure demonstrates that all parameters are well-constrained, providing a robust statistical foundation for the calibrated relation between the sum of the stellar masses of the three most massive group galaxies and their host halo mass.

Figure 13

Table B2. Best fitting of the coefficients for the sSHMR, including their 16th and 84th percentile.