Hostname: page-component-6b88cc9666-vdgfs Total loading time: 0 Render date: 2026-02-13T08:51:48.123Z Has data issue: false hasContentIssue false

Group therapy for halos: Advancing halo mass estimation for galaxy groups

Published online by Cambridge University Press:  22 January 2026

Wesley Van Kempen*
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology , Australia
Michelle Cluver
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology – Hawthorn Campus, Australia
Edward N. Taylor
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology , Australia
Darren Croton
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology , Australia ARC Centre of Excellence for All-Sky Astrophysics, Australia ARC Centre of Excellence for Dark Matter Particle Physics, Australia
Trystan Lambert
Affiliation:
Faculty of Science, International Centre for Radio Astronomy Research, ICRAR, University of Western Australia, Australia
Claudia Lagos
Affiliation:
Faculty of Science, International Centre for Radio Astronomy Research, ICRAR, University of Western Australia, Australia
*
Corresponding author: Wesley Van Kempen, Email: wvankempen@swin.edu.au.
Rights & Permissions [Opens in a new window]

Abstract

Accurate estimation of dark matter halo masses for galaxy groups is central to studies of galaxy evolution and for leveraging group catalogues as cosmological probes. In this work, we present a comprehensive evaluation and calibration of two complementary halo mass estimators: a dynamical estimator based on a modified virial theorem (MVT) and an empirical summed stellar mass to halo mass relation (sSHMR), which uses the summed mass of the three most massive group galaxies as a proxy for halo mass. Using a suite of state-of-the-art semi-analytic models (SAMs; Shark, SAGE, and GAEA) to produce observationally motivated mock light-cone catalogues, we rigorously quantify the accuracy, uncertainty, and model dependence of each method. The MVT halo mass estimator achieves negligible systematic bias (mean $\Delta = -0.01$ dex) and low scatter (mean $\sigma = 0.20$ dex) as a function of the predicted halo mass, with no sensitivity to the SAM baryonic physics. The calibrated sSHMR yields the highest precision, with mean $\Delta = 0.02$ dex and mean $\sigma = 0.14$ dex as a function of the predicted halo mass but exhibits greater model dependence due to its sensitivity to varying baryonic physics and physical prescriptions across the SAMs. We demonstrate the application of these estimators to observational group catalogues, including the construction of the empirical halo mass function and the mapping of quenched fractions in the stellar mass–halo mass plane. We provide clear guidance on the optimal application of each method: the MVT is recommended for GAMA-like surveys ($i \lt 19.2$) calibrated to $z \lt 0.1$ and should be used for studies that require minimal model dependence, while the sSHMR is optimal for high-precision halo mass estimation across diverse catalogues with magnitude limits of $Z \lt 21.2$ or brighter and to redshifts of $z \leq 0.3$. These calibrated estimators will be of particular value for upcoming wide-area spectroscopic surveys, enabling robust and precise analyses between the galaxy–halo connection and the underlying dark matter distribution.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of Astronomical Society of Australia

1. Introduction

The dark-matter halo-mass represents one of the most fundamental properties governing galaxy evolution within the hierarchical structure formation paradigm; as galaxies form and evolve within these gravitationally bound dark matter structures, the halo mass strongly influences numerous galaxy properties including star formation histories, morphologies, and chemical enrichment pathways (White & Rees Reference White and Rees1978; Blumenthal et al. Reference Blumenthal, Faber, Primack and Rees1984; Bower et al. Reference Bower2006). Accurate determination of halo masses therefore underpins our ability to connect theoretical models of structure formation with observational constraints on galaxy evolution across cosmic time.

Given the importance of halo mass in the field of galaxy evolution, accurate measures of halo mass are critical; however, it remains challenging to measure directly for individual groups. Weak gravitational lensing provides accurate and widely adopted halo mass estimates, and is routinely used to validate indirect estimators (Leauthaud et al. Reference Leauthaud2012; Hoekstra et al. Reference Hoekstra2013; Velander et al. Reference Velander2014; Mandelbaum et al. Reference Mandelbaum2016; Mandelbaum et al. Reference Mandelbaum2018). Nonetheless, robust weak-lensing halo masses for individual galaxy groups are not universally available at typical survey depths, as the per-object signal-to-noise at group-scale masses is often insufficient without stacking or targeted deep imaging (Leauthaud et al. Reference Leauthaud2012; Velander et al. Reference Velander2014; Viola et al. Reference Viola2015; Simet et al. Reference Simet2017). Other indirect probes include X-ray emission from hot gas (e.g. Arnaud et al. Reference Arnaud2010) and galaxy dynamics (e.g. Old et al. Reference Old2014); these can be applied to individual groups, but carry method-specific systematics that are particularly pronounced in low membership groups. This regime is crucial for tracing the growth of structure within group environments and for understanding the transition from group to cluster environments (Yang et al. Reference Yang2007; Robotham et al. Reference Robotham2011). Current approaches to estimating halo masses face several challenges: weak lensing signals become increasingly difficult to detect for lower-mass systems (Viola et al. Reference Viola2015), while X-ray observations require the presence of hot gas in hydrostatic equilibrium; a condition typically not met in group-scale environments (Lovisari et al. Reference Lovisari, Ettori, Gaspari and Giles2021). Traditional dynamical mass estimators based on the virial theorem assume that systems are both virialised and well-sampled; these assumptions often break down for groups with low multiplicity, particularly when survey completeness is limited (Robotham et al. Reference Robotham2011; Old et al. Reference Old2018; Wojtak et al. Reference Wojtak2018). Collectively, these limitations introduce substantial scatter and systematic uncertainty in halo mass estimates, particularly in the low mass/multiplicity group regime where accurate halo mass measurements are most critical for understanding the impact of baryonic feedback on galaxy evolution (Wechsler & Tinker Reference Wechsler and Tinker2018).

Improved halo mass estimations are required to address several fundamental questions in modern astrophysics. First, the precise shape and amplitude of the halo mass function provide important constraints on cosmological parameters, particularly $\sigma_{8}$ and $\Omega_{m}$ (Tinker et al. Reference Tinker2008; Castro et al. Reference Castro2021; Driver et al. Reference Driver2022a). Second, understanding the coevolution of galaxies and their host haloes requires accurate mapping between observable galaxy properties and underlying halo masses (Behroozi et al. Reference Behroozi, Wechsler, Hearin and Conroy2019; Moster, Naab, & White Reference Moster, Naab and White2020). The scatter in the stellar halo mass relation (SHMR) offers insights into the complex baryonic processes driving galaxy assembly, including feedback from supernovae, active galactic nuclei (AGN), and star-formation efficiency (Pillepich et al. Reference Pillepich2018; Davies et al. Reference Davies2019; Oyarzún et al. Reference Oyarzún, Tinker, Bundy, Xhakaj and Wyithe2024; Wang & Peng Reference Wang and Peng2025). More broadly, measuring the baryonic properties of galaxies as a function of halo mass remains a central goal in galaxy evolution and cosmology. For example, Chauhan et al. (Reference Chauhan2020) demonstrated that the signatures of AGN feedback on the HI content of haloes are remarkably strong, with variations between simulations exceeding 1 dex. However, subsequent work by Chauhan et al. (Reference Chauhan2021) showed that the uncertainties inherent in halo mass estimation, particularly when using the HI stacking techniques commonly adopted in the literature, can completely obscure these feedback signatures, makes it impossible to distinguish between different simulation models. These studies further highlight that while dynamical mass estimators perform well for high-multiplicity systems, but are unable to resolve the underlying distribution for low mass, low multiplicity groups. This underscores the critical importance of minimising halo mass uncertainties in order to robustly connect baryonic content and feedback processes to the underlying dark matter halo population.

The advent of next-generation wide-field spectroscopic surveys such as Wide Area Vista Extragalactic Survey (WAVES; Driver et al. Reference Driver2019), the Dark Energy Spectroscopic Instrument (Desi; DESI Collaboration et al. 2016), and the 4MOST Hemisphere Survey (Taylor et al. Reference Taylor2023) promises to generate comprehensive catalogues of galaxy groups across cosmic time. These surveys will deliver spectroscopic redshifts for millions of galaxies, where high completeness will facilitate the robust identification of gravitationally bound structures at the group scale where environmental effects demonstrably alter galaxy properties and evolutionary trajectories (Peng et al. Reference Peng2010; Wetzel et al. Reference Wetzel, Tinker, Conroy and van den Bosch2013; Davies et al. Reference Davies2019). However, the scientific potential of these datasets precariously depends on our capacity to accurately translate observable group properties into reliable halo mass estimates (Kravtsov, Vikhlinin, & Meshcheryakov Reference Kravtsov, Vikhlinin and Meshcheryakov2018; Eckert et al. Reference Eckert2020; Tinker Reference Tinker2021). This connection between observable baryonic tracers and the underlying dark matter distribution remains fundamental for quantitatively testing hierarchical structure formation models.

The continuous flow of gas into, within, and out of galaxies is referred to as the baryon cycle and represents a key process governing galaxy evolution (Davé et al. Reference Davé, Finlator and Oppenheimer2012; Lilly et al. Reference Lilly, Carollo, Pipino, Renzini and Peng2013). Complementary multi-wavelength facilities such as Euclid (Laureijs et al. Reference Laureijs2011), the Vera Rubin Observatory (LSST Science Collaboration et al. 2009), the Square Kilometre Array (SKA; Dewdney et al. Reference Dewdney, Hall, Schilizzi and Lazio2009), the Atacama Large Millimeter/submillimeter Array (ALMA; Wootten & Thompson Reference Wootten and Thompson2009), the extended Roentgen Survey with an Imaging Telescope Array (eROSITA; Merloni et al. Reference Merloni2012), the Australian Square Kilometre Array Pathfinder (ASKAP; Johnston et al. Reference Johnston2008), and the Very Large Telescope/Multi Unit Spectroscopic Explorer (VLT/MUSE; Bacon et al. Reference Bacon, McLean, Ramsay and Takami2010) capture distinct components of this cycle: stellar content through optical and near-infrared observations, molecular gas reservoirs via millimetre observations, hot gas through X-ray measurements, neutral hydrogen via radio observations, and spatially resolved gas kinematics through integral-field spectroscopy (Saintonge et al. Reference Saintonge2017; Péroux & Howk Reference Péroux and Howk2020; Tacconi, Genzel, & Sternberg Reference Tacconi, Genzel and Sternberg2020). Pairing these multi-wavelength datasets with spectroscopic information and group catalogues allows us to directly probe the influence of dark matter on the baryon cycle and empirically test how halo properties regulate key processes such as gas accretion rates, star formation efficiency, and feedback-driven outflows across diverse environments and cosmic epochs (Tumlinson, Peeples, & Werk Reference Tumlinson, Peeples and Werk2017; Mitchell et al. Reference Mitchell, Schaye, Bower and Crain2020; van de Voort et al. Reference van de Voort2021). A comprehensive understanding of these complex relationships necessitates both the statistical power of large-scale spectroscopic group catalogues and the detailed multi-wavelength characterisation of baryon cycle components. Nevertheless, precise halo mass measurements remain the critical prerequisite, particularly at the group scale ( $10^{12}$ $10^{14}$ M $_{\odot}$ ) where the interplay between dark matter and baryonic processes most significantly influences galaxy evolution (Behroozi et al. Reference Behroozi, Wechsler, Hearin and Conroy2019; Davies et al. Reference Davies2019; Wang & Peng Reference Wang and Peng2025).

Semi-analytical models (SAMs) offer a powerful framework for developing and calibrating such techniques. These models implement physically motivated prescriptions for galaxy formation within dark matter halo merger trees from N-body simulations (Prada et al. Reference Prada, Klypin, Cuesta, Betancort-Rijo and Primack2012; Somerville, Popping, & Trager Reference Somerville, Popping and Trager2015; Croton et al. Reference Croton2016; Lagos et al. Reference Lagos2018; De Lucia et al. Reference De Lucia, Fontanot, Xie and Hirschmann2024). By producing realistic galaxy populations with known halo properties, SAMs allow us to assess the performance of different mass estimation techniques and quantify their associated uncertainties. Recent advances in SAMs, including improved treatments of gas cooling, star formation, and feedback processes, which has significantly enhanced their ability to reproduce observed galaxy properties across a wide range of environments (Klypin et al. Reference Klypin, Yepes, Gottlöber, Prada and Heß2016; Croton et al. Reference Croton2016; Lacey et al. Reference Lacey2016; Lagos et al. Reference Lagos2018; Stevens et al. 2018; Henriques et al. Reference Henriques2020; Lagos et al. Reference Lagos2024; De Lucia et al. Reference De Lucia, Fontanot, Xie and Hirschmann2024).

In this paper, we present two complementary approaches to improve observational halo mass estimates for galaxy groups. The first approach addresses limitations in traditional dynamical mass estimators via the virial theorem when applied to low-multiplicity groups by developing a corrective framework that accounts for systematic biases in groups with small velocity dispersions and projected radius measurements. The second method uses the correlation between baryonic mass and dark matter mass, probing the mass relationship between the three most massive galaxies in the halo and that of the halo mass.

Figure 1. RA–Dec distribution of the three SGP sub-samples used for uniform completeness: SGP–2dF (red), G23–2dF (green), and G23–GAMA (orange). Comparing G23–2dF with G23–GAMA highlights the effect of spectroscopic completeness. Numbers in brackets denote sample sizes.

This paper is organised as follows. In Sections 2 and 3, we introduce the observational dataset in which we showcase applications of the halo mass relations and describe the SAMs used to calibrate/validate our relations. Section 4 introduces our two calibrated halo mass estimators; Section 4.1 presents the first approach utilising a modified virial theorem (MVT) to estimate the halo mass, and in Section 4.2 we present a summed stellar-halo mass relation (sSHMR) using the three most massive galaxies in a group as a proxy for halo mass. In Section 5, we apply the halo mass estimations to the observational data and demonstrate their performance and use cases. Finally, Section 6 summarises our findings and outlines future directions for halo mass estimation in upcoming large-scale surveys.

Throughout this paper, we adopt a flat $\Lambda$ CDM cosmology with parameters: $H_{0}$ = 70 km s $^{-1}$ Mpc $^{-1}$ , $\Omega_{M}$ = 0.3, and $\Omega_{\Lambda}$ = 0.7, unless stated otherwise. All halo masses are defined as $M_{200}$ , the mass enclosed within a radius where the mean density is 200 times the critical density of the Universe.

2. Observational data

The primary observational dataset used in this work is the Southern Galactic Pole (SGP) catalogue introduced in Van Kempen et al. (Reference Van Kempen2024). The SGP provides a highly complete spectroscopic sample of galaxies at $z \lt 0.1$ across 376 deg $^2$ ( $340^\circ \lt \mathrm{RA} \lt 26^\circ$ , $-35.3^\circ \lt \mathrm{Dec} \lt -25.8^\circ$ ), the Two-degree-Field Galaxy Redshift Survey (2dFGRS; Colless et al. Reference Colless2001) and Galaxy And Mass Assembly (GAMA; Driver et al. Reference Driver2009) G23 survey regions. Redshifts are sourced primarily from 2dFGRS and GAMA, and are supplemented with the Six-degree Field Galaxy Survey (6dFGRS; Blake et al. Reference Blake2016), the 2-degree Field Lensing Survey (2dFLenS; Jones et al. Reference Jones2004), the 2MASS Redshift Survey (2MRS; Macri et al. Reference Macri2019), and the Million Quasars catalogue (MILLIQUAS; Flesch Reference Flesch2021) measurements. The SGP catalogue is comprised of 24 656 unique spectroscopic sources. These sources were cross-matched with photometry from the Wide-field Infrared Survey Explorer (WISE; Wright et al. Reference Wright2010), combining both the WISE Extended Source Catalogue (WXSC; Jarrett et al. Reference Jarrett2013, Reference Jarrett2019) and the point-source AllWISE catalogue (Cluver et al. Reference Cluver2014, Reference Cluver2020). The matching between WISE photometry and that of the spectroscopic sources was approximately 93%, yielding mid-infrared measurements for 22 933 galaxies. The cross-matched WISE photometry enables robust estimates of stellar masses, derived from the W1-based relations of Jarrett et al. (Reference Jarrett2023), as well as star formation rates, calculated from W3 and W4 luminosities following the calibrations of Cluver et al. (Reference Cluver2025).

As the SGP catalogue is constructed from multiple spectroscopic catalogues, the resulting dataset is heterogeneous in nature. To establish homogeneity for our analyses, we define three spectroscopic sub-samples: (1) SGP–2dF, comprising all galaxies with 2dFGRS photometry across the full SGP footprint; (2) G23–2dF, containing galaxies with 2dFGRS photometry restricted to the GAMA G23 region ( $339^\circ \lt \mathrm{RA} \lt 351^\circ$ , $-35^\circ \lt \mathrm{Dec} \lt -30^\circ$ ); and (3) G23–GAMA, consisting of galaxies with GAMA photometry within G23. Figure 1 presents an Right Ascension–Declination view of the homogeneous, WISE cross-matched sub-samples of the SGP dataset. The G23-GAMA sub-samples clearly demonstrate higher spectroscopic completeness, which is approximately 2.3 times greater than that of G23–2dF.

The SGP catalogue provides 1 413 galaxy groups, identified via a Python-based, graph-theory implementation of a friends-of-friends (FoF) algorithm, FoFpy (Lambert et al. Reference Lambert, Kraan-Korteweg, Jarrett and Macri2020). FoFpy links galaxies when both their projected separations and line-of-sight velocity differences fall below scalable linking lengths, with a probabilistic cut used to reject links (see Lambert et al. Reference Lambert, Kraan-Korteweg, Jarrett and Macri2020 for further details on the FoFpy algorithm). To ensure robust recovery of galaxy groups, a two-pass strategy was adopted: an initial FoF run with extended linking lengths captured larger groups, which were removed prior to a second pass with smaller linking lengths targeting smaller groups. The linking lengths were calibrated using mock lightcones constructed from the Millennium simulation (Springel et al. Reference Springel2005) and the Semi-Analytic Galaxy Evolution (SAGE) model (Croton et al. Reference Croton2016). These mock lightcones were generated with 2dFGRS ( $b_{J}=19.45$ ) and GAMA ( $i=19.2$ ) magnitude limits. See Van Kempen et al. (Reference Van Kempen2024) for a full description of the construction of the SGP group catalogue.

3. Semi-analytic models of galaxy formation

This section provides a comprehensive overview of the simulated datasets used in this study. We describe the suite of state-of-the-art SAMs of galaxy formation employed in this work. SAMs represent powerful theoretical tools for exploring the interplay between dark matter structure formation and baryonic physics. These models implement physically motivated prescriptions for key astrophysical processes within merger trees extracted from cosmological N-body simulations (see reviews by Baugh Reference Baugh2006; Benson Reference Benson2010). By providing self-consistent galaxy populations with fully known dark matter halo properties, SAMs offer an ideal framework for developing and calibrating halo-mass estimation techniques. In this work, three independent SAMs are used with distinct roles: Shark (Lagos et al. Reference Lagos2018, Reference Lagos2024) Lagos et al. Reference Lagos2018; Lagos et al. Reference Lagos2024) serves as the fiducial calibration model for the halo mass estimates, whereas SAGE and the GAlaxy Evolution and Assembly (GAEA; De Lucia et al. Reference De Lucia2014, Reference De Lucia, Fontanot, Xie and Hirschmann2024; Hirschmann, De Lucia, & Fontanot Reference Hirschmann, De Lucia and Fontanot2016) models are used exclusively for validation and robustness testing. No parameters of the estimators are re-tuned on SAGE or GAEA. Each of these SAMs implement different physical prescriptions while operating on distinct N-body simulations. This multi-model approach enables assessment of robustness to variations in the underlying galaxy-formation physics and cosmology.

To ensure comparability with the observations described in Section 2, all SAM outputs are post-processed into mock light cones with realistic sky coordinates and redshifts, including peculiar-velocity–induced redshift-space distortions. Apparent magnitudes are computed and an SGP-like selection is applied (e.g. GAMA G23 magnitude limit of $i\lt19.2$ ) to emulate the survey depth. Galaxy groups in the SAMs provide ground-truth memberships and halo masses while retaining observational selection effects. Estimator inputs are restricted to observables after the SGP-like selection, this methodology ensures that the calibration and validation of halo mass estimators are performed under conditions that closely mimic real, highly complete spectroscopic surveys, thereby enhancing the reliability and applicability of the methods to current and future observational datasets. As a caveat, a fainter magnitude limit of $Z\lt21.2$ is used for Shark in the development of the sSHMR (see Section 4.2.3).

3.1. SHARK

Shark v2.0 (Lagos et al. Reference Lagos2018, Reference Lagos2024) is an open-source, modular semi-analytic model of galaxy formation and evolution. The latest version incorporates significant advancements, including an improved treatment of angular momentum evolution, several environmental processes including ram pressure and tidal stripping, and updated feedback models. The free parameters in Shark are calibrated to reproduce the observed stellar mass function, star formation rate density, and cold gas scaling relations at $z = 0$ .

The Shark runs analysed in this work are based on the SURFS suite of N-body simulations (Elahi et al. Reference Elahi2018), specifically medi-SURFS, which spans $(210h^{-1}\mathrm{Mpc})^3$ with $1\,536^3$ dark matter particles, yielding a particle mass of $2.21\times10^8h^{-1}\,{\rm M}_{\odot}$ . Haloes, sub-haloes, and merger trees are constructed using HBT+HERONS (Chandro-Gómez et al. Reference Chandro-Gómez2025). The Shark simulated light cones used in this work correspond to WAVES WIDE (North + South) light cones, totalling $\sim 1\,100\,\mathrm{deg}^2$ in area. These light cones were constructed using the pipeline described in Lagos et al. (Reference Lagos2019): the survey geometry and magnitude selections are built using Stingray (Chauhan et al. Reference Chauhan2019), and the galaxy SEDs are built using ProSpect (Robotham et al. Reference Robotham2020). A lower stellar mass limit of $\log M_{\star} \gt 7.5\,({\rm M}_{\odot})$ was applied to the mock light cones to ensure completeness and reliability in the resulting galaxy sample. The simulation adopts a Planck 2015 cosmology (Planck Collaboration et al. 2016), with $\Omega_{{m}} = 0.3121$ , $\Omega_{\Lambda} = 0.6879$ , $\Omega_{{b}} = 0.0491$ , $h = 0.6751$ , $\sigma_8 = 0.8150$ , and $n_{{s}} = 0.9653$ . All calibrations of the halo mass relations are performed on Shark.

3.2. SAGE

The Semi-Analytic Galaxy Evolution (SAGE) model (Croton et al. Reference Croton2016) is a flexible, publicly available semi-analytic framework, building upon the Munich model lineage (Croton et al. Reference Croton2006). SAGE incorporates detailed prescriptions for radiative cooling, star formation, stellar and AGN feedback, black hole growth, and environmental processes. The model parameters are tuned to match the observed stellar mass function and galaxy colour distributions at $z = 0$ .

For this study, SAGE is applied to the BOLSHOI N-body simulation (Klypin, Trujillo-Gomez, & Primack Reference Klypin, Trujillo-Gomez and Primack2011), which covers a volume of $(250h^{-1}\mathrm{Mpc})^3$ with $2\,048^3$ particles, corresponding to a mass resolution of $1.35 \times 10^8h^{-1}\,{\rm M}_{\odot}$ . The SAGE simulated light cones used in this work consisted of 10 lightcones, each with an area of $\sim 1\,960\,\mathrm{deg}^2$ . Haloes are identified using the ROCKSTAR phase-space halo finder (Behroozi, Wechsler, & Wu Reference Behroozi, Wechsler and Wu2013a), and merger trees are constructed with the Consistent Trees algorithm (Behroozi et al. Reference Behroozi2013b). A lower stellar mass limit of $\log M_{\star} \gt 7.5\,({\rm M}_{\odot})$ was applied to the mock light cones to ensure completeness and reliability in the resulting galaxy sample. The adopted cosmology is based on WMAP7 (Komatsu et al. Reference Komatsu2011), with $\Omega_{{m}} = 0.270$ , $\Omega_{\Lambda} = 0.730$ , $\Omega_{{b}} = 0.0469$ , $h = 0.70$ , $\sigma_8 = 0.82$ , and $n_{{s}} = 0.95$ . SAGE is used solely to validate the halo mass estimators without any re-tuning, thereby probing sensitivity to differing galaxy-formation prescriptions.

3.3. GAEA

The Galaxy Evolution and Assembly (GAEA) semi-analytic model (De Lucia et al. Reference De Lucia2014; Hirschmann et al. Reference Hirschmann, De Lucia and Fontanot2016; De Lucia et al. Reference De Lucia, Fontanot, Xie and Hirschmann2024) provides an independent theoretical benchmark in this work. The latest version includes updated prescriptions for AGN feedback, environmental processes affecting satellites, black hole accretion, disk instabilities, and starburst activity (Fontanot et al. Reference Fontanot2020; De Lucia et al. Reference De Lucia, Fontanot, Wilman and Monaco2011).

GAEA parameters are calibrated to match the stellar mass function over $0 \lt z \lt 4$ , local atomic and molecular hydrogen mass functions, and AGN bolometric luminosity function evolution to $z \sim 4$ . The model is run on merger trees from the Millennium Simulation (Springel et al. Reference Springel2005), which adopts a $\Lambda$ CDM cosmology with $\Omega_{{m}} = 0.25$ , $\Omega_{{b}} = 0.045$ , $\Omega_{\Lambda} = 0.75$ , $h = 0.73$ , $n_{{s}} = 1$ , and $\sigma_8 = 0.8$ . The simulation volume is $(500h^{-1}\mathrm{Mpc})^3$ , with a particle mass of $8.625 \times 10^8~h^{-1}\,{\rm M}_{\odot}$ . The constructed observational cone from this simulation, was large enough to produce a full celestial sphere ( $\sim 41\,253 \, \mathrm{deg}^2$ ). A lower stellar mass limit of $\log M_{\star} \gt 8\,({\rm M}_{\odot})$ was applied to the mock light cones to ensure completeness and reliability in the resulting galaxy sample. Haloes and merger trees are constructed using Subfind and Sub-LINK (Springel et al. Reference Springel, White, Tormen and Kauffmann2001). GAEA is used solely to validate our halo mass estimates, without any re-tuning, thereby probing the estimates sensitivity to differing galaxy-formation prescriptions.

4. Halo mass relations

In the group and cluster regime, halo mass estimates are traditionally derived from dynamical tracers, such as the velocity dispersion of member galaxies, or from abundance matching techniques that link observed galaxy properties to theoretical halo mass functions (Yang et al. Reference Yang2007; Viola et al. Reference Viola2015; Lim et al. Reference Lim2021). However, these methods are subject to significant systematic uncertainties, particularly at low halo masses and for groups with low multiplicity, where the reliability of dynamical indicators is compromised by small number statistics and projection effects (Old et al. Reference Old2015; Robotham et al. Reference Robotham2011; Muldrew et al. Reference Muldrew2012).

To provide a physically motivated and observationally calibrated framework for halo mass estimation, we chose to calibrate our halo mass estimators to Shark. This decision is motivated by Shark’s demonstrated ability to reproduce key observables, most notably the observed stellar mass function (SMF; Figure 2) and halo mass function (HMF; Figure 3), when measured using observational techniques. Furthermore, Shark is specifically tailored for application to forthcoming large-scale surveys, most notably WAVES, but also 4HS, and will serve as the primary simulation framework for training and assessing future group-finding algorithms to be used in these surveys (Lagos et al. Reference Lagos2019). In addition, recent advancements in Shark, such as the implementation of the HBT+HERONS merger tree algorithm (Chandro-Gómez et al. Reference Chandro-Gómez2025), have significantly reduced numerical artefacts, including: mass swapping, massive transients, and orphan galaxies – that can arise in dark matter merger trees. These improvements yield a more stable and physically consistent halo population, which is essential for the development of robust and reliable halo mass estimators.

Figure 2. Comparison of the SMF from Shark to well-established observational SMFs and SMFs produced by observational data from Van Kempen et al. (Reference Van Kempen2024). The figure shows the SMF ( $\log \unicode{x03D5}$ ) versus stellar mass ( $\log M_{\star}$ ) for Shark v2.0 (blue circles) alongside canonical constraints from Baldry et al. (Reference Baldry2012) (black dashed line) and Driver et al. (Reference Driver2022a) (grey dashed line). Red triangles, orange diamonds, and green crosses represent our observational data from Van Kempen et al. (Reference Van Kempen2024) (SGP-2dF, G23-GAMA, and G23-2dF datasets, respectively). Error bars indicate Poisson uncertainties. Shark demonstrates excellent agreement with the established SMFs and across our observational datasets, with only minor deviations.

Figure 3. Comparison of the HMF derived from Shark with analytic and observational benchmarks. The blue data points represent the Shark HMF, with error bars indicating Poisson uncertainties in each mass bin. The solid black curve denotes the analytic HMF prediction for the input cosmology of Shark, computed using the hmf Python package (Murray, Power, & Robotham Reference Murray, Power and Robotham2013). The grey dashed line shows the empirical fit from Driver et al. (Reference Driver2022a), based on GAMA5, SDSS5, and REFLEX II data. The magenta dashed curve corresponds to a Schechter function fit to the Shark HMF.

As shown in Figure 2, Shark provides an excellent match to well-established observational SMFs (Baldry et al. Reference Baldry2012; Driver et al. Reference Driver2022a) across the full stellar mass range. The observational datasets from Van Kempen et al. (Reference Van Kempen2024) (SGP-2dF, G23-GAMA, and G23-2dF) show good agreement with the Baldry et al. (Reference Baldry2012) and Driver et al. (Reference Driver2022a) SMFs across the intermediate to high stellar mass range, which demonstrates the reliability of the sample in this regime. At lower stellar masses, however, the observed SMFs exhibit a deficit relative to these fits. This offset is attributable to the limitations of WISE photometry in detecting low-surface brightness and low stellar mass galaxies, resulting in systematic incompleteness at the faint end. Such incompleteness is a well-known limitation of near-infrared selected samples and must be considered when interpreting the low-mass behaviour of the observed SMF.

The SMF and HMF are constructed using a standard $1/V_{\mathrm{max}}$ approach to correct for survey incompleteness and selection effects. For a given mass bin ( $\Delta \log M$ ; stellar mass for the SMF and halo mass for the HMF), the number density is computed as:

(1) \begin{equation} \unicode{x03D5}(\log M) = \sum_{i=0}^{N} \frac{1}{V_{\mathrm{max}}^i \, \Delta \log M},\end{equation}

where $V_{\mathrm{max}}^i$ is the maximum comoving volume within which the ith galaxy (for the SMF) or group (for the HMF) could be observed, given the survey magnitude limits and selection criteria. For the SMF, $V_{\mathrm{max}}^i$ is determined by the redshift range over which each galaxy remains above the survey flux limit. For the HMF, following the methodology of Driver et al. (Reference Driver2022a), $V_{\mathrm{max}}^i$ is estimated based on the nth brightest galaxy in each group (here, $n=3$ ), such that the group’s $V_{\mathrm{max}}$ is calculated based on the maximum survey volume that the nth brightest member remains above the survey flux limit. This approach ensures that the group sample is volume-limited with respect to its membership.

The halo mass function (HMF) is fit as a single Schechter function of the form:

(2) \begin{equation}\begin{array}{@{}l@{}}\unicode{x03D5}(\log M_{\rm halo}) = \ln(10) \, \unicode{x03D5}_* \, \unicode{x03B2} \left( \frac{M_{\rm halo}}{M^*} \right)^{\alpha + 1} \cdot \exp \left[ - \left( \frac{M_{\rm halo}}{M^*} \right)^{\unicode{x03B2}} \right],\end{array}\end{equation}

where $M^*$ is the characteristic halo mass, $\unicode{x03D5}_*$ is the normalisation, $\alpha$ is the low-mass slope, and $\unicode{x03B2}$ is the exponential cutoff parameter.

Figure 3 demonstrates the close agreement between the Shark HMF and analytic predictions, with only minor deviations at the lowest halo masses. The robust match between Shark and analytic expectations ensures that systematic biases in the halo mass distribution are minimised, providing a reliable foundation for calibrating observational mass proxies.

In the subsequent subsections, the formulation, calibration, accuracy, uncertainty and use cases of each halo mass estimator are described in detail. We employ SAGE and GAEA as independent tests to validate our Shark-calibrated relations. These models were run on different N-body simulations and contain different physical prescriptions, allowing us to assess whether our derived scaling relations remain robust across varied galaxy formation prescriptions and are suitable for general observational applications.

4.1. Modifying the virial theorem

The evolution of dark matter haloes is governed solely by gravitational forces, a regime that is well explored through detailed N-body simulations and SAMs, whereas galaxies are complex systems regulated by a multitude of baryonic processes, including gas cooling, star formation, and feedback (see review of Somerville & Davé Reference Somerville and Davé2015). This fundamental distinction underpins the rationale for employing dynamical, gravity-based estimators for halo mass, which are expected to exhibit minimal dependence on the details of baryonic physics and thus provide a robust, model-independent approach to halo mass estimation.

4.1.1. The virial theorem

In observational studies of galaxy groups and clusters, the velocity dispersion is measured along the line of sight, denoted as $\sigma_{\mathrm{los}}$ (in units of km s $^{-1}$ ). For a self-gravitating, isotropic, and uniform sphere of mass M and radius R, the virial theorem relates the total kinetic energy T and potential energy U as $2T + U = 0$ in equilibrium. The total kinetic energy can be expressed in terms of $\sigma_{\mathrm{los}}$ as $T = \frac{1}{2} M \sigma_{\mathrm{los}}^2$ , while the gravitational potential energy remains $U = -\frac{3}{5} \frac{GM^2}{R}$ . This yields the standard virial mass estimator:

(3) \begin{equation} M_{\mathrm{halo,~VT}}(h^{-1}M_{\odot}) = \frac{5}{3} \frac{\sigma_{\mathrm{los}}^2 R}{G},\end{equation}

where R is a radius (e.g. the maximum projected separation among group members from the median centre of the group) in units of Mpc, and G is the gravitational constant in units of M $_{\odot}^{-1}\,\mathrm{km}^2\,\mathrm{s}^{-2}\,\mathrm{Mpc}$ , yielding a mass in $h^{-1}{\rm M}_{\odot}$ . This estimator is widely used in both observational and theoretical studies of galaxy groups and clusters (e.g. Carlberg et al. Reference Carlberg1996; Eke et al. Reference Eke2004; Robotham et al. Reference Robotham2011). While the $\frac{5}{3}$ coefficient is sometimes omitted in the literature (e.g. Finn et al. Reference Finn2005; Evrard et al. Reference Evrard2008; Lau, Nagai, & Kravtsov Reference Lau, Nagai and Kravtsov2010; Poggianti et al. Reference Poggianti2010; Robotham et al. Reference Robotham2011; Alpaslan et al. Reference Alpaslan2012), it is physically motivated and essential for accurate mass estimates, particularly for massive halos.

The line-of-sight velocity dispersion, $\sigma_{\mathrm{los}}$ , is calculated using the Gapper (gap) method (Beers, Flynn, & Gebhardt Reference Beers, Flynn and Gebhardt1990), which is particularly robust for small group sizes and is less sensitive to outliers than standard deviation-based estimators. For a group of N galaxies with ordered velocities $v_1 \lt v_2 \lt \cdots \lt v_N$ , the Gapper estimator is defined as:

(4) \begin{equation}\sigma_{\mathrm{gap}} = \frac{\sqrt{\pi}}{N(N-1)} \sum_{i=1}^{N-1} w_i g_i,\end{equation}

where $g_i = v_{i+1} - v_i$ is the velocity gap between adjacent ordered velocities, and $w_i = i(N-i)$ is a weighting factor. Through the use of ordered velocity gaps, the gap method provides lower sampling variance than standard-deviation or bi-weight scale estimators, and is less sensitive to interlopers and has thus become a standard approach in modern group catalogues (e.g. Eke et al. Reference Eke2004; Robotham et al. Reference Robotham2011; Old et al. Reference Old2015).

In most mock haloes, the brightest galaxy is assumed to be at rest with respect to the halo centre of mass. To account for this, the velocity dispersion is further corrected by a factor of $\sqrt{N/(N-1)}$ (Eke et al. Reference Eke2004; Robotham et al. Reference Robotham2011). The final velocity dispersion is then given by:

(5) \begin{equation}\sigma = \sqrt{\frac{N}{N-1} \sigma_{\mathrm{gap}}^2 } .\end{equation}

This approach ensures that the velocity dispersion estimates are unbiased and robust, even for low-multiplicity systems, and that the dominant sources of observational uncertainty are properly propagated (Robotham et al. Reference Robotham2011).

Figure 4. Comparison of halo mass estimates derived from the velocity dispersion relation before and after calibration, across three semi-analytic models: Shark, SAGE, and GAEA. All samples have a manitude limit of $i\lt19.2$ and a redshift limit of $z\lt0.1$ . Top Row: The virial theorem halo mass ( $\log M_{\mathrm{halo, \, VT}}$ ) versus the true halo mass ( $\log M_{\mathrm{halo}}$ ) for each model, with colour indicating the logarithmic number density of halos in each bin. The dashed line denotes the one-to-one relation. Second Row: Corresponding residuals ( $\Delta \log M_{\mathrm{halo}} = \log M_{\mathrm{halo, \, VT}} - \log M_{\mathrm{halo}}$ ) and error bars indicate 16th and 84th percentiles in each bin. The mean offset and scatter indicated in the legend. Third Row: The MVT halo mass estimates ( $\log$ M $_{\mathrm{halo, \, MVT}}$ ), compared to the true halo mass ( $\log M_{\mathrm{halo}}$ ). Bottom Row: Corresponding median of residuals ( $\Delta \log M_{\mathrm{halo}} = \log M_{\mathrm{halo, \, MVT}} - \log M_{\mathrm{halo}}$ ) and error bars indicate 16th and 84th percentiles in each bin. The mean offset and scatter indicated in the legend.

The top panels of Figure 4 compare the true halo masses from the simulations to those measured using the virial theorem via Equation (3) on the simulated light-cones with a $z\lt0.1$ limit. These results demonstrate that the traditional virial theorem mass estimator is subject to substantial scatter and systematic offsets, particularly at low halo masses and for systems with low multiplicity. As shown in the lower panels, the mean offset ( $\Delta$ ) between the estimated and true halo masses is significant across all models, reaching values as large as $-2.10$ dex (SAGE), $-1.63$ dex (GAEA), and $-1.19$ dex (SHARK). These biases indicate that, without modification, the virial theorem systematically underestimates halo masses, especially in the low-mass regime, highlighting the need for improved halo mass estimates in group environments. Given these limitations, it is essential to develop improved approaches for halo mass estimation in group environments. In the following section, we present a calibrated virial theorem estimator designed to address these shortcomings and provide more reliable halo mass measurements.

4.1.2. Calibration of the modified virial theorem

While the virial theorem provides a physically motivated starting point, its direct application to observed galaxy groups is complicated by departures from equilibrium, projection effects, and uncertainties in group membership, especially for low-multiplicity systems (Robotham et al. Reference Robotham2011; Muldrew et al. Reference Muldrew2012; Old et al. Reference Old2015). To account for these effects, we introduce an MVT, which includes a scale factor, A, to Equation (3) and corrects the mass estimate based on the measured velocity dispersion and projected radius of the group:

(6) \begin{equation} M_{\mathrm{halo,\ MVT}} = A \frac{\sigma^2 R}{G},\end{equation}

where A is a function of both velocity dispersion and group radius. The calibration of A is essential to ensure unbiased halo mass estimates across the full range of group halo masses.

The calibration of the MVT was performed using a Bayesian framework, employing Markov Chain Monte Carlo (MCMC) sampling to explore the posterior probability distribution of the model parameters. The likelihood function was constructed to minimise the scatter in $\log M_{\mathrm{halo}}\,(M_{\odot})$ at fixed predicted halo mass, effectively minimising $\chi^2$ and maximising the log-likelihood. Convergence of the MCMC chains was assessed via autocorrelation analysis. Further details of the likelihood function and the Bayesian inference methodology are provided in Appendix A.

A suite of functional forms for the correction terms $A_{\sigma}$ and $A_{R}$ were tested, including exponential, inverse square, linear, logistic, and sigmoid decay models. The power-law form was found to provide the best fit to the simulation data, as quantified by the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), outperforming alternative models by a substantial margin. The final adopted form for the A coefficient is:

(7) \begin{equation} A= \frac{5}{3} + A_{\sigma} + A_{R},\end{equation}

with the power-law decay models for $A_{\sigma}$ and $A_{R}$ given by:

(8) \begin{equation} A_{\sigma} = \begin{cases} \alpha \left( \left( \frac{ \sigma }{ \sigma_{\lim} } \right)^{n_1} - 1 \right), & \text{if } \sigma \leq \sigma_{\lim}\\ 0, & \text{otherwise} \end{cases},\end{equation}
(9) \begin{equation} A_{R} = \begin{cases} \unicode{x03B2} \left( \left( \frac{ R }{ R_{\lim} } \right)^{n_2} - 1 \right), & \text{if } R \leq R_{\lim}\\ 0, & \text{otherwise} \end{cases},\end{equation}

where $\alpha$ , $\sigma_{\lim}$ , $n_1$ , $\unicode{x03B2}$ , $R_{\lim}$ , and $n_2$ are free parameters determined by the MCMC sampling. The resulting posterior distributions for all model parameters are presented in Figure A1 in Appendix A, demonstrating that the parameters are well-constrained. The best-fitting parameters for the MVT are given in Table 1 for convenience.

Notably, the calibrated $A_{\sigma}$ and $A_{R}$ coefficients decay to zero at the fitted velocity dispersion and group radius limits, reflecting that for larger multiplicity groups, the velocity dispersion and radius are well-defined and no further modification is required. Above these limits, the MVT (Equation 6) reduces to the virial theorem (Equation 3). This procedure ensures that the MVT estimator is unbiased and robust across the full range of group properties, with well-quantified uncertainties and minimal model dependence.

To convert halo masses between different values of the Hubble parameter $H_0$ (or Hubble constant h), the following scaling should be applied:

Table 1. Best-fitting parameters for the calibrated virial theorem relation.

(10) \begin{equation} R_{\lim}^{'} = R_{\lim} \cdot \frac{h}{h^{'}},\end{equation}

where $R_{\lim}^{'}$ is the group radius limit for a newly chosen h’, and h is the value used in this work ( $h=0.7$ ). For example, converting to $h'=0.67$ yields $R_{\lim}^{'} = 0.369\cdot \frac{0.7}{0.67}=0.386$ Mpc. Alternatively, a factor of $\log_{10} (h'/h)$ can be added to the calculated $M_{\mathrm{halo}}$ from Equation (6) when using the $R_{\lim}$ in Table 1. The calculated projected group radius for a given group should also be recalculated for the chosen h value.

4.1.3. MVT – Accuracy, uncertainty and use cases

The MVT is calibrated to Shark, we then quantify the calibration accuracy and uncertainty on Shark and cross-validate the calibrated estimator to both SAGE and GAEA. The bottom panels of Figure 4 present a quantitative comparison between the predicted halo mass from the MVT (Equation 6) and true halo masses for each model (calibrated to $z \lt 0.1$ and $i\lt19.2$ mag). The accuracy of the MVT is characterised by the mean offset (mean $\Delta$ ) between the predicted and true halo masses, while the uncertainty is quantified by the mean of the $16{\mathrm{th}}$ and $84{\mathrm{th}}$ percentiles (mean $\sigma$ ) of the residuals. These metrics are shown in the bottom panels of Figure 4 as a function of the predicted halo mass for each simulation. On the calibration model (Shark), the mean $\Delta$ is $-0.01$ dex, with a mean $\sigma$ of $0.20$ dex, indicating negligible systematic bias and moderate scatter. On the validation models; SAGE yields a mean $\Delta$ of $0.12$ dex and a mean $\sigma$ of $0.10$ dex, while the GAEA model exhibits a mean $\Delta$ of $0.07$ dex and a mean $\sigma$ of $0.13$ dex. These results demonstrate that the calibration procedure effectively removes systematic biases and reduces scatter across the full mass range and for all group multiplicities. The calibrated relation yields a high degree of consistency in both normalisation and slope across the models compared to the traditional virial theorem estimator (top panels), with only minor differences from the true halo mass, particularly at the low-mass end in the GAEA model. The small variance in the calibrated relation between the different models reflects the minimal dependence of the virial theorem on the details of baryonic physics, as it is fundamentally anchored in gravitational dynamics. However, some residual differences are observed, particularly in the low-mass regime for the GAEA simulation.

The robustness of the calibrated relation was further tested by varying the magnitude and redshift limits. When imposing a fainter WAVES-wide magnitude limit ( $Z \lt 21.2$ ), the free parameters changed by 1–20%, with the correction factor decreasing due to the deeper limiting magnitude. Extending the redshift range to $z \lt 0.3$ resulted in larger changes of 5–40% in the free parameters, as stronger corrections were required as a function of redshift. These tests demonstrate that the calibrated relation presented here is specifically tailored to GAMA-like selection criteria (i.e. $i \lt 19.2$ ) and to $z \lt 0.1$ . Users applying this relation to datasets with different selection functions or redshift limits should be aware of the potential systematic biases that may arise when doing so and a recalibration of the relation for their specific survey parameters is suggested. Nevertheless, despite these limitations, the application of the calibrated relation to non-ideal datasets remains preferable to relying solely on the traditional virial theorem, which does not account for selection effects or redshift-dependent biases.

In summary, the MVT is the least model-dependent method considered in this work, as it does not require knowledge of galaxy properties or detailed prescriptions for baryonic processes amongst the models. This makes it ideally suited for applications where unbiased halo mass estimates are required, such as the construction of the halo mass function (HMF) in spectroscopic surveys or the derivation of cosmological parameters from group catalogues, provided that the relation is applied to the appropriate dataset.

4.2. The summed stellar-to-halo mass relation

The connection between the stellar content of galaxies and their host dark matter haloes is a cornerstone of galaxy formation theory, providing a critical link between observable baryonic properties and the underlying dark matter distribution (Behroozi, Conroy, & Wechsler Reference Behroozi, Conroy and Wechsler2010; Moster, Naab, & White Reference Moster, Naab and White2013; Wechsler & Tinker Reference Wechsler and Tinker2018). While abundance matching and halo occupation models have traditionally been used to infer this relationship on a statistical basis (e.g. Yang, Mo, & van den Bosch Reference Yang, Mo and van den Bosch2003; Vale & Ostriker Reference Vale and Ostriker2004; Behroozi et al. Reference Behroozi, Conroy and Wechsler2010; Moster et al. Reference Moster, Naab and White2013; Kravtsov et al. Reference Kravtsov, Vikhlinin and Meshcheryakov2018), direct group-based approaches offer a powerful means to calibrate halo mass estimators for individual systems (e.g. Viola et al. Reference Viola2015; Lim et al. Reference Lim2021), particularly in the group regime where dynamical methods become less reliable at low multiplicity (Old et al. Reference Old2015; Muldrew et al. Reference Muldrew2012).

4.2.1. Summed stellar mass proxy

In this section, we establish an empirical sSHMR between the summed stellar masses of the three most massive galaxies in a group ( $\Sigma M_{*,\mathrm{3}}$ ) and the group halo mass ( $M_{\mathrm{halo}}$ ). This approach is motivated by the expectation that the most massive group members should provide the most reliable baryonic tracers of the underlying halo mass. By summing the stellar mass of the three most massive galaxies, this method should mitigate stochasticity compared to a typical single galaxy tracer via a stellar-to-halo mass relation (SHMR). The choice of three galaxies represents a balance between competing factors, using only the most massive galaxy may introduce excessive scatter and result in a steeper high-mass slope in the SHMR, where small changes in stellar mass would produce disproportionately large changes in the estimated halo mass. Conversely, extending the sum to more galaxies may yield diminishing returns while adding noise from less reliable tracers. Summing over the three most massive galaxies should flatten the relation at the high-mass end and provide a more stable and physically motivated estimator. Additionally, we assess the robustness of this relation by varying the selection criteria from the GAMA-like limit ( $i \lt 19.2$ ) to a fainter WAVES-wide limit ( $Z \lt 21.2$ ) and by extending the redshift range from $z \lt 0.1$ to $z \lt 0.3$ , examining how the relation responds to different survey parameters.

4.2.2. sSHMR calibration and functional form

We use the same MCMC methodology as in Section 4.1.2. The likelihood function was constructed to minimise the scatter in $\log$ $M_{\mathrm{halo}}$ at fixed $\log \Sigma M_{\star, \, 3}$ ; convergence was assessed via autocorrelation analysis. The resulting posterior distributions for the model parameters are shown in Figure B1, and further details of the likelihood function, the fitting methodology and the uncertainty of the fitted parameters are provided in Appendix B.

The adopted functional form for the sSHMR is a double power-law in the form:

(11) \begin{equation}\begin{array}{@{}l@{}}M_{\mathrm{halo,\, 3}} \,(M_{\odot}) = A \cdot \Sigma M_{*,\, \mathrm{3}} \cdot\Big( \Big( \frac{\Sigma M_{*,\, \mathrm{3}}}{M_A} \Big)^{\unicode{x03B2}}+ \Big( \frac{\Sigma M_{*,\, \mathrm{3}}}{M_A} \Big)^{\gamma} \Big),\end{array}\end{equation}

where $\Sigma M_{*,\mathrm{3}}$ is the sum of the stellar masses of the three most massive galaxies in the group in units of $M_{\odot}$ , A is the normalisation, $M_A$ is the characteristic stellar mass in units of $M_{\odot}$ , and $\unicode{x03B2}$ and $\gamma$ are the low- and high-mass slopes, respectively. The best-fitting parameters from the MCMC sampling are summarised in Table 2.

Table 2. Best-fitting parameters for the summed stellar mass–halo mass relation.

To convert the SHMR between different values of the Hubble parameter $H_0$ (or h), the following scaling should be applied:

(12) \begin{equation} M_{A}^{'} = M_A \cdot \frac{h}{h^{'}},\end{equation}

where $M_{A}^{'}$ is the characteristic stellar mass for a newly chosen $h^{'}$ , and h is the value used in this work ( $h=0.7$ ). For example, converting to $h^{'} = 0.67$ yields $\log M_{A}^{'} = \log (10^{10.483} \cdot \frac{0.7}{0.67}) = 10.502$ . As the characteristic stellar mass is derived from simulations, there is only one factor of $h^{-1}$ rather than the typical $h^{-2}$ for observations. It is essential to ensure that both the stellar masses used in the sSHMR and the characteristic mass parameter $M_A$ are consistently defined with respect to the chosen value of h. Any conversion of the sSHMR to a different Hubble parameter must be accompanied by a corresponding adjustment of these quantities.

4.2.3. sSHMR – Accuracy, uncertainty and use cases

Following the same methods as in Section 4.1.3, the sSHMR is calibrated to Shark where we test the accuracy and uncertainty of this calibration, then we-cross validate this with SAGE and GAEA as independent tests. The accuracy of the sSHMR calibration is characterised by the mean $\Delta$ between the predicted and true halo masses, while the uncertainty is quantified by the mean $\sigma$ . Both are presented in the second and last rows of Figure 5 as a function of both the sum of the primary three stellar masses ( $\Sigma M_{\star,\,3}$ ) and the predicted halo mass ( $M_{\mathrm{halo,\,3}}$ ) for each model. On the calibration model (Shark), the mean $\Delta$ is $0.01$ dex as a function of the summed masses, with a mean $\sigma$ of $0.12$ dex; as a function of the predicted halo mass, the mean $\Delta$ is $0.02$ dex with a mean $\sigma$ of $0.14$ dex, indicating negligible systematic bias. Compared to the MVT in Section 4.1.3, the sSHMR achieves similar accuracy but with approximately half the uncertainty in the Shark model.

Figure 5. Comparison of baryonic and halo mass relations across three SAMs: Shark, SAGE, and GAEA. The Shark sample has a fainter magnitude limit of $Z\lt21.2$ and a deeper redshift limit of $z\lt0.3$ compared to SAGE and GAEA ( $i\lt19.2$ and $z\lt0.1$ ). Top Row: The sSHMR using the summed stellar mass of the three most massive group galaxies ( $\log \sum {M}_{*,3}$ ) and the true halo mass ( $\log M_{halo}$ ) is shown for each SAM, with the colour scale indicating the logarithm of the number of groups per bin. The black dashed line in each panel represents the MCMC-derived fit to the Shark data, optimised to minimise scatter in halo mass. This same fit is overlaid on the SAGE and GAEA panels to illustrate model dependence. Second Row: The residuals in halo mass ( $\Delta\log M_{halo}=\log\sum M_{*,3}-\log M_{halo}$ ) is shown as a function of the summed stellar mass. Third Row: Estimated halo masses, derived by applying the Shark-calibrated sSHMR, are plotted against the true halo masses from each simulation. The black dashed line denotes the one-to-one relation. The close alignment of the points along this line in all models demonstrates that the Shark-based calibration provides robust halo mass estimates, with low bias and scatter, even when applied to independent SAMs. Bottom Row: The residuals in halo mass are shown as a function of the estimated halo mass. In both residual panels, the colour scale again indicates the logarithm of the number of groups per bin.

Through cross-validation the SAGE and GAEA models yield mean $\Delta$ values of $0.21$ and $0.37$ dex, respectively, and mean $\sigma$ values of $0.10$ and $0.12$ dex, respectively, as a function of the summed stellar masses. As a function of the predicted halo mass, SAGE and GAEA yield mean $\Delta$ values of $0.13$ and $0.33$ dex, respectively, and mean $\sigma$ values of $0.11$ and $0.12$ dex, respectively. The differences in the sSHMR between models are more pronounced than those seen for the MVT. These discrepancies arise from the varying treatments of star formation efficiency, feedback, and merger-driven stellar mass growth in each SAM, as well as differences in the underlying dark matter merger trees and halo assembly histories. Such model dependencies are an inherent limitation of baryonic tracers, as the stellar mass content of halos is sensitive to the adopted physical prescriptions and calibration strategies.

As hypothesised, using a sSHMR proved to be the most effective approach rather than a typical SHMR. This method reduces the impact of stochasticity compared to using a single galaxy tracer, while also avoiding the diminishing returns observed when incorporating more than three members. Importantly, extending the relation to include more than three galaxies would, by definition, exclude groups with fewer than four members from being appropriately sampled. Such groups would require a separate relation, introducing additional complexity and reducing the universality of the method. Relying solely on the most massive galaxy leads to a steep slope at the high-mass end, where small increases in stellar mass correspond to disproportionately large increases in halo mass. In contrast, the use of $\Sigma M_{\star,\,3}$ flattens this relation, producing a more gradual and stable increase in halo mass with stellar mass. This improvement enhances its suitability for observational studies, as uncertainties in stellar mass estimates translate to significantly smaller errors in the inferred halo mass. The relation between $\Sigma M_{\star,\,3}$ and the true halo mass exhibits a high degree of consistency in overall form, but with notable differences in normalisation and slope between the models, as shown in Figure 5 These differences reflect the distinct implementations of baryonic physics, feedback, and merger histories in each SAM, which influence both stellar mass assembly and the mapping between baryonic and dark matter components.

Testing the robustness of the relation across different selection criteria demonstrated that extending the redshift range to $z \lt 0.3$ produced no significant changes, with the fitted free parameters varying by less than 5%. The relation shown in Figure 5 is calibrated on Shark using the fainter WAVES-wide limit ( $Z \lt 21.2$ ) extended to $z \lt 0.3$ . This is what is shown within the Shark panels of Figure 5, whilst SAGE and GAEA both display a sample with ( $i \lt 19.2$ ) and $z \lt 0.1$ . Importantly, using the brighter GAMA-like limit ( $i \lt 19.2$ ) fails to capture the power-law slope at the low-mass end, making it unsuitable for calibrating this relation. The fainter $Z \lt 21.2$ limit captures the low-mass regime, enabling a complete characterisation of the sSHMR. Crucially, while the relation itself does not depend on the magnitude limit, the ability to define this relation does. This means the calibrated relation can be applied to any selection function with a brighter limit than $Z \lt 21.2$ , making it applicable to most existing group catalogues. Furthermore, since this method uses stellar mass rather than luminosity as the mass tracer, it can be applied to any group catalogue where reliable stellar mass measurements are available.

Figure 6. Empirical HMF for galaxy groups with three or more members, constructed from the group catalogue of Van Kempen et al. (Reference Van Kempen2024). The HMF is shown separately for groups in the SGP region dominated by 2dF coverage (red triangles), the G23 region with GAMA spectroscopy (orange diamonds), and the G23 region with 2dF spectroscopy (green crosses). Halo masses are estimated using the traditional and calibrated virial theorem method ( $\log M_{\mathrm{halo,\,VT}}$ or $\log M_{\mathrm{halo,\,MVT}}$ ). Number densities are not corrected for survey volume or selection effects, due to the heterogeneous nature of the group sample and the challenges in defining a complete selection function. The solid black curve represents the analytic HMF prediction for the adopted cosmology of Shark, while the grey dashed line shows the empirical fit from Driver et al. (Reference Driver2022a) based on GAMA5, SDSS5, and REFLEX II data. Error bars reflect Poisson uncertainties.

It is important to note that, while the sSHMR provides the most precise and lowest-scatter halo mass estimates, it is also the most model-dependent of the methods considered. The distribution of halo masses is tied to both the HMF and SMF of the input model – Shark. As a result, this halo mass estimator is not suitable for observationally constructing the HMF or for applications that aim to produce unbiased cosmological parameters from halo masses. For such applications, the MVT method is preferred, as its model dependence is negligible. Nevertheless, for studies focused on group-scale halo mass estimation, where the primary goal is to minimise residuals and uncertainties and to obtain a precise halo mass measurement, the sSHMR is the optimal choice, provided that reliable stellar mass measurements are available.

In summary, the calibrated sSHMR offers a powerful and practical tool for estimating group halo masses, delivering the highest precision among the methods tested. Its application should be restricted to contexts where model dependence is not a limiting factor. The method’s versatility is demonstrated by its applicability to any group catalogue with stellar mass measurements and its robustness across different survey selection functions, provided they have magnitude limits brighter than $Z \lt 21.2$ . Critically, users must ensure that their stellar mass measurements are analogous to ProSpect-derived stellar masses, as the Shark model used for this calibration is specifically tailored to ProSpect stellar masses. Failure to account for any systematic differences between stellar masses may introduce biases in the resulting halo mass estimates, particularly at the high-mass end.

5. Observational application of halo masses

The calibrated halo mass estimators developed in this work enable a range of new applications in the analysis of spectroscopic group catalogues. Here, we present two representative examples using the SGP group sample from Van Kempen et al. (Reference Van Kempen2024): (i) the construction of an empirical HMF using the MVT, and (ii) the mapping of the quenched fraction of galaxies in the stellar mass–halo mass plane using the sSHMR.

5.1. Halo mass function from observational groups

The HMF is a fundamental cosmological observable, that encodes information about the growth of structure and the underlying cosmological parameters (e.g. Press & Schechter Reference Press and Schechter1974; Jenkins et al. Reference Jenkins2001; Murray et al. Reference Murray, Power and Robotham2013). In Figure 6, we present the empirical HMF for galaxy groups with $N_{m} \geq 4$ members. The sample is divided into three subsamples: the SGP-2dF (red triangles) which uses 2dFGRS photometry across the entire sample, the G23-2dF region (green crosses) which uses 2dFGRS photometry in the G23 region, and the G23-GAMA (orange diamonds) which uses the GAMA photometry.

Unlike the SMF, which is based on uniform measurements of individual galaxies, constructing the HMF requires consistent data for all members within each group and a well-defined selection function to enable $1/V_{\mathrm{max}}$ corrections. However, the observational group sample is heterogeneous, with the formation and detection of groups often drawing from multiple spectroscopic surveys that have varying completeness and magnitude limits (see Van Kempen et al. Reference Van Kempen2024). This diversity makes it challenging to accurately determine the completeness for each group, and thus we do not apply the $1/V_{\mathrm{max}}$ correction to the observational HMF. The absence of this correction means that the HMF is not fully corrected for incompleteness, particularly at the low-mass end. Low-mass groups are more likely to be missed because their member galaxies often fall below the survey’s limiting magnitude, and even when detected, such groups are typically found only at very low redshifts, where the survey volume is small and statistical uncertainties are larger. As a result, the observed HMF underestimates the true abundance of low-mass haloes, leading to an artificial drop-off in this range. Therefore, the HMF presented here should be regarded as a demonstration of the methodology rather than a definitive measurement.

To ensure reliable binning of the HMF, the provided HMFs includes groups with $N_{{m}} \geq 3$ detected in each survey (SGP-2dF, G23-2dF, and G23-GAMA). For the SGP-2dF sample, an additional restriction to $z \lt 0.08$ is imposed to minimise the effects of declining completeness at higher redshifts. The HMFs shown in Figure 6 are derived using both the standard (left panel: Equation 6) and MVT (right panel: Equation 6) estimators. As discussed in Section 4.1.1 and illustrated in Figure 4, the standard virial theorem estimator predicts an excess of very low-mass group haloes. When considering the observational survey volume and the presence of a local void at $z \sim 0.02$ (Driver et al. Reference Driver2022b; Van Kempen et al. Reference Van Kempen2024), such low-mass systems are not expected to be sampled within this dataset. This underestimation of halo masses by the virial theorem could be misinterpreted as a genuine feature in the absence of this prior knowledge. In contrast, the MVT corrects for this bias, yielding a halo mass function that is more consistent with theoretical and practical expectations, providing a more reliable representation of the underlying group halo mass distribution.

The resulting HMFs are compared both to the analytic prediction for the adopted cosmology of Shark and to the empirical fit derived by Driver et al. (Reference Driver2022a). The empirical fit from Driver et al. (Reference Driver2022a) is based on halo masses drawn from three low-redshift group/cluster catalogues: the GAMA group catalogue of Robotham et al. (Reference Robotham2011), the SDSS Data group catalogue of Tempel et al. (Reference Tempel, Tuvikene, Kipper and Libeskind2017), and ROSAT-ESO Flux Limited X-ray Galaxy Cluster Survey (REFLEX II; Böhringer et al. Reference Böhringer2013; Böhringer, Chon, & Collins Reference Böhringer, Chon and Collins2014; Böhringer, Chon, & Fukugita Reference Böhringer, Chon and Fukugita2017). Together these span the mass range from galaxy groups to massive clusters, enabling a continuous measurement of the $z=0$ HMF. It is important to note that the dataset used by Driver et al. (Reference Driver2022a) to establish their empirical relation is substantially larger and homogenised compared the sample analysed in this work. Nevertheless, our non-idealised and considerably smaller dataset yields a HMF that aligns well with theoretical expectations at the high to moderate halo mass range ( $\log M_{halo}\gt12.75\,({\rm M}_{\odot})$ ) and underscores the significant progress made in the development of accurate halo mass estimators. However, at lower halo masses, the observed HMFs exhibit clear signs of incompleteness, further emphasising the necessity for larger and more homogeneous samples to robustly characterise this regime and improve observational constraints on the low-mass end.

Looking ahead, upcoming wide-area spectroscopic surveys such as WAVES (Driver et al. Reference Driver2019) and 4HS (Taylor et al. Reference Taylor2023) will provide the depth, completeness, and sky coverage required to construct high-precision HMFs over a wide mass range. With sufficiently large samples and accurate halo mass estimates, such surveys will enable the use of the HMF as a cosmological probe, constraining parameters such as $\Omega_m$ , $\sigma_8$ , and potentially the nature of dark energy (e.g. Murray et al. Reference Murray, Power and Robotham2013; Bocquet et al. Reference Bocquet2019). The principal limiting factor for future HMF-based cosmological analyses is likely to be the uncertainty in group identification and membership assignment, particularly at low multiplicity, rather than the precision of the halo mass estimator itself.

5.2. Quenched fraction in the stellar mass–halo mass plane

The second application demonstrates the utility of the sSHMR estimator for investigating the interplay between galaxy quenching, stellar mass, and environment. To ensure compatibility with the sSHMR calibration detailed in Section 4.2.3, we applied a systematic offset of +0.13 dex to all observational stellar mass estimates, bringing them into alignment with the ProSpect like stellar masses used in the Shark calibration. This adjustment is essential for accurate halo mass estimation using Equation (11) (see Section 4.2.3 for more details). Figure 7 presents the distribution of the quenched fraction in the stellar mass–halo mass plane for the mock datasets, while Figure 8 shows the corresponding distributions for the observational dataset. The quenched fraction is defined as the fraction of galaxies with $\log\,\mathrm{sSFR} \lt -11.0~(\mathrm{yr}^{-1})$ , following the criteria established in Van Kempen et al. (Reference Van Kempen2024), with observational SFR upper limits classified as quenched if they meet the W3 detection limit for their given distance, otherwise conservatively classified as star-forming (see Van Kempen et al. Reference Van Kempen2024 for full details regarding upper-limit handling).

Figure 7. Distribution of quenched fraction in the stellar mass–halo mass plane for galaxy groups in the SHARK, SAGE, and GAEA simulations. The colour scale indicates the fraction of quenched galaxies (defined as those with $\log\,\mathrm{sSFR} \lt -11.0$ ) within each region of parameter space, ranging from blue (predominantly star-forming) to red (predominantly quenched). The underlying distribution is computed using a two-dimensional kernel density estimate (KDE) with adaptive smoothing. The quenched fraction is calculated as the ratio of the KDE-weighted density of quenched galaxies to the total KDE-weighted density in each cell. Black contours show the underlying density of individual galaxies. Halo masses are estimated using the calibrated relation between the sum of the stellar masses of the three most massive group galaxies and the group halo mass.

The underlying distribution is computed using a two-dimensional kernel density estimate (KDE) with adaptive smoothing. The KDE is evaluated on a regular grid with the kernel full width at half maximum chosen to be approximately twice the typical uncertainty in the stellar and halo mass ( $0.15$ and $0.3$ dex, respectively). This approach reduces bias from mass errors (stellar and halo) in the smoothed contours, ensuring that the observed trends are not artificially sharpened by noise or underestimated uncertainties. The quenched fraction in each cell is calculated as the ratio of the KDE-weighted density of quenched galaxies to the total KDE-weighted density, providing a robust estimate even in regions of low sampling density. Black contours indicate the underlying density of individual galaxies for reference.

The simulated quenched fraction distributions in Figure 7 highlight the diversity in how different galaxy formation models implement satellite quenching. Notably, the GAEA simulation shows a striking dependence of the quenched fraction almost exclusively on stellar mass, with minimal variation as a function of halo mass, suggesting that internal processes dominate quenching in this model. In contrast, both SHARK and SAGE display more complex dependencies, with varying degrees of environmental influence on quenching. In the observational data, there is a pronounced transition in the quenched fraction at $M_\star \sim 10^{10}\,{\rm M}_\odot$ , indicating a strong stellar mass threshold for quenching, alongside a secondary, smaller dependence on halo mass. These comparisons underscore the need for improved observational constraints to better calibrate and distinguish between quenching models, particularly in the treatment of environmental effects.

Looking ahead, the larger and more complete samples provided by upcoming surveys such as WAVES and 4HS will enable this analysis to be extended in greater detail, including the incorporation of cosmic web location and other environmental metrics. This will provide new opportunities to disentangle the relative roles of internal and external quenching mechanisms in galaxy evolution.

In summary, these examples illustrate the power and flexibility of the calibrated halo mass estimators developed in this work for a range of observational applications, from cosmological tests with the HMF to detailed studies of galaxy evolution in group environments.

6. Summary and conclusions

Accurate estimation of dark-matter halo masses for galaxy groups is fundamental both for studies of galaxy evolution and for exploiting group catalogues as cosmological probes. In this work we developed, calibrated, and validated two complementary, observational halo-mass estimators: a modified virial-theorem estimator (MVT) and a summed stellar–halo mass relation (sSHMR) that uses the sum of the stellar masses of the three most massive group members, as the predictor. The estimators were constructed using realistic mock light cones and survey selections (Sections 4.1 and 4.2); both were calibrated on the fiducial Shark SAM and subsequently validated on SAGE and GAEA catalogues to quantify model dependence. Performance was quantified in terms of accuracy, uncertainty, and variations between models (Sections 4.1.3, 4.2.3); the MVT was designed to minimise model dependence while the sSHMR provides the lowest scatter where reliable stellar masses are available. Finally, the practical utility of the calibrated estimators was demonstrated in two observational applications, the empirical halo mass function and the mapping of quenched fractions in the stellar mass–halo mass plane (Section 5).

Figure 8. Distribution of quenched fraction in the stellar mass–halo mass plane for galaxy groups in the observational dataset of Van Kempen et al. (Reference Van Kempen2024). The colour scale represents the fraction of quenched galaxies (defined as those with) within each region of parameter space, ranging from blue (predominantly star-forming) to red (predominantly quenched). The underlying distribution is computed using a two-dimensional kernel density estimate (KDE) with adaptive smoothing. The quenched fraction is calculated as the ratio of the KDE-weighted density of quenched galaxies to the total KDE-weighted density in each cell. Black contours indicate the underlying density of individual galaxies. The halo masses used in this figure are estimated using the calibrated sSHMR between the sum of the stellar masses of the three most massive group galaxies and the group halo mass. This figure highlights the dependence of quenching on both stellar and halo mass, with the quenched fraction increasing towards higher masses in both dimensions.

The primary results of this study are as follows:

  1. 1. Modified Virial Theorem: We present a robust, physically motivated modification of the virial theorem for group-scale halo mass estimation, incorporating corrections for velocity dispersion and projected radius (Section 4.1). The calibrated estimator achieves negligible systematic bias (mean $\Delta \sim 0.01$ dex) and moderate scatter ( $\sigma \sim 0.23$ $0.25$ dex) across all tested SAMs, with minimal dependence on baryonic physics (Section 4.1.3, Figure 4).

  2. 2. Summed Stellar-to-Halo Mass Relation: We calibrate an empirical relation between the sum of the stellar masses of the three most massive group galaxies and the halo mass, optimised via Bayesian MCMC (Section 4.2). This method yields the highest precision among those tested, with typical scatter as low as $0.12$ dex in the primary calibration sample, but exhibits greater model dependence due to its sensitivity to the underlying baryonic physics and stellar mass assembly (Section 4.2.3, Figure 5).

  3. 3. Comparison Across Models: Both estimators were callibrated to Shark and cross-validated with SAGE, and GAEA, demonstrating robust performance. The virial theorem calibration is largely insensitive to the details of the SAM, while the sSHMR shows greater variance, particularly in models with differing feedback and mass resolution (notably GAEA at low masses; Sections 4.1.3, 4.2.3).

  4. 4. Observational Applications: We showcase two key applications: (i) the construction of the empirical HMF using the calibrated virial theorem, demonstrating the feasibility of this approach for future spectroscopic surveys (Section 5, Figure 6); and (ii) the mapping of the quenched fraction in the stellar mass–halo mass plane using sSHMR-based halo masses, revealing the joint dependence of quenching on both stellar and halo mass (Section 5, Figure 8).

  5. 5. Guidance for Future Work: For cosmological applications, such as HMF construction and parameter inference, the calibrated virial theorem is recommended due to its minimal model dependence. For studies focused on group-scale halo mass estimation and environmental effects, the sSHMR provides the highest precision, provided reliable stellar mass measurements are available (Sections 4.1.3, 4.2.3, 5).

In conclusion, the calibrated halo mass estimators developed in this work provide a robust foundation for future analyses of group catalogues in both simulated and observational contexts. Their application will be particularly valuable for upcoming wide-area spectroscopic surveys, enabling precise studies of galaxy evolution and the dark matter halo population, and facilitating the use of group catalogues as cosmological probes.

Acknowledgements

We thank the anonymous referee for helpful comments and suggestions that have improved the content and clarity of this paper. M.E.C. is a recipient of an Australian Research Council Future Fellowship (project no. FT170100273) funded by the Australian Government. D.J.C. is a recipient of an Australian Research Council Future Fellowship (project no. FT220100841) funded by the Australian Government. This publication makes use of data products from the Wide-field Infrared Survey Explorer, which is a joint project of the University of California, Los Angeles, and the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration. GAMA is a joint European-Australasian project based around a spectroscopic campaign using the Anglo-Australian Telescope. The GAMA input catalogue is based on data taken from the Sloan Digital Sky Survey and the UKIRT Infrared Deep Sky Survey. Complementary imaging of the GAMA regions is being obtained by a number of independent survey programmes including GALEX MIS, VST KiDS, VISTA VIKING, WISE, Herschel-ATLAS, GMRT, and ASKAP providing UV to radio coverage. GAMA is funded by the STFC (UK), the ARC (Australia), the AAO, and the participating institutions. The GAMA website is https://www.gama-survey.org/. Based on observations made with ESO Telescopes at the La Silla Paranal Observatory under programme ID 177.A-3016. This research has made use of python (https://www.python.org) and python packages: astropy (Astropy Collaboration et al. 2013, 2018, 2022), cmasher (van der Velden Reference van der Velden2020), emcee (Foreman- Mackey et al. 2013), matplotlib http://matplotlib.org/ (Hunter Reference Hunter2007), NumPy http://www.numpy.org/ (van der Walt, Colbert, & Varoquaux Reference van der Walt, Colbert and Varoquaux2011), Pandas (McKinney Reference McKinney, van der Walt and Millman2010), and SciPy https://www.scipy.org/ (Virtanen et al. Reference Virtanen2020).

Appendix A. Virial theorem fitting parameters

The calibration of the MVT was performed using a Bayesian MCMC approach, enabling robust exploration of the posterior probability distributions for all model parameters. The likelihood function was constructed to minimise the scatter in $\log M_{\mathrm{halo}}$ at fixed predicted mass, and is defined as

(A1) \begin{equation} \ln \mathcal{L} = -\frac{1}{2} \sum_{i} \left[ \frac{(y_i - \mu_i)^2}{\sigma^2} + \ln(2\pi \sigma^2) \right],\end{equation}

where $y_i$ is the true halo mass, $\mu_i$ is the predicted halo mass from the calibrated virial theorem, and $\sigma$ is the standard deviation of the residuals. The model incorporates a power-law correction to the virial coefficient, parameterised as described in Section 4.1, with the functional form:

(A2) \begin{equation} A = \frac{5}{3} + \alpha \left[ \left( \frac{\sigma}{\sigma_{\mathrm{lim}}} \right)^{n_1} - 1 \right] + \unicode{x03B2} \left[ \left( \frac{R}{R_{\mathrm{lim}}} \right)^{n_2} - 1 \right],\end{equation}

where $\sigma$ is the group velocity dispersion, R is the projected group radius, and $\alpha$ , $\sigma_{\mathrm{lim}}$ , $\unicode{x03B2}$ , $R_{\mathrm{lim}}$ , $n_1$ , and $n_2$ are free parameters.

Table A1. Best fitting of the coefficients for the modified virial theorem relation, including their 16th and 84th percentile scatter.

Figure A1. Posterior distributions and covariances for the six free parameters of the calibrated dispersion-based halo mass relation, as determined by MCMC sampling on the Shark model. The corner plot displays the marginalised one-dimensional distributions along the diagonal, with median values and 68% credible intervals indicated, and the two-dimensional projections of the posterior for each parameter pair in the off-diagonal panels. The parameters correspond to the normalisation and scaling of the velocity dispersion and maximum projected separation terms ( $\alpha$ , $\sigma_{\mathrm{lim}}$ , $\unicode{x03B2}$ , $R_{\mathrm{lim}}$ , $n_1$ , $n_2$ ) in the power-law model for halo mass. The MCMC analysis was performed using the emcee sampler, with convergence assessed via autocorrelation time. This figure demonstrates that all parameters are well-constrained and highlights the correlations between them, providing a robust statistical foundation for the calibrated group halo mass estimator.

Figure B1. Posterior distributions and covariances for the four free parameters of the summed stellar mass–halo mass relation, as determined by MCMC sampling on the Shark model. The corner plot displays the marginalised one-dimensional posterior distributions for each parameter (normalisation A, characteristic mass $M_A$ , low-mass slope $\unicode{x03B2}$ , and high-mass slope $\gamma$ ) along the diagonal, with median values and 68% credible intervals indicated. Off-diagonal panels show the two-dimensional projections of the posterior for each parameter pair, highlighting correlations and degeneracies. The MCMC analysis was performed using the emcee sampler, with convergence assessed via autocorrelation time. This figure demonstrates that all parameters are well-constrained, providing a robust statistical foundation for the calibrated relation between the sum of the stellar masses of the three most massive group galaxies and their host halo mass.

Uniform priors were adopted for all parameters within physically motivated bounds. The MCMC sampling was performed using an ensemble sampler with 40 walkers and up to $2 \times 10^6$ steps, with convergence assessed via the integrated autocorrelation time. Burn-in and thinning were determined dynamically based on the final autocorrelation time, ensuring that the posterior samples are independent and representative of the converged distribution.

The resulting posterior distributions and covariances for the six free parameters are shown in Figure A1. The median values and 68% confidence intervals for each of the coefficients are are shown in Table A1.

The corner plot in Figure A1 illustrates that all parameters are well-constrained, with physically plausible correlations and no significant degeneracies. This calibration provides a statistically robust foundation for the application of the virial theorem-based halo mass estimator to both simulated and observational group catalogues. For further details on the calibration methodology and performance, see Sections 4.1 and 4.1. 3.

Appendix B. SHMR fitting parameters

The calibration of the sSHMR was performed using a Bayesian MCMC approach, enabling a rigorous exploration of the posterior probability distributions for all model parameters. The adopted functional form for the SHMR is a double power-law,

(B1) \begin{equation} M_{\mathrm{halo}} = A\, \Sigma M_{*,\mathrm{3}} \left[ \left( \frac{\Sigma M_{*,\mathrm{3}}}{{M_A}} \right)^{\unicode{x03B2}} + \left( \frac{\Sigma M_{*,\mathrm{3}}}{{M_A}} \right)^{\gamma} \right],\end{equation}

where $M_{*,\mathrm{sum}}$ is the sum of the stellar masses of the three most massive galaxies in the group, A is the normalisation, $M_A$ is the characteristic stellar mass, and $\unicode{x03B2}$ and $\gamma$ are the low- and high-mass slopes, respectively.

The likelihood function was constructed to minimise the scatter in $\log M_{\mathrm{halo}}$ at fixed $\Sigma M_{\star, \mathrm{3}}$ , and is given by

(B2) \begin{equation} \ln \mathcal{L} = -\frac{1}{2} \sum_{i} \left[ \frac{(y_i - \mu_i)^2}{\sigma^2} + \ln(2\pi \sigma^2) \right],\end{equation}

where $y_i$ is the true halo mass, $\mu_i$ is the predicted halo mass from the SHMR, and $\sigma$ is the standard deviation of the residuals. Uniform priors were adopted for all parameters within physically motivated bounds.

MCMC sampling was performed using an ensemble sampler with 40 walkers and up to $2 \times 10^6$ steps, with convergence assessed via the integrated autocorrelation time. Burn-in and thinning were determined dynamically based on the final autocorrelation time, ensuring that the posterior samples are independent and representative of the converged distribution.

The resulting posterior distributions and covariances for the four free parameters are shown in Figure B1. The median and the 16th and 84th percentile for each parameter given in Table B2.

Table B2. Best fitting of the coefficients for the sSHMR, including their 16th and 84th percentile.

The corner plot in Figure B1 demonstrates that all parameters are well-constrained, with physically plausible correlations and no significant degeneracies. This calibration provides a statistically robust foundation for the application of the SHMR-based halo mass estimator to both simulated and observational group catalogues. For further details on the calibration methodology and performance, see Sections 4.2 and 4.2.3.

References

Alpaslan, M., et al. 2012, MNRAS, 426, 2832 Google Scholar
Arnaud, M., et al. 2010, A&A, 517, A92 Google Scholar
Astropy Collaboration, et al. 2013, A&A, 558, A33 Google Scholar
Astropy Collaboration, et al. 2018, AJ, 156, 123 Google Scholar
Astropy Collaboration, et al. 2022, ApJ, 935, 167 Google Scholar
Bacon, R., et al. 2010, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 7735, Ground-based and Airborne Instrumentation for Astronomy III, ed. McLean, I. S., Ramsay, S. K., & Takami, H., 773508Google Scholar
Baldry, I. K., et al. 2012, MNRAS, 421, 621 Google Scholar
Baugh, C. M. 2006, RPPh, 69, 3101 Google Scholar
Beers, T. C., Flynn, K., & Gebhardt, K. 1990, AJ, 100, 32 Google Scholar
Behroozi, P., Wechsler, R. H., Hearin, A. P., & Conroy, C. 2019, MNRAS, 488, 3143 Google Scholar
Behroozi, P. S., Conroy, C., & Wechsler, R. H. 2010, ApJ, 717, 379 Google Scholar
Behroozi, P. S., Wechsler, R. H., & Wu, H.-Y. 2013a, ApJ, 762, 109 Google Scholar
Behroozi, P. S., et al. 2013b, ApJ, 763, 18 Google Scholar
Benson, A. J. 2010, PhR, 495, 33 Google Scholar
Blake, C., et al. 2016, MNRAS, 462, 4240 Google Scholar
Blumenthal, G. R., Faber, S. M., Primack, J. R., & Rees, M. J. 1984, Natur, 311, 517 Google Scholar
Bocquet, S., et al. 2019, ApJ, 878, 55 Google Scholar
Böhringer, H., Chon, G., & Collins, C. A. 2014, A&A, 570, A31 Google Scholar
Böhringer, H., Chon, G., & Fukugita, M. 2017, A&A, 608, A65 Google Scholar
Böhringer, H., et al. 2013, A&A, 555, A30 Google Scholar
Bower, R. G., et al. 2006, MNRAS, 370, 645 Google Scholar
Carlberg, R. G., et al. 1996, ApJ, 462, 32 Google Scholar
Castro, T., et al. 2021, MNRAS, 500, 2316 Google Scholar
Chandro-Gómez, Á., et al. 2025, MNRAS, 539, 776 Google Scholar
Chauhan, G., et al. 2019, MNRAS, 488, 5898 Google Scholar
Chauhan, G., et al. 2021, MNRAS, 506, 4893 Google Scholar
Chauhan, G., et al. 2020, MNRAS, 498, 44 Google Scholar
Cluver, M. E., et al. 2014, ApJ, 782, 90 Google Scholar
Cluver, M. E., et al. 2020, ApJ, 898, 20 Google Scholar
Cluver, M. E., et al. 2025, ApJ, 979, 18 Google Scholar
Colless, M., et al. 2001, MNRAS, 328, 1039 Google Scholar
Croton, D. J., et al. 2006, MNRAS, 365, 11 Google Scholar
Croton, D. J., et al. 2016, ApJS, 222, 22 Google Scholar
Davé, R., Finlator, K., & Oppenheimer, B. D. 2012, MNRAS, 421, 98 Google Scholar
Davies, L. J. M., et al. 2019, MNRAS, 483, 5444 Google Scholar
De Lucia, G., Fontanot, F., Wilman, D., & Monaco, P. 2011, MNRAS, 414,1439 Google Scholar
De Lucia, G., Fontanot, F., Xie, L., & Hirschmann, M. 2024, A&A, 687, A68 Google Scholar
De Lucia, G., et al. 2014, MNRAS, 445, 970 Google Scholar
DESI Collaboration, et al. 2016, arXiv e-prints, arXiv:1611.00036 Google Scholar
Dewdney, P. E., Hall, P. J., Schilizzi, R. T., & Lazio, T. J. L. W. 2009, IEEE Proc., 97, 1482 Google Scholar
Driver, S. P., et al. 2009, A&G, 50, 5.12Google Scholar
Driver, S. P., et al. 2019, Msngr, 175, 46 Google Scholar
Driver, S. P., et al. 2022a, MNRAS, 515, 2138 Google Scholar
Driver, S. P., et al. 2022b, MNRAS, 513, 439 Google Scholar
Eckert, D., et al. 2020, OJA, 3, 12 Google Scholar
Eke, V. R., et al. 2004, MNRAS, 348, 866 Google Scholar
Elahi, P. J., et al. 2018, MNRAS, 475, 5338 Google Scholar
Evrard, A. E., et al. 2008, ApJ, 672, 122 Google Scholar
Finn, R. A., et al. 2005, ApJ, 630, 206 Google Scholar
Flesch, E. W. 2021, arXiv e-prints, arXiv:2105.12985 Google Scholar
Fontanot, F., et al. 2020, MNRAS, 496, 3943 Google Scholar
Foreman-Mackey, D., et al. 2013, emcee: The MCMC Hammer, Astrophysics Source Code Library, record ascl:1303.002Google Scholar
Henriques, B. M. B., et al. 2020, MNRAS, 491, 5795 Google Scholar
Hirschmann, M., De Lucia, G., & Fontanot, F. 2016, MNRAS, 461, 1760 Google Scholar
Hoekstra, H., et al. 2013, SSR, 177, 75 Google Scholar
Hunter, J. D. 2007, CSE, 9, 90 Google Scholar
Jarrett, T. H., et al. 2019, ApJS, 245, 25 Google Scholar
Jarrett, T. H., et al. 2023, ApJ, 946, 95 Google Scholar
Jarrett, T. H., et al. 2013, AJ, 145, 6 Google Scholar
Jenkins, A., et al. 2001, MNRAS, 321, 372 Google Scholar
Johnston, S., et al. 2008, ExA, 22, 151 Google Scholar
Jones, D. H., et al. 2004, MNRAS, 355, 747 Google Scholar
Klypin, A., Yepes, G., Gottlöber, S., Prada, F., & Heß, S. 2016, MNRAS, 457, 4340 Google Scholar
Klypin, A. A., Trujillo-Gomez, S., & Primack, J. 2011, ApJ, 740, 102 Google Scholar
Komatsu, E., et al. 2011, ApJS, 192, 18 Google Scholar
Kravtsov, A. V., Vikhlinin, A. A., & Meshcheryakov, A. V. 2018, AstL, 44, 8 Google Scholar
Lacey, C. G., et al. 2016, MNRAS, 462, 3854 Google Scholar
Lagos, C. d. P., et al. 2018, MNRAS, 481, 3573 Google Scholar
Lagos, C. d. P., et al. 2019, MNRAS, 489, 4196 Google Scholar
Lagos, C. d. P., et al. 2024, MNRAS, 531, 3551 Google Scholar
Lambert, T. S., Kraan-Korteweg, R. C., Jarrett, T. H., & Macri, L. M. 2020, MNRAS, 497, 2954 Google Scholar
Lau, E. T., Nagai, D., & Kravtsov, A. V. 2010, ApJ, 708, 1419 Google Scholar
Laureijs, R., et al. 2011, arXiv e-prints, arXiv:1110.3193 Google Scholar
Leauthaud, A., et al. 2012, ApJ, 744, 159 Google Scholar
Lilly, S. J., Carollo, C. M., Pipino, A., Renzini, A., & Peng, Y. 2013, ApJ, 772, 119 Google Scholar
Lim, S. H., et al. 2021, MNRAS, 504, 5131 Google Scholar
Lovisari, L., Ettori, S., Gaspari, M., & Giles, P. A. 2021, Universe, 7, 139 Google Scholar
LSST Science Collaboration, et al. 2009, arXiv e-prints, arXiv:0912.0201 Google Scholar
Macri, L. M., et al. 2019, ApJS, 245, 6 Google Scholar
Mandelbaum, R., et al. 2016, MNRAS, 457, 3200 Google Scholar
Mandelbaum, R., et al. 2018, MNRAS, 481, 3170 Google Scholar
McKinney, W. 2010, in Proceedings of the 9th Python in Science Conference, ed. van der Walt, S., & Millman, J., 56–61Google Scholar
Merloni, A., et al. 2012, arXiv e-prints, arXiv:1209.3114 Google Scholar
Mitchell, P. D., Schaye, J., Bower, R. G., & Crain, R. A. 2020, MNRAS, 494, 3971 Google Scholar
Moster, B. P., Naab, T., & White, S. D. M. 2013, MNRAS, 428, 3121 Google Scholar
Moster, B. P., Naab, T., & White, S. D. M. 2020, MNRAS, 499, 4748 Google Scholar
Muldrew, S. I., et al. 2012, MNRAS, 419, 2670 Google Scholar
Murray, S. G., Power, C., & Robotham, A. S. G. 2013, A&C, 3, 23 Google Scholar
Old, L., et al. 2014, MNRAS, 441, 1513 Google Scholar
Old, L., et al. 2015, MNRAS, 449, 1897 Google Scholar
Old, L., et al. 2018, MNRAS, 475, 853 Google Scholar
Oyarzún, G. A., Tinker, J. L., Bundy, K., Xhakaj, E., & Wyithe, J. S. B. 2024, ApJ, 974, 29 Google Scholar
Peng, Y.-j., et al. 2010, ApJ, 721, 193 Google Scholar
Péroux, C., & Howk, J. C. 2020, ARA&A, 58, 363 Google Scholar
Pillepich, A., et al. 2018, MNRAS, 475, 648 Google Scholar
Planck Collaboration, et al. 2016, A&A, 594, A24 Google Scholar
Poggianti, B. M., et al. 2010, MNRAS, 405, 995 Google Scholar
Prada, F., Klypin, A. A., Cuesta, A. J., Betancort-Rijo, J. E., & Primack, J. 2012, MNRAS, 423, 3018 Google Scholar
Press, W. H., & Schechter, P. 1974, ApJ, 187, 425 Google Scholar
Robotham, A. S. G., et al. 2020, MNRAS, 495, 905 Google Scholar
Robotham, A. S. G., et al. 2011, MNRAS, 416, 2640 Google Scholar
Saintonge, A., et al. 2017, ApJS, 233, 22 Google Scholar
Simet, M., et al. 2017, MNRAS, 466, 3103 Google Scholar
Somerville, R. S., & Davé, R. 2015, ARA&A, 53, 51 Google Scholar
Somerville, R. S., Popping, G., & Trager, S. C. 2015, MNRAS, 453, 4337 Google Scholar
Springel, V., White, S. D. M., Tormen, G., & Kauffmann, G. 2001, MNRAS, 328, 726 Google Scholar
Springel, V., et al. 2005, Natur, 435, 629 Google Scholar
Stevens, A. R. H., Lagos, C. d. P., Obreschkow, D., & Sinha, M. 2018, MNRAS, 481, 5543 Google Scholar
Tacconi, L. J., Genzel, R., & Sternberg, A. 2020, ARA&A, 58, 157 Google Scholar
Taylor, E. N., et al. 2023, Msngr, 190, 46 Google Scholar
Tempel, E., Tuvikene, T., Kipper, R., & Libeskind, N. I. 2017, A&A, 602, A100 Google Scholar
Tinker, J., et al. 2008, ApJ, 688, 709 Google Scholar
Tinker, J. L. 2021, ApJ, 923, 154 Google Scholar
Tumlinson, J., Peeples, M. S., & Werk, J. K. 2017, ARA&A, 55, 389 Google Scholar
Vale, A., & Ostriker, J. P. 2004, MNRAS, 353, 189 Google Scholar
van de Voort, F., et al. 2021, MNRAS, 501, 4888 Google Scholar
van der Velden, E. 2020, JOSS, 5, 2004 Google Scholar
van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, CSE, 13, 22 Google Scholar
Van Kempen, W., et al. 2024, PASA, 41, e096 Google Scholar
Velander, M., et al. 2014, MNRAS, 437, 2111 Google Scholar
Viola, M., et al. 2015, MNRAS, 452, 3529 Google Scholar
Virtanen, P., et al. 2020, NM, 17, 261 Google Scholar
Wang, K., & Peng, Y. 2025, ApJ, 980, 233 Google Scholar
Wechsler, R. H., & Tinker, J. L. 2018, ARA&A, 56, 435 Google Scholar
Wetzel, A. R., Tinker, J. L., Conroy, C., & van den Bosch, F. C. 2013, MNRAS, 432, 336 Google Scholar
White, S. D. M., & Rees, M. J. 1978, MNRAS, 183, 341 Google Scholar
Wojtak, R., et al. 2018, MNRAS, 481, 324 Google Scholar
Wootten, A., & Thompson, A. R. 2009, IEEE Proc., 97, 1463 Google Scholar
Wright, E. L., et al. 2010, AJ, 140, 1868 Google Scholar
Yang, X., Mo, H. J., & van den Bosch, F. C. 2003, MNRAS, 339, 1057 Google Scholar
Yang, X., et al. 2007, ApJ, 671, 153 Google Scholar
Figure 0

Figure 1. RA–Dec distribution of the three SGP sub-samples used for uniform completeness: SGP–2dF (red), G23–2dF (green), and G23–GAMA (orange). Comparing G23–2dF with G23–GAMA highlights the effect of spectroscopic completeness. Numbers in brackets denote sample sizes.

Figure 1

Figure 2. Comparison of the SMF from Shark to well-established observational SMFs and SMFs produced by observational data from Van Kempen et al. (2024). The figure shows the SMF ($\log \unicode{x03D5}$) versus stellar mass ($\log M_{\star}$) for Shark v2.0 (blue circles) alongside canonical constraints from Baldry et al. (2012) (black dashed line) and Driver et al. (2022a) (grey dashed line). Red triangles, orange diamonds, and green crosses represent our observational data from Van Kempen et al. (2024) (SGP-2dF, G23-GAMA, and G23-2dF datasets, respectively). Error bars indicate Poisson uncertainties. Shark demonstrates excellent agreement with the established SMFs and across our observational datasets, with only minor deviations.

Figure 2

Figure 3. Comparison of the HMF derived from Shark with analytic and observational benchmarks. The blue data points represent the Shark HMF, with error bars indicating Poisson uncertainties in each mass bin. The solid black curve denotes the analytic HMF prediction for the input cosmology of Shark, computed using the hmf Python package (Murray, Power, & Robotham 2013). The grey dashed line shows the empirical fit from Driver et al. (2022a), based on GAMA5, SDSS5, and REFLEX II data. The magenta dashed curve corresponds to a Schechter function fit to the Shark HMF.

Figure 3

Figure 4. Comparison of halo mass estimates derived from the velocity dispersion relation before and after calibration, across three semi-analytic models: Shark, SAGE, and GAEA. All samples have a manitude limit of $i\lt19.2$ and a redshift limit of $z\lt0.1$. Top Row: The virial theorem halo mass ($\log M_{\mathrm{halo, \, VT}}$) versus the true halo mass ($\log M_{\mathrm{halo}}$) for each model, with colour indicating the logarithmic number density of halos in each bin. The dashed line denotes the one-to-one relation. Second Row: Corresponding residuals ($\Delta \log M_{\mathrm{halo}} = \log M_{\mathrm{halo, \, VT}} - \log M_{\mathrm{halo}}$) and error bars indicate 16th and 84th percentiles in each bin. The mean offset and scatter indicated in the legend. Third Row: The MVT halo mass estimates ($\log$ M$_{\mathrm{halo, \, MVT}}$), compared to the true halo mass ($\log M_{\mathrm{halo}}$). Bottom Row: Corresponding median of residuals ($\Delta \log M_{\mathrm{halo}} = \log M_{\mathrm{halo, \, MVT}} - \log M_{\mathrm{halo}}$) and error bars indicate 16th and 84th percentiles in each bin. The mean offset and scatter indicated in the legend.

Figure 4

Table 1. Best-fitting parameters for the calibrated virial theorem relation.

Figure 5

Table 2. Best-fitting parameters for the summed stellar mass–halo mass relation.

Figure 6

Figure 5. Comparison of baryonic and halo mass relations across three SAMs: Shark, SAGE, and GAEA. The Shark sample has a fainter magnitude limit of $Z\lt21.2$ and a deeper redshift limit of $z\lt0.3$ compared to SAGE and GAEA ($i\lt19.2$ and $z\lt0.1$). Top Row: The sSHMR using the summed stellar mass of the three most massive group galaxies ($\log \sum {M}_{*,3}$) and the true halo mass ($\log M_{halo}$) is shown for each SAM, with the colour scale indicating the logarithm of the number of groups per bin. The black dashed line in each panel represents the MCMC-derived fit to the Shark data, optimised to minimise scatter in halo mass. This same fit is overlaid on the SAGE and GAEA panels to illustrate model dependence. Second Row: The residuals in halo mass ($\Delta\log M_{halo}=\log\sum M_{*,3}-\log M_{halo}$) is shown as a function of the summed stellar mass. Third Row: Estimated halo masses, derived by applying the Shark-calibrated sSHMR, are plotted against the true halo masses from each simulation. The black dashed line denotes the one-to-one relation. The close alignment of the points along this line in all models demonstrates that the Shark-based calibration provides robust halo mass estimates, with low bias and scatter, even when applied to independent SAMs. Bottom Row: The residuals in halo mass are shown as a function of the estimated halo mass. In both residual panels, the colour scale again indicates the logarithm of the number of groups per bin.

Figure 7

Figure 6. Empirical HMF for galaxy groups with three or more members, constructed from the group catalogue of Van Kempen et al. (2024). The HMF is shown separately for groups in the SGP region dominated by 2dF coverage (red triangles), the G23 region with GAMA spectroscopy (orange diamonds), and the G23 region with 2dF spectroscopy (green crosses). Halo masses are estimated using the traditional and calibrated virial theorem method ($\log M_{\mathrm{halo,\,VT}}$ or $\log M_{\mathrm{halo,\,MVT}}$). Number densities are not corrected for survey volume or selection effects, due to the heterogeneous nature of the group sample and the challenges in defining a complete selection function. The solid black curve represents the analytic HMF prediction for the adopted cosmology of Shark, while the grey dashed line shows the empirical fit from Driver et al. (2022a) based on GAMA5, SDSS5, and REFLEX II data. Error bars reflect Poisson uncertainties.

Figure 8

Figure 7. Distribution of quenched fraction in the stellar mass–halo mass plane for galaxy groups in the SHARK, SAGE, and GAEA simulations. The colour scale indicates the fraction of quenched galaxies (defined as those with $\log\,\mathrm{sSFR} \lt -11.0$) within each region of parameter space, ranging from blue (predominantly star-forming) to red (predominantly quenched). The underlying distribution is computed using a two-dimensional kernel density estimate (KDE) with adaptive smoothing. The quenched fraction is calculated as the ratio of the KDE-weighted density of quenched galaxies to the total KDE-weighted density in each cell. Black contours show the underlying density of individual galaxies. Halo masses are estimated using the calibrated relation between the sum of the stellar masses of the three most massive group galaxies and the group halo mass.

Figure 9

Figure 8. Distribution of quenched fraction in the stellar mass–halo mass plane for galaxy groups in the observational dataset of Van Kempen et al. (2024). The colour scale represents the fraction of quenched galaxies (defined as those with) within each region of parameter space, ranging from blue (predominantly star-forming) to red (predominantly quenched). The underlying distribution is computed using a two-dimensional kernel density estimate (KDE) with adaptive smoothing. The quenched fraction is calculated as the ratio of the KDE-weighted density of quenched galaxies to the total KDE-weighted density in each cell. Black contours indicate the underlying density of individual galaxies. The halo masses used in this figure are estimated using the calibrated sSHMR between the sum of the stellar masses of the three most massive group galaxies and the group halo mass. This figure highlights the dependence of quenching on both stellar and halo mass, with the quenched fraction increasing towards higher masses in both dimensions.

Figure 10

Table A1. Best fitting of the coefficients for the modified virial theorem relation, including their 16th and 84th percentile scatter.

Figure 11

Figure A1. Posterior distributions and covariances for the six free parameters of the calibrated dispersion-based halo mass relation, as determined by MCMC sampling on the Shark model. The corner plot displays the marginalised one-dimensional distributions along the diagonal, with median values and 68% credible intervals indicated, and the two-dimensional projections of the posterior for each parameter pair in the off-diagonal panels. The parameters correspond to the normalisation and scaling of the velocity dispersion and maximum projected separation terms ($\alpha$, $\sigma_{\mathrm{lim}}$, $\unicode{x03B2}$, $R_{\mathrm{lim}}$, $n_1$, $n_2$) in the power-law model for halo mass. The MCMC analysis was performed using the emcee sampler, with convergence assessed via autocorrelation time. This figure demonstrates that all parameters are well-constrained and highlights the correlations between them, providing a robust statistical foundation for the calibrated group halo mass estimator.

Figure 12

Figure B1. Posterior distributions and covariances for the four free parameters of the summed stellar mass–halo mass relation, as determined by MCMC sampling on the Shark model. The corner plot displays the marginalised one-dimensional posterior distributions for each parameter (normalisation A, characteristic mass $M_A$, low-mass slope $\unicode{x03B2}$, and high-mass slope $\gamma$) along the diagonal, with median values and 68% credible intervals indicated. Off-diagonal panels show the two-dimensional projections of the posterior for each parameter pair, highlighting correlations and degeneracies. The MCMC analysis was performed using the emcee sampler, with convergence assessed via autocorrelation time. This figure demonstrates that all parameters are well-constrained, providing a robust statistical foundation for the calibrated relation between the sum of the stellar masses of the three most massive group galaxies and their host halo mass.

Figure 13

Table B2. Best fitting of the coefficients for the sSHMR, including their 16th and 84th percentile.