1. Introduction
The deep, high-resolution radio and optical surveys that have been developed in recent years have begun to allow us to probe the previously unreachable low surface brightness Universe. Historically, we have been oblivious to much of our Universe due to the brightness of the sky background and the limitations of instruments. Disney & Phillipps (Reference Disney and Phillipps1987) argued that the galaxies that we observe should be thought of as ‘icebergs’ or ‘crouching giants’. That is, what we observe above the sky background is not a reliable indicator of what lies beneath. Insignificant dwarf elliptical galaxies could just be the tip of giant low surface brightness spirals, such as Malin 1 (Bothun et al. Reference Bothun, Impey, Malin and Mould1987). Although the existence of low surface brightness galaxies (LSBGs) is nothing new (e.g. Sandage & Binggeli Reference Sandage and Binggeli1984; Impey, Bothun, & Malin Reference Impey, Bothun and Malin1988), it is only recently that they have been shown to make up a significant fraction of the galaxy census, with multiple populations of both low surface brightness and optically dark sources being found, including ultra diffuse galaxies (UDGs, e.g. van Dokkum et al. Reference van Dokkum2015; Koda et al. Reference Koda, Yagi, Yamanoi and Komiyama2015; Leisman et al. Reference Leisman2017; Mancera Piña et al. Reference Mancera Piña2020; For et al. Reference For2023; Gannon et al. Reference Gannon2024) and neutral atomic hydrogen (H i) clouds without or with extremely faint optical counterparts (e.g. Kilborn et al. Reference Kilborn, Forbes, Koribalski, Brough and Kern2006; Matsuoka et al. Reference Matsuoka, Ienaka, Oyabu, Wada and Takino2012; Cannon et al. Reference Cannon2015; Józsa et al. Reference Józsa2021; Wong et al. Reference Wong2021; O’Beirne et al. Reference O’Beirne2024). As deeper optical observations are obtained we are able to improve our understanding of these sources. Many dark H i clouds identified in the Arecibo Legacy Fast ALFA (ALFALFA) survey have since been revealed to host stellar counterparts (e.g. Du et al. Reference Du2024; Jones et al. Reference Jones2024). With current and future H i and optical surveys such as the Widefield ASKAP L-band Legacy All-sky Blind surveY (WALLABY, Koribalski et al. Reference Koribalski2020) and the Dark Energy Spectroscopic Instrument (DESI) Legacy Imaging Surveys data release 10 (hereafter referred to as the Legacy Survey; Dey et al. Reference Dey2019), we are entering an era where these LSBGs can be studied in large numbers for the first time.
At the extreme end of the spectrum of LSBGs lie dark galaxies: dark matter haloes that lack stars. Dark galaxy candidates are optically dark sources with surface brightnesses below the sensitivity limits of current optical telescopes. Additionally, they are predicted to have significant H i gas (Jimenez et al. Reference Jimenez, Heavens, Hawkins and Padoan1997; Jimenez & Heavens Reference Jimenez and Heavens2020). They could be key to reconciling Lambda Cold Dark Matter (
$\Lambda$
CDM) simulations with observations, in particular resolving the ‘Missing Satellites Problem’ (Kauffmann, White, & Guiderdoni Reference Kauffmann, White and Guiderdoni1993; Moore et al. Reference Moore1999; Klypin et al. Reference Klypin, Kravtsov, Valenzuela and Prada1999) and the ‘Too Big To Fail’ problem (Boylan-Kolchin, Bullock, & Kaplinghat Reference Boylan-Kolchin, Bullock and Kaplinghat2011; Papastergis & Shankar Reference Papastergis and Shankar2016). Hydrodynamical simulations have shown that
$\Lambda$
CDM can actually reproduce the stellar mass function of observed satellites in the Local Group (Sawala et al. Reference Sawala2016). However, reionisation and feedback from supernovae and stellar winds play a key role in suppressing star formation. In effect, this makes a proportion of the dark matter subhaloes invisible to our optical telescopes. Lee et al. (Reference Lee, Hwang, Lee, Shin and Song2024) study the formation and evolution of dark galaxies using the IllustrisTNG cosmological hydrodynamical simulation. They predict that, at the present epoch
$(z = 0)$
, dark galaxies are predominantly located in void regions and higher spin parameters than luminous galaxies. This work selects for relatively low mass dark matter haloes with
$M\sim10^{9}$
M
$_{\odot}$
. The theoretical work done by Jimenez & Heavens (Reference Jimenez and Heavens2020) shows that we do not expect to find large numbers of dark galaxies above a halo mass of
$\sim10^{10}$
M
$_{\odot}$
. They predict that the number density of dark galaxies with halo masses
$\gt3\times10^{10}$
M
$_{\odot}$
is only
$10^{-6}$
Mpc
$^{-3}$
, so large volumes will be required in order to identify them. In their models, Benitez-Llambay & Frenk (Reference Benitez-Llambay and Frenk2020) find that all haloes with
$M_{200}\gt5\times10^9$
M
$_{\odot}$
should host a luminous galaxy, and a population of starless gaseous haloes should exist with masses between
$10^6$
M
$_{\odot}$
and
$5\times10^9$
M
$_{\odot}$
. In these haloes, gas is expected to be in thermal equilibrium with the ultraviolet background radiation and in hydrostatic equilibrium in the gravitational potential of the halo. Additionally, it has been postulated that dark minihaloes could host some type of compact gas cloud, such as ultra compact high velocity clouds (UCHVC; Adams, Giovanelli, & Haynes Reference Adams, Giovanelli and Haynes2013) and REionization-Limited H i Clouds (RELHICs; Benítez-Llambay et al. Reference Benítez-Llambay2017).
Looking at the low surface brightness Universe can increase our understanding of galaxy formation and evolution. The gas-star formation cycle plays a key role in the life-cycle of galaxies, centred on the balance between gas inflows from the intergalactic medium, its consumption through star formation, and outflows (Kennicutt & Evans Reference Kennicutt and Evans2012; Lilly et al. Reference Lilly, Carollo, Pipino, Renzini and Peng2013). The H i in a galaxy acts as a reservoir for star formation. If this gas is stripped then the star formation in a galaxy can become quenched. This can occur as a result of star formation feedback, as well as through ram pressure stripping and tidal interactions (e.g. Cortese, Catinella, & Smith Reference Cortese, Catinella and Smith2021). Alternatively, isolation can also be responsible for low star formation. Giant low surface brightness spirals are usually found in low density environments and rarely found to be interacting with other systems (Das Reference Das2013). A lack of mergers and tidal interactions combined with massive dark matter halos may allow for increased stability with a slow, steady accretion of gas, avoiding large star-forming events. We can look for internal and external factors that cause star formation to be suppressed in these low surface brightness galaxies. H i-rich dark galaxies could give us insights into the early stages of galaxy formation, giving us a chance to study the pristine conditions of the very first galaxies.
Yet even now, true dark galaxies have been shown to be extremely rare in observations. Xu et al. (Reference Xu2023) claim that their recent detection of FAST J0139+4328 with the Five-hundred-meter Aperture Spherical radio Telescope (FAST; Jiang et al. Reference Jiang2019) is the first isolated dark galaxy detected in the local Universe. Their optical imaging, however, is limited to the shallow Panoramic Survey Telescope and Rapid Response System data (Pan-STARRS; Chambers et al. Reference Chambers2016), which has a surface brightness limit of 24 mag arcsec
$^{-2}$
(Sola et al. Reference Sola2022). This source’s status as a dark galaxy is not confirmed (Benítez-Llambay et al. 2024; Karunakaran & Spekkens Reference Karunakaran and Spekkens2024). Prior to this candidate, there have in fact been several other promising potential dark galaxies (e.g. Kilborn et al. Reference Kilborn2000; Kent Reference Kent2010; B lek, Müller, Vudragović, & Taylor 2020; Wong et al. Reference Wong2021).
While both are devoid of stars, it is useful to distinguish between dark galaxies, which have a primordial origin, and dark clouds, which are debris from interactions. This includes tidal debris, such as the dark clouds in Taylor et al. (Reference Taylor2022), Józsa et al. (Reference Józsa2022), and debris from ram pressure stripping, such as the dark cloud in the Virgo cluster studied by Oosterloo & van Gorkom (Reference Oosterloo and van Gorkom2005). Dark galaxies reside in dark matter haloes and are stable to the effects of harassment (Taylor et al. Reference Taylor2016), while dark clouds are dark matter poor and transient in comparison. It is often difficult to confirm the formation mechanism of a dark H i source, as was the case with VIRGOHI21. This source was initially identified as a dark galaxy candidate (Davies et al. Reference Davies2004; Minchin et al. Reference Minchin2005) before later being revealed to favour an interaction based origin (Bekki, Koribalski, & Kilborn Reference Bekki, Koribalski and Kilborn2005; Haynes, Giovanelli, & Kent Reference Haynes, Giovanelli and Kent2007; Duc & Bournaud Reference Duc and Bournaud2008).
Although dark galaxies and dark clouds are invisible to optical instruments, they are detectable by radio telescopes if they are sufficiently gas-rich, and consequently H i is an excellent probe of the optically low surface brightness Universe. H i often extends well beyond the optical disc, making it an exceedingly useful tracer of environment and galaxy evolution. The extended H i is susceptible to ram pressure stripping and tidal forces, causing it to be significantly impacted by the environment in which it resides (e.g. Oosterloo & van Gorkom Reference Oosterloo and van Gorkom2005; Lee-Waddell et al. Reference Lee-Waddell2014). Large area, untargeted H i surveys are the key to identifying low surface brightness galaxies and dark galaxy candidates in large numbers. The H i Parkes All Sky Survey (HIPASS, Barnes et al. Reference Barnes2001; Doyle et al. Reference Doyle2005), however, was unable to confirm any dark galaxies due to poor angular resolution and source confusion. In the Arecibo Legacy Fast ALFA (ALFALFA, Giovanelli et al. Reference Giovanelli2005) survey less than 2% of sources were missing optical counterparts (Haynes et al. Reference Haynes2018). Many of these had a tidal origin (e.g. Leisman et al. Reference Leisman2016), and a few dark galaxy candidates were followed up (e.g. Kent Reference Kent2010; Janowiecki et al. Reference Janowiecki2015). By performing an optical search for LSBGs in the region covered by the Arecibo H i Strip Survey, Trachternach et al. (Reference Trachternach, Bomans, Haberzettl and Dettmar2006) demonstrate how optical and H i surveys sample different parts of LSBG population to complement each other, finding that LSBGs are expected to make up
$\gt30$
% of the local galaxy number density. With its improved angular resolution and sensitivity, WALLABY has the potential to detect dark galaxy candidates and extremely low surface brightness galaxies in large numbers by being able to better locate the origin of the emission and better separate emission from other nearby objects in denser group and cluster environments. The fast survey speed will also allow more of these rare objects to be detected in the large volume covered. Two dark H i clouds have already been identified in the pre-pilot survey observations (Wong et al. Reference Wong2021).
The paper is structured as follows. Section 2 outlines the surveys used in our analysis, and Section 3 presents the methods used to identify the dark and low surface brightness sources and calculate the H i and multiwavelength properties. Section 4 presents the dark H i sources that we identify and the global properties that they and the LSBGs possess though several scaling relations. Finally, Section 5 discusses our results and Section 6 summarises our findings. Throughout this work we use velocity in the optical convention (
$v=cz$
) and we adopt the AB magnitude convention. We assume a cosmology with
$H_{0}=70$
km s
$^{-1}$
Mpc
$^{-1}$
and
$\Omega_{m,0} = 0.3$
.
2. Surveys
2.1 WALLABY
WALLABY (Koribalski et al. Reference Koribalski2020) is an H i survey being conducted with the Australian Square Kilometre Array Pathfinder (ASKAP; Hotan et al. Reference Hotan2021). The phase 1 pilot survey targeted three 60 deg
$^{2}$
fields around the Hydra Cluster, Norma Cluster and the NGC 4636 galaxy group in a redshift range of
$z\lt0.08$
(Westmeier et al. Reference Westmeier2022; Deg et al. Reference Deg2022). The phase 2 pilot survey targeted the three additional fields around the NGC 5044 group, NGC 4808 group and the Vela Cluster (Murugeshan et al. Reference Murugeshan2024). In this study we have made use of the Hydra, NGC 4808 and NGC 5044 fields as they are free from bright continuum sources and consequently have the most reliable detections. Additionally, these fields are not near the Galactic plane and overlap with the Legacy Survey. The full survey is currently underway and will observe approximately
$1.4\pi$
sr of the sky in 8 832 h over the next few years. It is expected to detect
$\sim210\,000$
galaxies out to a redshift of
$z\approx0.1$
across the majority of the southern hemisphere. WALLABY has better angular resolution and sensitivity than previous wide-field H i surveys, such as ALFALFA and HIPASS, with a 30 arcsec beam. The survey has a frequency range of 1 295.5–1 439.5 MHz and a channel resolution of 4 km s
$^{-1}$
. The phase 1 data has a target noise level of 1.6 mJy beam
$^{-1}$
per channel, which corresponds to a
$5\sigma$
column density of
$8.6 \times 10^{19} (1+z)^4$
cm
$^{-2}$
across 20 km s
$^{-1}$
. The Hydra field has a slightly higher noise level of 1.85 mJy beam
$^{-1}$
per channel. The phase 2 data has an observed median rms noise of 1.7 mJy beam
$^{-1}$
per channel, which corresponds to a 5
$\sigma$
H i column density sensitivity of
$ \sim 9.1 \times 10^{19}(1 + z)^{4}$
cm
$^{-2}$
across 20 km s
$^{-1}$
. Version 2 of the Source Finding Application (SoFiA; Serra et al. Reference Serra2015; Westmeier et al. Reference Westmeier2021) is used by the WALLABY team for source identification. This allows for many H i properties to be included in the WALLABY catalogues, including the integrated fluxes and central frequencies of the sources.
2.2 DESI legacy imaging surveys
The Legacy Survey is a combination of three public projects: the Dark Energy Camera Legacy Survey (DECaLS), the Beijing-Arizona Sky Survey (BASS; Zou et al. Reference Zou2017), and the Mayall z-band Legacy Survey (MzLS). The Legacy Survey is primarily being conducted with the 4 m Blanco Telescope at the Cerro Tololo Inter-American Observatory. The Legacy Survey provides imaging in the g, r, i and z-bands over 20 000 square degrees with seeing on the order of 1 arcsec. Data release 10 has
$3\sigma$
limiting surface brightnesses of 29.8, 29.4, 27.7 and 28.0 mag arcsec
$^{-2}$
in the g, r, i and z-bands respectively, measured in
$10 \times 10$
arcsec boxes as measured by O’Beirne et al. (Reference O’Beirne2024) following the depth definition by Román et al. (Reference Román, Trujillo and Montes2020). These deep optical images allow extremely faint optical counterparts to WALLABY H i detections to be found and dark sources to be identified. The next deeper optical photometric survey across a comparably wide field will be the Legacy Survey of Space and Time (LSST; Ivezić et al. 2019).
2.3 GALEX
The Galaxy Evolution Explorer (GALEX; Martin et al. Reference Martin2005) provides imaging in the near ultraviolet (NUV; 1 770–2 730 Å) and the far ultraviolet (FUV; 1 350–1 780 Å). The surveys include the All-sky Imaging Survey (AIS), a Medium Imaging Survey (MIS) of 1 000 deg
$^{2}$
and a Deep Imaging Survey (DIS) of 100 deg
$^{2}$
, with depths of m
$_\mathrm{AB}\sim$
20.5, m
$_\mathrm{AB}\sim$
23 and m
$_\mathrm{AB}\sim$
25 respectively. The resolution of the NUV and FUV images are 4.2 and 5.3 arcsec. We use the GALEX images to estimate star formation rates in Section 3.2.3. As GALEX does not have complete sky coverage in the WALLABY fields that we have used, we find that 5% of our dark and low surface brightness sample was not observed by GALEX.
2.4. WISE
In addition to calculating the star formation rates of the LSBGs from the GALEX UV emission, we also investigated the infrared emission from the Wide-field Infrared Survey Explorer (WISE; Wright et al. Reference Wright2010). WISE provides data over the entire sky in four mid-infrared bands: W1 (3.4
$\unicode{x03BC}$
m), W2 (4.6
$\unicode{x03BC}$
m), W3 (12
$\unicode{x03BC}$
m) and W4 (22
$\unicode{x03BC}$
m). Unfortunately, we were only able to measure the infrared photometry for 19 of the LSBGs due to the sensitivity limits of the WISE data.
3. Methods
3.1 H i properties
In this work we make use of the H i properties provided in the WALLABY catalogues, including central frequency and integrated flux. We apply the statistical flux corrections following the methods outlined in Westmeier et al. (Reference Westmeier2022) and Murugeshan et al. (Reference Murugeshan2024) for the phase 1 and 2 pilot data respectively. This is done to account for the flux deficit of the faint WALLABY sources compared to the flux that would be recovered by single dish observations. We calculate the Hi mass from the integrated flux measurement, propagating the flux uncertainty to calculate the uncertainty in the Hi mass. The true Hi mass error will be dominated by systematic errors, such as the partial detection of a galaxy, source confusion and distance uncertainty. Throughout this work we use the luminosity distances approximated by the Hubble law using the heliocentric velocities of the Hi detections. The uncertainty in the Hi velocities (approximately 4 km s
$^{-1}$
) is small compared with the uncertainties in typical optical velocity estimates. Approximately half of the previously catalogued sources in the Hydra and NGC 5044 fields are in the 6dF Galaxy Survey (6dFGS; Jones et al. Reference Jones2009), and approximately half of the previously catalogued sources in the NGC 4808 field are in SDSS. These have average quoted velocity errors of 46 km s
$^{-1}$
and 12 km s
$^{-1}$
respectively. The median offset between our Hi velocities and the corresponding optical velocities is 20 km s
$^{-1}$
.
In addition to galaxy properties provided in the WALLABY catalogues, we measure the sizes of the H i discs (
$D_\mathrm{HI}$
) from the integrated intensity (moment 0) maps.
$D_\mathrm{HI}$
defined as the major axis of the 1 M
$_{\odot}$
pc
$^{-2}$
isodensity contour. To measure this, we model the H i moment 0 map as a 2-dimensional Gaussian. Commensurate with the signal-to-noise ratio of the resolved detections, the uncertainty is estimated as 10 arcsec converted to a physical size at the angular diameter distance of the source. A beam smearing correction is also applied to the H i diameter using Equations (1)–(3):



where
$ \sigma_\mathrm{galaxy}$
is the major axis standard deviation of the deconvolved galaxy,
$\sigma_\mathrm{model}$
is the major axis standard deviation of the 2-dimensional Gaussian model of the moment 0 map (which is the real galaxy convolved with the WALLABY beam) and
$\sigma_\mathrm{beam}$
is the standard deviation of the beam.
$D_\mathrm{HI}$
is the H i diameter in arcsec and A is the amplitude of the Gaussian model in M
$_{\odot}$
pc
$^{-2}$
. Equation (3) is the equation for a 1-dimensional Gaussian along the major axis. This deconvolution method is valid because the WALLABY beam is circular (30 arcsec). Due to the resolution limitations we do not correct for the effect of inclination on optical depth. This could lead to an over estimation of
$D_{HI}$
for any edge on galaxies. To measure
$D_\mathrm{HI}$
we require
$\sigma_\mathrm{galaxy} \gt 2\sigma_\mathrm{beam}$
to ensure the galaxy is sufficiently well resolved. Unfortunately
$89.7$
% of the dark and low surface brightness sample were not sufficiently resolved to meaningfully measure
$D_\mathrm{HI}$
. Additionally, we find that
$0.3$
% of the sample were not well-modelled by a Gaussian and
$0.8$
% were too diffuse to reach a density of 1 M
$_{\odot}$
pc
$^{-2}$
, and consequently do not have a
$D_\mathrm{HI}$
measurement.
3.2. Photometry
3.2.1 Sérsic modelling
To measure photometric properties in the Legacy Survey and GALEX images, we create models of the galaxies using the python package AstroPhot (Stone et al. Reference Stone2023). We model the galaxies using a Sérsic profile (Graham & Driver Reference Graham and Driver2005), given by:

where I(R) is the brightness profile as a function of semi-major axis, R is the semi-major axis length,
$I_{e}$
is the brightness at the half-light radius
$R_{e}$
, n is the Sérsic index that controls the shape of the profile, and
$b_{n}$
is a function of n that is not involved in the fit. AstroPhot fits seven parameters: the centre coordinates, position angle, axis ratio,
$I_{e}$
, n and
$R_{e}$
. These fits allow us to obtain photometric properties, including total fluxes and central surface brightnesses, and to create meaningful apertures using the effective radii, position angles and axis ratios.
Consistent multiband photometry for extended diffuse galaxies is a challenge. Using a model optimises the signal to noise, and a consistent model across the bands allows for consistent photometry, and consequently good colours. AstroPhot is a powerful tool, handling point spread functions (PSFs), multiple sources (e.g. foreground stars) and joint fitting in multiple bands. For the Legacy Survey images, we model the PSF ourselves using AstroPhot. To do this, we fit a Moffat PSF profile to five stars in each of the image cutouts. All galaxy models take the respective PSFs into account, and additionally galaxies with bright foreground stars are modelled together with the point source models of the foreground stars. We also make use of the joint modelling function of AstroPhot to model multiple bands together, allowing all parameters except
$I_{e}$
to be fit together in both bands. The g- and i-band images are modelled together, as are the NUV and FUV images. An example of the Sérsic modelling is shown in Appendix A.
There are limitations to our identification of LSBGs. We model all of our galaxies using a Sérsic profile, which may not reflect the range of properties present in all galaxies and has implicit limitations (e.g. Trujillo, Graham, & Caon Reference Trujillo, Graham and Caon2001). Across the 1829 galaxies in the three WALLABY fields looked at in this study, 6.3% of the sources are not well modelled by a Sérsic profile. This includes galaxies with foreground stars that were not able to be removed accurately as they saturated the detector. A further
$0.8$
% had missing coverage in the Legacy Survey image, and
$4.4$
% LSBGs did not have reliable models of the UV emission. Moreover, the proportion of LSBG sources in our sample could be underestimated. This is because a proportion of the H i detections contained more than one optical source. These detections could contain galaxies currently undergoing interactions within a shared HI envelope, two galaxies at different redshifts, or could alternatively be the result of the limited angular resolution of WALLABY compared to the Legacy Survey. As the individual H i properties could not be determined, these sources were not included in the sample. Similarly, the H i properties could not be accurately determined for the galaxies that had their H i emission split across multiple detections by SoFiA. Overall,
$12.1$
% of the 1829 WALLABY galaxies were not modelled because of these reasons. Future work will include studying the properties and distributions of galaxy pairs and groups with interacting H i gas.
3.2.2. Stellar mass
To calculate the stellar mass of the LSBGs, we use the relation between stellar mass to light ratio (
$\frac{\Upsilon^*}{ \mathrm{M}_{\odot}/\mathrm{L}_{\odot}}$
) and
$g-i$
colour from Du et al. (Reference Du, Cheng, Zheng and Wu2020):

where
$a=-1.152$
and
$b=1.328$
are the coefficients. We account for Galactic extinction using the correction method outlined in Yuan et al. (Reference Yuan, Liu and Xiang2013), adopting the R(a) values from Schlegel et al. (Reference Schlegel, Finkbeiner and Davis1998) and the
$E(B-V)$
values from Schlafly & Finkbeiner (Reference Schlafly and Finkbeiner2011). We use the mean Schlafly & Finkbeiner (Reference Schlafly and Finkbeiner2011)
$E(B-V)$
values.Footnote
a
Additionally, we apply k-corrections following Chilingarian, Melchior, & Zolotukhin (Reference Chilingarian, Melchior and Zolotukhin2010). While the r-band Legacy Survey images are deeper than the i-band images, almost a third of the low surface brightness sources had not been observed by the Legacy Survey in the r-band at the time of writing. Consequently, for consistency the
$g-i$
colour is used to calculate the stellar mass to light ratio for all sources. Du et al. (Reference Du, Cheng, Zheng and Wu2020) derive this relationship from their sample of LSBGs selected from the
$\alpha$
.40 H i survey (Haynes et al. Reference Haynes2011) and the Sloan Digital Sky Survey (SDSS) DR7 (Abazajian et al. Reference Abazajian2009). LSBGs have been shown to have low star formation rates and low stellar mass densities (Burkholder, Impey, & Sprayberry Reference Burkholder, Impey and Sprayberry2001; Lei et al. Reference Lei2018). These distinct properties suggest that LSBGs could have different formation and evolutionary histories compared to high surface brightness galaxies, and different stellar populations have very different spectral energy distributions. This leads to different stellar mass-to-light ratio scaling relations, for example, Du et al. (Reference Du, Cheng, Zheng and Wu2020) show that the relation from Bell et al. (Reference Bell, McIntosh, Katz and Weinberg2003) overestimates the stellar mass for their population of LSBGs. Hence, it is important that we use a stellar mass to light ratio scaling relation derived specifically for LSBGs. The mass-to-light ratio can be converted to a stellar mass using the Equations (6), (7) and (8):

where
$M_{*}$
is the stellar mass, and L is the luminosity. The luminosity can be calculated from the g-band magnitude.


where
$M_g$
is the g-band absolute magnitude and
$M_{g, \mathrm{solar}}=5.05$
M
$_{\odot}$
is the absolute solar magnitude in the g-band (Willmer Reference Willmer2018). The uncertainty associated with the stellar mass is calculated by propagating the uncertainty in the g-band flux measurement from the Sérsic modelling and the 0.24 dex scatter in the
$\log_{10}(\Upsilon^{*})$
from Du et al. (Reference Du, Cheng, Zheng and Wu2020).
We measure the
$3\sigma$
upper limits on the stellar masses of the dark sources using an aperture on the Legacy Survey images within the location of the H i detection. To create a meaningful aperture, we study the correlation between H i size and effective radius for the resolved LSBGs, as shown in Figure 1. We find that this relation has a best-fitting line defined by
$\log_{10}\left(\frac{R_{e}}{\mathrm{kpc}}\right) = (1.0\pm0.1) \times \log_{10}\left(\frac{D_\mathrm{HI}}{\mathrm{kpc}}\right) - (0.8\pm0.1)$
. For the dark sources we use this relation to create a circular aperture corresponding to their measured H i size. For the purpose of choosing an appropriate aperture to measure the upper limits, we measure
$D_\mathrm{HI}$
for poorly resolved dark sources with
$\sigma_\mathrm{galaxy} \lt 2\times \sigma_\mathrm{beam}$
without correcting for the beam, noting that 4 of the 55 dark sources still do not have H i size measurements, as 3 were too diffuse and for 1 source the Gaussian model did not converge.

Figure 1. The effective radius (
$R_{e}$
) as a function of H i size (
$D_\mathrm{HI}$
) for the well-resolved LSBGs. The black line shows the best-fitting line:
$\log_{10}\left(\frac{R_{e}}{\mathrm{kpc}}\right) = (1.0\pm0.1) \times \log_{10}\left(\frac{D_\mathrm{HI}}{\mathrm{kpc}}\right) - (0.8\pm0.1)$
.
3.2.3 Star formation rates
Newly formed high mass stars release hot UV radiation. This radiation from young stars is absorbed by dust and re-emitted in the mid- and far-infrared; this reprocessed emission can therefore be used as a SFR indicator. The SFR can be derived from the WISE W3 luminosity after correcting for the contribution of old stellar populations to the W3 band; this Rayleigh-Jeans emission is estimated using the W1 luminosity (Cluver et al. Reference Cluver2017). However, as the LSBGs do not have strong W3 infrared emission, we calculate their star formation rates (SFRs) solely from the GALEX ultra-violet (UV) images. The UV traces young massive stars and hence is often used as an indicator of SFR, however it is susceptible to dust extinction which must be accounted for. The SFRs are calculated following the method of Hao et al. (Reference Hao2011) using the total fluxes measured in the NUV and FUV bands. First we apply a Galactic reddening correction using Equations (9)–(12):




where
$m_\mathrm{NUV,corr}$
and
$m_\mathrm{FUV,corr}$
are the NUV and FUV magnitudes after applying the Galactic reddening correction to the measured NUV and FUV magnitudes (
$m_\mathrm{NUV}$
and
$m_\mathrm{FUV}$
), and
$\alpha$
and
$\beta$
are functions of
$E(B-V)$
. Next we apply the k-correction following Chilingarian et al. (Reference Chilingarian, Melchior and Zolotukhin2010). Then we apply an attenuation correction using Equations (13) and (14):


where
$m_\mathrm{FUV,atten}$
is the attenuation-corrected FUV magnitude. Finally, the SFR is calculated using Equation (15):

where
$L_{FUV}$
is the reddening- and attenuation-corrected FUV luminosity. However, if
$FUV-NUV\le0$
, the internal dust attenuation correction cannot be applied, as the
$\gamma$
correction parameter becomes unphysical. The uncertainty in
$\log_{10}\left(\frac{\mathrm{SFR}}{\mathrm{M_{\odot} yr^{-1}}}\right)$
is taken to be
$\pm0.115$
dex from the scatter of the relation in Equation (15) (Hao et al. Reference Hao2011).
$3\sigma$
upper limits on the SFRs for the LSBGs that were not detected in the FUV are calculated from elliptical apertures with semi-major axes of two times the effective radius from the optical Sérsic models (or as estimated in Section 3.2.2 for the dark sources).

Figure 2. The H i mass as a function of redshift for the LSBGs (blue) and strong candidate dark source detections (green) compared to the rest of the detections in the WALLABY fields (grey). The phase 1 data and phase 2 data are plotted separately in panels (a) and (b) respectively as the detection limits of the pilot surveys differ. The
$5\sigma$
detection limit shown by the solid black line is calculated using a line width of 1 MHz. The
$5\sigma$
detection limits shown by the dashed black lines are calculated using the minimum and maximum line widths.
3.3. Source classification and reliability
Galaxies are classified as LSBGs if they have a mean g-band surface brightness fainter than 23 mag arcsec
$^{-2}$
within 1 effective radius (as calculated in the Sérsic model) and are classified as dark H i sources if they have no visible counterpart in the g-band image and no Sérsic model could be made at the H i coordinates of the Legacy Survey image. The optical properties of the dark sources are further analysed by coadding all four bands of Legacy Survey images and convolving with a boxcar kernel with a size of 2.6 arcsec by 2.6 arcsec to degrade the resolution and enhance the surface brightness sensitivity, greater enabling the detection of diffuse emission. These images are shown in Appendix D, and one ‘dark’ source may have evidence of a potential optical counterpart (see Section 5.2). Additionally, we inspect the dark H i sources for signatures that suggest a tidal origin. Sources that have asymmetric Hi features or lie within the virial radius of a neighbouring galaxy are potentially tidal debris. We fit a Sérsic model to the neighbouring galaxies in NASA/IPAC Extragalactic Database (NED) that are within
$\pm 300$
km s
$^{-1}$
and measure their effective radii. We use the relation
$R_{e} = 0.015R_{200}$
from Kravtsov (Reference Kravtsov2013) to estimate the virial radius from the effective radius.
It is essential to consider the reliability of the detection of the dark H i sources, as the false positive rate of the SoFiA implementation in the WALLABY pipeline still requires further investigation. False positives could arise from a number of factors, including the presence of interference and residual continuum emission in the data cube. Nevertheless, all of the dark H i sources presented in this work have passed the quality checks required to be published in the WALLABY catalogues (Westmeier et al. Reference Westmeier2022; Murugeshan et al. Reference Murugeshan2024). There are, however, notes in the catalogue to caution users against trusting the reliability of these sources without further analysis. To distinguish strong dark source candidates from uncertain detections (requiring follow-up observations), we consider the strength of the detection with respect to the survey detection limit and we inspect the unmasked H i cubes and spectra to assess the strength of the detection compared to the noise level in the cube. Figure 2 shows the H i mass plotted against the redshift for the dark H i sources and LSBGs with respect to all of the WALLABY detections in the fields used. The phase 1 and 2 pilot data are plotted separately as the phase 2 survey has a lower noise level than the Hydra field. The limiting detectable H i mass is dependent on the velocity range, so the detection limit for each dark source is calculated by integrating the median local RMS noise level (of 1.85 mJy in the Hydra field and 1.7 mJy in the phase 2 cubes) over the
$w_{20}$
. The
$5\sigma$
detection limit line shown in the figure is calculated assuming a line width of 1 MHz (211 km s
$^{-1}$
) for consistency with the WALLABY pilot data releases, however, it is important to note that this width is not representative of all galaxies and the
$w_{20}$
emission line widths in our sample range from 0.09 to 1.55 MHz (19 to 327 km s
$^{-1}$
). Seven of the dark sources have H i masses less than a factor of two above the detection limit (for their given
$w_{20}$
width). Dark sources lying close to the detection limit could help to explain why large numbers of dark candidates have not been detected in previous H i surveys, however it also suggests that these seven dark sources are less reliable detections and could simply be false positives caused by noise or artefacts in the data.
We identify a total of 38 strong candidate sources (leaving 17 uncertain H i detections). All of the strong dark source candidates have a peak signal-to-noise (SNR) ratio
$\gt4.9$
, where the SNR is defined as the peak flux density divided by the rms provided in the WALLABY catalogues. Furthermore, we check for dark H i sources that could be impacted by satellite radio frequency interference (RFI), whether from sidelobes and haromonics of radionavigation satellites below 1293 MHz, or satellites that use frequencies within the clean mid band (1293 to 1437 MHz; Lourenço et al. Reference Lourenço2024). We identify three candidates (WALLABY J
$130119+053553$
, WALLABY J
$132709-163509$
and WALLABY J
$132238-204726$
) that could be affected by RFI based on their central frequencies. All of these had already been flagged as uncertain detections, which is a testament to the accuracy of our classification. Three of the strong H i source candidates have a Parkes H i detection within 15.5 arcmin (size of HIPASS beam) of the WALLABY coordinates: WALLABY J131244-155218/HIPASS J1312-15, WALLABY J131928-123828/HIPASS J1319-12 (Barnes et al. Reference Barnes2001), and WALLABY J132825-253528/HIDEEP J1329-2533 (Minchin et al. Reference Minchin2003). Deep follow-up H i observations, such as with MeerKAT (e.g. Namumba et al. Reference Namumba2021; Maccagni et al. Reference Maccagni2024; Zabel et al. Reference Zabel2024), are still required to definitively confirm the detections of the dark H i sources.
Table 1. The number of WALLABY dark H i sources and LSBGs.

Table 2. The strong dark H i detections that show evidence that suggests they could be tidal remnants.

4. Results
4.1. Source identification
Table 1 shows the number of WALLABY sources in each category for each pilot survey field used. In total we find 315 LSBGs, 38 strong dark H i detections, and 17 uncertain dark H i detections. Of the 38 strong dark H i detections, 13 have signatures that suggest they could be tidal debris (1 from the Hydra field, 1 from the NGC 4808 field and 11 from the NGC 5044 field). Table 2 presents the sources that have clear tidal features.
Table 3 contains the properties of the strong dark H i detections. This includes the WALLABY field, WALLABY name, right ascension, declination, central velocity, luminosity distance,
$w_{50}$
,
$w_{20}$
, H i size, H i mass, stellar mass
$3\sigma$
upper limit, SFR
$3\sigma$
upper limit, peak signal-to-noise ratio and whether the source has signatures to suggest that it is a tidal remnant. Throughout this section, the uncertain dark H i detections are not included in Figures 3–5, however their properties are presented in Table C1 in Appendix C. The H i contours overlaid on optical images, moment 1 maps and unmasked spectra of the strong and uncertain dark H i detections are shown in Appendices B and C respectively.
Table 3. Properties of the strong dark source detections. From left to right the columns are: WALLABY field, WALLABY name, right ascension, declination, central velocity, luminosity distance, emission line width at half maximum,
$w_{20}$
emission line width, H i size (major axis at 1 M
$_{\odot}$
pc
$^{-2}$
contour level), H i mass, stellar mass
$3\sigma$
upper limit, SFR
$3\sigma$
upper limit, peak signal-to-noise ratio, and whether the source has evidence to suggest it could be a tidal remnant.

Figure 3 shows the location of the LSBGs and dark H i sources with respect to the rest of the WALLABY detections in the three fields. The Hydra cluster and virial radius (
$r_{200}=1.44$
Mpc projected onto the sky at 61 Mpc; Reiprich & Böhringer Reference Reiprich and Böhringer2002; Reynolds et al. Reference Reynolds2022) and NGC 5044 and
$r_{500}$
overdensity radius (
$r_{500}=620$
kpc projected onto the sky at 33 Mpc; Osmond & Ponman Reference Osmond and Ponman2004) are also shown. There does not appear to be a preferential distribution of the dark H i sources within each field. Most have been detected in the NGC 5044 field, as this field covers a larger area than the other two (NGC 5044 covers
$\sim 120$
deg
$^2$
, Hydra covers
$\sim 60$
deg
$^2$
and NGC 4808 covers
$\sim 30$
deg
$^2$
).
4.2 Global Galaxy Properties
Figure 4 shows the H i size-mass relation with the resolved WALLABY LSBGs and dark H i sources. The H i size is defined as the major axis of the 1 M
$_{\odot}$
pc
$^{-2}$
isodensity contour. As discussed in Section 3.1, the H i size was only able to be measured for the
$\sim 9.2$
% of sources that were sufficiently well resolved. For comparison, the best-fitting relation and
$3\sigma$
scatter for the galaxies studied by Wang et al. (Reference Wang2016) are shown. The relationship extends over a large range of H i masses, from
$\sim10^{5.5}$
M
$_{\odot}$
all the way to
$\sim10^{11}$
M
$_{\odot}$
. Notable galaxies have been marked by star symbols. The sample studied by Wang et al. (Reference Wang2016) contains galaxies across a range of morphologies and environments, and yet all lie on the same
$D_\mathrm{HI}-M_{HI}$
relation with a small
$1\sigma$
scatter of
$\sim0.06$
dex. Some (almost) dark H i sources are known to lie above this relation, such as the dark cloud found in Kilborn et al. (Reference Kilborn, Forbes, Koribalski, Brough and Kern2006). The WALLABY LSBGs and dark H i sources follow this relation remarkably well. The
$1\sigma$
scatter of the LSBGs and dark H i detections are only 0.04 dex and 0.08 dex respectively. Hence, despite having extreme optical properties, our well-resolved LSBGs and strong dark H i sources appear to have typical H i sizes. We emphasise here that whilst the well-resolved dark sources are consistent with this relation, we are still yet to confirm whether they are genuine H i detections. The tightness of the H i size-mass correlation is thought to arise from a constant average H i surface density across different galaxy morphologies (Broeils & Rhee Reference Broeils and Rhee1997), and environmental stripping processes, such as ram pressure stripping, that cause gas disc truncation have been shown not to impact this relation (Stevens et al. Reference Stevens2019). The H i surface density has been shown to be regulated by the conversion of H i to molecular hydrogen and star formation, however Wang et al. (Reference Wang2016) find that these two factors cannot be the only or major drivers of the H i size-mass relation. This is consistent with our LSBGs and dark sources following the relation, despite their lack of significant star formation, to which we turn our attention to next.

Figure 3. The location of the LSBGs (blue), dark H i sources with tidal features (orange) and the rest of the dark H i sources (green) with respect to the rest of the WALLABY sources (grey). A 1 deg
$^{2}$
box is shown in the lower left corner for scale. The Hydra cluster and virial radius is shown in panel (a) and the NGC 5044 group and
$r_{500}$
overdensity radius is shown in panel (c).

Figure 4. The H i size-mass relation for the well resolved LSBGs (blue), dark sources with tidal features (orange) and the rest of the dark sources (green). The H i size (
$D_\mathrm{HI})$
is the size of the semi-major axis at 1 M
$_{\odot}$
pc
$^{-2}$
. The relation and 3
$\sigma$
scatter from Wang et al. (Reference Wang2016) is shown by the solid and dotted lines respectively.

Figure 5. Gas and stellar property comparisons with the LSBGs (blue), dark sources with tidal features (orange) and the rest of the dark sources (green). The xGASS galaxies are shown in light grey, with the rolling median and interquartile range given by the black line and shaded region. The scaling relations of the ALFALFA galaxies are shown by the dashed lines. The WALLABY Eridanus galaxies are shown in dark grey. Upper limits are denoted by triangular symbols or arrows. The plots show (a) the H i mass (
$M_{HI}$
) against the stellar mass (
$M_*$
), (b) the star formation efficiency (SFE) against
$M_*$
(c) the star formation rate (SFR) against
$M_*$
, and (d) the specific star formation rate (sSFR) against
$M_*$
.

Figure 6. Panel (a) shows the H i mass against
$w_{50}$
emission line width for the WALLABY LSBGs (blue), ALFALFA almost dark sources (grey), dark tidal sources (orange) and other candidate dark sources (green). Panel (b) presents the histogram of the redshifts of the dark tidal sources and the other dark sources.
Figure 5 presents H i mass, stellar mass, SFR and star formation efficiency (SFE) scaling relations. We use the GALEX Arecibo SDSS Survey (xGASS; Catinella et al. Reference Catinella2018) galaxies as our control sample to compare with our LSBG and dark H i sources. xGASS is a gas fraction- and volume-limited H i survey of galaxies selected by stellar mass and redshift, minimising the effect of detection limit bias. The xGASS galaxies are shown by light grey markers, and the rolling median (excluding non-detections) and interquartile range are shown by the black line and shaded region. The dashed lines show the scaling relations from ALFALFA (Huang et al. Reference Huang2012), another H i selected sample, for comparison. Additionally, we compare the WALLABY pre-pilot observation Eridanus galaxies from For et al. (Reference For2021), shown by dark grey markers. This sample consists of 55 H i detections, 43 of which are part of the Eridanus supergroup. It has a large fraction of H i deficient galaxies and their distorted H i morphologies suggest the presence of ongoing tidal interactions (Wang et al. Reference Wang2022). The LSBGs in our sample are shown in blue, with the median errorbars shown in the bottom left of the plot. The upper limits of the dark H i sources with tidal signatures and the other dark H i sources are denoted by the orange and green markers respectively.
Figure 5a shows the stellar mass plotted against the H i mass. Our LSBGs have large H i masses compared to the xGASS sample within the overlapping stellar mass range. This is a combination of selection effects: our sample of LSBGs are, by definition, low surface brightness and so are expected to have low stellar mass, while at the same time have enough H i to have been detected in WALLABY. The LSBGs are not as gas-deficient as the Eridanus sources, and tend to follow the trend of the ALFALFA galaxies for
$\log_{10}\left(\frac{M_*}{\mathrm{M_{\odot}}}\right) \lt 9$
, while those with
$\log_{10}\left(\frac{M_*}{\mathrm{ M_{\odot}}}\right) \gt 9$
seem to have higher H i masses. There are, however, several LSBGs with
$\log_{10}\left(\frac{M_*}{\mathrm{ M_{\odot}}}\right) \lt 9$
that deviate from the ALFALFA scaling relation towards the parameter space occupied by the upper limits of the dark H i sources. The stellar masses of the LSBGs extend over a large range due to the varied nature of the sample, spanning from dwarfs all the way to large diffuse galaxies. There are 15 LSBGs with very high stellar masses (
$\log_{10}\left(\frac{M_*}{\mathrm{M_{\odot}}}\right) \gt 10$
). They are all fairly distant sources at redshifts between 0.034 and 0.078, with large effective radii (
$7.4 \lt R_e \lt 22.1 $
kpc). 12 of the 15 show regular rotation in their moment 1 maps.
Figure 5b presents the SFE (the ratio of the SFR and the H i mass) as a function of stellar mass. The SFEs of the LSBGs tend to lie below the xGASS median, and the distribution is relatively flat across the stellar mass range (which is consistent with the findings of Wong et al. Reference Wong2016). Figures 5c and 5d show the stellar mass against the SFR and specific SFR (sSFR; the ratio of the SFR and the stellar mass). The LSBGs tend to have lower stellar masses than the xGASS galaxies, but within the range of stellar masses spanned by the xGASS sample, most of the WALLABY LSBGs have SFRs that lie above the xGASS median. The sSFR of the LSBGs tend to be larger than those of xGASS across all stellar masses, and follow the trend of the ALFALFA galaxies. Altogether, Figure 5 illustrates that the WALLABY LSBGs are a distinct population from the xGASS galaxies by selection, and that most, but not all, tend to follow the trend of (H i selected) ALFALFA galaxies. The stellar mass and SFR of the LSBGs is an extention of the high surface brightness population. However, once the H i is taken into consideration, the completely different mode of galaxy evolution becomes clear, with the large reservoirs of gas used sparingly for their mass. The stellar mass, SFE and SFR upper limits of the dark H i sources highlight the parameter space that is limited by the depth of our current surveys.
5. Discussion
5.1. Low surface brightness galaxies
We find that LSBGs make up a significant proportion of the gas-rich galaxy population, with 17% of the 1829 WALLABY detections used in this study having low surface brightness optical counterparts. Even this proportion may underestimate the true number of LSBGs due to sensitivity limitations. Using the cosmological hydrodynamical simulation Horizon-AGN (Dubois et al. Reference Dubois2014), Martin et al. (Reference Martin2019) predicted that LSBGs (mean r-band surface brightness within 1
$R_e$
that is
$ \gt 23$
mag arcsec
$^{-2}$
) are expected to make up 47% of galaxies with
$M_{*}\gt10^8$
M
$_{\odot}$
, and 85% of galaxies with
$M_{*}\gt10^7$
M
$_{\odot}$
. Of the 315 LSBGs that we identified, 75% were not catalogued in NED (within 10 arcsec of the optical source centres). This highlights both the power of H i surveys like WALLABY for identifying gas-rich low surface brightness sources that may be missed by photometric studies, and the lack of deep optical imaging in the southern hemisphere. Combined with new deep optical surveys such as the Legacy Survey, we are finally entering an era where these extreme sources can be studied in large numbers for the first time, and foreshadows what could be achieved with LSST.
There were no matches between the LSBGs and UDGs in the Hydra cluster catalogued by La Marca et al. (Reference La Marca2022) and the WALLABY LSBGs. This does not necessarily imply that the cluster LSBGs and UDGs are completely devoid of Hi, but rather that their Hi masses may lie below the WALLABY detection threshold (WALLABY has a
$5\sigma$
Hi mass sensitivity of
$\sim5.5\times10^8(D/100$
Mpc
$)^2$
M
$_{\odot}$
for point sources, Murugeshan et al. Reference Murugeshan2024). For et al. (Reference For2023) had similar results searching for Hi-detections of UDGs candidates from the Systematically Measuring Ultra-diffuse Galaxies survey (SMUDGes; Zaritsky et al. Reference Zaritsky2022) in the Eridanus supergroup using the WALLABY pre-pilot data. They found 6 UDGs that were undetected in Hi and only one Hi-bearing low surface brightness dwarf galaxy.
In addition to calculating the SFR of the LSBGs from the GALEX UV emission, we also investigated the infrared emission from WISE. Of the 315 LSBGs, 190 were detected in GALEX FUV, while only 19 had a W3 detection in WISE. Of all the LSBGs that were detected in W3, none had a sufficiently high W3 luminosity to calculate a SFR after subtraction of the stellar continuum using the W1 luminosity correction. This lack of dust, even in sources with UV-detected star formation, suggests that the interstellar medium conditions of low surface brightness sources are relatively dust-poor compared to more massive and luminous galaxies (up to the sensitivity limits of the WISE data). Our sample of LSBGs spans a large range in stellar masses (
$5\times10^5$
M
$_{\odot}$
$ \lt M_*\lt1\times10^{11}$
M
$_{\odot}$
) and SFR (
$9\times10^{-5}$
M
$_{\odot}$
yr
$^{-1}$
$\lt$
SFR
$\lt6$
M
$_{\odot}$
yr
$^{-1}$
). This shows that the LSBGs exhibit considerable diversity, from irregular dwarfs to massive spirals. This large sample of LSBGs with significant H i content in a range of environments suggests that a low surface brightness cannot always be the result of a loss of gas and early quenching, as has been suggested for cluster ultra-diffuse galaxies (van Dokkum et al. Reference van Dokkum2015). Internal mechanisms that suppress significant star formation in these galaxies must also exist, such as supernova feedback (e.g. Di Cintio et al. Reference Di Cintio2017), and high spin (e.g. Leisman et al. Reference Leisman2017) which has been directly linked to galaxies with high gas fractions (Mancera Piña et al. Reference Mancera Piña2021).
5.2 Dark H i Sources
In this work, we aim to investigate the nature of the dark H i sources in the WALLABY pilot catalogues. While they both lack optical counterparts, it is useful to differentiate between isolated dark sources and dark tidal clouds, as they have different formation mechanisms and consequently different properties. There is a possibility that a small number of the isolated dark sources could be higher H i mass counterparts of primordial dark galaxies. Using the IllustrisTNG cosmological hydrodynamical simulations, Lee et al. (Reference Lee, Hwang, Lee, Shin and Song2024) find that in the early Universe, dark galaxies initially tend to form in less dense regions. Their star formation is suppressed by heating from cosmic reionisation and a lack of mergers and interactions. They are predicted to form in dark matter haloes with high spin parameters (Jimenez & Heavens Reference Jimenez and Heavens2020) and be stable to the effects of harassment (Taylor et al. Reference Taylor2016). REionization-Limited H i Clouds (RELHIC; Benítez-Llambay et al. Reference Benítez-Llambay2017) are a type of dark galaxy predicted by simulations. They are starless, low mass dark matter haloes that host gas which is almost completely ionised but with small (H i size
$\lt1$
kpc), round neutral cores. The gas is in hydrostatic equilibrium with the gravitational potential of the dark matter halo and in thermal equilibrium with the ionising UV background. Some observed ultra compact high velocity clouds (UCHVC; Adams et al. Reference Adams, Giovanelli and Haynes2013) are consistent with RELHICs. On the other hand, dark tidal clouds lack dark matter as they are formed from stripped gas in galaxy interactions (e.g. Duc & Bournaud Reference Duc and Bournaud2008) and exist over shorter timescales. 13 of the strong H i detections have signatures that suggest they are tidal debris or dark tidal dwarf galaxies (as discussed in Section 4.1). The other 25 dark sources are either artefacts of an unknown nature or extreme LSBGs (with optical counterparts beyond the limits of the Legacy Survey). These dark galaxy candidates that we have identified are not consistent with UCHVCs or RELHICs as their H i sizes and velocity widths are too large.
Figure 6a presents the
$w_{50}$
emission line width against the H i mass for the LSBGs, dark tidal sources and the other candidate dark sources. Additionally, the almost dark galaxies from the ALFALFA survey (Leisman et al. Reference Leisman2017) are shown for comparison. The dark sources span the same parameter space as the LSBGs, with the dark tidal sources preferentially distributed in the lower H i mass region. The ALFALFA almost dark sources span the same range of
$w_{50}$
values and have a narrower H i mass range. Figure 6b highlights that we are only able to detect dark tidal cloud candidates at lower redshifts, suggesting that at higher redshifts they may be subject to source confusion, or be too low mass to be detectable. Figure 5a highlights that all the candidate dark H i sources, assuming they are genuine detections, would have to have considerable H i mass (
$\gt4.9\times10^7$
M
$_{\odot}$
), despite their low stellar mass upper limits. All of the candidates would have
$\frac{M_{HI}}{M_{*}}\gt187$
, and 89% of the dark sources have
$\frac{M_{HI}}{M_{*}}\gt500$
. The sensitivities of these new optical and H i surveys are pushing us to consider the limits of what we define as a galaxy.
Unfortunately, none of the candidate dark sources were sufficiently well resolved both spatially and spectrally to have kinematic models generated from the WALLABY Kinematic Analysis Proto-Pipeline (WKAPP; Deg et al. Reference Deg2022; Murugeshan et al. Reference Murugeshan2024), and consequently meaningful dynamical mass estimates could not be made. Higher resolution observations of the dark candidates, such as with MeerKAT, would allow us to determine which are artefacts of an unknown nature, which are tidal debris of unknown origin, and which may be ‘failed’ galaxies. Many models and simulations predict dark galaxies with halo masses of order
$\lt10^9$
M
$_{\odot}$
(e.g. Benítez-Llambay et al. Reference Benítez-Llambay2017; Jimenez & Heavens Reference Jimenez and Heavens2020; Benitez-Llambay & Frenk Reference Benitez-Llambay and Frenk2020; Lee et al. Reference Lee, Hwang, Lee, Shin and Song2024). In contrast, 17 of the 25 strong dark H i detections without tidal features have H i masses
$\gt10^{9}$
M
$_{\odot}$
. If these dark galaxy candidates are revealed to be genuine detections by follow-up H i observations, this indicates a significant gap in our current understanding of galaxy formation. On the other hand, deeper optical observations could reveal faint stellar counterparts to many of the high-mass candidates. In fact, although no optical counterpart can be easily seen in the g-band image of WALLABY J131244-155218 (Figure B16), an extremely faint optical counterpart is just visible in the co-added image of the g, r, i and z-bands (convolved with a boxcar kernel with a size of 2.6 arcsec by 2.6 arcsec), as shown in Figure 7. Co-added images of all the dark sources are presented in Appendix D.

Figure 7. Co-added g, r, i and z-band image convolved with a boxcar kernel of WALLABY J131244-155218. H i contours ([0.1,0.5,1]
$\times 10^{20}$
cm
$^{-2}$
) are overlaid and the WALLABY beam is shown in the lower left corner. Although this appears to be a dark source from the g-band image, a faint optical counterpart is visible in the co-added image.
Very few of the ALFALFA H i sources were found to be dark galaxy candidates (Haynes et al. Reference Haynes2011; Janowiecki et al. Reference Janowiecki2015; Cannon et al. Reference Cannon2015). While we are yet to confirm whether all of the dark WALLABY H i sources are genuine detections, the discrepancy between the number of dark sources detected by the two surveys could arise from the different survey properties. While ALFALFA and WALLABY have similar sensitivities, the significantly better angular resolution of WALLABY means that we are better able to localise the H i emission and thus reduce source confusion. The 55 dark WALLABY detections are likely to be a mix of artefacts (especially the uncertain detections shown in Appendix C), tidal debris, extreme LSBGs and dark galaxy candidates.
6. Summary and conclusions
We have presented 315 LSBGs identified in the WALLABY pilot data. To measure the photometry of the optical and ultraviolet observations, we use the Python package AstroPhot to fit Sérsic models to the galaxies. These fits are done consistently across multiple bands and are designed to give consistent, SNR-optimised photometry. The LSBGs are defined by a mean g-band surface brightnesses within 1
$R_e$
fainter than 23 mag arcsec
$^{-2}$
, and the faintest LSBG in our sample has a mean g-band surface brightness within 1
$R_e$
of 25.7 mag arcsec
$^{-2}$
. All of our dark H i sources and 75% of our LSBGs had not been catalogued prior to WALLABY (within 10 arcsec of the optical source centres), highlighting both the extreme nature of these sources and the lack of multiwavelength coverage in the southern hemisphere.
In addition to the LSBGs, we find 55 H i detections without optical counterparts in the deepest observations available. We investigate the nature of these candidate dark H i sources in the WALLABY pilot catalogues. We assess their reliability, and identify 38 to be strong candidates. Of these, we find that 13 show signatures of tidal remnants, while the other 25 are isolated, so could be fainter LSBGs, genuinely dark galaxies, or artefacts of an unknown nature. A large proportion of the non-tidal dark sources have large H i masses (if they are genuine detections), with 68 per cent having H i masses
$\gt 10^9$
M
$_{\odot}$
. This is in conflict with simulations, which predict an abundance of lower mass dark galaxies. If the dark sources are revealed to be genuine dark galaxy candidates by follow-up H i observations, this indicates a significant gap in our current understanding of galaxy formation. On the other hand, deeper optical observations could reveal faint stellar counterparts to many of the high mass candidates. While we are yet to confirm whether all of the dark WALLABY sources are genuine detections, the discrepancy between the number of dark sources detected by ALFALFA and WALLABY may arise from the different survey properties. Although ALFALFA and WALLABY have similar sensitivities, the significantly better angular resolution of WALLABY means that we are better able to localise the H i emission and thus reduce source confusion.
We use scaling relations to study the global galaxy properties of our dark and low surface brightness sample. Both the WALLABY LSBGs and dark sources (that are sufficiently well resolved) follow the H i size-mass relation remarkably well. Hence, despite having extreme optical properties, our LSBGs and dark H i sources do in fact have typical H i galaxy properties. The WALLABY LSBGs have high H i masses for their stellar masses when compared with the xGASS galaxies due to selection effects. On the other hand, they do have similar H i masses to the (H i-selected) ALFALFA sample for
$\log_{10}\left(\frac{M_*}{\mathrm{ M_{\odot}}}\right) \lt 9$
, while those with
$\log_{10}\left(\frac{M_*}{\mathrm{M_{\odot}}}\right) \gt 9$
seem to have higher H i masses. The sSFRs of the LSBGs follow the trend of the ALFALFA galaxies across all stellar masses, and tend to be larger than those of the xGASS galaxies. We find that the WALLABY LSBGs have low SFEs, and have stellar masses spanning five orders of magnitude, which highlights the varied morphologies across our sample, ranging from tiny dwarf galaxies to large ultra-diffuse galaxies. The stellar mass and star formation rate upper limits of the dark sources illustrate the unexplored parameter space that is limited by the sensitivity of current surveys.
Assuming that the non-tidal dark sources are not preferentially distributed with respect to the environment, across the 1.4
$\pi$
steradians to be covered by the full WALLABY survey we can expect to detect
$\sim570$
isolated dark sources. We have highlighted the challenges that the full survey will face with respect to distinguishing true dark sources from false positive detections. To confirm the reliability of our WALLABY dark sources, follow-up H i observations with a high-resolution and high-sensitivity instrument, such as MeerKAT or the upcoming Square Kilometre Array, is required. Additionally, deep optical imaging could help push down to lower surface brightness levels to reveal whether the dark sources are failed galaxies with very little stellar content, or perhaps none at all. Our understanding of galaxy formation and evolution is coupled to the galaxies that are visible (in the optical wavelengths) in past and current surveys. This work provides a glimpse into the future of studying the H i-rich optically faint Universe and the potential for interplay between radio and multiwavelength observations in the upcoming Square Kilometre Array era.
Acknowledgements
We are grateful to the anonymous referee for their useful feedback and suggestions. We thank Barbara Catinella for her contributions that have improved this manuscript.
This scientific work uses data obtained from Inyarrimanha Ilgari Bundara/the Murchison Radio-astronomy Observatory. We acknowledge the Wajarri Yamaji People as the Traditional Owners and native title holders of the Observatory site. CSIRO’s ASKAP radio telescope is part of the Australia Telescope National Facility (https://ror.org/05qajvd42). Operation of ASKAP is funded by the Australian Government with support from the National Collaborative Research Infrastructure Strategy. ASKAP uses the resources of the Pawsey Supercomputing Research Centre. Establishment of ASKAP, Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory and the Pawsey Supercomputing Research Centre are initiatives of the Australian Government, with support from the Government of Western Australia and the Science and Industry Endowment Fund.
WALLABY acknowledges technical support from the Australian SKA Regional Centre (AusSRC).
Parts of this research were supported by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013.
This investigation has made use of the NASA/IPAC Extragalactic Database (NED) which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration, and NASA’s Astrophysics Data System.
The Legacy Surveys consist of three individual and complementary projects: the Dark Energy Camera Legacy Survey (DECaLS; Proposal ID #2014B-0404; PIs: David Schlegel and Arjun Dey), the Beijing-Arizona Sky Survey (BASS; NOAO Prop. ID #2015A-0801; PIs: Zhou Xu and Xiaohui Fan), and the Mayall z-band Legacy Survey (MzLS; Prop. ID #2016A-0453; PI: Arjun Dey). DECaLS, BASS and MzLS together include data obtained, respectively, at the Blanco telescope, Cerro Tololo Inter-American Observatory, NSF’s NOIRLab; the Bok telescope, Steward Observatory, University of Arizona; and the Mayall telescope, Kitt Peak National Observatory, NOIRLab. Pipeline processing and analyses of the data were supported by NOIRLab and the Lawrence Berkeley National Laboratory (LBNL). The Legacy Surveys project is honored to be permitted to conduct astronomical research on Iolkam Du’ag (Kitt Peak), a mountain with particular significance to the Tohono O’odham Nation.
NOIRLab is operated by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation. LBNL is managed by the Regents of the University of California under contract to the U.S. Department of Energy.
This project used data obtained with the Dark Energy Camera (DECam), which was constructed by the Dark Energy Survey (DES) collaboration. Funding for the DES Projects has been provided by the U.S. Department of Energy, the U.S. National Science Foundation, the Ministry of Science and Education of Spain, the Science and Technology Facilities Council of the United Kingdom, the Higher Education Funding Council for England, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, the Kavli Institute of Cosmological Physics at the University of Chicago, Center for Cosmology and Astro-Particle Physics at the Ohio State University, the Mitchell Institute for Fundamental Physics and Astronomy at Texas A&M University, Financiadora de Estudos e Projetos, Fundacao Carlos Chagas Filho de Amparo, Financiadora de Estudos e Projetos, Fundacao Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro, Conselho Nacional de Desenvolvimento Cientifico e Tecnologico and the Ministerio da Ciencia, Tecnologia e Inovacao, the Deutsche Forschungsgemeinschaft and the Collaborating Institutions in the Dark Energy Survey. The Collaborating Institutions are Argonne National Laboratory, the University of California at Santa Cruz, the University of Cambridge, Centro de Investigaciones Energeticas, Medioambientales y Tecnologicas-Madrid, the University of Chicago, University College London, the DES-Brazil Consortium, the University of Edinburgh, the Eidgenossische Technische Hochschule (ETH) Zurich, Fermi National Accelerator Laboratory, the University of Illinois at Urbana-Champaign, the Institut de Ciencies de l’Espai (IEEC/CSIC), the Institut de Fisica d’Altes Energies, Lawrence Berkeley National Laboratory, the Ludwig Maximilians Universitat Munchen and the associated Excellence Cluster Universe, the University of Michigan, NSF’s NOIRLab, the University of Nottingham, the Ohio State University, the University of Pennsylvania, the University of Portsmouth, SLAC National Accelerator Laboratory, Stanford University, the University of Sussex, and Texas A&M University.
BASS is a key project of the Telescope Access Program (TAP), which has been funded by the National Astronomical Observatories of China, the Chinese Academy of Sciences (the Strategic Priority Research Program “The Emergence of Cosmological Structures” Grant # XDB09000000), and the Special Fund for Astronomy from the Ministry of Finance. The BASS is also supported by the External Cooperation Program of Chinese Academy of Sciences (Grant # 114A11KYSB20160057), and Chinese National Natural Science Foundation (Grant # 12120101003, # 11433005).
The Legacy Survey team makes use of data products from the Near-Earth Object Wide-field Infrared Survey Explorer (NEOWISE), which is a project of the Jet Propulsion Laboratory/California Institute of Technology. NEOWISE is funded by the National Aeronautics and Space Administration.
The Legacy Surveys imaging of the DESI footprint is supported by the Director, Office of Science, Office of High Energy Physics of the U.S. Department of Energy under Contract No. DE-AC02-05CH1123, by the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility under the same contract; and by the U.S. National Science Foundation, Division of Astronomical Sciences under Contract No. AST-0950945 to NOAO.
KAO acknowledges support by the Royal Society through a Dorothy Hodgkin Fellowship (DHF/R1/231105).
Data availability statement
The WALLABY source catalogue and associated data products (e.g. cubelets, moment maps, integrated spectra, radial surface density profiles) are available online through the CSIRO ASKAP Science Data Archive (CASDA) and the Canadian Astronomy Data Centre (CADC). All source and kinematic model data products are mirrored at both locations. Links to the data access services and the software tools used to produce the data products as well as documented instructions and example scripts for accessing the data are available from the WALLABY Data Portal (https://wallaby-survey.org/data/). The photometric properties and additional Hi properties measured in this study are available on request.
Appendix A. Sersic Models
In this section, we use WALLABY J102113-262325 to illustrate an example of Sérsic fits to a LSBG. As discussed in Section 3.2.1, we use Sérsic models to measure the photometric properties of the galaxies. The g-band and i-band Legacy Survey images are modelled together and are shown in Figure. A1, and the NUV and FUV GALEX images are modelled together and are shown in Figure. A2. The parameters measured from the models are shown in Table A1. From these properties, we estimated the mean g-band surface brightness within 1 effective radius to be 23.3 mag arcsec
$^{-2}$
, the stellar mass of this LSBG to be
$3\times10^7$
M
$_{\odot}$
and the SFR to be 0.013 M
$_{\odot}$
yr
$^{-1}$
.
Table A1. Properties of the LSBG WALLABY J102113-262325 measured from the Sérsic models. The optical parameters are measured from the g and i band Legacy Survey images, and the UV parameters are measured from the NUV and FUV GALEX images. The properties presented in this table are:
$m_{\lambda}$
the total apparent magnitude in the respective bands,
$R_e$
the effective radius, PA the position angle, q the axis ratio, and n the Sérsic index.


Figure A1. Sérsic fits to the g and i band Legacy Survey images of LSBG WALLABY J102113-262325. The left two panels of (a) are the optical images, with the box identifying the area to be modelled. The right two panels of (a) are the Sérsic models in each of the two bands. The left two panels of (b) are the residuals from the target image subtract the model. The right two panels are the radial surface brightness profiles. The points show the median of pixel values at a given radius of the image and the lines show the fitted models.

Figure A2. Sérsic fits to the NUV and FUV band GALEX images of LSBG WALLABY J102113-262325. The left two pannels of (a) are the UV images, with the box identifying the area to be modelled. The right two panels of (a) are the Sérsic models in each of the two bands. The left two panels of (b) are the residuals from the target image subtract the model. The right two panels are the radial surface brightness profiles. The points show the median of pixel values at a given radius of the image and the lines show the fitted models.
Appendix B. Strong Dark Source Detections
In this section we present the images of the strong dark source detections. The dark sources that may be tidal remnants from Table 2 are noted as ‘tidal’ in the captions. In Figure B1–B38, Figure (a) is the g-band Legacy Survey image with H i contours of the dark source overlaid. The dashed contour represents the edge of the SoFiA mask. The lowest solid contour corresponds to the column density equal to the local rms of the unmasked moment 0 map. Additional contours equal to 3, 5 and 7 times the local rms column density are also shown where possible. Figure (b) is the Legacy Survey g-band image zoomed out to 30 arcmin, with the mask outline and the column density contour equal to the local rms of the unmasked moment 0 map of the dark source overlaid in magenta and other WALLABY sources overlaid in black. Figure (c) is the moment 1 map (velocity field) of the dark source. The WALLABY beam is shown in the lower left corner and the scale is shown in the lower right corner of each image. Figure (d) is the unmasked H i spectrum.

Figure B1. WALLABY J100321-291708 (tidal; Hydra).

Figure B2. WALLABY J125513+080246 (tidal; NGC 4808).

Figure B3. WALLABY J125915-150108 (tidal; NGC 5044 field). Note, H i is detected on the other side of the galaxy, so this dark tidal cloud candidate may be part of a larger structure such as an outflow or polar ring.

Figure B4. WALLABY J131928-123828 (tidal; NGC 5044 field).

Figure B5. WALLABY J131331-160600 (tidal; NGC 5044).

Figure B6. WALLABY J133006-205341 (tidal; NGC 5044).

Figure B7. WALLABY J132202-161829 (tidal; NGC 5044).

Figure B8. WALLABY J130606-172523 (tidal; NGC 5044).

Figure B9. WALLABY J132948-180438 (tidal; NGC 5044).

Figure B10. WALLABY J133008-203319 (tidal; NGC 5044).

Figure B11. WALLABY J133747-175606 (tidal; NGC 5044).

Figure B12. WALLABY J133057-211755 (tidal; NGC 5044).

Figure B13. WALLABY J131009-171227 (tidal; NGC 5044).

Figure B14. WALLABY J101934-261721 (Hydra).

Figure B15. WALLABY J103853-274100 (Hydra).

Figure B16. WALLABY J131244-155218 (NGC 5044).

Figure B17. WALLABY J132022-240400 (NGC 5044).

Figure B18. WALLABY J132825-253528 (NGC 5044). i-band image shown as g-band image is incomplete. While there is no obvious optical counterpart, we note that this source is in a crowded field.

Figure B19. WALLABY J125721-171102 (NGC 5044).

Figure B20. WALLABY J125855-142319 (NGC 5044).

Figure B21. WALLABY J132931-181615 (NGC 5044).

Figure B22. WALLABY J131704-171858 (NGC 5044).

Figure B23. WALLABY J131600-185222 (NGC 5044).

Figure B24. WALLABY J130347-180311 (NGC 5044).

Figure B25. WALLABY J131743-181822 (NGC 5044).

Figure B26. WALLABY J132328-172821 (NGC 5044).

Figure B27. WALLABY J132059-173347 (NGC 5044).

Figure B28. WALLABY J133604-195904 (NGC 5044).

Figure B29. WALLABY J131355-115301 (NGC 5044).

Figure B30. WALLABY J131717-132332 (NGC 5044).

Figure B31. WALLABY J132814-165706 (NGC 5044)

Figure B32. WALLABY J132259-172513 (NGC 5044).

Figure B33. WALLABY J133621-200033 (NGC 5044).

Figure B34. WALLABY J132457-182105 (NGC 5044).

Figure B35. WALLABY J132957-150800 (NGC 5044).

Figure B36. WALLABY J132422-162744 (NGC 5044). This source may be a partial detection.

Figure B37. WALLABY J132848-143813 (NGC 5044).

Figure B38. WALLABY J133556-153510 (NGC 5044).
Appendix C. Uncertain Dark Sources
In this section we present the images of the uncertain dark H i sources. Table C1 presents the uncertain dark sources. Each figure in this appendix contains the same set of images as outlined in Appendix B. To emphasise that these dark candidates are uncertain detections, they are marked with
$^{*}$
next to the source name.
Table C1. Properties of the uncertain dark sources. From left to right the columns are: WALLABY field, WALLABY name, right ascension, declination, central velocity, luminosity distance, emission line width at half maximum,
$w_{20}$
emission line width, H i size (major axis at 1 M
$_{\odot}$
pc
$^{-2}$
contour level), H i mass, stellar mass
$3\sigma$
upper limit, SFR
$3\sigma$
upper limit and the signal-to-noise ratio.


Figure C1. WALLABY J103543-255954
$^{*}$
(Hydra).

Figure C2. WALLABY J103818-285023
$^{*}$
(Hydra).

Figure C3. WALLABY J130119+053553
$^{*}$
(NGC 4808). Possible RFI from GPS satellite at 1 308 MHz.

Figure C4. WALLABY J130011+065105
$^{*}$
(NGC 4808).

Figure C5. WALLABY J125656-202606
$^{*}$
(NGC 5044).

Figure C6. WALLABY J131847-210939
$^{*}$
(NGC 5044).

Figure C7. WALLABY J131844-113805
$^{*}$
(NGC 5044).

Figure C8. WALLABY J132359-235510
$^{*}$
(NGC 5044).

Figure C9. WALLABY J132709-163509
$^{*}$
(NGC 5044). Possible RFI from GPS satellite at 1 381 MHz.

Figure C10. WALLABY J130514-203447
$^{*}$
(NGC 5044).

Figure C11. WALLABY J132238-204726
$^{*}$
(NGC 5044). Possible RFI from GPS satellite at 1 305 MHz.

Figure C12. WALLABY J131247-132906
$^{*}$
(NGC 5044).

Figure C13. WALLABY J132719-170237
$^{*}$
(NGC 5044).

Figure C14. WALLABY J130835-143159
$^{*}$
(NGC 5044).

Figure C15. WALLABY J132935-153750
$^{*}$
(NGC 5044).

Figure C16. WALLABY J132810-151352
$^{*}$
(NGC 5044).

Figure C17. WALLABY J131805-200055
$^{*}$
(NGC 5044).
Appendix D. Co-added Images
Here we present all the co-added images of the dark sources. For each source, we co-add the g, r, i and z-band images, convolving with a boxcar kernel with a size of 2.6 arcsec by 2.6 arcsec to degrade the resolution and enhance the surface brightness sensitivity, greater enabling the detection of diffuse emission. The same Hi contours shown are the same as those presented in Appendices B and C. WALLABY J131244-155218 (also presented in Figure B16 in Section 5.2) is the only source that shows evidence of an optical counterpart.

Figure D1. Co-added images of tidal sources.

Figure D2. Co-added images of other strong dark source detections.

Figure D3. Co-added images of uncertain dark sources.