Hostname: page-component-89b8bd64d-j4x9h Total loading time: 0 Render date: 2026-05-08T10:05:38.392Z Has data issue: false hasContentIssue false

WALLABY Pilot Survey: kNN identification of perturbed galaxies through H 1 morphometrics

Published online by Cambridge University Press:  24 January 2025

Benne Willem Holwerda*
Affiliation:
Department of Physics and Astronomy, University of Louisville, Louisville, KY, USA
Helga Dénes
Affiliation:
School of Physical Sciences and Nanotechnology, Yachay Tech University, Urcuquí, Ecuador
Jonghwan Rhee
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), University of Western Australia, Crawley, WA, Australia
Denis Leahy
Affiliation:
Department of Physics and Astronomy, University of Calgary, Calgary, AB, Canada
Bärbel Silvia Koribalski
Affiliation:
Australia Telescope National Facility, CSIRO, Space and Astronomy, Parkes, NSW, Australia School of Science, Western Sydney University, Penrith, NSW, Australia
Niankun Yu
Affiliation:
National Astronomical Observatories, Chinese Academy of Sciences, Beijing, People’s Republic of China Key Laboratory of Radio Astronomy and Technology, Chinese Academy of Sciences, Beijing, People’s Republic of China
Nathan Deg
Affiliation:
Department of Physics, Engineering Physics, and Astronomy, Queen’s University, Kingston, ON, Canada
T. Westmeier
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), University of Western Australia, Crawley, WA, Australia
Karen Lee-Waddell
Affiliation:
Australian SKA Regional Centre, Perth, Australia
Yago Ascasibar
Affiliation:
Departamento de Física Teórica, Universidad Autónoma de Madrid (UAM), Madrid, Spain Centro de Investigación Avanzada en Física Fundamental (CIAFF-UAM), Madrid, Spain
Manasvee Saraf
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), University of Western Australia, Crawley, WA, Australia Australia Telescope National Facility, CSIRO, Space and Astronomy, Bentley, WA, Australia ARC Centre of Excellence for All-Sky Astrophysics in 3 Dimensions (ASTRO 3D), Sydney, Australia
Xuchen Lin
Affiliation:
Department of Astronomy, School of Physics, Peking University, Beijing, People’s Republic of China
Barbara Catinella
Affiliation:
ARC Centre of Excellence for All-Sky Astrophysics in 3 Dimensions (ASTRO 3D), Sydney, Australia International Centre for Radio Astronomy Research, The University of Western Australia, Crawley, WA, Australia
Kelley Hess
Affiliation:
Chalmers University of Technology, Onsala Space Observatory, Göteborg, Sweden
*
Corresponding author: Benne Willem Holwerda, Email: benne.holwerda@louisville.edu.
Rights & Permissions [Opens in a new window]

Abstract

Galaxy morphology in stellar light can be described by a series of ‘non-parametric’ or ‘morphometric’ parameters, such as concentration-asymmetry-smoothness, Gini, $M_{20}$, and Sérsic fit. These parameters can be applied to column density maps of atomic hydrogen (H 1). The H 1 distribution is susceptible to perturbations by environmental effects, for example, intergalactic medium pressure and tidal interactions. Therefore, H 1 morphology can potentially identify galaxies undergoing ram-pressure stripping or tidal interactions. We explore three fields in the WALLABY Pilot H 1 survey and identify perturbed galaxies based on a k-nearest neighbour (kNN) algorithm using an H 1 morphometric feature space. For training, we used labelled galaxies in the combined NGC 4808 and NGC 4636 fields with six H 1 morphometrics to train and test a kNN classifier. The kNN classification is proficient in classifying perturbed galaxies with all metrics – accuracy, precision, and recall – at 70–80%. By using the kNN method to identify perturbed galaxies in the deployment field, the NGC 5044 mosaic, we find that in most regards, the scaling relations of perturbed and unperturbed galaxies have similar distribution in the scaling relations of stellar mass versus star formation rate and the Baryonic Tully–Fisher relation, but the H 1 and stellar mass relation flatter than of the unperturbed galaxies. Our results for NGC 5044 provide a prediction for future studies on the fraction of galaxies undergoing interaction in this catalogue and to build a training sample to classify such galaxies in the full WALLABY survey.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Astronomical Society of Australia
Figure 0

Table 1. Basic properties of the three galaxy group WALLABY fields analysed here.

Figure 1

Figure 1. Distribution of distance for galaxies in the three WALLABY fields, centred on NGC 4808, NGC 4646, and NGC 5044. The vertical dashed line is the 60 Mpc cutoff for selection for the training sample in NGC 4808 and the application samples.

Figure 2

Figure 2. A corner plot of the H 1 morphometrics of galaxies in the NGC 4636 pointing based on the SoFiA segmentation maps.

Figure 3

Figure 3. A corner plot of the H 1 morphometrics of galaxies in the NGC 4808 pointing based on the SoFiA segmentation maps.

Figure 4

Figure 4. A corner plot of the H 1 morphometrics of galaxies in the NGC 5044 pointing based on the SoFiA segmentation maps.

Figure 5

Figure 5. The WALLABY detections for both the NGC 4808 and NGC 4636 fields within (60 Mpc) in grey. Superimposed is the catalogue from Lin et al. (2023) with the perturbed (red circles) and unperturbed (white circles). Not every source in Lin et al. (2023) has a counterpart in the two WALLABY catalogues but a sufficient number is available for training. Because the Lin et al. (2023) catalogue is based on different data, we select all the sources within the green circle to be used as the WALLABY training sample with those without a Lin et al. (2023) classification deemed ‘unperturbed’.

Figure 6

Figure 6. The processing flag for WALLABY objects according to Lin et al. (2023): ram-pressure (1), tidal interaction (2) or merger (3). We train kNN to distinguish between an undisturbed (0) and processing (1) label which includes all three here (1-3).

Figure 7

Figure 7. Hyperparameter choice for a set feature space with metrics as a function of the number of neighbours (k). This is the performance for the full of morphometric space.

Figure 8

Figure 8. The mean (left row) and variance (right row) map of the precision, recall and F1. Mean and variance are determined by drawing a random set of features in the H 1 morphometric space and running the kNN on it. Variance tends to be high for $k=1$ or $n=2$ features.

Figure 9

Table 2. The features selected for the final iteration of the kNN.

Figure 10

Figure 9. Hyperparameter choice for a set feature space with metrics as a function of the number of neighbours (k). This is for the optimal set of features in Table 2.

Figure 11

Figure 10. The average confusion matrix for the kNN ($k=2$, trained on subsamples of 80%, tested on the remaining 20% shown here) with the optimised feature space listed in Table 2 for all the members of the NGC 4636 and NGC 4636 groups. We repeated the training/test ten times and these are the averages of all ten split-train-test iterations.

Figure 12

Table 3. The performance metrics of the WALLABY training catalogue $(D \unicode{x003C} 60$ Mpc within the green circle in Fig. 5) split into subsections using the features listed in Table 2. By iterating ten times over this sample and splitting off 20% for testing, these are the mean and variance of the kNN performance.

Figure 13

Table 4. The performance metrics of in the full WALLABY training catalogue ($D \unicode{x003C} 60$ Mpc within the green circle in Fig. 5) using the features listed in Table 2.

Figure 14

Figure 11. The confusion matrix for the kNN ($k=2$, trained on a subsample of 80%) with the optimised feature space listed in Table 2 for all the objects in the combined catalogue of the NGC 4808 and NGC 4636 fields ($D \unicode{x003C} 60$ Mpc within the green circle in Fig. 5).

Figure 15

Table 5. The fraction of galaxies that were perturbed as reported by Lin et al. (2023) and the kNN trained on NGC 4808+4636 (WALLABY training sample). For comparison, the morphometric criteria for merging or perturbed galaxies from Conselice (2003), Lotz et al. (2004, (2008), and Holwerda et al. (2011d) are listed as well.

Figure 16

Figure 12. The kNN labelling in both the training sample (left) and the deployment field, NGC 5044. Compare to the labels in Fig. 5.

Figure 17

Table 6. The linear fits to the stellar mass and star formation relation for the training sample and the deployment sample of NGC 5044 for all the galaxies in the sample, the unperturbed and perturbed ones. Qualitatively the fits are similar but the deployment fits have lower slopes and higher intercepts than the training sample.

Figure 18

Figure 13. The stellar mass and star formation relation for the WALLABY training sample (left) and the NGC 5044 deployment sample (right). Qualitatively, the results are similar for the star-forming galaxy main sequence: similar slopes for all three populations, unperturbed, perturbed, and all galaxies, but there are quantitative differences in the SFGMS slope and interecept between the training and the deployement samples.

Figure 19

Figure 14. The stellar mass and H 1 mass relation for the training sample (left) and the deployment sample, the NGC 5044 mosaic. Both the combined and the unperturbed samples show very similar fits and the galaxies indicated as perturbed in the training sample as well as in the NGC 5044 mosaic both show less H 1 mass for a given stellar mass.

Figure 20

Figure 15. The Baryonic Tully–Fisher relation for the WALLABY training sample (left) and the deployment sample on NGC 5044 (right). The velocity is computed according to equation 12 with the unit in m/s.

Figure 21

Table 7. The linear fits to the stellar and H 1 mass relation for the training sample and the deployment sample of NGC 5044 for all the galaxies in the sample, the unperturbed and perturbed ones. Qualitatively the fits are similar but the deployment fits have lower slopes and higher intercepts than the training sample.

Figure 22

Table 8. The linear fits to the Baryonic Tully–Fisher relation for the training sample and the deployment sample of NGC 5044 for all the galaxies in the sample, the unperturbed and perturbed ones.