Hostname: page-component-6766d58669-tq7bh Total loading time: 0 Render date: 2026-05-16T19:50:19.015Z Has data issue: false hasContentIssue false

Hydra I: An extensible multi-source-finder comparison and cataloguing tool

Published online by Cambridge University Press:  08 June 2023

M. M. Boyce*
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada
A. M. Hopkins
Affiliation:
Australian Astronomical Optics, Macquarie University, North Ryde, NSW, Australia
S. Riggi
Affiliation:
INAF, Osservatorio Astrofisico di Catania, Catania, Italy
L. Rudnick
Affiliation:
Minnesota Institute for Astrophysics, School of Physics and Astronomy, University of Minnesota, Minneapolis, MN, USA
M. Ramsay
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada
C. L. Hale
Affiliation:
School of Physics and Astronomy, Institute for Astronomy, Royal Observatory, University of Edinburgh, Blackford Hill, Edinburgh EH9 3HJ, UK
J. Marvil
Affiliation:
National Radio Astronomy Observatory, Socorro, NM, USA
M. T. Whiting
Affiliation:
CSIRO Space & Astronomy, PO Box 76 Epping, NSW 1710, Australia
P. Venkataraman
Affiliation:
Dunlap Institute for Astronomy and Astrophysics, University of Toronto, Toronto, ON, Canada
C. P. O’Dea
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada
S. A. Baum
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada
Y. A. Gordon
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada Department of Physics, University of Wisconsin-Madison, Madison, WI, USA
A. N. Vantyghem
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada
M. Dionyssiou
Affiliation:
Dunlap Institute for Astronomy and Astrophysics, University of Toronto, Toronto, ON, Canada
H. Andernach
Affiliation:
Departamento de Astronomía, DCNE, Universidad de Guanajuato, Guanjuato, CP, GTO, Mexico
J. D. Collier
Affiliation:
Inter-University Institute for Data Intensive Astronomy (IDIA), Department of Astronomy, University of Cape Town, Rondebosch, South Africa School of Science, Western Sydney University, Penrith, NSW, Australia
J. English
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada
B. S. Koribalski
Affiliation:
School of Science, Western Sydney University, Penrith, NSW, Australia Australia Telescope National Facility, CSIRO Astronomy and Space Science, Epping, NSW, Australia
D. Leahy
Affiliation:
Department of Physics and Astronomy, University of Calgary, Calgary, Canada
M. J. Michałowski
Affiliation:
Astronomical Observatory Institute, Faculty of Physics, Adam Mickiewicz University, Poznań, Poland
S. Safi-Harb
Affiliation:
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Canada
M. Vaccari
Affiliation:
Inter-University Institute for Data Intensive Astronomy (IDIA), Department of Astronomy, University of Cape Town, Rondebosch, South Africa Department of Physics and Astronomy, Inter-University Institute for Data Intensive Astronomy (IDIA), University of the Western Cape, Bellville, Cape Town, South Africa INAF - Istituto di Radioastronomia, Bologna, Italy
E. L. Alexander
Affiliation:
Jodrell Bank Centre for Astrophysics, Department of Physics and Astronomy, University of Manchester, Manchester, UK
M. Cowley
Affiliation:
School of Chemistry and Physics, Queensland University of Technology, Brisbane, QLD, Australia Centre for Astrophysics, University of Southern Queensland, West Street, Toowoomba, QLD 4350, Australia
A. D. Kapinska
Affiliation:
National Radio Astronomy Observatory, Socorro, NM, USA
A. S. G. Robotham
Affiliation:
ICRAR, M468, University of Western Australia, Crawley, WA 6009, Australia
H. Tang
Affiliation:
Department of Astronomy, Tsinghua University, Beijing 100084, China
*
Corresponding author: M. M. Boyce; Email: michelle.boyce2@umanitoba.ca.
Rights & Permissions [Opens in a new window]

Abstract

The latest generation of radio surveys are now producing sky survey images containing many millions of radio sources. In this context it is highly desirable to understand the performance of radio image source finder (SF) software and to identify an approach that optimises source detection capabilities. We have created Hydra to be an extensible multi-SF and cataloguing tool that can be used to compare and evaluate different SFs. Hydra, which currently includes the SFs Aegean, Caesar, ProFound, PyBDSF, and Selavy, provides for the addition of new SFs through containerisation and configuration files. The SF input RMS noise and island parameters are optimised to a 90% ‘percentage real detections’ threshold (calculated from the difference between detections in the real and inverted images), to enable comparison between SFs. Hydra provides completeness and reliability diagnostics through observed-deep ($\mathcal{D}$) and generated-shallow ($\mathcal{S}$) images, as well as other statistics. In addition, it has a visual inspection tool for comparing residual images through various selection filters, such as S/N bins in completeness or reliability. The tool allows the user to easily compare and evaluate different SFs in order to choose their desired SF, or a combination thereof. This paper is part one of a two part series. In this paper we introduce the Hydra software suite and validate its $\mathcal{D/S}$ metrics using simulated data. The companion paper demonstrates the utility of Hydra by comparing the performance of SFs using both simulated and real images.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Astronomical Society of Australia
Figure 0

Table 1. SF general design characteristics (re., Hopkins et al. 2015; Hale et al. 2019; Bonaldi et al. 2021). NxGen indicates multiprocessing capabilities.

Figure 1

Figure 1. High level schematic representation of the Hydra software suite workflow. Homados provides $\mathcal{D}$ and $\mathcal{S}$-image channels for simulated/real images (‘dancing ghosts’ example image, see Norris et al. 2021). Each channel is run separately through the Typhon optimiser, which uses the SF interface provided by Cerberus. Hydra coordinates all of these activities, building catalogues and compiling statistics at the end of the process.

Figure 2

Figure 2. Homados $\mathcal{S}$-image generation example, using an image cutout sample from an ATLAS CDFS DR1 $2.2^{\circ}\times2.7^{\circ}$ tile (Norris et al. 2006). The figures show $\mathcal{D}$ (left) and $\mathcal{S}$ (right) images, zoomed in. The noise level scale factor, n, was set to 5 to generate the shallow image.

Figure 3

Table 2. Cerberus RMS and Island parameter definitions in units of $\sigma$ with respect to the background, with soft constraint $\sigma_{Island}<\sigma_{RMS}$.

Figure 4

Figure 3. Cerberus code generation workflow.

Figure 5

Figure 4. Example Typhon PRD of a $2^\circ\times2^\circ$ simulated $\mathcal{D}$-image (left) and its corresponding $\mathcal{S}$-image (right). The variation in the PRD with the RMS parameters (in $\sigma$ units) is represented along the horizontal axis, and the variation in the PRD with the island parameters is represented by the error bars. The data points represent average values: that is, Aegean and ProFound indicate the true shape of the curves, due their insensitivity to their island parameters. The SF parameters are listed in Table 2. The dotted horizontal lines indicate the 90% PRD levels.

Figure 6

Figure 5. Clustering Algorithm Infographic: The left panel shows the results of 3 hypothetical SFs (red, green, and gold). The middle panel shows the results after clustering, resulting in two clumps, assigned clump_id 1 (upper panel) and clump_id 2 (lower panel). A clump is defined through the spatial overlap between SF detections (i.e., components), in the $\mathcal{D}$ and $\mathcal{S}$-images together. The components are numbered independently and can be associated with the clump they end up in. For instance, components 1, 2, and 3 are linked together in the $\mathcal{D}$-image, and, in addition, 1 overlaps with 6, and 3 overlaps with 8, 9, and 10 in the $\mathcal{S}$-image. Together this set of components populate clump_id 1. Similarly, clump_id 2 is composed of components 4, 5, and 7. Clumps are centred in the Hydra Viewer (Fig. 7), with unassociated components greyed out. The right panel shows the results after the clumps are decomposed into closest (i.e., overlapping centre-to-centre) matches between SFs, in the $\mathcal{D}$ and $\mathcal{S}$-images, such that there is only one SF with a match in the $\mathcal{D}$ and $\mathcal{S}$-images. These matched sets are assigned match_ids, with boxes enclosing the extremities of the components. The Hydra Viewer displays these numbers at the centre of the boxes (which are coloured differently here, for emphasis). So clump_id 1 contains match_id 1 = {1, 2, 6} and match_id 2 = {3, 8, 9, 10}, while clump_id 2 contains match_id 3 = {4, 5, 7}.

Figure 7

Table 3. Configured island parameters.

Figure 8

Figure 6. Derivation of the distance metric used for clustering. Here we assume that the space is locally flat, so that $\Delta(C_i,C_j)\approx(\delta^2_{\texttt{RA}_{ij}}+\delta^2_{\texttt{Dec}_{ij}})^{1/2}$, where $\delta_{\texttt{RA}_{ij}}=(\texttt{RA}_j-\texttt{RA}_i)\cos(\texttt{Dec}_i)$ and $\delta_{\texttt{Dec}_{ij}}=\texttt{Dec}_j-\texttt{Dec}_i$. The distances from the centres of components $C_i$ and $C_j$ to their edges, along a ray between them, is given by $r_i$ and $r_j$, respectively: that is, $r_\mu$, is a standard geometrical expression in terms of angle $\beta_\mu=\pi/2-(\theta_\mu-\eta)$ with respect to the ray and the semi-major axis $a_\mu$, where $\theta_\mu$ is the position angle, $\eta=-\tan^{-1}(\delta_{\texttt{Dec}_{ij}}/\delta_{\texttt{RA}_{ij}})$, and $\mu=i,\,j$. The grey area outside the ellipses is the skirt, whose extent is determined by f.

Figure 9

Figure 7. Hydra Viewer Infographic: The cutout viewing section of Hydra’s local web-viewer interface. At the top is the navigation bar, which allows the user to navigate by clump ID, go to a specific clump, turn on/off cutout annotations, or examine S/N bins of diagnostic plots such as $\mathcal{C}$ and $\mathcal{R}$, by using the Mode button. The main panel contains $\mathcal{D}$ (top) and $\mathcal{S}$ (bottom) square image cutouts (first column) and SF residual image cutouts (remaining columns), centred about a given clump’s centroid. Here the annotation is turned on, with the neighbouring clumps greyed out. The table at the bottom (not to scale) is the cluster table rows for the clump, with the following columns: cluster catalogue ID, SF catalogue cross-reference ID, clump ID, subclump ID, match ID, SF or $\mathcal{J}$ catalogue name, image depth, RA ($^\circ$), Dec ($^\circ$), semi-major axis ($^{\prime\prime}$), semi-minor axis ($^{\prime\prime}$), position angle ($^\circ$), total flux density (mJy), bane RMS noise (mJy), S/N (total flux over bane RMS noise), peak flux (mJy beam$^{-1}$), normalised-residual RMS (mJy (arcmin$^2$ beam)$^{-1}$), normalised-residual MADFM (mJy (arcmin$^2$ beam)$^{-1}$), and normalised-residual $\Sigma I^2$ ((mJy (arcmin$^2$ beam))$^{-2}$). The normalised-residual statistics are normalised by the cutout area (arcmin$^2$). This statistical information is also shown below each cutout, along with the number of components (N), and cutout size (Size, in arcmin). This figure is to illustrate the layout of the Hydra viewer, not the details. It shows screen shots from the Hydra Viewer pasted together, hence the fonts appear small. The data at the bottom is raw output from the cluster table, which is not rounded in this version of the software.

Figure 10

Table 4. SF annotation colours.

Figure 11

Figure 8. An example of a $\mathcal{D}$-image cutout, with annotations turned on, consisting of 4 Aegean, 2 ProFound, 2 PyBDSF, and 2 Selavy overlapping $\mathcal{D}$-image catalogue components. The label at the top indicates it corresponds to clump_id 243, and the numbers at the centres of the cyan boxes are the match_ids (369 through 372).

Figure 12

Figure 9. Venn diagram of completeness and reliability, for sets of deep ($\mathcal{D}$), shallow ($\mathcal{S}$), and injected ($\mathcal{J}$) sources.

Figure 13

Table 5. Completeness/reliability metrics (see Fig. 9) in terms of deep ($\mathcal{D}$), shallow ($\mathcal{S}$), and injected ($\mathcal{J}$) sources.

Figure 14

Figure 10. Examples of deep (blue) and shallow (amber) source component overlaps, $\mathcal{C_{\mathcal{DS}}}=(\mathcal{S}\cap\mathcal{D})/\mathcal{D}$ and $\mathcal{R_{\mathcal{DS}}}=(\mathcal{S}\cap\mathcal{D})/\mathcal{S}$. Real-shallow detections are indicated by overlapping pair-wise deep-shallow detections ($\mathcal{S}\cap\mathcal{D}$), whose centres are closest. The dash-lines indicate clumps of component extent overlays.

Figure 15

Figure 11. Simulated map with point-like (compact) sources. The coordinates are arbitrarily set, and the FWHM is set to 15”.

Figure 16

Figure 12. Simulated image with both point-like (compact) and extended sources. The sky coordinates are arbitrarily chosen.

Figure 17

Table 6. Hydra $\mu$-optimised box_size and step_size inputs for Aegean, PyBDSF, and Selavy,$^{\rm a}$ using CMP and EXT $\mathcal{D/S}$-image data.

Figure 18

Table 7. Typhon run statistics for CMP and EXT images, with SF, image depth, SF RMS parameter ($n_{rms}$ [$\sigma$]), SF island parameter ($n_{island}$ [$\sigma$]), source counts (N), residual RMS ($\unicode{x03BC}$Jy beam$^{-1}$), and residual MADFM ($\unicode{x03BC}$Jy beam$^{-1}$) columns.$^{\rm a}$

Figure 19

Figure 13. SF CMP and EXT $\mathcal{D/S}$-image detection stacked plots (re. the N columns of Table 7).

Figure 20

Table 8. Ratios of deep-to-injected ($\mathcal{D}\,:\,\mathcal{J}$) shallow-to-injected ($\mathcal{S}\,:\,\mathcal{J}$), and shallow-to-deep ($\mathcal{S}\,:\,\mathcal{D}$) sources. The $\mathcal{D}/\mathcal{S}$ source counts (N) are provided in Table 7, and the injected source counts are 9075 and 9974 for CMP and EXT sources, respectively (re. Section 5.1).

Figure 21

Figure 14. Major-axis distributions for (a) CMP and (b) EXT sources (with $\mathcal{D}$ and $\mathcal{S}$ both included). The vertical dashed-line represents the beam size. The distributions of the injected sources are shown in grey. Recall that size estimates between SFs are not necessarily directly comparable as they are estimated using different methods.

Figure 22

Figure 15. Simulated $\mathcal{D}$-image compact (left) and extended (right) source $\mathcal{C_{D}}$ (top) and $\mathcal{R_{D}}$ (bottom) vs S/N ($\mathcal{D}$-signal/Deep-noise) plots. The $\mathcal{D}$-noise is computed using bane.

Figure 23

Figure 16. Simulated $\mathcal{S}$-image compact (left) and extended (right) source $\mathcal{C_{S}}$ (top) and $\mathcal{R_{S}}$ (bottom) vs S/N ($\mathcal{D}$-signal/$\mathcal{S}$-noise) plots. The $\mathcal{S}$-noise is computed using bane.

Figure 24

Figure 17. $\mathcal{C_{DS}}$ (a and g), $\mathcal{R_{DS}}$ (b and h), $\mathcal{\tilde{C}_{DS}}$ (c and i), $\mathcal{\tilde{R}_{DS}}$ (d and j), $\delta\mathcal{C_{DS}}$ (e and k), and $\delta\mathcal{R_{DS}}$ (f and l) vs S/N for CMP (top set) and simulated-extend (bottom set) sources. The S/N are expressed as $\mathcal{D}$-signal/$\mathcal{S}$-noise and $\mathcal{S}$-signal/$\mathcal{S}$-noise for completeness and reliability, respectively, where the $\mathcal{S}$-noise is computed using bane.