Hostname: page-component-89b8bd64d-rbxfs Total loading time: 0 Render date: 2026-05-08T01:29:45.288Z Has data issue: false hasContentIssue false

BLINK: An end-to-end GPU high time resolution imaging pipeline for fast radio burst searches with the Murchison Widefield Array

Published online by Cambridge University Press:  26 May 2025

Cristian Di Pietrantonio*
Affiliation:
Pawsey Supercomputing Research Centre, Kensington, WA, Australia International Centre for Radio Astronomy Research, Curtin University, Bentley, WA, Australia
Marcin Sokolowski
Affiliation:
International Centre for Radio Astronomy Research, Curtin University, Bentley, WA, Australia
Christopher Harris
Affiliation:
Pawsey Supercomputing Research Centre, Kensington, WA, Australia
Daniel Price
Affiliation:
International Centre for Radio Astronomy Research, Curtin University, Bentley, WA, Australia Square Kilometre Array Observatory (SKAO), Kensington, WA, Australia
Randall Wayth
Affiliation:
International Centre for Radio Astronomy Research, Curtin University, Bentley, WA, Australia Square Kilometre Array Observatory (SKAO), Kensington, WA, Australia
*
Corresponding author: Cristian Di Pietrantonio; Email: cristian.dipietrantonio@csiro.au
Rights & Permissions [Opens in a new window]

Abstract

Petabytes of archival high time resolution observations have been captured with the Murchison Widefield Array. The search for Fast Radio Bursts within these using established software has been limited by its inability to scale on supercomputing infrastructure, necessary to meet the associated computational and memory requirements. Hence, past searches used a coarse integration time, in the scale of seconds, or analysed an insufficient number of hours of observations. This paper introduces BLINK, a novel radio interferometry imaging software for low-frequency FRB searches to be run on modern supercomputers. It is implemented as a suite of software libraries executing all computations on GPU, supporting both AMD and NVIDIA hardware vendors. These libraries are designed to interface with each other and to define the BLINK imaging pipeline as a single executable program. Expensive I/O operations between imaging stages are not necessary because the stages now share the same memory space and data representation. BLINK is the first imaging pipeline implementation able to fully run on GPUs as a single process, further supporting AMD hardware and enabling Australian researchers to take advantage of Pawsey’s Setonix supercomputer. In the millisecond-scale time resolution imaging test case illustrated in this paper, representative of what is required for FRB searches, the BLINK imaging pipeline achieves a 3 687x speedup compared to a traditional MWA imaging pipeline employing WSClean.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Astronomical Society of Australia
Figure 0

Table 1. Representative requirements and constraints for an FRB search with the MWA. The high time and frequency resolution requirements, the large instantaneous bandwidth of the MWA, and the dispersive delay causing the signal to cross the low frequency band in tens of seconds demand the FRB search software to handle half a million of images per each time frame needed to compute dynamic spectra.

Figure 1

Figure 1. Relative computational cost per time sample of the main steps of the imaging process. The number of fundamental numerical operations of the correlation, gridding and 2D FFT algorithms expressed as a percentage of their sum is a proxy of their relative computational cost. Curves shown in the plot arise when the imaging process is instantiated on MWA Phase I VCS data at observational frequency of 150 MHz. The integration time is the critical parameter that determines whether 2D FFT or correlation is the dominant part of the computation.

Figure 2

Figure 2. Significant computational costs in the context of an FRB search. The computational cost per time sample of correlation and 2D FFT in the imaging process for the MWA Phase I VCS data, at an integration time of 50 ms and observational frequency of 150 MHz (left panel). In the same setting, beamforming requires three orders of magnitude more computation than imaging (right panel).

Figure 3

Figure 3. The BLINK FRB search pipeline. The distinguishing features of the BLINK pipeline are the utilisation of GPU cores to execute every compute stage and a homogenous data representation throughout. In addition to parallelisation of computation, the proposed approach eliminates the need of using the filesystem as a temporary staging storage. Intermediate data products to feed successive stages already reside in GPU memory, removing expensive CPU-GPU memory transfers. The CPU is left the task of reading the voltage data from disk, planning and submitting GPU computations, managing memory, and presenting the final output to the user. Dashed lines characterise steps yet to be implemented.

Figure 4

Listing 1. Example submission script. Executing the BLINK pipeline on the Setonix supercomputer requires minimal BASH scripting. First, Slurm directives inform the scheduler of the computational resources required to run the program. In this case, a single GPU on a node in the gpu partition. BASH variables are then defined to point to input files like observation metadata, calibration solutions, and voltage files. BASH wildcards are used to select the interval of seconds and frequency channels to process. In this case, all the data associated with OBSID 1293315072. Finally, the BLINK pipeline is executed with a single command line. Important options are: -t, the integration time; -c, the number of contiguous fine channels to average; -n, the side size in pixels of the output images.

Figure 5

Table 2. Observation 1. Information summary of the MWA observation imaged in a long exposure setting with the BLINK and SMART pipelines.

Figure 6

Figure 4. Long exposure images. The BLINK and SMART pipelines were run on the MWA Phase II Extended 1276619416 observation in multi-frequency synthesis mode, averaging images across all time bins, which resulted in the long exposure images shown in these panels. Three regions are highlighted by green circles in both images to compute key statistics for the purpose of comparison. These are reported in Table 3.

Figure 7

Table 3. Key image statistics. This table reports key statistics of the three regions highlighted in the long exposure images produced by the BLINK and SMART pipelines. The top region does not contain any source and it is ideal to sample the noise level. The two other regions contain extended objects, providing opportunity to compare peak fluxes.

Figure 8

Table 4. Observation 2. Details of the MWA observation that has been imaged at 20 ms time resolution with the BLINK imager.

Figure 9

Figure 5. High time resolution images. The figure shows a sample of the 20 ms images generated with the BLINK pipeline as part of the second test case. These contain bright pulses of pulsar B0950+08, whose position is marked with a red circle. The centre row shows the images with bright single pulses at UNIX times 1609279857.740 (left column) and 1609279856.480 (right column). The 20 ms images before and after the pulses are respectively in the rows above and below the centre one. When the peak of the pulse is closer to the edges of time bins its residual signal appears in the before and after images. The average noise in the images is equal to 0.5 Jy. The two brightest pulses have a flux density of 9 and 16.5 Jy.

Figure 10

Table 5. Execution times for the long exposure images. Execution times and their standard deviation (in brackets) of the various phases of the SMART and BLINK imaging pipelines needed to process one second of observation at 1 s time resolution and 40 kHz frequency resolution. Values are an average over 4 660 s worth of observation. For the SMART pipeline, timings for the Data loading and Output writing are not available because I/O interleaves with computation at every stage. The Correlation, Corrections, and Gridding and 2D FFT columns correspond to the execution of offline correlator, COTTER and WSClean, respectively.

Figure 11

Table 6. Execution times for high resolution images. The timing for each step of the SMART and BLINK imaging pipelines needed to process one second of observation at 20 ms time and 40 kHz frequency resolutions. In this case WSClean could only complete 7% of the total number of imaging steps within the 24 h wall time limit of Garrawarla. Hence, we give a projected execution time based on that information. No standard deviation is reported given timings of kernel executions and I/O operations on NVMe are quite stable.

Figure 12

Table 7. Data volumes. A summary of the amount of memory required to hold fundamental data products generated at each step of an imaging pipeline for the two test cases discussed in this paper. The table does not account for additional support data structures such as the array holding gridded visibilities.

Figure 13

Figure 6. Data volumes generated by an imaging pipeline per second of observation. In blue is the case of multi-frequency synthesis of a single image using full bandwidth data and a long (1 s) integration time. The red bars shows data volumes in the case of high time (50 $\times$ 20 ms time bins) and frequency (768 $\times$ 40 kHz channels) resolution images. No significant averaging in time and frequency occurs in the latter case, resulting in a large number of images being generated.