Hostname: page-component-6766d58669-zlvph Total loading time: 0 Render date: 2026-05-16T20:38:24.190Z Has data issue: false hasContentIssue false

Enabling Near Real-Time Remote Search for Fast Transient Events with Lossy Data Compression

Published online by Cambridge University Press:  05 September 2017

Dany Vohl*
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn 3122, Australia Advanced Visualisation Laboratory, Digital Research & Innovation Capability Platform, Swinburne University of Technology, Hawthorn 3122, Australia
Tyler Pritchard
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn 3122, Australia Australian Research Council Centre of Excellence for All-sky Astrophysics (CAASTRO), The University of Sydney, NSW 2006, Australia
Igor Andreoni
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn 3122, Australia Australian Research Council Centre of Excellence for All-sky Astrophysics (CAASTRO), The University of Sydney, NSW 2006, Australia Australian Astronomical Observatory, North Ryde 2113, Australia Australian Research Council Centre of Excellence for Gravitational Wave Discovery (OzGrav), Swinburne University of Technology, Hawthorn 3122, Australia
Jeffrey Cooke
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn 3122, Australia Australian Research Council Centre of Excellence for All-sky Astrophysics (CAASTRO), The University of Sydney, NSW 2006, Australia Australian Research Council Centre of Excellence for Gravitational Wave Discovery (OzGrav), Swinburne University of Technology, Hawthorn 3122, Australia
Bernard Meade
Affiliation:
Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn 3122, Australia The University of Melbourne, Parkville 3010, Australia
*
Rights & Permissions [Opens in a new window]

Abstract

We present a systematic evaluation of JPEG2000 (ISO/IEC 15444) as a transport data format to enable rapid remote searches for fast transient events as part of the Deeper Wider Faster programme. Deeper Wider Faster programme uses ~20 telescopes from radio to gamma rays to perform simultaneous and rapid-response follow-up searches for fast transient events on millisecond-to-hours timescales. Deeper Wider Faster programme search demands have a set of constraints that is becoming common amongst large collaborations. Here, we focus on the rapid optical data component of Deeper Wider Faster programme led by the Dark Energy Camera at Cerro Tololo Inter-American Observatory. Each Dark Energy Camera image has 70 total coupled-charged devices saved as a ~1.2 gigabyte FITS file. Near real-time data processing and fast transient candidate identifications—in minutes for rapid follow-up triggers on other telescopes—requires computational power exceeding what is currently available on-site at Cerro Tololo Inter-American Observatory. In this context, data files need to be transmitted rapidly to a foreign location for supercomputing post-processing, source finding, visualisation and analysis. This step in the search process poses a major bottleneck, and reducing the data size helps accommodate faster data transmission. To maximise our gain in transfer time and still achieve our science goals, we opt for lossy data compression—keeping in mind that raw data is archived and can be evaluated at a later time. We evaluate how lossy JPEG2000 compression affects the process of finding transients, and find only a negligible effect for compression ratios up to ~25:1. We also find a linear relation between compression ratio and the mean estimated data transmission speed-up factor. Adding highly customised compression and decompression steps to the science pipeline considerably reduces the transmission time—validating its introduction to the Deeper Wider Faster programme science pipeline and enabling science that was otherwise too difficult with current technology.

Information

Type
Research Article
Copyright
Copyright © Astronomical Society of Australia 2017 
Figure 0

Figure 1. Example of a raw and uncalibrated mosaic image, as captured by the 62 science CCDs and 8 guides CCDs of DECam. Each science CCD is of dimension 4146 × 2160 pixels, while each guide CCD contains 2098 × 2160 pixels. Each pixel is encoded as a 32-bit integer, resulting to ~1.2 GB of storage space for the whole mosaic. Each CCD has two amplifiers, providing the ability to read the pixel arrays using either or both amplifiers. The uncalibrated image displays a split darker and lighter sides for each CCD, corresponding to the regions covered by each amplifier. The mosaic was visualised with SAOImage DS9 (Smithsonian Astrophysical Observat 2000) using the heat colour map. The blue masks and dashed lines highlight the size of a science and guide CCD, respectively.

Figure 1

Figure 2. JPEG2000 compression is applied as a stream of processing steps based on the discrete wavelet transform, scalar quantisation, context modelling, entropy coding, and post-compression rate allocation [adapted from Kitaeff et al. (2015)].

Figure 2

Figure 3. Compression procedure schematic diagram. The multi-extension FITS file from DECam is lossily compressed into multiple JPEG2000 (one per extension), and then grouped together into a TAR file ready for transmission. Note that the primary header is merged with the extension header.

Figure 3

Figure 4. Decompression procedure schematic diagram. The TAR file is expanded to recover all JPEG2000 files; each of them is then decompressed into a single extension FITS file. Each FITS file corresponds to a given extension of the original file, where the primary header contains the merged information of the original primary header and the current extension header.

Figure 4

Table 1. Parameters in Part 1 of the JPEG2000 Standard, ordered as encountered in the encoder. The only parameter for which the default value is modified during an observation run is highlighted.

Figure 5

Figure 5. Schematic diagram of the experiment setup. Three images taken at different epochs form a set. Transients are added to each image of the set, using the same sky coordinates in all three images. Images of the set are coadded (median stacking) to better detect the transients, and to eliminate cosmic rays—reflecting a transient lasting longer than three images worth of time (about 120 s). Difference imaging is then applied between the stacked image and a template image, resulting in a residual image. Transient detection is applied on the residual image using the Mary pipeline, which outputs a transient candidates list. This list is cross-matched with the list of injected sources to evaluate the completeness. We note that a loss in completeness will naturally occur when sources fall onto bright sources, making its detection difficult or impossible.

Figure 6

Figure 6. Normalised completeness as a function of magnitude for all evaluated compression ratio [Equation (4)]. A normalised completeness of 10° indicates no difference in transient finding results between the compressed and never-compressed data. Results above and below this line show that compression affected the findings positively or negatively, respectively. Results are limited to cases where completeness on original data was ⩾0.5 (i.e. down to magnitude ~22.5).

Figure 7

Figure 7. Mean normalised completeness and 95% confidence interval as a function of compression ratio. Results are limited to cases where completeness on original data was ⩾0.5.

Figure 8

Figure 8. Box and whiskers plot showing distributions of transfer rate (MB s−1, top panel), transfer time (s, central panel), and compression ratio (:1, bottom panel) obtained during each day of the O2 (2016) and O3 (2017) runs. The median (line) is within the box bounded by the first and third quartiles range ($\mathtt {IQR}$ = $\mathtt {Q3}-\mathtt {Q1}$). The whiskers are $\mathtt {Q1-1.5 \times IQR}$ and $\mathtt {Q3+1.5 \times IQR}$. Beyond the whiskers, values are considered outliers and are plotted as diamonds. We note that transfer rate varied greatly from day to day. Compression ratio was varied by the team during each run to provide data with visual quality as high as possible, while providing fast enough transfer time.

Figure 9

Figure 9. Mean estimated speed-up factor ($\hat{s}$) and 95% confidence interval as a function of compression ratio for 13 081 files transferred during the O2 run (2016) and O3 run (2017).

Figure 10

Table 2. Summary of transmission timing results for the combined (Both) and individual (O2, O3) observation runs. Columns show compression ratio (#), transfer rate (r), estimated speed-up factor ($\hat{s}$), and estimated transfer time saved ($\hat{\theta }$). Rows show minimum, maximum, mean, median, and standard deviation of the distribution.