Strong lensing by edge-on galaxies in UNIONS

Abstract Current searches for galaxy-scale strong lenses focus on massive Luminous Red Galaxies but tend to overlook late-type lenses, in part because of their smaller Einstein radii. We take advantage of the superb seeing of the UNIONS survey in the r-band to perform an imaging search for edge-on late-type lenses. We use Convolutional Neural Networks trained with simulated observations composed of images of real galaxies from UNIONS and real sources from HST. Using 3600 square degrees of the survey we test ∼7 million galaxies and find 56 systems with obvious signs of lensing. In addition, we empirically estimate the true prevalence of lenses in UNIONS by visually inspecting 120,000 randomly chosen images in the survey. We find that the number of edge-on lenses we discover with CNNs is compatible with these estimates.


Introduction
Machine Learning methods, applied to wide-field imaging surveys, have led to a revolution in the field of galaxy-scale strong lensing.Just in the past five years, the number of known strong lenses has increased by one order of magnitude, with some of the most recent lens-finding works reporting, on their own, tens or even hundreds of reliable candidates (e.g.Cañameras et al. 2020;Li et al. 2021;Rojas et al. 2022;Savary et al. 2022).However, the new lenses discovered are vastly dominated by early-type galaxies such as Luminous Red Galaxies (LRG) and late-type lenses remain a minority.This can be partly explained by the smaller lensing cross section of late-type galaxies, making them less likely than LRGs of producing prominent lensing features.In addition, current lensfinding teams tend to deliberately bias their samples towards LRG lenses, as indeed this is where chances of finding efficient lenses are the best.As a result, using strong lensing as a tool to study galaxy formation and evolution remains limited to a fairly narrow range of galaxy morphologies, leaving late-type galaxies underrepresented.
The lensing cross-section of late-type galaxies peaks when they are seen edge-on, hence maximizing the projected surface mass density along the line of sight.And indeed, most examples of late-type lenses do have edge-on deflectors.This turns out to be very convenient for spectroscopic follow-up aiming at measuring the stellar mass and the rotation curve of the lens.The combination of kinematics with lens modelling allows us to separate the stellar and dark mass distributions as well as the IMF of the lens (e.g.Ferreras et al. 2010) and to break known model degeneracies such as the disc-bulge degeneracy or the dark matter-baryons degeneracy (e.g.Dutton et al. 2011).This technique, however, has not been systematically exploited in the last decade due to the low number of known edge-on disc lenses.
The largest current sample of late-type lenses resides in the Sloan WFC Edge-on Latetype Lens Survey (SWELLS, Treu et al. 2011).The SWELLS team looked for SDSS (Sloan Digital Sky Survey) spectra that contained more than one set of lines at different redshifts and then obtained high-resolution imaging using the Hubble Space Telescope (HST).They confirmed and studied 24 edge-on lenses, and studied in detail 20 of them (Brewer et al. 2012).By contrast, the largest search using wide-field optical imaging was carried out by Sygnet et al. (2010), finding 11 good candidates in the 150 deg 2 wide part of the Canada-France Hawaii Telescope (CFHT) Legacy Survey.This study did not use Machine Learning.To date, there has been no systematic and targeted search for edge-on lenses in wide-field imaging surveys using Machine Learning.Most new edge-on lens candidates were in fact discovered in the course of LRG lens searches, and thus, are found only in tiny proportions.For example, out of 294 candidates in Cañameras et al. (2021), only 13 have edge-on late-type galaxies as main deflectors.
In this work, we use a Convolutional Neural Network (CNN) to do a specific search for late-type lens galaxies, using data-driven imaging simulations composed exclusively of edge-on lenses for the positive class and of a random selection of galaxies of all types for the negative class.The observations used here are part of the Canada-France Infrared Survey (CFIS), now part of the Ultraviolet Near Infrared Optical Northern Survey (UNIONS).Our search is carried in a single photometric band (r-band) under exquisite seeing conditions.As such, our work can be seen as exploratory work for the single-band VIS Euclid data.

Data
The Ultraviolet Near Infrared Optical Northern Survey (UNIONS) is a collaboration of wide field imaging surveys of the northern hemisphere.UNIONS consists of the Canada-France Imaging Survey (CFIS), conducted at the 3.6-meter CFHT on Maunakea, members of the Pan-STARRS team, and the Wide Imaging with Subaru HyperSuprime-Cam of the Euclid Sky (WISHES) team.CFHT/CFIS is obtaining deep u and r bands; Pan-STARRS is obtaining deep i and moderate-deep z band imaging, and Subaru is obtaining deep z-band imaging through WISHES and g-band imaging through the Waterloo-Hawaii IfA g-band Survey (WHIGS).These independent efforts are directed, in part, to securing optical imaging to complement the Euclid space mission, although UNIONS is a separate collaboration aimed at maximizing the science return of these large and deep surveys of the northern skies.We focus our search in the CFIS r-band data taking advantage of the excellent average seeing of 0.65 arcsec.The observations were taken with the wide-field imager MegaCam and reduced using the MegaPipe image processing pipeline (Gwyn 2008).Catalogs were produced using Sextractor (Bertin andArnouts 1996, 2010).
We used observations covering 3624 deg 2 and applied a magnitude cut of 17 < r < 20.5.We did not apply any ellipticity cut because the lensing features, when present, reduce the measured ellipticity of the whole system.This left us with 6,978,977 extended sources to mine for lenses.Our stamp size was 12.3 arcsec 2 (66 pixels on a side) centered on the source, along with a local model of the PSF from PSFEx (Bertin 2011(Bertin , 2013) ) and a weight map.
To train the CNN we simulated a data set mocking CFIS r-band observations of edgeon lenses.We used real CFIS images for the deflector galaxies and HST imaging for the background sources.To select edge-on spirals we applied an ellipticity cut (0.6 < a 2 −b 2 a 2 +b 2 < 0.92) and a magnitude cut.We limited the ellipticity cut to 0.92 in order to avoid diffraction spikes, dead columns and other spurious detections.This selection was matched with SDSS DR18 (Almeida et al. 2023) to obtain photometric redshifts.We kept only the galaxies that had good estimates of the photo-z error.In total, there were 982,721 potential deflectors for the mock edge-on lenses.To simulate the sources we used 50,889 r-band images from the catalog compiled by Cañameras et al. (2020), which combines HST/ACS F814W high-resolution images with color information from the Hyper-Supreme Camera (HSC) to produce high-resolution images mimicking the HSC observed colors.

Methodology
We used a supervised approach that relies on showing correctly-classified examples to the CNN, and minimizing the error between the score and the correct classification (1 for lenses, 0 otherwise).Given the low number of known edge-on lenses, we are forced to create a data set of mock CFIS observations of edge-on lenses.The lens-finding procedure has three main steps: (1) Create a training data set by mocking CFIS observations of edge-on lenses.
(2) Train and deploy a CNN to select good edge-on lens candidates.
(3) Visually inspect the CNN's candidates to clean the final catalog from falsepositives.
To make a mock lens we start with a potential deflector and assume a mass-to-light ratio for it.Then, we create a mass model based on the light profile of the galaxy plus a dark matter halo.Finally, we lens an HST source and add it.This was done using the Lenstronomy package (Birrer and Amara 2018;Birrer et al. 2021), where the luminous mass is represented with a Chameleon profile (see appendix of Dutton et al. 2011) and where the dark matter is a Navarro-Frenk-White profile (Navarro et al. 1996).The final training data set consisted of 99,547 mock lenses and an equal number of visually inspected non-lenses, of which 80% was used as training data, 10% as validation data, and 10% as test data.
For the classification we reimplemented the CNN CMU DeepLens (Lanusse et al. 2018) in Keras † (Chollet and others 2018) and trained it for 160 epochs using the standard binary cross-entropy as loss function and a Stochastic Gradient Descent optimizer.We used a learning rate, momentum and batch size of 0.001, 0.9 and 512 respectively.We consider objects with score larger than 0.5 as candidates for visual inspection.After training, the accuracy, precision and recall of the CNN on the test data set were 99.33%, 99.30% and 99.34%, respectively.
The visual inspection was done by J. A. B., B. C., and F. C. using a visualization tool derived from the one used in Savary et al. (2022) ‡.The inspection was done in three steps: first, each expert inspected all of the candidates by mosaics of 10 × 10 stamps per page, selecting any object that shows any possible hint of lensing.Second, each expert inspected one by one all of the candidates selected in the previous step and grade them A,  B, or X (A meaning sure lens; B, possibly a lens; and X, not a lens).Finally, all experts gathered together and decided a final list of candidates from all the objects graded A or B in the previous step.

Preliminary results and conclusion
estimated range (16 to 638), albeit on the lower side.By contrast, the number of lens candidates of any type discovered by the CNN ( 56) is well outside of the estimated range (363 to 1885).This is to be expected since the CNN was trained to target edge-on lenses specifically, and they represent only a small fraction of all the lenses in the sample.Finally, we combined the lens candidates discovered during the inspection of random stamps, with the candidates from the CNN, and the serendipitous discoveries, resulting in a total of 78 new lens candidates and 8 re-discoveries.Out of the new candidates, 48 (4 grade A and 44 grade B) had an edge-on disc galaxy as main deflector.The other 30 new candidates (7 grade A and 23 grade B) had LRGs, galaxy groups, or galaxy clusters as main deflector.
In conclusion, we showed the viability of finding edge-on lenses from single-band data in ground-based imaging surveys, in spite of their small Einstein radii.We discovered a sizable population of previously unknown edge-on lenses, along with some new candidates of lensing by LRG, galaxy groups, and galaxy clusters.Furthermore, we showcased a methodology to evaluate the performance of automated lens finders by estimating the prevalence of candidates in the studied sample, and comparing it to the number of candidates discovered by the lens finder.In this way we can evaluate the completeness of the lens search in an objective basis for that particular data set.Our work also illustrates the need for lens searches that explicitly target edge-on disc galaxies, and presents a methodology on how to carry them out.Our results and methods are particularly promising for the Euclid VIS instrument, which will have even sharper images and will observe 4 times the area studied in this work.

Figure 1 .
Figure 1.Examples of Grade A edge-on lens candidates recovered by the CNN.All of them are new candidates except for the one at the right (see J1559+3146 in Cañameras et al. 2020; Savary et al. 2022).

Table 1 .
Summary of the visual inspection of random stamps Top: all types of lenses.Bottom: edge-on lenses only.