Hostname: page-component-77f85d65b8-g98kq Total loading time: 0 Render date: 2026-03-28T02:34:21.020Z Has data issue: false hasContentIssue false

The quest for early detection of retinal disease: 3D CycleGAN-based translation of optical coherence tomography into confocal microscopy

Published online by Cambridge University Press:  16 December 2024

Xin Tian*
Affiliation:
Visual Information Laboratory, University of Bristol, Bristol, UK
Nantheera Anantrasirichai
Affiliation:
Visual Information Laboratory, University of Bristol, Bristol, UK
Lindsay Nicholson
Affiliation:
Autoimmune Inflammation Research, University of Bristol, Bristol, UK
Alin Achim
Affiliation:
Visual Information Laboratory, University of Bristol, Bristol, UK
*
Corresponding author: Xin Tian; Email: xin.tian@bristol.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Optical coherence tomography (OCT) and confocal microscopy are pivotal in retinal imaging, offering distinct advantages and limitations. In vivo OCT offers rapid, noninvasive imaging but can suffer from clarity issues and motion artifacts, while ex vivo confocal microscopy, providing high-resolution, cellular-detailed color images, is invasive and raises ethical concerns. To bridge the benefits of both modalities, we propose a novel framework based on unsupervised 3D CycleGAN for translating unpaired in vivo OCT to ex vivo confocal microscopy images. This marks the first attempt to exploit the inherent 3D information of OCT and translate it into the rich, detailed color domain of confocal microscopy. We also introduce a unique dataset, OCT2Confocal, comprising mouse OCT and confocal retinal images, facilitating the development of and establishing a benchmark for cross-modal image translation research. Our model has been evaluated both quantitatively and qualitatively, achieving Fréchet inception distance (FID) scores of 0.766 and Kernel Inception Distance (KID) scores as low as 0.153, and leading subjective mean opinion scores (MOS). Our model demonstrated superior image fidelity and quality with limited data over existing methods. Our approach effectively synthesizes color information from 3D confocal images, closely approximating target outcomes and suggesting enhanced potential for diagnostic and monitoring applications in ophthalmology.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. The proposed OCT-to-Confocal image translation method is based on 3D CycleGAN.

Figure 1

Figure 2. OCT2Confocal data. (a) The OCT cube with the confocal image stack of A2R, (b) The OCT projection and confocal of 3 mice.

Figure 2

Figure 3. Example of one slice in an original four-color channel of retinal confocal image stack. The images show (from left to right):(a) Endothelial cells lining the blood vessels (red), (b) CD4+ T cells (green), (c) Cell nuclei stained with DAPI (blue), and (d) Microglia and macrophages (white).

Figure 3

Table 1. Correlation of selected DB and NR image quality metrics with MOS

Figure 4

Table 2. Comparative results of a different generator architecture in 3D CycleGAN-3. The table presents FID768, FID2048, and KID scores for U-Net, WGAN-GP, and ResNet 9 generators. Lower scores indicate better performance, with the best result colored in red

Figure 5

Figure 4. Visual comparison of translated images using different generator architectures. This figure displays the translated confocal images using U-Net, WGAN-GP, and ResNet 9 architectures.

Figure 6

Figure 5. Impact of Gradient and Identity Loss Hyperparameters $ {\lambda}_2 $ and $ {\lambda}_3 $ on FID and KID. The lowest (optimal) score is highlighted in red.

Figure 7

Figure 6. Visual comparison of translated confocal images with different $ {\lambda}_2 $ and $ {\lambda}_3 $ values against the optimized setting.

Figure 8

Figure 7. Visual comparison of translated images with varying input slice depths (5, 7, 9, 11 slices). This figure demonstrates the impact of different slice depths on the quality of image translation by the 3D CycleGAN-3 model.

Figure 9

Table 3. The performance of models was evaluated by DB metrics FID scores and KID scores, alongside the subjective MOS rating. The results are referred to categories with reference (’W Ref’), without reference (’W/O Ref’), and total (‘Total’) image sets. For each column, the best result is colored in red and the second best is colored in blue

Figure 10

Figure 8. Visual comparative translation results with reference.

Figure 11

Figure 9. Visual comparative translation results without reference.

Figure 12

Figure 10. Boxplot of subjective evaluation scores for comparison across scenarios with reference (‘W Ref’), without reference (‘W/O Ref’), and the combined total (‘Total’). The circles indicate outliers in the data.

Figure 13

Figure 11. Example of model hallucination analysis. Focusing on the red channel for vascular structures and the green channel for CD4+ T cells. Areas highlighted (yellow boxes) show where each model introduces inaccuracies in the representation of vascular and immune cell distributions.

Figure 14

Table 4. Comparative computational of different models. For each column, the red indicates the most computationally efficient values for each metric

Figure 15

Table 5. The performance of models was evaluated by DB metrics FID scores and KID scores, alongside the subjective MOS rating of each individual set. The best result is colored in red and the second best is colored in blue