Hostname: page-component-89b8bd64d-mmrw7 Total loading time: 0 Render date: 2026-05-06T17:11:07.309Z Has data issue: false hasContentIssue false

Fibrosis severity scoring on Sirius red histology with multiple-instance deep learning

Published online by Cambridge University Press:  18 July 2023

Sneha N. Naik*
Affiliation:
ITMAT Data Science Group, NIHR Imperial BRC, Imperial College, London, United Kingdom Heffner Biomedical Imaging Lab, Department of Biomedical Engineering, Columbia University, New York, NY, USA
Roberta Forlano
Affiliation:
Faculty of Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College, London, United Kingdom
Pinelopi Manousou
Affiliation:
Department of Hepatology, Imperial College Healthcare NHS Trust, London, United Kingdom
Robert Goldin
Affiliation:
Section for Pathology, Imperial College, London, United Kingdom
Elsa D. Angelini
Affiliation:
ITMAT Data Science Group, NIHR Imperial BRC, Imperial College, London, United Kingdom Heffner Biomedical Imaging Lab, Department of Biomedical Engineering, Columbia University, New York, NY, USA Faculty of Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College, London, United Kingdom Telecom Paris, Institut Polytechnique de Paris, LTCI, Palaiseau, France
*
Corresponding author: Sneha N. Naik; Email: sn2990@columbia.edu
Rights & Permissions [Opens in a new window]

Abstract

Non-alcoholic fatty liver disease (NAFLD) is now the leading cause of chronic liver disease, affecting approximately 30% of people worldwide. Histopathology reading of fibrosis patterns is crucial to diagnosing NAFLD. In particular, separating mild from severe stages corresponds to a critical transition as it correlates with clinical outcomes. Deep Learning for digitized histopathology whole-slide images (WSIs) can reduce high inter- and intra-rater variability. We demonstrate a novel solution to score fibrosis severity on a retrospective cohort of 152 Sirius-Red WSIs, with fibrosis stage annotated at slide level by an expert pathologist. We exploit multiple instance learning and multiple-inferences to address the sparsity of pathological signs. We achieved an accuracy of $ 78.98\pm 5.86\% $, an F1 score of $ 77.99\pm 5.64\%, $ and an AUC of $ 0.87\pm 0.06 $. These results set new state-of-the-art benchmarks for this application.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Figure 1. Overview of our bag construction and deep-learning pipelines for fibrosis scoring on Sirius Red histopathology WSIs. Two MIL bag construction methods are compared: (1) RED-1, which uses a priori knowledge on red appearance of fibrosis to generate 1 bag per WSI, and (2) RAND-n, which generates $ n $ bags per WSI only based on tissue content. Deep-learning pipeline (best model): SE-ResNet18 initialized with ImageNet pre-trained weights with gated attention aggregation to output a binary label on a given bag.

Figure 1

Table 1. Single-bag MIL architectures performance–pre-trained with ImageNet and using Dataloader RED-1 with bag size $ k=10 $.

Figure 2

Table 2. Effect of bag sizes $ k $ using SE-ResNet18 (Gated Attention) pre-trained with ImageNet on Dataloader RED-1 with $ n=1 $ bag per WSI.

Figure 3

Table 3. Comparing Dataloader RED-1 ($ n=1 $) versus Dataloader RAND-n with bag size $ k=10. $

Figure 4

Figure 2. True positive (TP) cases of severe fibrosis illustrated on tiles with the 10 highest attention weights. WSI color categories are indicated under each case. Arrows color-coding: blue = relevant pathological signs of bridging, purple = red artefacts, green = healthy portal tracts. (a–f) Cases with high attention correctly focused on fibrotic signs as well as on artifacts and portal tracts. (g) Case showing higher attention put on portal tracts sliced longitudinally than on fibrotic signs. (h) Case with strong blurring still showing high attention on fibrotic signs.

Figure 5

Figure 3. False positive (FP) cases of severe fibrosis. (a) Red artifacts: The tiles with two highest and two lowest attention weights show a focus on red pixels from a vein cut within the tissue. (b) Borderline case: This case was reconsidered to be a potential severe case by the expert clinician.

Figure 6

Figure 4. False Negative (FN) cases. (a) Faded red stain; (b) Poor-quality biopsy with tissue crumbling; (c,d) Borderline cases with sparse bridging signs shown in zoomed boxes: (c) most red pixels correspond to healthy portal tracts sliced longitudinally; (d) Very thin bridging patterns.