Hostname: page-component-77f85d65b8-8v9h9 Total loading time: 0 Render date: 2026-03-29T07:38:09.209Z Has data issue: false hasContentIssue false

BibliZap: An exploratory evaluation of an automated multi-level citation searching tool for systematic and rapid reviews

Published online by Cambridge University Press:  24 March 2026

Raphaël Bentegeac
Affiliation:
Public Health and Epidemiology, Lille University Hospital, France UMR1167 RID-AGE, Pasteur Institute of Lille, France
Bastien Le Guellec
Affiliation:
UMR1167 RID-AGE, Pasteur Institute of Lille, France Inserm, CHU Lille, U1172 LilNCog Lille Neuroscience & Cognition, Lille 2 University of Health and Law, France Neuroradiology, Lille University Hospital, France
Victor Leblanc
Affiliation:
Hauts-de-France, Protection Maternelle et Infantile du Département du Nord, France
Rémi Lenain
Affiliation:
Nephrology, Lille University Hospital, France
Luc Dauchet
Affiliation:
Public Health and Epidemiology, Lille University Hospital, France UMR1167 RID-AGE, Pasteur Institute of Lille, France
Victoria Gauthier
Affiliation:
Public Health and Epidemiology, Lille University Hospital, France UMR1167 RID-AGE, Pasteur Institute of Lille, France
Erwin Gerard
Affiliation:
CHU Lille, ULR 2694—METRICS, CERIM, Public Health Department, Lille 2 University of Health and Law, France
Emmanuel Chazard
Affiliation:
CHU Lille, ULR 2694—METRICS, CERIM, Public Health Department, Lille 2 University of Health and Law, France
Philippe Amouyel
Affiliation:
Public Health and Epidemiology, Lille University Hospital, France UMR1167 RID-AGE, Pasteur Institute of Lille, France
Estelle Aymes
Affiliation:
Public Health and Epidemiology, Lille University Hospital, France UMR1167 RID-AGE, Pasteur Institute of Lille, France
Aghilès Hamroun*
Affiliation:
Public Health and Epidemiology, Lille University Hospital, France UMR1167 RID-AGE, Pasteur Institute of Lille, France
*
Corresponding author: Aghilès Hamroun; Email: aghiles.hamroun@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

The exponential growth of scientific literature poses increasing challenges for evidence synthesis. Systematic reviews (SRs) usually rely on keyword-based database searches, which are limited by inconsistent terminology and indexing delays. Citation searching—identifying studies that cite or are cited by known relevant articles—offers a complementary route to uncover additional evidence but remains poorly automated and integrated into screening workflows. We developed BibliZap, an open-source, fully automated citation-searching tool built on Lens.org data, performing multi-level forward and backward citation searches with relevance-based ranking. Its performance was evaluated across 66 published SRs, comparing five approaches: (1) PubMed-only searches; (2) PubMed followed by BibliZap restricted to the top 500 ranked results; (3) PubMed followed by full BibliZap screening; and (4–5) two exploratory early-stop strategies where BibliZap was initiated after identifying the first or the first three PubMed relevant records. The primary outcome was sensitivity, with secondary assessments of screening workload and precision. When used after PubMed screening, BibliZap increased mean sensitivity from 75% to 97%, achieving complete recall in over half of the reviews. Screening only the top 500 outputs still allowed over 90% of reviews to reach or exceed 80% recall. BibliZap recovered a median of three additional included articles per review, not retrieved by PubMed, while adding a median of 6,450 additional records. Citation searching via BibliZap enhances the completeness of evidence retrieval in SRs based on restricted database searches and supports transparent, scalable workflows adaptable to rapid and exploratory review contexts.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology
Figure 0

Figure 1 BibliZap web application input interface. User interface for launching a BibliZap query from the public web app (https://biblizap.org). Users input one or several seed identifiers (PMIDs, DOIs, or Lens IDs) separated by spaces, select the citation depth, direction (backward, forward, or both), and the number of results to retrieve. The application is powered by The Lens and includes a Firefox plugin for quick access from article pages.

Figure 1

Figure 2 Overview of BibliZap’s citation searching process. Starting from one or more seed articles (bottom left), BibliZap performs forward (blue arrow: articles citing the seed) and backward (orange arrow: articles cited by the seed) citation exploration. The first expansion level (Depth 1) retrieves all directly linked articles. These are then used as new seeds to perform a second round of forward and backward citation searching (Depth 2), further expanding the network. In the figure, articles are color-coded to allow visual identification of their propagation through the network. The final output includes all unique articles retrieved across both expansion levels. Each article is assigned a relevance score, defined by the number of times it appears across all citation paths. As illustrated in the right panel, articles that appear most frequently receive the highest scores and are ranked accordingly, enabling transparent prioritization for screening.

Figure 2

Figure 3 Example of BibliZap output interface. Screenshot of the web application output table after running a citation query. Retrieved articles are ranked by a cumulative relevance score and presented with their metadata, including DOI, title, journal, first author, publication year, summary, citation count, and score. Users can filter, sort, and download results for further screening or integration into reference management tools.

Figure 3

Figure 4 Proportion of relevant articles retrieved per systematic review, according to the three main citation searching configurations. Each horizontal bar represents one review (N = 66). Stacked segments indicate the proportion of included (gold-standard) articles retrieved using PubMed only (Approach 1, dark blue), PubMed supplemented with the top 500 BibliZap results (Approach 2, grey), and PubMed supplemented with the full BibliZap output (Approach 3, black). Bars are ordered by overall sensitivity within each review.

Figure 4

Figure 5 Proportion of systematic reviews (N = 66) meeting predefined sensitivity thresholds (≥80%, ≥90%, and 100%) across the five search strategies. Bars represent mean proportions with 95% confidence intervals. Approaches include: PubMed only (Approach 1, dark blue), PubMed + BibliZap top 500 (Approach 2, grey), PubMed + BibliZap full output (Approach 3, black), PubMed early stop + BibliZap (Approach 4, light blue), and PubMed late stop + BibliZap (Approach 5, teal).

Figure 5

Figure 6 Cumulative recall curve for the PubMed + full-seed BibliZap strategy (Approach 3). Cumulative sensitivity is plotted as a function of the number of articles screened, in descending relevance order (Best Match for PubMed; BibliZap relevance score for citation searching). The transition between PubMed (dark blue) and BibliZap (black) corresponds to the point where PubMed screening ends and citation searching begins. Right panel: among gold-standard articles not retrieved by PubMed (Approach 1), the cumulative proportion recovered by BibliZap (Approach 3) is plotted against the number of ranked articles screened. Shaded areas represent 95% confidence intervals.

Supplementary material: File

Bentegeac et al. supplementary material

Bentegeac et al. supplementary material
Download Bentegeac et al. supplementary material(File)
File 36.6 KB