Development and Validation of a Literature Screening Tool: Few-Shot Learning Approach in Systematic Reviews

Phongphat Wiwatthanasetthakarn; Wanchana Ponthongmak; Panu Looareesuwan; Amarit Tansawet; Pawin Numthavaj; Gareth J McKay; John Attia; Ammarin Thakkinstian

doi:10.2196/56863

Development and Validation of a Literature Screening Tool: Few-Shot Learning Approach in Systematic Reviews

J Med Internet Res. 2024 Dec 11:26:e56863. doi: 10.2196/56863.

Authors

Phongphat Wiwatthanasetthakarn¹, Wanchana Ponthongmak¹, Panu Looareesuwan¹, Amarit Tansawet², Pawin Numthavaj¹, Gareth J McKay³, John Attia⁴, Ammarin Thakkinstian¹

Affiliations

¹ Department of Clinical Epidemiology and Biostatistics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand.
² Department of Research and Medical Innovation, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand.
³ Centre for Public Health, Queen's University Belfast, Belfast, United Kingdom.
⁴ Centre for Clinical Epidemiology and Biostatistics, School of Medicine and Public Health, University of Newcastle, New South Wales, Australia.

PMID: 39662894
DOI: 10.2196/56863

Abstract

Background: Systematic reviews (SRs) are considered the highest level of evidence, but their rigorous literature screening process can be time-consuming and resource-intensive. This is particularly challenging given the rapid pace of medical advancements, which can quickly make SRs outdated. Few-shot learning (FSL), a machine learning approach that learns effectively from limited data, offers a potential solution to streamline this process. Sentence-bidirectional encoder representations from transformers (S-BERT) are particularly promising for identifying relevant studies with fewer examples.

Objective: This study aimed to develop a model framework using FSL to efficiently screen and select relevant studies for inclusion in SRs, aiming to reduce workload while maintaining high recall rates.

Methods: We developed and validated the FSL model framework using 9 previously published SR projects (2016-2018). The framework used S-BERT with titles and abstracts as input data. Key evaluation metrics, including workload reduction, cosine similarity score, and the number needed to screen at 100% recall, were estimated to determine the optimal number of eligible studies for model training. A prospective evaluation phase involving 4 ongoing SRs was then conducted. Study selection by FSL and a secondary reviewer were compared with the principal reviewer (considered the gold standard) to estimate the false negative rate.

Results: Model development suggested an optimal range of 4-12 eligible studies for FSL training. Using 4-6 eligible studies during model development resulted in similarity thresholds for 100% recall, ranging from 0.432 to 0.636, corresponding to a workload reduction of 51.11% (95% CI 46.36-55.86) to 97.67% (95% CI 96.76-98.58). The prospective evaluation of 4 SRs aimed for a 50% workload reduction, yielding numbers needed to screen 497 to 1035 out of 995 to 2070 studies. The false negative rate ranged from 1.87% to 12.20% for the FSL model and from 5% to 56.48% for the second reviewer compared with the principal reviewer.

Conclusions: Our FSL framework demonstrates the potential for reducing workload in SR screening by over 50%. However, the model did not achieve 100% recall at this threshold, highlighting the potential for omitting eligible studies. Future work should focus on developing a web application to implement the FSL framework, making it accessible to researchers.

Keywords: S-BERT; deep learning; few shots learning; natural language processing; sentence-bidirectional encoder representations from transformers; study selection; systematic review.

©Phongphat Wiwatthanasetthakarn, Wanchana Ponthongmak, Panu Looareesuwan, Amarit Tansawet, Pawin Numthavaj, Gareth J McKay, John Attia, Ammarin Thakkinstian. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 11.12.2024.

Publication types

Validation Study

MeSH terms

Humans
Machine Learning*
Systematic Reviews as Topic*