The added value of including thyroid nodule features into large language models for automatic ACR TI-RADS classification based on ultrasound reports

Pilar López-Úbeda; Teodoro Martín-Noguerol; Alba Ruiz-Vinuesa; Antonio Luna

doi:10.1007/s11604-024-01707-z

The added value of including thyroid nodule features into large language models for automatic ACR TI-RADS classification based on ultrasound reports

Jpn J Radiol. 2024 Nov 25. doi: 10.1007/s11604-024-01707-z. Online ahead of print.

Authors

Pilar López-Úbeda¹, Teodoro Martín-Noguerol², Alba Ruiz-Vinuesa³, Antonio Luna²

Affiliations

¹ NLP Department, HT Médica, Carmelo Torres 2, 23007, Jaén, Spain. p.lopez@htmedica.com.
² MRI Unit, Radiology Department, HT Médica, Carmelo Torres 2, 23007, Jaén, Spain.
³ Escuela de Ingeniería de Fuenlabrada, Universidad Rey Juan Carlos, Cam. del Molino, 5, 28942, Fuenlabrada, Madrid, Spain.

PMID: 39585560
DOI: 10.1007/s11604-024-01707-z

Abstract

Objective: The ACR Thyroid Imaging, Reporting, and Data System (TI-RADS) uses a score based on ultrasound (US) imaging to stratify the risk of nodule malignancy and recommend appropriate follow-up. This study aims to analyze US reports and explore how Natural Language Processing (NLP) leveraging Transformers models can classify ACR TI-RADS from text reports using the description of thyroid nodule features.

Materials and methods: This retrospective study evaluated 16,847 thyroid-free text reports from our institution. An automated system, followed by manual review by a radiologist, established baseline annotations by assigning ACR TI-RADS categories from 1 to 5. Two types of systems were evaluated and compared in the dataset. The first by performing a multiclass classification to detect the associated ACR TI-RADS, and the second by extracting thyroid nodule features from the textual reports and incorporating them into the classifier.

Results: Our study showed that models enhanced with specific features systematically outperformed those without. Particularly, the BERTIN model, to which additional features were added, achieved the highest level of accuracy, with a score of 0.8426. Moreover, we found a correlation between the presence of punctate echogenic foci, a feature often linked to malignant thyroid lesions, and increased ACR TI-RADS scores.

Conclusions: The features of the thyroid nodules described in thyroid US reports, such as composition, echogenicity, shape, margin or echogenic foci, help the NLP classifier to predict the associated ACR TI-RADS most accurately.

Keywords: Natural language processing; TI-RADS; Thyroid nodule features; Transformers; US reports.

Grants and funding

PTQ2021-012120/Ministerio de Ciencia e Innovación