The added value of including thyroid nodule features into large language models for automatic ACR TI-RADS classification based on ultrasound reports

Jpn J Radiol. 2024 Nov 25. doi: 10.1007/s11604-024-01707-z. Online ahead of print.

Abstract

Objective: The ACR Thyroid Imaging, Reporting, and Data System (TI-RADS) uses a score based on ultrasound (US) imaging to stratify the risk of nodule malignancy and recommend appropriate follow-up. This study aims to analyze US reports and explore how Natural Language Processing (NLP) leveraging Transformers models can classify ACR TI-RADS from text reports using the description of thyroid nodule features.

Materials and methods: This retrospective study evaluated 16,847 thyroid-free text reports from our institution. An automated system, followed by manual review by a radiologist, established baseline annotations by assigning ACR TI-RADS categories from 1 to 5. Two types of systems were evaluated and compared in the dataset. The first by performing a multiclass classification to detect the associated ACR TI-RADS, and the second by extracting thyroid nodule features from the textual reports and incorporating them into the classifier.

Results: Our study showed that models enhanced with specific features systematically outperformed those without. Particularly, the BERTIN model, to which additional features were added, achieved the highest level of accuracy, with a score of 0.8426. Moreover, we found a correlation between the presence of punctate echogenic foci, a feature often linked to malignant thyroid lesions, and increased ACR TI-RADS scores.

Conclusions: The features of the thyroid nodules described in thyroid US reports, such as composition, echogenicity, shape, margin or echogenic foci, help the NLP classifier to predict the associated ACR TI-RADS most accurately.

Keywords: Natural language processing; TI-RADS; Thyroid nodule features; Transformers; US reports.