Machine Learning for Automatic Detection of Velopharyngeal Dysfunction: A Preliminary Report

J Craniofac Surg. 2024 May 6. doi: 10.1097/SCS.0000000000010147. Online ahead of print.

Abstract

Background: Even after palatoplasty, the incidence of velopharyngeal dysfunction (VPD) can reach 30%; however, these estimates arise from high-income countries (HICs) where speech-language pathologists (SLP) are part of standardized cleft teams. The VPD burden in low- and middle-income countries (LMICs) is unknown. This study aims to develop a machine-learning model that can detect the presence of VPD using audio samples alone.

Methods: Case and control audio samples were obtained from institutional and publicly available sources. A machine-learning model was built using Python software.

Results: The initial 110 audio samples used to test and train the model were retested after format conversion and file deidentification. Each sample was tested 5 times yielding a precision of 100%. Sensitivity was 92.73% (95% CI: 82.41%-97.98%) and specificity was 98.18% (95% CI: 90.28%-99.95%). One hundred thirteen prospective samples, which had not yet interacted with the model, were then tested. Precision was again 100% with a sensitivity of 88.89% (95% CI: 78.44%-95.41%) and a specificity of 66% (95% CI: 51.23%-78.79%).

Discussion: VPD affects nearly 100% of patients with unrepaired overt soft palatal clefts and up to 30% of patients who have undergone palatoplasty. VPD can render patients unintelligible, thereby accruing significant psychosocial morbidity. The true burden of VPD in LMICs is unknown, and likely exceeds estimates from HICs. The ability to access a phone-based screening machine-learning model could expand access to diagnostic, and potentially therapeutic modalities for an innumerable amount of patients worldwide who suffer from VPD.