Optimizing a Classification Model to Evaluate Individual Susceptibility in Noise-Induced Hearing Loss: Cross-Sectional Study

JMIR Public Health Surveill. 2024 Nov 14:10:e60373. doi: 10.2196/60373.

Abstract

Background: Noise-induced hearing loss (NIHL), one of the leading causes of hearing loss in young adults, is a major health care problem that has negative social and economic consequences. It is commonly recognized that individual susceptibility largely varies among individuals who are exposed to similar noise. An objective method is, therefore, needed to identify those who are extremely sensitive to noise-exposed jobs to prevent them from developing severe NIHL.

Objective: This study aims to determine an optimal model for detecting individuals susceptible or resistant to NIHL and further explore phenotypic traits uniquely associated with their susceptibility profiles.

Methods: Cross-sectional data on hearing loss caused by occupational noise were collected from 2015 to 2021 at shipyards in Shanghai, China. Six methods were summarized from the literature review and applied to evaluate their classification performance for susceptibility and resistance of participants to NIHL. A machine learning (ML)-based diagnostic model using frequencies from 0.25 to 12 kHz was developed to determine the most reliable frequencies, considering accuracy and area under the curve. An optimal method with the most reliable frequencies was then constructed to detect individuals who were susceptible versus resistant to NIHL. Phenotypic characteristics such as age, exposure time, cumulative noise exposure, and hearing thresholds (HTs) were explored to identify these groups.

Results: A total of 6276 participants (median age 41, IQR 33-47 years; n=5372, 85.6% men) were included in the analysis. The ML-based NIHL diagnostic model with misclassified subjects showed the best performance for identifying workers in the NIHL-susceptible group (NIHL-SG) and NIHL-resistant group (NIHL-RG). The mean HTs at 4 and 12.5 kHz showed the highest predictive value for detecting those in the NIHL-SG and NIHL-RG (accuracy=0.78 and area under the curve=0.81). Individuals in the NIHL-SG selected by the optimized model were younger than those in the NIHL-RG (median 28, IQR 25-31 years vs median 35, IQR 32-39 years; P<.001), with a shorter duration of noise exposure (median 5, IQR 2-8 years vs median 8, IQR 4-12 years; P<.001) and lower cumulative noise exposure (median 90, IQR 86-92 dBA-years vs median 92.2, IQR 89.2-94.7 dBA-years; P<.001) but greater HTs (4 and 12.5 kHz; median 58.8, IQR 53.8-63.8 dB HL vs median 8.8, IQR 7.5-11.3 dB HL; P<.001).

Conclusions: An ML-based NIHL diagnostic model with misclassified subjects using the mean HTs of 4 and 12.5 kHz was the most reliable method for identifying individuals susceptible or resistant to NIHL. However, further studies are needed to determine the genetic factors that govern NIHL susceptibility.

Keywords: extended high frequencies; genetic heterogeneity; linear regression; machine learning algorithms; noise-induced hearing loss; phenotypic characteristics; resistance; susceptible.

MeSH terms

  • Adult
  • China / epidemiology
  • Cross-Sectional Studies
  • Disease Susceptibility
  • Female
  • Hearing Loss, Noise-Induced* / diagnosis
  • Hearing Loss, Noise-Induced* / epidemiology
  • Humans
  • Male
  • Middle Aged
  • Noise, Occupational / adverse effects
  • Noise, Occupational / statistics & numerical data