Identification of clinical disease trajectories in neurodegenerative disorders with natural language processing

Nat Med. 2024 Apr;30(4):1143-1153. doi: 10.1038/s41591-024-02843-9. Epub 2024 Mar 12.

Abstract

Neurodegenerative disorders exhibit considerable clinical heterogeneity and are frequently misdiagnosed. This heterogeneity is often neglected and difficult to study. Therefore, innovative data-driven approaches utilizing substantial autopsy cohorts are needed to address this complexity and improve diagnosis, prognosis and fundamental research. We present clinical disease trajectories from 3,042 Netherlands Brain Bank donors, encompassing 84 neuropsychiatric signs and symptoms identified through natural language processing. This unique resource provides valuable new insights into neurodegenerative disorder symptomatology. To illustrate, we identified signs and symptoms that differed between frequently misdiagnosed disorders. In addition, we performed predictive modeling and identified clinical subtypes of various brain disorders, indicative of neural substructures being differently affected. Finally, integrating clinical diagnosis information revealed a substantial proportion of inaccurately diagnosed donors that masquerade as another disorder. The unique datasets allow researchers to study the clinical manifestation of signs and symptoms across neurodegenerative disorders, and identify associated molecular and cellular features.

MeSH terms

  • Humans
  • Natural Language Processing*
  • Netherlands / epidemiology
  • Neurodegenerative Diseases* / diagnosis
  • Neurodegenerative Diseases* / genetics