Natural Language Processing (NLP) is an ever-growing field of computational science that aims to model natural human language. Combined with advances in machine learning, which learns patterns in data, it offers practical capabilities including automated language analysis. These approaches have garnered interest from clinical researchers seeking to understand the breakdown of language due to pathological changes in the brain, offering fast, replicable and objective methods. The study of Alzheimer's disease (AD), and preclinical Mild Cognitive Impairment (MCI), suggests that changes in discourse (connected speech or writing) may be key to early detection of disease. There is currently no disease-modifying treatment for AD, the leading cause of dementia in people over the age of 65, but detection of those at risk of developing the disease could help with the identification and testing of medications which can take effect before the underlying pathology has irreversibly spread. We outline important components of natural language, as well as NLP tools and approaches with which they can be extracted, analysed and used for disease identification and risk prediction. We review literature using these tools to model discourse across the spectrum of AD, including the contribution of machine learning approaches and Automatic Speech Recognition (ASR). We conclude that NLP and machine learning techniques are starting to greatly enhance research in the field, with measurable and quantifiable language components showing promise for early detection of disease, but there remain research and practical challenges for clinical implementation of these approaches. Challenges discussed include the availability of large and diverse datasets, ethics of data collection and sharing, diagnostic specificity and clinical acceptability.
Keywords: Alzheimer's disease; Discourse; Machine learning; Mild Cognitive Impairment; Natural Language Processing.
Copyright © 2020 Elsevier Ltd. All rights reserved.