A probabilistic semantic analysis of eHealth scientific literature

J Telemed Telecare. 2020 Aug-Sep;26(7-8):414-432. doi: 10.1177/1357633X19846252. Epub 2019 May 12.

Abstract

Introduction: eHealth emerged as an interdisciplinary research area about 70 years ago. This study employs probabilistic techniques to semantically analyse scientific literature related to the field of eHealth in order to identify topics and trends and discuss their comparative evolution.

Methods: Authors collected titles and abstracts of published literature on eHealth as indexed in PubMed. Basic statistical and bibliometric techniques were applied to overall describe the collected corpus; Latent Dirichlet Allocation was employed for unsupervised topics identification; topics trends analysis was performed, and correlation graphs were plotted were relevant.

Results: A total of 30,425 records on eHealth were retrieved from PubMed (all records till 31 December 2017, search on 8 May 2018) and 23,988 of these were included to the study corpus. eHealth domain shows a growth higher than the growth of the entire PubMed corpus, with a mean increase of eHealth corpus proportion of about 7% per year for the last 20 years. Probabilistic topics modelling identified 100 meaningful topics, which were organised by the authors in nine different categories: general; service model; disease; medical specialty; behaviour and lifestyle; education; technology; evaluation; and regulatory issues.

Discussion: Trends analysis shows a continuous shift in focus. Early emphasis on medical image transmission and system integration has been replaced by increased focus on standards, wearables and sensor devices, now giving way to mobile applications, social media and data analytics. Attention on disease is also shifting, from initial popularity of surgery, trauma and acute heart disease, to the emergence of chronic disease support, and the recent attention to cancer, infectious disease, mental disorders, paediatrics and perinatal care; most interestingly the current swift increase is in research related to lifestyle and behaviour change. The steady growth of all topics related to assessment and various systematic evaluation techniques indicates a maturing research field that moves towards real world application.

Keywords: Latent Dirichlet Allocation; eHealth; topic modelling; trends analysis.

MeSH terms

  • Bibliometrics*
  • Chronic Disease
  • Female
  • Humans
  • Mobile Applications / trends
  • Pregnancy
  • Semantics*
  • Telemedicine / trends*
  • Wearable Electronic Devices / trends