Predictors of Medical and Dental Clinic Closure by Machine Learning Methods: Cross-Sectional Study Using Empirical Data

J Med Internet Res. 2024 Aug 30:26:e46608. doi: 10.2196/46608.

Abstract

Background: Small clinics are important in providing health care in local communities. Accurately predicting their closure would help manage health care resource allocation. There have been few studies on the prediction of clinic closure using machine learning techniques.

Objective: This study aims to test the feasibility of predicting the closure of medical and dental clinics (MCs and DCs, respectively) and investigate important factors associated with their closure using machine running techniques.

Methods: The units of analysis were MCs and DCs. This study used health insurance administrative data. The participants of this study ran and closed clinics between January 1, 2020, and December 31, 2021. Using all closed clinics, closed and run clinics were selected at a ratio of 1:2 based on the locality of study participants using the propensity matching score of logistic regression. This study used 23 and 19 variables to predict the closure of MCs and DCs, respectively. Key variables were extracted using permutation importance and the sequential feature selection technique. Finally, this study used 5 and 6 variables of MCs and DCs, respectively, for model learning. Furthermore, four machine learning techniques were used: (1) logistic regression, (2) support vector machine, (3) random forest (RF), and (4) Extreme Gradient Boost. This study evaluated the modeling accuracy using the area under curve (AUC) method and presented important factors critically affecting closures. This study used SAS (version 9.4; SAS Institute Inc) and Python (version 3.7.9; Python Software Foundation).

Results: The best-fit model for the closure of MCs with cross-validation was the support vector machine (AUC 0.762, 95% CI 0.746-0.777; P<.001) followed by RF (AUC 0.736, 95% CI 0.720-0.752; P<.001). The best-fit model for DCs was Extreme Gradient Boost (AUC 0.700, 95% CI 0.675-0.725; P<.001) followed by RF (AUC 0.687, 95% CI 0.661-0.712; P<.001). The most significant factor associated with the closure of MCs was years of operation, followed by population growth, population, and percentage of medical specialties. In contrast, the main factor affecting the closure of DCs was the number of patients, followed by annual variation in the number of patients, year of operation, and percentage of dental specialists.

Conclusions: This study showed that machine running methods are useful tools for predicting the closure of small medical facilities with a moderate level of accuracy. Essential factors affecting medical facility closure also differed between MCs and DCs. Developing good models would prevent unnecessary medical facility closures at the national level.

Keywords: artificial intelligence; clinic bankruptcy; clinic closure; health clinic; health facility closure; health insurance; healthcare resources; hospital bankruptcy; hospital closure; machine learning; medical clinic; prediction.

MeSH terms

  • Cross-Sectional Studies
  • Dental Clinics / statistics & numerical data
  • Female
  • Humans
  • Logistic Models
  • Machine Learning*
  • Male
  • Support Vector Machine