Background: Uncertain validity of epilepsy diagnoses within health insurance claims and other large datasets have hindered efforts to study and monitor care at the population level.
Objectives: To develop and validate prediction models using longitudinal Medicare administrative data to identify patients with actual epilepsy among those with the diagnosis.
Research design, subjects, measures: We used linked electronic health records and Medicare administrative data including claims to predict epilepsy status. A neurologist reviewed electronic health record data to assess epilepsy status in a stratified random sample of Medicare beneficiaries aged 65+ years between January 2012 and December 2014. We then reconstructed the full sample using inverse probability sampling weights. We developed prediction models using longitudinal Medicare data, then in a separate sample evaluated the predictive performance of each model, for example, area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity.
Results: Of 20,945 patients in the reconstructed sample, 2.1% had confirmed epilepsy. The best-performing prediction model to identify prevalent epilepsy required epilepsy diagnoses with multiple claims at least 60 days apart, and epilepsy-specific drug claims: AUROC=0.93 [95% confidence interval (CI), 0.90-0.96], and with an 80% diagnostic threshold, sensitivity=87.8% (95% CI, 80.4%-93.2%), specificity=98.4% (95% CI, 98.2%-98.5%). A similar model also performed well in predicting incident epilepsy (k=0.79; 95% CI, 0.66-0.92).
Conclusions: Prediction models using longitudinal Medicare data perform well in predicting incident and prevalent epilepsy status accurately.