Background: Administrative databases and cancer registries are frequently used to conduct population-based research, but often lack clinical data necessary for risk stratification. Our objective was to determine the criterion validity of a risk-stratification algorithm based on treatment characteristics available from a pediatric cancer registry as a proxy for disease risk, by comparing it to traditional biology-based risk classifications.
Methods: We identified all children with acute lymphoblastic leukemia diagnosed at a single institution between January 2000 and June 2011, and linked them to a population-based cancer registry. Several risk algorithms were then constructed using disease risk variables collected through chart review by a pediatric oncologist, and compared to a risk algorithm based on treatment protocol name and age, available from the registry.
Results: Of 596 patients identified, 579 (97.1%) met inclusion criteria and were successfully linked. The registry-based algorithm showed almost perfect agreement with a biology-based algorithm based on age, initial white blood cell count, immunophenotype and cytogenetics (kappa=0.85, 95th confidence interval 0.81-0.90). Discrepant cases were often due to the presence of unusual high risk features not captured by standard disease-risk variables but reflected in clinicians' choices of higher intensity treatment protocols.
Conclusions: Protocol name represents a valid proxy of disease risk, allowing for risk stratification while conducting comparative effectiveness research using cancer registries and health services data. Future studies should examine the validity of treatment-based risk algorithms in other malignancies and using other treatment characteristics commonly found in health services data, such as the receipt of specific chemotherapeutic agents.