Background: Neuroblastoma is the most common extracranial solid tumor in childhood. Amplification of MYCN in neuroblastoma is a predictor of poor prognosis. Materials and methods: DNA methylation data from the TARGET data matrix were stratified into MYCN amplified and non-amplified groups. Differential methylation analysis, clustering, recursive feature elimination (RFE), machine learning (ML), Cox regression analysis and Kaplan-Meier estimates were performed. Results and Conclusion: 663 CpGs were differentially methylated between the two groups. A total of 25 CpGs were selected by RFE for clustering and ML, and a 100% clustering accuracy was obtained. ML validation on three external datasets produced high accuracy scores of 100%, 97% and 93%. Eight survival-associated CpGs were also identified. Therapeutic interventions may need to be targeted to patient subgroups.
Keywords: MYCN amplification; differential methylation analysis; machine learning; neuroblastoma; prognostic markers.
Lay abstract Neuroblastoma is the most common extracranial solid tumor in childhood. Elevated levels of the MYCN protein in neuroblastoma is a predictor of poor prognosis. It is the most relevant prognostic factor in neuroblastoma and predicting MYCN gene amplification (which leads to increased gene expression and more protein) from epigenetic data rather than genetic testing might be useful in the oncology clinic. This study was designed to identify a DNA methylation (epigenetic) signature that can be used to diagnose MYCN amplification without actually testing for the gene. The authors also aimed to correlate this DNA methylation signature with patient survival and poorer prognosis. Based on statistical and computational methods applied to DNA methylation data for neuroblastoma, signatures that are predictive of MYCN amplification and poor prognosis were found, which clinicians can use for early patient diagnosis and selection of the best therapies for patients at high risk.