Background: The incidence of acute lymphoblastic leukemia (ALL) is nearly 20% higher among Hispanics than non-Hispanic Whites. Previous studies have shown evidence for association between risk of ALL and variation within IKZF1, ARID5B, CEBPE, CDKN2A, GATA3, and BM1-PIP4K2A genes. However, variants identified only account for <10% of the genetic risk of ALL.
Methods: We applied pathway-based analyses to genome-wide association study (GWAS) data from the California Childhood Leukemia Study to determine whether different biologic pathways were overrepresented in childhood ALL and major ALL subtypes. Furthermore, we applied causal inference and data reduction methods to prioritize candidate genes within each identified overrepresented pathway, while accounting for correlation among SNPs.
Results: Pathway analysis results indicate that different ALL subtypes may involve distinct biologic mechanisms. Focal adhesion is a shared mechanism across the different disease subtypes. For ALL, the top five overrepresented Kyoto Encyclopedia of Genes and Genomes pathways include axon guidance, protein digestion and absorption, melanogenesis, leukocyte transendothelial migration, and focal adhesion (PFDR < 0.05). Notably, these pathways are connected to downstream MAPK or Wnt signaling pathways which have been linked to B-cell malignancies. Several candidate genes for ALL, such as COL6A6 and COL5A1, were identified through targeted maximum likelihood estimation.
Conclusions: This is the first study to show distinct biologic pathways are overrepresented in different ALL subtypes using pathway-based approaches, and identified potential gene candidates using causal inference methods.
Impact: The findings demonstrate that newly developed bioinformatics tools and causal inference methods can provide insights to furthering our understanding of the pathogenesis of leukemia. Cancer Epidemiol Biomarkers Prev; 25(5); 815-22. ©2016 AACR.
©2016 American Association for Cancer Research.