Biological function integrated prediction of severe radiographic progression in rheumatoid arthritis: a nested case control study

Arthritis Res Ther. 2017 Oct 25;19(1):244. doi: 10.1186/s13075-017-1414-x.

Abstract

Background: Radiographic progression is reported to be highly heritable in rheumatoid arthritis (RA). However, previous study using genetic loci showed an insufficient accuracy of prediction for radiographic progression. The aim of this study is to identify a biologically relevant prediction model of radiographic progression in patients with RA using a genome-wide association study (GWAS) combined with bioinformatics analysis.

Methods: We obtained genome-wide single nucleotide polymorphism (SNP) data for 374 Korean patients with RA using Illumina HumanOmni2.5Exome-8 arrays. Radiographic progression was measured using the yearly Sharp/van der Heijde modified score rate, and categorized in no or severe progression. Significant SNPs for severe radiographic progression from GWAS were mapped on the functional genes and reprioritized by post-GWAS analysis. For robust prediction of radiographic progression, tenfold cross-validation using a support vector machine (SVM) classifier was conducted. Accuracy was used for selection of optimal SNPs set in the Hanyang Bae RA cohort. The performance of our final model was compared with that of other models based on GWAS results and SPOT (one of the post-GWAS analyses) using receiver operating characteristic (ROC) curves. The reliability of our model was confirmed using GWAS data of Caucasian patients with RA.

Results: A total of 36,091 significant SNPs with a p value <0.05 from GWAS were reprioritized using post-GWAS analysis and approximately 2700 were identified as SNPs related to RA biological features. The best average accuracy of ten groups was 0.6015 with 85 SNPs, and this increased to 0.7481 when combined with clinical information. In comparisons of the performance of the model, the 0.7872 area under the curve (AUC) in our model was superior to that obtained with GWAS (AUC 0.6586, p value 8.97 × 10-5) or SPOT (AUC 0.7449, p value 0.0423). Our model strategy also showed superior prediction accuracy in Caucasian patients with RA compared with GWAS (p value 0.0049) and SPOT (p value 0.0151).

Conclusions: Using various biological functions of SNPs and repeated machine learning, our model could predict severe radiographic progression relevantly and robustly in patients with RA compared with models using only GWAS results or other post-GWAS tools.

Keywords: Bioinformatic analysis; GWAS; Post-GWAS analysis; Radiographic progression; Rheumatoid arthritis.

MeSH terms

  • Adult
  • Arthritis, Rheumatoid / diagnostic imaging*
  • Arthritis, Rheumatoid / genetics*
  • Arthritis, Rheumatoid / pathology
  • Case-Control Studies
  • Cohort Studies
  • Disease Progression
  • Female
  • Genome-Wide Association Study / methods*
  • Humans
  • Male
  • Middle Aged
  • Polymorphism, Single Nucleotide*
  • ROC Curve
  • Radiography / methods*
  • Reproducibility of Results