Screening Biomarkers for Systemic Lupus Erythematosus Based on Machine Learning and Exploring Their Expression Correlations With the Ratios of Various Immune Cells

Front Immunol. 2022 Jun 10:13:873787. doi: 10.3389/fimmu.2022.873787. eCollection 2022.

Abstract

Background: Systemic lupus erythematosus (SLE) is an autoimmune illness caused by a malfunctioning immunomodulatory system. China has the second highest prevalence of SLE in the world, from 0.03% to 0.07%. SLE is diagnosed using a combination of immunological markers, clinical symptoms, and even invasive biopsy. As a result, genetic diagnostic biomarkers for SLE diagnosis are desperately needed.

Method: From the Gene Expression Omnibus (GEO) database, we downloaded three array data sets of SLE patients' and healthy people's peripheral blood mononuclear cells (PBMC) (GSE65391, GSE121239 and GSE61635) as the discovery metadata (nSLE = 1315, nnormal = 122), and pooled four data sets (GSE4588, GSE50772, GSE99967, and GSE24706) as the validate data set (nSLE = 146, nnormal = 76). We screened the differentially expressed genes (DEGs) between the SLE and control samples, and employed the least absolute shrinkage and selection operator (LASSO) regression, and support vector machine recursive feature elimination (SVM-RFE) analyze to discover possible diagnostic biomarkers. The candidate markers' diagnostic efficacy was assessed using the receiver operating characteristic (ROC) curve. The reverse transcription quantitative polymerase chain reaction (RT-qPCR) was utilized to confirm the expression of the putative biomarkers using our own Chinese cohort (nSLE = 13, nnormal = 10). Finally, the proportion of 22 immune cells in SLE patients was determined using the CIBERSORT algorithm, and the correlations between the biomarkers' expression and immune cell ratios were also investigated.

Results: We obtained a total of 284 DEGs and uncovered that they were largely involved in several immune relevant pathways, such as type І interferon signaling pathway, defense response to virus, and inflammatory response. Following that, six candidate diagnostic biomarkers for SLE were selected, namely ABCB1, EIF2AK2, HERC6, ID3, IFI27, and PLSCR1, whose expression levels were validated by the discovery and validation cohort data sets. As a signature, the area under curve (AUC) values of these six genes reached to 0.96 and 0.913, respectively, in the discovery and validation data sets. After that, we checked to see if the expression of ABCB1, IFI27, and PLSCR1 in our own Chinese cohort matched that of the discovery and validation sets. Subsequently, we revealed the potentially disturbed immune cell types in SLE patients using the CIBERSORT analysis, and uncovered the most relevant immune cells with the expression of ABCB1, IFI27, and PLSCR1.

Conclusion: Our study identified ABCB1, IFI27, and PLSCR1 as potential diagnostic genes for Chinese SLE patients, and uncovered their most relevant immune cells. The findings in this paper provide possible biomarkers for diagnosing Chinese SLE patients.

Keywords: CIBERSORT; diagnostic biomarker; immune cell disturbance; machine learning; systemic lupus erythematosus.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genetic Markers
  • Humans
  • Leukocytes, Mononuclear* / metabolism
  • Lupus Erythematosus, Systemic* / diagnosis
  • Lupus Erythematosus, Systemic* / genetics
  • ROC Curve
  • Support Vector Machine

Substances

  • Genetic Markers