Learning predictive signatures of HLA type from T-cell repertoires

PLoS Comput Biol. 2025 Jan 6;21(1):e1012724. doi: 10.1371/journal.pcbi.1012724. eCollection 2025 Jan.

Abstract

T cells recognize a wide range of pathogens using surface receptors that interact directly with peptides presented on major histocompatibility complexes (MHC) encoded by the HLA loci in humans. Understanding the association between T cell receptors (TCR) and HLA alleles is an important step towards predicting TCR-antigen specificity from sequences. Here we analyze the TCR alpha and beta repertoires of large cohorts of HLA-typed donors to systematically infer such associations, by looking for overrepresentation of TCRs in individuals with a common allele.TCRs, associated with a specific HLA allele, exhibit sequence similarities that suggest prior antigen exposure. Immune repertoire sequencing has produced large numbers of datasets, however the HLA type of the corresponding donors is rarely available. Using our TCR-HLA associations, we trained a computational model to predict the HLA type of individuals from their TCR repertoire alone. We propose an iterative procedure to refine this model by using data from large cohorts of untyped individuals, by recursively typing them using the model itself. The resulting model shows good predictive performance, even for relatively rare HLA alleles.

MeSH terms

  • Alleles
  • Computational Biology* / methods
  • HLA Antigens* / genetics
  • HLA Antigens* / immunology
  • Humans
  • Receptors, Antigen, T-Cell / genetics
  • Receptors, Antigen, T-Cell / immunology
  • T-Lymphocytes* / immunology

Substances

  • HLA Antigens
  • Receptors, Antigen, T-Cell