Comprehensive Characterization of the Human Endogenous Retrovirus HERV-K(HML-6) Group: Overview of Structure, Phylogeny, and Contribution to the Human Genome

J Virol. 2019 Jul 30;93(16):e00110-19. doi: 10.1128/JVI.00110-19. Print 2019 Aug 15.

Abstract

Eight percent of the human genome is composed of human endogenous retroviruses (HERVs), remnants of ancestral germ line infections by exogenous retroviruses, which have been vertically transmitted as Mendelian characters. The HML-6 group, a member of the class II betaretrovirus-like viruses, includes several proviral loci with an increased transcriptional activity in cancer and at least two elements that are known for retaining an intact open reading frame and for encoding small proteins such as ERVK3-1, which is expressed in various healthy tissues, and HERV-K-MEL, a small Env peptide expressed in samples of cutaneous and ocular melanoma but not in normal tissues.IMPORTANCE We reported the distribution and genetic composition of 66 HML-6 elements. We analyzed the phylogeny of the HML-6 sequences and identified two main clusters. We provided the first description of a Rec domain within the env sequence of 23 HML-6 elements. A Rec domain was also predicted within the ERVK3-1 transcript sequence, revealing its expression in various healthy tissues. Evidence about the context of insertion and colocalization of 19 HML-6 elements with functional human genes are also reported, including the sequence 16p11.2, whose 5' long terminal repeat overlapped the exon of one transcript variant of a cellular zinc finger upregulated and involved in hepatocellular carcinoma. The present work provides the first complete overview of the HML-6 elements in GRCh37(hg19), describing the structure, phylogeny, and genomic context of insertion of each locus. This information allows a better understanding of the genetics of one of the most expressed HERV groups in the human genome.

Keywords: HERV; HERV-K-MEL; HML-6; RetroTector; bioinformatics; endogenous retrovirus.

MeSH terms

  • Chromosome Mapping
  • Computational Biology / methods
  • Endogenous Retroviruses / classification*
  • Endogenous Retroviruses / genetics*
  • Genetic Loci
  • Genome, Human*
  • Humans
  • Molecular Sequence Annotation
  • Open Reading Frames
  • Phylogeny*
  • Proviruses / genetics*
  • Terminal Repeat Sequences