Use of nucleotide composition analysis to infer hosts for three novel picorna-like viruses

J Virol. 2010 Oct;84(19):10322-8. doi: 10.1128/JVI.00601-10. Epub 2010 Jul 28.

Abstract

Nearly complete genome sequences of three novel RNA viruses were acquired from the stool of an Afghan child. Phylogenetic analysis indicated that these viruses belong to the picorna-like virus superfamily. Because of their unique genomic organization and deep phylogenetic roots, we propose these viruses, provisionally named calhevirus, tetnovirus-1, and tetnovirus-2, as prototypes of new viral families. A newly developed nucleotide composition analysis (NCA) method was used to compare mononucleotide and dinucleotide frequencies for RNA viruses infecting mammals, plants, or insects. Using a large training data set of 284 representative picornavirus-like genomic sequences with defined host origins, NCA correctly identified the kingdom or phylum of the viral host for >95% of picorna-like viruses. NCA predicted an insect host origin for the 3 novel picorna-like viruses. Their presence in human stool therefore likely reflects ingestion of insect-contaminated food. As metagenomic analyses of different environments and organisms continue to yield highly divergent viral genomes NCA provides a rapid and robust method to identify their likely cellular hosts.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Afghanistan
  • Amino Acid Sequence
  • Animals
  • Base Composition
  • Base Sequence
  • DNA Primers / genetics
  • Food Microbiology
  • Genetic Variation
  • Genome, Viral
  • Host-Pathogen Interactions
  • Humans
  • Insecta / virology
  • Molecular Sequence Data
  • Phylogeny
  • Picornaviridae / classification*
  • Picornaviridae / genetics*
  • Picornaviridae / isolation & purification
  • RNA, Viral / chemistry
  • RNA, Viral / genetics
  • Serine Proteases / genetics
  • Species Specificity
  • Untranslated Regions
  • Viral Nonstructural Proteins / genetics
  • Viral Structural Proteins / genetics

Substances

  • DNA Primers
  • RNA, Viral
  • Untranslated Regions
  • Viral Nonstructural Proteins
  • Viral Structural Proteins
  • Serine Proteases

Associated data

  • GENBANK/HM480374
  • GENBANK/HM480375
  • GENBANK/HM480376