Genome assembly and annotation of Arabidopsis halleri, a model for heavy metal hyperaccumulation and evolutionary ecology

Mol Ecol Resour. 2017 Sep;17(5):1025-1036. doi: 10.1111/1755-0998.12604. Epub 2016 Oct 26.

Abstract

The self-incompatible species Arabidopsis halleri is a close relative of the self-compatible model plant Arabidopsis thaliana. The broad European and Asian distribution and heavy metal hyperaccumulation ability make A. halleri a useful model for ecological genomics studies. We used long-insert mate-pair libraries to improve the genome assembly of the A. halleri ssp. gemmifera Tada mine genotype (W302) collected from a site with high contamination by heavy metals in Japan. After five rounds of forced selfing, heterozygosity was reduced to 0.04%, which facilitated subsequent genome assembly. Our assembly now covers 196 Mb or 78% of the estimated genome size and achieved scaffold N50 length of 712 kb. To validate assembly and annotation, we used synteny of A. halleri Tada mine with a previously published high-quality reference assembly of a closely related species, Arabidopsis lyrata. Further validation of the assembly quality comes from synteny and phylogenetic analysis of the HEAVY METAL ATPASE4 (HMA4) and METAL TOLERANCE PROTEIN1 (MTP1) regions using published sequences from European A. halleri for comparison. Three tandemly duplicated copies of HMA4, key gene involved in cadmium and zinc hyperaccumulation, were assembled on a single scaffold. The assembly will enhance the genomewide studies of A. halleri as well as the allopolyploid Arabidopsis kamchatica derived from A. lyrata and A. halleri.

Keywords: Arabidopsis halleri; Tada mine; de novo assembly; functional annotation; heavy metal hyperaccumulator.

MeSH terms

  • Adenosine Triphosphatases / genetics
  • Arabidopsis / genetics*
  • Arabidopsis / growth & development
  • Arabidopsis / metabolism
  • Arabidopsis Proteins / genetics
  • Cation Transport Proteins / genetics
  • Environmental Pollution
  • Genome, Plant*
  • Japan
  • Metals, Heavy
  • Molecular Sequence Annotation*
  • Phylogeny
  • Sequence Analysis, DNA*
  • Sequence Homology
  • Synteny

Substances

  • Arabidopsis Proteins
  • Cation Transport Proteins
  • MTP1 protein, Arabidopsis
  • Metals, Heavy
  • Adenosine Triphosphatases
  • HMA4 protein, Arabidopsis