Quick and efficient approach to develop genomic resources in orphan species: Application in Lavandula angustifolia

PLoS One. 2020 Dec 11;15(12):e0243853. doi: 10.1371/journal.pone.0243853. eCollection 2020.

Abstract

Next-Generation Sequencing (NGS) technologies, by reducing the cost and increasing the throughput of sequencing, have opened doors to generate genomic data in a range of previously poorly studied species. In this study, we propose a method for the rapid development of a large-scale molecular resources for orphan species. We studied as an example the true lavender (Lavandula angustifolia Mill.), a perennial sub-shrub plant native from the Mediterranean region and whose essential oil have numerous applications in cosmetics, pharmaceuticals, and alternative medicines. The heterozygous clone "Maillette" was used as a reference for DNA and RNA sequencing. We first built a reference Unigene, compound of coding sequences, thanks to de novo RNA-seq assembly. Then, we reconstructed the complete genes sequences (with introns and exons) using an Unigene-guided DNA-seq assembly approach. This aimed to maximize the possibilities of finding polymorphism between genetically close individuals despite the lack of a reference genome. Finally, we used these resources for SNP mining within a collection of 16 commercial lavender clones and tested the SNP within the scope of a genetic distance analysis. We obtained a cleaned reference of 8, 030 functionally in silico annotated genes. We found 359K polymorphic sites and observed a high SNP frequency (mean of 1 SNP per 90 bp) and a high level of heterozygosity (more than 60% of heterozygous SNP per genotype). On overall, we found similar genetic distances between pairs of clones, which is probably related to the out-crossing nature of the species and the restricted area of cultivation. The proposed method is transferable to other orphan species, requires little bioinformatics resources and can be realized within a year. This is also the first reported large-scale SNP development on Lavandula angustifolia. All the genomics resources developed herein are publicly available and provide a rich pool of molecular resources to explore and exploit lavender genetic diversity in breeding programs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computer Simulation
  • DNA, Plant / genetics
  • Exons / genetics
  • Genome, Plant*
  • Genomics / methods*
  • Introns / genetics
  • Lavandula / genetics*
  • Molecular Sequence Annotation
  • Phylogeny
  • Polymorphism, Single Nucleotide / genetics
  • Principal Component Analysis
  • RNA-Seq
  • Transcriptome / genetics

Substances

  • DNA, Plant

Grants and funding

This project was funded by the Trust account for "Agricultural and Rural Development"(CASDAR), project number C-2017-05, on behalf of French Ministry of Agriculture. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.