Extensive studies on the taxonomic resolution required for bioassessment purposes have determined that resolution above species level (genus, family) is sufficient for their use as indicators of relevant environmental pressures. The high-throughput sequencing (HTS) and meta-barcoding methods now used for bioassessment traditionally employ an arbitrary sequence similarity threshold (SST) around 95% or 97% to cluster sequences into operational taxonomic units, which is considered descriptive of species-level resolution. In this study, we analyzed the effect of the SST on the resulting diatom-based ecological quality index, which is based on OTU abundance distribution along a defined environmental gradient, ideally avoiding taxonomic assignments that could result in high rates of unclassified OTUs and biased final values. A total of 90 biofilm samples were collected in 2014 and 2015 from 51 stream sites on Mayotte Island in parallel with measures of relevant physical and chemical parameters. HTS sequencing was performed on the biofilms using the rbcL region as the genetic marker and diatom-specific primers. Hierarchical clustering was used to group sequences into OTUs using 20 experimental SST levels (80%-99%). An OTU-based quality index (IdxOTU) was developed based on a weighted average equation using the abundance profiles of the OTUs. The developed IdxOTU revealed significant correlations between the IdxOTU values and the reference pressure gradient, which reached maximal performance using an SST of 90% (well above species level delimitation). We observed an interesting and important trade-off with the power to discriminate between sampling sites and index stability that will greatly inform future applications of the index. Taken together, the results from this study detail a thoroughly optimized and validated approach to generating robust, reproducible, and complete indexes that will greatly facilitate effective and efficient environmental monitoring.
Keywords: Diatoms; OTU; Water Framework Directive; high‐throughput sequencing; pollution assessment; sequence similarity threshold; taxonomic resolution.