High quality genome assembly and annotation (v1) of the eukaryotic freshwater microalga Coccomyxa elongata SAG 216-3b

G3 (Bethesda). 2024 Dec 13:jkae294. doi: 10.1093/g3journal/jkae294. Online ahead of print.

Abstract

Unicellular green algae of the genus Coccomyxa are recognized for their worldwide distribution and ecological versatility. Coccomyxa elongata is a freshwater species of the Coccomyxa simplex clade, which also includes lichen symbionts. To facilitate future molecular and phylogenomic studies of this versatile clade of algae, we generated a high-quality genome assembly for Coccomyxa elongata Chodat & Jaag SAG 216-3b within the framework of the Biodiversity Genomics Center Cologne (BioC2) initiative. A combination of long-read PacBio HiFi and Oxford Nanopore Technologies with chromatin conformation capture (Hi-C) sequencing led to the assembly of the genome into 21 scaffolds with a total length of 51.4 Mb and an N50 of 2.8 Mb. Nineteen of the scaffolds represent highly complete nuclear chromosomes delimited by telomeric repeats, while the two additional scaffolds represent the mitochondrial and plastid genomes. Transcriptome-guided gene annotation resulted in the identification of 14,811 protein-coding genes, of which 61% have annotated PFAM domains and 841 are predicted to be secreted. BUSCO analysis against the Chlorophyta database identified a total of 1,494 (98.4 %) complete gene models, suggesting a highly complete genome annotation.

Keywords: Coccomyxa; Trebouxiophyceae; genome annotation; genome assembly; long-read sequencing; unicellular algae.