Summary: Advances in high-throughput DNA sequencing technologies and decreasing costs have fueled the identification of small genetic variants (such as single nucleotide variants and indels) across tumors. Despite efforts to standardize variant formats and vocabularies, many sources of variability persist across databases and computational tools that annotate variants, hindering their integration within cancer genomic analyses. In this context, we present OpenVariant, an easily extendable Python package that facilitates seamless reading, parsing and refinement of diverse input file formats in a customizable structure, all within a single process.
Availability and implementation: OpenVariant is an open-source package available at https://github.com/bbglab/openvariant. Documentation may be found at https://openvariant.readthedocs.io.
© The Author(s) 2024. Published by Oxford University Press.