Premise: Common steps in phylogenomic matrix production include biological sequence concatenation, morphological data concatenation, insertion/deletion (indel) coding, gene content (presence/absence) coding, removing uninformative characters for parsimony analysis, recording with reduced amino acid alphabets, and occupancy filtering. Existing software does not accomplish these tasks on a phylogenomic scale using a single program.
Methods and results: BAD2matrix is a Python script that performs the above-mentioned steps in phylogenomic matrix construction for DNA or amino acid sequences as well as morphological data. The script works in UNIX-like environments (e.g., LINUX, MacOS, Windows Subsystem for LINUX).
Conclusions: BAD2matrix helps simplify phylogenomic pipelines and can be downloaded from https://github.com/dpl10/BAD2matrix/tree/master under a GNU General Public License v2.
Keywords: concatenation; gene content; gene presence/absence; indel coding; morphology; occupancy filtering; phylogenomics; reduced amino acid alphabets.
© 2024 The Author(s). Applications in Plant Sciences published by Wiley Periodicals LLC on behalf of Botanical Society of America.