Variational Supertrees for Bayesian Phylogenetics

Michael D Karcher; Cheng Zhang; Frederic A Matsen 4th

doi:10.1007/s11538-024-01338-5

Variational Supertrees for Bayesian Phylogenetics

Bull Math Biol. 2024 Aug 5;86(9):114. doi: 10.1007/s11538-024-01338-5.

Authors

Michael D Karcher^{1

2}, Cheng Zhang³, Frederic A Matsen 4th⁴

Affiliations

¹ Department of Math & CS, Muhlenberg College, 2400 W Chew St, Allentown, PA, 18104, USA. michaelkarcher@muhlenberg.edu.
² Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., Seattle, WA, 98109, USA. michaelkarcher@muhlenberg.edu.
³ School of Mathematical Sciences and Center for Statistical Science, Peking University, No. 5 Yiheyuan Road, Haidian District, Beijing, 100871, People's Republic of China.
⁴ Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., Seattle, WA, 98109, USA.

Abstract

Bayesian phylogenetic inference is powerful but computationally intensive. Researchers may find themselves with two phylogenetic posteriors on overlapping data sets and may wish to approximate a combined result without having to re-run potentially expensive Markov chains on the combined data set. This raises the question: given overlapping subsets of a set of taxa (e.g. species or virus samples), and given posterior distributions on phylogenetic tree topologies for each of these taxon sets, how can we optimize a probability distribution on phylogenetic tree topologies for the entire taxon set? In this paper we develop a variational approach to this problem and demonstrate its effectiveness. Specifically, we develop an algorithm to find a suitable support of the variational tree topology distribution on the entire taxon set, as well as a gradient-descent algorithm to minimize the divergence from the restrictions of the variational distribution to each of the given per-subset probability distributions, in an effort to approximate the posterior distribution on the entire taxon set.

Keywords: Divide-and-conquer; Gradient descent; Phylogenetics; Supertrees; Variational methods.

MeSH terms

Algorithms*
Bayes Theorem*
Computer Simulation
Markov Chains*
Mathematical Concepts*
Models, Genetic*
Phylogeny*
Probability

Abstract

MeSH terms

Grants and funding