Variational Supertrees for Bayesian Phylogenetics

Bull Math Biol. 2024 Aug 5;86(9):114. doi: 10.1007/s11538-024-01338-5.

Abstract

Bayesian phylogenetic inference is powerful but computationally intensive. Researchers may find themselves with two phylogenetic posteriors on overlapping data sets and may wish to approximate a combined result without having to re-run potentially expensive Markov chains on the combined data set. This raises the question: given overlapping subsets of a set of taxa (e.g. species or virus samples), and given posterior distributions on phylogenetic tree topologies for each of these taxon sets, how can we optimize a probability distribution on phylogenetic tree topologies for the entire taxon set? In this paper we develop a variational approach to this problem and demonstrate its effectiveness. Specifically, we develop an algorithm to find a suitable support of the variational tree topology distribution on the entire taxon set, as well as a gradient-descent algorithm to minimize the divergence from the restrictions of the variational distribution to each of the given per-subset probability distributions, in an effort to approximate the posterior distribution on the entire taxon set.

Keywords: Divide-and-conquer; Gradient descent; Phylogenetics; Supertrees; Variational methods.

MeSH terms

  • Algorithms*
  • Bayes Theorem*
  • Computer Simulation
  • Markov Chains*
  • Mathematical Concepts*
  • Models, Genetic*
  • Phylogeny*
  • Probability