Systematic Exploration of the High Likelihood Set of Phylogenetic Tree Topologies

by   Chris Whidden, et al.

Bayesian Markov chain Monte Carlo explores tree space slowly, in part because it frequently returns to the same tree topology. An alternative strategy would be to explore tree space systematically, and never return to the same topology. In this paper, we present an efficient parallelized method to map out the high likelihood set of phylogenetic tree topologies via systematic search, which we show to be a good approximation of the high posterior set of tree topologies. Here `likelihood' of a topology refers to the tree likelihood for the corresponding tree with optimized branch lengths. We call this method `phylogenetic topographer' (PT). The PT strategy is very simple: starting in a number of local topology maxima (obtained by hill-climbing from random starting points), explore out using local topology rearrangements, only continuing through topologies that are better than than some likelihood threshold below the best observed topology. We show that the normalized topology likelihoods are a useful proxy for the Bayesian posterior probability of those topologies. By using a non-blocking hash table keyed on unique representations of tree topologies, we avoid visiting topologies more than once across all concurrent threads exploring tree space. We demonstrate that PT can be used directly to approximate a Bayesian consensus tree topology. When combined with an accurate means of evaluating per-topology marginal likelihoods, PT gives an alternative procedure for obtaining Bayesian posterior distributions on phylogenetic tree topologies.


page 10

page 12

page 22


Polishness of some topologies related to automata (Extended version)

We prove that the Büchi topology, the automatic topology, the alphabetic...

19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology

The marginal likelihood of a model is a key quantity for assessing the e...

A New Quartet Tree Heuristic for Hierarchical Clustering

We consider the problem of constructing an an optimal-weight tree from t...

Continuous-Time Birth-Death MCMC for Bayesian Regression Tree Models

Decision trees are flexible models that are well suited for many statist...

VaiPhy: a Variational Inference Based Algorithm for Phylogeny

Phylogenetics is a classical methodology in computational biology that t...

On the minimum quartet tree cost problem

Given a set of n data objects and their pairwise dissimilarities, the go...

GeoPhy: Differentiable Phylogenetic Inference via Geometric Gradients of Tree Topologies

Phylogenetic inference, grounded in molecular evolution models, is essen...

Please sign up or login with your details

Forgot password? Click here to reset