VaiPhy: a Variational Inference Based Algorithm for Phylogeny

by   Hazal Koptagel, et al.

Phylogenetics is a classical methodology in computational biology that today has become highly relevant for medical investigation of single-cell data, e.g., in the context of development of cancer. The exponential size of the tree space is unfortunately a formidable obstacle for current Bayesian phylogenetic inference using Markov chain Monte Carlo based methods since these rely on local operations. And although more recent variational inference (VI) based methods offer speed improvements, they rely on expensive auto-differentiation operations for learning the variational parameters. We propose VaiPhy, a remarkably fast VI based algorithm for approximate posterior inference in an augmented tree space. VaiPhy produces marginal log-likelihood estimates on par with the state-of-the-art methods on real data, and is considerably faster since it does not require auto-differentiation. Instead, VaiPhy combines coordinate ascent update equations with two novel sampling schemes: (i) SLANTIS, a proposal distribution for tree topologies in the augmented tree space, and (ii) the JC sampler, the, to the best of our knowledge, first ever scheme for sampling branch lengths directly from the popular Jukes-Cantor model. We compare VaiPhy in terms of density estimation and runtime. Additionally, we evaluate the reproducibility of the baselines. We provide our code on GitHub:


Multiple Importance Sampling ELBO and Deep Ensembles of Variational Approximations

In variational inference (VI), the marginal log-likelihood is estimated ...

Prior Density Learning in Variational Bayesian Phylogenetic Parameters Inference

The advances in variational inference are providing promising paths in B...

Divide and Couple: Using Monte Carlo Variational Objectives for Posterior Approximation

Recent work in variational inference (VI) uses ideas from Monte Carlo es...

Variational Inference with NoFAS: Normalizing Flow with Adaptive Surrogate for Computationally Expensive Models

Fast inference of numerical model parameters from data is an important p...

Systematic Exploration of the High Likelihood Set of Phylogenetic Tree Topologies

Bayesian Markov chain Monte Carlo explores tree space slowly, in part be...

Real-Time Likelihood-free Inference of Roman Binary Microlensing Events with Amortized Neural Posterior Estimation

Fast and automated inference of binary-lens, single-source (2L1S) microl...

Improved Variational Bayesian Phylogenetic Inference with Normalizing Flows

Variational Bayesian phylogenetic inference (VBPI) provides a promising ...

Please sign up or login with your details

Forgot password? Click here to reset