Title: Building Supertree for Rooted Phylogenetic Trees)
Date/Time: March 29th, 2017 @ 2:10 PM
Place: 223 Atanasoff
Major Professor: David Fernandez - Baca
Committee Members: Oliver Eulenstein, Xiaoqiu Huang, Ryan Martin, Wensheng Zhang
A phylogenetic tree is a branching diagram showing the inferred evolutionary relationships among various biological species or other entities. Building a phylogenetic tree that encompasses all living species is one of the central challenges of computational biology. One approach to overcoming this obstacle begins by identifying subsets of species for which enough data is available, and building phylogenies for each subset. The resulting trees are then synthesized into a single phylogeny —a supertree— for the combined set of species.
We present a new graph-based approach to the following basic problem in phylogenetic tree construction. Given a collection P of rooted phylogenetic trees over various subsets of a set of species. The tree compatibility problem asks whether there is a tree T with the following property: for each input tree can be obtained from the restriction of T to the species set of it by contracting zero or more edges. If such a tree T exists, we say that P is compatible.
Our approach leads to a O(MP log2 MP) algorithm for the tree compatibility problem, where MP is the total number of nodes and edges in P. Unlike previous algorithms, the running time of our method does not depend on the degrees of the nodes in the input trees. Thus, our algorithm is equally fast on highly resolved and highly unresolved trees. We also show that an extension of this approach allows us to solve, in the same running time, a more general version of the problem. In this generalization, called the ancestral compatibility problem, the input trees may have internal nodes that are labeled by higher-order taxa (that is, the labels are the names of families of species). The question is whether there exists an internally labeled tree T that respects the clusters and the ancestor/descendant relationships implied by the input phylogenies. In the future, I will implement our algorithm and study its performance on real data. And I will also integrate our algorithm into a synthesis method that deals with incompatible profiles.