Abstract
Given a set of leaf-labeled trees with identical leaf sets, the well-known Maximum Agreement SubTree (MAST) problem consists in finding a subtree homeomorphically included in all input trees and with the largest number of leaves. MAST and its variant called Maximum Compatible Tree (MCT) are of particular interest in computational biology. This article presents a linear-time approximation algorithm to solve the complement version of MAST, namely identifying the smallest set of leaves to remove from input trees to obtain isomorphic trees. We also present an O(n2 + kn) algorithm to solve the complement version of MCT. For both problems, we thus achieve significantly lower running times than previously known algorithms. Fast running times are especially important in phylogenetics where large collections of trees are routinely produced by resampling procedures, such as the nonparametric bootstrap or Bayesian MCMC methods.
- Amir, A., and Keselman, D. 1997. Maximum agreement subtree in a set of evolutionary trees: Metrics and efficient algorithm. SIAM J. Comput. 26, 6, 1656--1669. Google ScholarDigital Library
- Berger-Wolf, T. 2004. Consensus and agreement of phylogenetic trees. In Proceedings of the 4th Workshop on Algorithms in Bioinformatics (WABI). Lecture Notes in Computer Science. Springer, 350--361.Google Scholar
- Berry, V., and Nicolas, F. 2004. Maximum agreement and compatible supertrees. In Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching (CPM'04), S. C. Sahinalp et al., Eds. Lecture Notes in Computer Science, vol. 3109. Springer, 205--219.Google Scholar
- Berry, V., and Nicolas, F. to appear. Improved parametrized complexity and approximation of the maximum agreement subtree and maximum compatible tree problems. IEEE/ACM Trans. Comput. Biol. Bioinf. Google ScholarDigital Library
- Bryant, D. 1997. Building trees, hunting for trees and comparing trees: Theory and method in phylogenetic analysis. Ph.D. thesis, University of Canterbury, Department of Mathemathics.Google Scholar
- Bryant, D., Steel, M., and MacKenzie, A. 1983. The size of a maximum agreement subtree for random binary trees. In BioConsensus, M. Janowitz et al., Eds. DIMACS AMS, 55--66.Google Scholar
- Cole, R., Farach-Colton, M., Hariharan, R., Przytycka, T. M., and Thorup, M. 2001. An O(n log n) algorithm for the maximum agreement subtree problem for binary trees. SIAM J. Comput. 30, 5, 1385--1404. Google ScholarDigital Library
- Downey, R. G., Fellows, M. R., and Stege, U. 1999. Computational tractability: The view from Mars. Bull. Eur. Assoc. Theor. Comput. Sci. 69, 73--97.Google Scholar
- Farach, M., Przytycka, T. M., and Thorup, M. 1995. On the agreement of many trees. Inf. Process. Lett. 55, 6, 297--301. Google ScholarDigital Library
- Ganapathy, G., and Warnow, T. J. 2002. Approximating the complement of the maximum compatible subset of leaves of k trees. In Proceedings of the 5th International Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX'02). Lecture Notes in Computer Science, vol. 2462. Springer, 122--134. Google ScholarDigital Library
- Ganapathysaravanabavan, G., and Warnow, T. J. 2001. Finding a maximum compatible tree for a bounded number of trees with bounded degree is solvable in polynomial time. In Proceedings of the 1st International Workshop on Algorithms in Bioinformatics (WABI'01), O. Gascuel and B. M. E. Moret, Eds. Lecture Notes in Computer Science, vol. 2149. Springer, 156--163. Google ScholarDigital Library
- Guindon, S., and Gascuel, O. 2003. A simple, fast and accurate method to estimate large phylogenies by maximum-likelihood. Syst. Biol. 52, 5, 696--704.Google ScholarCross Ref
- Gupta, A., and Nishimura, N. 1998. Finding largest subtrees and smallest supertrees. Algorithmica 21, 2, 183--210.Google ScholarCross Ref
- Hamel, A. M., and Steel, M. A. 1996. Finding a maximum compatible tree is NP-hard for sequences and trees. Appl. Math. Lett. 9, 2, 55--59.Google ScholarCross Ref
- Harel, D., and Tarjan, R. E. 1984. Fast algorithms for finding nearest common ancestor. SIAM J. Comput. 13, 2, 338--355. Google ScholarDigital Library
- Hein, J., Jiang, T., Wang, L., and Zhang, K. 1996. On the complexity of comparing evolutionary trees. Discrete Appl. Math. 71, 1--3, 153--169. Google ScholarDigital Library
- Kao, M.-Y., Lam, T. W., Sung, W.-K., and Ting, H.-F. 1999. A decomposition theorem for maximum weight bipartite matchings with applications to evolutionary trees. In Proceedings of the 7th Annual European Symposium on Algorithms (ESA'99). Lecture Notes in Computer Science, vol. 1643. Springer, 438--449. Google ScholarDigital Library
- Kao, M.-Y., Lam, T. W., Sung, W.-K., and Ting, H.-F. 2001. An even faster and more unifying algorithm for comparing trees via unbalanced bipartite matchings. J. Algor. 40, 2, 212--233. Google ScholarDigital Library
- McMorris, F., Meronik, D., and Neumann, D. 1983. A view of some consensus methods for trees. In Numerical Taxonomy, J. Felsenstein, Ed. Springer, 122--125.Google Scholar
- Nishimura, N., Ragde, P., and Thilikos, D. 2004. Smaller kernels for hitting set problems of constant arity. In International Workshop on Parameterized and Exact Computation (IWPEC). Lecture Notes in Computer Science, vol. 3162. 121--126.Google ScholarCross Ref
- Semple, C., and Steel, M. 2003. Phylogenetics. Oxford Lecture Series in Mathematics and its Applications, vol. 24. Oxford University Press.Google Scholar
- Steel, M. A., and Warnow, T. J. 1993. Kaikoura tree theorems: Computing the maximum agreement subtree. Inf. Process. Lett. 48, 2, 77--82. Google ScholarDigital Library
Index Terms
- Linear time 3-approximation for the MAST problem
Recommendations
On the approximability of the Maximum Agreement SubTree and Maximum Compatible Tree problems
The aim of this paper is to give a complete picture of approximability for two tree consensus problems which are of particular interest in computational biology: Maximum Agreement SubTree (MAST) and Maximum Compatible Tree (MCT). Both problems take as ...
Improved approximation algorithm for maximum agreement forest of two rooted binary phylogenetic trees
Given two rooted binary phylogenetic trees with identical leaf label-set, the maximum agreement forest (MAF) problem asks for a largest common subforest of the two trees. This problem has been studied extensively in the literature, and has been known to ...
On the Maximum Agreement Subtree Conjecture for Balanced Trees
We give a counterexample to the conjecture of Martin and Thatte that two balanced rooted binary leaf-labeled trees on $n$ leaves have a maximum agreement subtree (MAST) of size at least $n^{\frac{1}{2}}$. In particular, we show that for any $c>0$, there ...
Comments