Journal Home > Volume 18 , Issue 5

Phylogenetic trees have been widely used in the study of evolutionary biology for representing the tree-like evolution of a collection of species. However, different data sets and different methods often lead to the construction of different phylogenetic trees for the same set of species. Therefore, comparing these trees to determine similarities or, equivalently, dissimilarities, becomes the fundamental issue. Typically, Tree Bisection and Reconnection (TBR) and Subtree Prune and Regraft (SPR) distances have been proposed to facilitate the comparison between different phylogenetic trees. In this paper, we give a survey on the aspects of computational complexity, fixed-parameter algorithms, and approximation algorithms for computing the TBR and SPR distances of phylogenetic trees.


menu
Abstract
Full text
Outline
About this article

Distances Between Phylogenetic Trees: A Survey

Show Author's information Feng ShiQilong Feng( )Jianer ChenLusheng WangJianxin Wang
School of Information Science and Engineering, Central South University, Changsha 410083, China
Department of Computer Science and Engineering, Texas A&M University, College Station, Texas 77843-3112, USA
Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong, China

Abstract

Phylogenetic trees have been widely used in the study of evolutionary biology for representing the tree-like evolution of a collection of species. However, different data sets and different methods often lead to the construction of different phylogenetic trees for the same set of species. Therefore, comparing these trees to determine similarities or, equivalently, dissimilarities, becomes the fundamental issue. Typically, Tree Bisection and Reconnection (TBR) and Subtree Prune and Regraft (SPR) distances have been proposed to facilitate the comparison between different phylogenetic trees. In this paper, we give a survey on the aspects of computational complexity, fixed-parameter algorithms, and approximation algorithms for computing the TBR and SPR distances of phylogenetic trees.

Keywords: approximation algorithm, phylogenetic tree, tree bisection and reconnection, subtree prune and regraft, fixed-parameter algorithm

References(41)

[1]
A. W. F.Edwardsand L. L.Cavalli-Sforza, The reconstruction of evolution, Annals of Human Genetics, vol. 27, pp. 105-106, 1963.
[2]
W.Fitch, Toward defining the course of evolution: Minimum change for a specified tree topology, Systematic Biology, vol. 20, no. 4, pp. 406-416, 1971.
[3]
D.Sankoff, Minimal mutation trees of sequences, SIAM Journal on Applied Mathematics, vol. 28, no. 1, pp. 35-42, 1975.
[4]
W. J.Le Quesne, The uniquely evolved character concept and its cladistic application, Systematic Biology, vol. 23, no. 4, pp. 513-517, 1974.
[5]
W. M.Fitchand E.Margoliash, Construction of phylogenetic trees, Science, vol. 155, no. 3760, pp. 279-284, 1967.
[6]
N.Saitouand M.Nei, The neighbor-joining method: A new method for reconstructing phylogenetic trees, Molecular Biology and Evolution, vol. 4, no. 4, pp. 406-425, 1987.
[7]
J.Felsenstein, Evolutionary trees for DNA sequences: A maximum likelihood approach, Journal of Molecular Evolution, vol. 17, no. 6, pp. 368-376, 1981.
[8]
D.Barryand J. A.Hartigan, Statistical analysis of hominoid molecular evolution, Statistical Science, vol. 2, no. 2, pp. 191-210, 1987.
[9]
D.Robinsonand L.Foulds, Comparison of phylogenetic trees, Mathematical Biosciences, vol. 53, nos. 1-2, pp. 131-147, 1981.
[10]
D.Robinson, Comparison of labeled trees with valency three, Journal of Combinatorial Theory, Series B, vol. 11, no. 2, pp. 105-119, 1971.
[11]
K.Culikand D.Wood, A note on some tree similarity measures, Information Processing Letters, vol. 15, no. 1, pp. 39-42, 1982.
[12]
W.Day, Properties of the nearest neighbor interchange metric for trees of small size, Journal of Theoretical Biology, vol. 101, no. 2, pp. 275-288, 1983.
[13]
R.Boland, E.Brown, and E.Day, Approximating minimum-length-sequence metrics: A cautionary note, Mathematical Social Sciences, vol. 4, no. 3, pp. 261-270, 1983.
[14]
J.Jarvis, J.Luedeman, and D.Shier, Counterexamples in measuring the distance between binary trees, Mathematical Social Sciences, vol. 4, no. 3, pp. 271-274, 1983.
[15]
J.Jarvis, J.Luedeman, and D.Shier, Comments on computing the similarity of binary trees, Journal of Theoretical Biology, vol. 100, no. 3, pp. 427-433, 1983.
[16]
M.Krivanke, Computing the nearest neighbor interchange metric for unlabeled binary trees is NP-complete, Journal of Classification, vol. 3, no. 1, pp. 55-60, 1986.
[17]
V.Kingand T.Warnow, On measuring the nni distance between two evolutionary trees, presented at the DIMACS Mini Workshop on Combinatorial Structures in Molecular Biology, South Plainfield, USA, 1994.
[18]
M.Li, J.Tromp, and L.Zhang, Some notes on the nearest neighbor interchange distance, in Proc. 2nd Annual International Computing and Combinatorics Conference (COCOON), Hong Kong, China, 1996, pp. 343-351.
DOI
[19]
M.Li, J.Tromp, and L.Zhang, On the nearest neighbor interchange distance between evolutionary trees, Journal on Theoretical Biology, vol. 182, no. 4, pp. 463-467, 1996.
[20]
J.Hein, Reconstructing evolution of sequences subject to recombination using parsimony, Mathematical Biosciences, vol. 98, no. 2, pp. 185-200, 1990.
[21]
J.Hein, A heuristic method to reconstruct the history of sequences subject to recombination, Journal of Molecular Evolution, vol. 36, no. 4, pp. 396-405, 1993.
[22]
P.Buneman, The recovery of trees from measures of dissimilarity, in Mathematics in the Archaeological and Historical Sciences, F. R.Hodson, D. G.Kendall, and P. T.Tautu, Ed. Edinburgh, UK: Edinburgh University Press, 1971, pp. 387-395.
[23]
D.Swofford, G.Olsen, P.Waddell, and D.Hillis, Phylogenetic inference, in Molecular Systematics, 1996, pp. 407-513.
[24]
B.Allenand M.Steel, Subtree transfer operations and their induced metrics on evolutionary trees, Annals of Combinatorics, vol. 5, no. 1, pp. 1-15, 2001.
[25]
F.Shi, J.Chen, Q.Feng, and J.Wang, Parameterized algorithms for maximum agreement forest on multiple trees, in Proc. 19th Annual International Computing and Combinatorics Conference (COCOON), Hangzhou, China, 2013, pp. 567-578.
DOI
[26]
J.Hein, T.Jiang, L.Wang, and K.Zhang, On the complexity of comparing evolutionary trees, Discrete Applied Mathematics, vol. 71, no. 1-3, pp. 153-169, 1996.
[27]
Y.Yang, A fixed-parameterized algorithm for the maximum agreement forest problem on multifurcating trees, Master dissertation, Central South University, Changsha, China, 2012. .
[28]
R.Downeyand M.Fellows, Parameterized Complexity. New York, USA: Springer, 1999.
DOI
[29]
J.Chen, Parameterized computation and complexity: A new approach dealing with NP-hardness, Journal of Computer Science and Technology, vol. 20, no. 1, pp. 18-37, 2005.
[30]
M.Hallettand C.McCartin, A faster FPT algorithm for the maximum agreement forest problem, Theory of Computing Systems, vol. 41, no. 3, pp. 539-550, 2007.
[31]
C.Whiddenand N.Zeh, A unifying view on approximation and FPT of agreement forests, in Proc. Algorithms in Bioinformatics, Philadelphia, PA, USA, 2009, pp. 390-402.
DOI
[32]
J.Chen, J.Fan, and S.Sze, Parameterized and approximation algorithms for the MAF problem in multifurcating trees, presented at the 39th International Workshop on Graph-Theoretic Concepts in Computer Science, Lübeck, Germany, 2013.
[33]
M.Bordewichand C.Semple, On the computational complexity of the rooted subtree prune and regraft distance, Annals of Combinatorics, vol. 8, no. 4, pp. 409-423, 2005.
[34]
G.Hickey, F.Dehne, A.Rau-Chaplin, and C.Blouin, SPR distance computation for unrooted trees, Evolutionary Bioinformatics, vol. 4, pp. 17-27, 2008.
[35]
M.Bonetand K. St.John, On the complexity of uSPR distance, IEEE/ACM Transaction on Computational Biology and Bioinformatics, vol. 7, no. 3, pp. 572-576, 2010.
DOI
[36]
M.Bordewich, C.McCartin, and C.Semple, A 3-approximation algorithm for the subtree distance between phylogenies, Journal of Discrete Algorithms, vol. 6, no. 3, pp. 458-471, 2008.
[37]
C.Whidden, R.Beiko, and N.Zeh, Fixed-parameter and approximation algorithms for maximum agreement forests, CoRR. abs/1108.2664, 2011.
[38]
Z.Chenand L.Wang, HybridNET: A tool for constructing hybridization networks, Bioinformatics, vol. 26, no. 22, pp. 2912-2913, 2010.
[39]
E.Rodrigues, M.Sagot, and Y.Wakabayashi, The maximum agreement forest problem: Approximation algorithms and computational experiments, Theoretical Computer Science, vol. 374, nos. 1-3, pp. 91-110, 2007.
[40]
M.Bonet, K. St.John, R.Mahindru, and N.Amenta, Approximating subtree distances between phylogenies, Journal of Computational Biology, vol. 13, no. 8, pp. 1419-1434, 2006.
[41]
O.Paun, C.Lehnebach, J.Johansson, P.Lockhart, and E.Hrandl, Phylogenetic relationships and biogeography of Ranunculus and allied genera (Ranunculaceae) in the Mediter-ranean region and in the European Alpine System, Taxon, vol. 54, no. 4, pp. 911-932, 2005.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 06 August 2013
Revised: 15 August 2013
Accepted: 20 August 2013
Published: 03 October 2013
Issue date: October 2013

Copyright

© The author(s) 2013

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61103033, 61173051, 61232001, and 70921001).

Rights and permissions

Return