ABSTRACT
Inspired by recent efforts to model cancer evolution with phylogenetic trees, we consider the problem of finding a consensus tumor evolution tree from a set of conflicting input trees. In contrast to traditional phylogenetic trees, the tumor trees we consider contain features such as mutation labels on internal vertices (in addition to the leaves) and allow multiple mutations to label a single vertex. We describe several distance measures between these tumor trees and present an algorithm to solve the consensus problem called GraPhyC. Our approach uses a weighted directed graph where vertices are sets of mutations and edges are weighted using a function that depends on the number of times a parental relationship is observed between their constituent mutations in the set of input trees. We find a minimum weight spanning arborescence in this graph and prove that the resulting tree minimizes the total distance to all input trees for one of our presented distance measures. We evaluate our GraPhyC method using both simulated and real data. On simulated data we show that our method outperforms a baseline method at finding an appropriate representative tree. Using a set of tumor trees derived from both whole-genome and deep sequencing data from a Chronic Lymphocytic Leukemia patient we find that our approach identifies a tree not included in the set of input trees, but that contains characteristics supported by other reported evolutionary reconstructions of this tumor.
- Edward N Adams III . 1972. Consensus techniques and the comparison of taxonomic trees. Systematic Biology Vol. 21, 4 (1972), 390--397.Google ScholarCross Ref
- Jean-Pierre Barthélemy and Fred R McMorris . 1986. The median procedure for n-trees. Journal of Classification Vol. 3, 2 (1986), 329--334.Google ScholarCross Ref
- Paola Bonizzoni, Anna Paola Carrieri, Gianluca Della Vedova, and Gabriella Trucco . 2014. Explaining evolution via constrained persistent perfect phylogeny. BMC Genomics Vol. 15 Suppl 6 (2014), S10.Google Scholar
- David Bryant . 2003. A classification of consensus methods for phylogenetics. DIMACS series in discrete mathematics and theoretical computer science Vol. 61 (2003), 163--184.Google Scholar
- Amit G Deshwar, Shankar Vembu, Christina K Yung, Gun Ho Jang, Lincoln Stein, and Quaid Morris . 2015. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol Vol. 16 (Feb . 2015), 35.Google Scholar
- Jack Edmonds . 1967. Optimum branchings. Journal of Research of the National Bureau of Standards B Vol. 71, 4 (1967), 233--240.Google ScholarCross Ref
- Mohammed El-Kebir, Layla Oesper, Hannah Acheson-Field, and Benjamin J Raphael . 2015. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics Vol. 31, 12 (Jun . 2015), i62--70.Google ScholarCross Ref
- Mohammed El-Kebir, Gryte Satas, Layla Oesper, and Benjamin J Raphael . 2016. Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. Cell Syst Vol. 3, 1 (Jul . 2016), 43--53.Google Scholar
- George F. Estabrook, F. R. McMorris, and Christopher A. Meacham . 1985. Comparison of Undirected Phylogenetic Trees Based on Subtrees of Four Evolutionary Units. Systematic Biology Vol. 34, 2 (1985), 193--200.Google ScholarCross Ref
- Charles Gawad, Winston Koh, and Stephen R Quake . 2014. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proc Natl Acad Sci U S A Vol. 111, 50 (Dec . 2014), 17947--52.Google ScholarCross Ref
- Katharina Jahn, Jack Kuipers, and Niko Beerenwinkel . 2016. Tree inference for single-cell data. Genome Biol Vol. 17 (May . 2016), 86.Google Scholar
- Yuchao Jiang, Yu Qiu, Andy J Minn, and Nancy R Zhang . 2016. Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proc Natl Acad Sci U S A Vol. 113, 37 (09 . 2016), E5528--37.Google ScholarCross Ref
- Wei Jiao, Shankar Vembu, Amit G Deshwar, Lincoln Stein, and Quaid Morris . 2014. Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinformatics Vol. 15 (Feb . 2014), 35.Google Scholar
- Salem Malikic, Katharina Jahn, Jack Kuipers, S. Cenk Sahinalp, and Niko Beerenwinkel . 2018. Integrative Inference of Subclonal Tumour Evolution from Single-Cell and Bulk Sequencing Data. In Research in Computational Molecular Biology, bibfieldeditorB.J. Raphael (Ed.). Springer, 269--270.Google Scholar
- Salem Malikic, Andrew W McPherson, Nilgun Donmez, and Cenk S Sahinalp . 2015. Clonality inference in multiple tumor samples using phylogeny. Bioinformatics Vol. 31, 9 (May . 2015), 1349--56.Google ScholarCross Ref
- Timothy Margush and Fred R McMorris . 1981. Consensusn-trees. Bulletin of Mathematical Biology Vol. 43, 2 (1981), 239--244.Google Scholar
- Yusuke Matsui, Atsushi Niida, Ryutaro Uchi, Koshi Mimori, Satoru Miyano, and Teppei Shimamura . 2017. phyC: Clustering cancer evolutionary trees. PLoS Comput Biol Vol. 13, 5 (May . 2017), e1005509.Google ScholarCross Ref
- Stefano Monti, Pablo Tamayo, Jill Mesirov, and Todd Golub . 2003. Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Machine learning Vol. 52, 1--2 (2003), 91--118. Google ScholarDigital Library
- P C Nowell . 1976. The clonal evolution of tumor cell populations. Science Vol. 194, 4260 (Oct . 1976), 23--8.Google ScholarCross Ref
- Victoria Popic, Raheleh Salari, Iman Hajirasouliha, Dorna Kashef-Haghighi, Robert B West, and Serafim Batzoglou . 2015. Fast and scalable inference of multi-sample cancer lineages. Genome Biol Vol. 16 (May . 2015), 91.Google Scholar
- F James Rohlf . 1982. Consensus indices for comparing classifications. Mathematical Biosciences Vol. 59, 1 (1982), 131--144.Google ScholarCross Ref
- Edith M Ross and Florian Markowetz . 2016. OncoNEM: inferring tumor evolution from single-cell sequencing data. Genome Biol Vol. 17 (Apr . 2016), 69.Google Scholar
- Sohrab Salehi, Adi Steif, Andrew Roth, Samuel Aparicio, Alexandre Bouchard-Côté, and Sohrab P Shah . 2017. ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data. Genome Biol Vol. 18, 1 (03 . 2017), 44.Google Scholar
- Gryte Satas and Benjamin J Raphael . 2017. Tumor phylogeny inference using tree-constrained importance sampling. Bioinformatics Vol. 33, 14 (Jul . 2017), i152--i160.Google ScholarCross Ref
- Anna Schuh, Jennifer Becq, Sean Humphray, Adrian Alexa, Adam Burns, Ruth Clifford, Stephan M Feller, Russell Grocock, Shirley Henderson, Irina Khrebtukova, et almbox. . 2012. Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. Blood Vol. 120, 20 (2012), 4191--4196.Google ScholarCross Ref
- Russell Schwartz and Alejandro A Sch"affer . 2017. The evolution of tumour phylogenetics: principles and practice. Nat Rev Genet Vol. 18, 4 (04 . 2017), 213--229.Google Scholar
- Mike A Steel and David Penny . 1993. Distributions of tree comparison metrics--some new results. Systematic biology Vol. 42, 2 (1993), 126--141.Google Scholar
- Francesco Strino, Fabio Parisi, Mariann Micsinai, and Yuval Kluger . 2013. TrAp: a tree approach for fingerprinting subclonal tumor composition. Nucleic Acids Res Vol. 41, 17 (Sep . 2013), e165.Google ScholarCross Ref
- Charles Swanton . 2014. Cancer evolution: the final frontier of precision medicine? Ann Oncol Vol. 25, 3 (Mar . 2014), 549--51.Google Scholar
- M S Waterman and T F Smith . 1978. On the similarity of dendrograms. J Theor Biol Vol. 73, 4 (Aug . 1978), 789--800.Google ScholarCross Ref
- W. T. Williams and H. T. Clifford . 1971. On the Comparison of Two Classifications of the Same Set of Elements. Taxon Vol. 20, 4 (1971), 519--522. deftempurl%http://www.jstor.org/stable/1218253 tempurlGoogle ScholarCross Ref
- Hamim Zafar, Anthony Tzen, Nicholas Navin, Ken Chen, and Luay Nakhleh . 2017. SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models. Genome Biol Vol. 18, 1 (Sep . 2017), 178.Google ScholarCross Ref
- Habil Zare, Junfeng Wang, Alex Hu, Kris Weber, Josh Smith, Debbie Nickerson, ChaoZhong Song, Daniela Witten, C Anthony Blau, and William Stafford Noble . 2014. Inferring clonal composition from multiple sections of a breast cancer. PLoS Comput Biol Vol. 10, 7 (Jul . 2014), e1003703.Google ScholarCross Ref
Index Terms
- A Consensus Approach to Infer Tumor Evolutionary Histories
Recommendations
GraPhyC: Using Consensus to Infer Tumor Evolution
We consider the problem of finding a consensus tumor evolution tree from a set of conflicting input trees. In contrast to traditional phylogenetic trees, the tumor trees we consider do not have the same set of labels applied to the leaves of each tree. We ...
Computing bounded-degree phylogenetic roots of disconnected graphs
The Phylogenetic kth Root Problem (PRk) is the problem of finding a (phylogenetic) tree T from a given graph G=(V,E) such that (1) T has no degree-2 internal nodes, (2) the external nodes (i.e., leaves) of T are exactly the elements of V, and (3) (u,v)@_...
Heterogeneous compression of large collections of evolutionary trees
Compressing heterogeneous collections of trees is an open problem in computational phylogenetics. In a heterogeneous tree collection, each tree can contain a unique set of taxa. An ideal compression method would allow for the efficient archival of large ...
Comments