Hypothesis testing for topological data analysis

Robinson, Andrew; Turner, Katharine

doi:10.1007/s41468-017-0008-7

Hypothesis testing for topological data analysis

Original Paper
Published: 04 November 2017

Volume 1, pages 241–261, (2017)
Cite this article

Journal of Applied and Computational Topology Aims and scope Submit manuscript

Andrew Robinson¹ &
Katharine Turner²

1869 Accesses
34 Citations
Explore all metrics

Abstract

Persistence homology is a vital tool for topological data analysis. Previous work has developed some statistical estimators for characteristics of collections of persistence diagrams. However, tools that provide statistical inference for observations that are persistence diagrams are limited. Specifically, there is a need for tests that can assess the strength of evidence against a claim that two samples arise from the same population or process. This expository paper provides an introduction to randomization-style null hypothesis significance tests (NHST) and shows how they can be used with sets of persistence diagrams. The hypothesis test is based on a loss function that comprises pairwise distances between the elements of each sample and all the elements in the other sample. We use this method to analyze a range of simulated and experimental data. Through these examples we experimentally explore the power of the p-values. Our results show that the randomization-style NHST based on pairwise distances can distinguish between samples from different processes, which suggests that its use for hypothesis tests upon persistence diagrams is reasonable. We demonstrate its application on a real dataset of fMRI data of patients with ADHD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Same But Different: Distance Correlations Between Topological Summaries

Robust Statistics, Hypothesis Testing, and Confidence Intervals for Persistent Homology on Metric Measure Spaces

Article 15 May 2014

Andrew J. Blumberg, Itamar Gal, … Matthew Pancia

A random persistence diagram generator

Article 07 October 2022

Theodore Papamarkou, Farzana Nasrin, … Vasileios Maroulas

Notes

The reader should note that in Turner et al. (2014b) the focus is on the L1 distance.

References

Baddeley, A., Silverman, B.: A cautionary example on the use of second-order methods for analyzing point patterns. Biometrics. 40(4), 1089–1093 (1984)
Article MathSciNet Google Scholar
Baddeley, A., Turner, R., et al.: Spatstat: an R package for analyzing spatial point patterns. J. Stat. Softw. 12(6), 1–42 (2005)
Article Google Scholar
Balakrishnan, S., Fasy, B., Lecci, F., Rinaldo, A., Singh, A., and Wasserman, L.: Statistical inference for persistent homology (2013). arXiv:1303.7117
Bendich, P., Edelsbrunner, H., Kerber, M.: Computing robustness and persistence for images. Vis. Comput. Graph. IEEE Trans. 16(6), 1251–1260 (2010)
Article Google Scholar
Berger, J.: Could Fisher, Jeffreys and Neyman have agreed on testing? Stat. Sci. 18(1), 1–32 (2003)
Article MathSciNet MATH Google Scholar
Biscio, C., Møller, J.: The accumulated persistence function, a new useful functional summary statistic for topological data analysis, with a view to brain artery trees and spatial point process applications. (2016). arXiv:1611.00630
Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16(1), 77–102 (2015)
MathSciNet MATH Google Scholar
Bubenik, P., Kim, P.T.: A statistical approach to persistent homology. Homol. Homotopy Appl. 9(2), 337–362 (2007)
Article MathSciNet MATH Google Scholar
Casella, G., Berger, R.L.: Statistical Inference. Duxbury Press, Belmont (1990)
MATH Google Scholar
Cericola, C., Johnson, I., Kiers, J., Krock, M., Purdy, J., Torrence, J. Extending hypothesis testing with persistence homology to three or more groups. (2016). arXiv:1602.03760
Cerri, A., Ferri, M., Giorgi, D.: Retrieval of trademark images by means of size functions. Graph. Models 68(5), 451–471 (2006)
Article Google Scholar
Chazal, F., Glisse, M., Labruère, C., Michel, B. Optimal rates of convergence for persistence diagrams in topological data analysis. (2013). arXiv:1305.6239
Edgington, E. S., Onghena, P.: Randomization Tests, 4th edn. Chapman & Hall/CRC, Boca Raton (2007)
Ellis, S. P., Klein, A. Describing high-order statistical dependence using “concurrence topology”, with application to functional mri brain data. (2012). arXiv:1212.1642
Gamble, J., Heo, G.: Exploring uses of persistent homology for statistical analysis of landmark-based shape data. J. Multivariate Anal. 101(9), 2184–2199 (2010)
Article MathSciNet MATH Google Scholar
Gao, J. X.: Visionlab. WWW. (2004). http://visionlab.uta.edu/shape_data.htm
Hatcher, A.: Algebraic topology. Cambridge University Press (2002)
Latecki, L. J., Lakamper, R., Eckhardt, T.: Shape descriptors for non-rigid shapes with a single closed contour. In: Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, vol. 1, pp. 424–429. IEEE (2000)
Mileyko, Y., Mukherjee, S., Harer, J.: Probability measures on the space of persistence diagrams. Inverse Probl. 27(12), 124007 (2011)
Article MathSciNet MATH Google Scholar
Pawitan, Y. : In All Likelihood: Statistical Modelling and Inference Using Likelihood. Clarendon Press, Oxford (2001)
Phipson, B., Smyth, G.K.: Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn. Stat. Appl. Genet. Mol. Biol. 9(1) (2010)
Robins, V., Turner, K.: Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. (2015). arXiv:1507.01454
Sikora, T.: The MPEG-7 visual standard for content description-an overview. Circuits Syst. Video Technol. IEEE Trans. 11(6), 696–702 (2001)
Article Google Scholar
Turner, K. Means and medians of sets of persistence diagrams. (2013). arXiv:1307.8300
Turner, K., Mileyko, Y., Mukherjee, S., Harer, J. Fréchet means for distributions of persistence diagrams. Discret. Comput. Geom. 52(1), 44–70 (2014a)
Turner, K., Mukherjee, S., Boyer, D.M.: Persistent homology transform for modeling shapes and surfaces. Inf. Inference 3(4), 310–344 (2014b)
Article MathSciNet Google Scholar
Welsh, A.H.: Aspects of Statistical Inference. Wiley, New York (1996)
Book MATH Google Scholar

Download references

Acknowledgements

We thank Steve Ellis and Arno Klein for providing us with the persistence diagrams produced in their work. The authors would like to acknowledge the assistance of the Defence Science Institute in facilitating this work.

Author information

Authors and Affiliations

CEBRA, University of Melbourne, Melbourne, Australia
Andrew Robinson
Australian National University, Canberra, Australia
Katharine Turner

Authors

Andrew Robinson
View author publications
You can also search for this author in PubMed Google Scholar
Katharine Turner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Katharine Turner.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Robinson, A., Turner, K. Hypothesis testing for topological data analysis. J Appl. and Comput. Topology 1, 241–261 (2017). https://doi.org/10.1007/s41468-017-0008-7

Download citation

Published: 04 November 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s41468-017-0008-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hypothesis testing for topological data analysis

Abstract

Access this article

Similar content being viewed by others

Same But Different: Distance Correlations Between Topological Summaries

Robust Statistics, Hypothesis Testing, and Confidence Intervals for Persistent Homology on Metric Measure Spaces

A random persistence diagram generator

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Hypothesis testing for topological data analysis

Abstract

Access this article

Similar content being viewed by others

Same But Different: Distance Correlations Between Topological Summaries

Robust Statistics, Hypothesis Testing, and Confidence Intervals for Persistent Homology on Metric Measure Spaces

A random persistence diagram generator

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation