Skip to main content

Advertisement

Log in

Hypothesis testing for topological data analysis

  • Original Paper
  • Published:
Journal of Applied and Computational Topology Aims and scope Submit manuscript

Abstract

Persistence homology is a vital tool for topological data analysis. Previous work has developed some statistical estimators for characteristics of collections of persistence diagrams. However, tools that provide statistical inference for observations that are persistence diagrams are limited. Specifically, there is a need for tests that can assess the strength of evidence against a claim that two samples arise from the same population or process. This expository paper provides an introduction to randomization-style null hypothesis significance tests (NHST) and shows how they can be used with sets of persistence diagrams. The hypothesis test is based on a loss function that comprises pairwise distances between the elements of each sample and all the elements in the other sample. We use this method to analyze a range of simulated and experimental data. Through these examples we experimentally explore the power of the p-values. Our results show that the randomization-style NHST based on pairwise distances can distinguish between samples from different processes, which suggests that its use for hypothesis tests upon persistence diagrams is reasonable. We demonstrate its application on a real dataset of fMRI data of patients with ADHD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The reader should note that in Turner et al. (2014b) the focus is on the L1 distance.

References

  • Baddeley, A., Silverman, B.: A cautionary example on the use of second-order methods for analyzing point patterns. Biometrics. 40(4), 1089–1093 (1984)

    Article  MathSciNet  Google Scholar 

  • Baddeley, A., Turner, R., et al.: Spatstat: an R package for analyzing spatial point patterns. J. Stat. Softw. 12(6), 1–42 (2005)

    Article  Google Scholar 

  • Balakrishnan, S., Fasy, B., Lecci, F., Rinaldo, A., Singh, A., and Wasserman, L.: Statistical inference for persistent homology (2013). arXiv:1303.7117

  • Bendich, P., Edelsbrunner, H., Kerber, M.: Computing robustness and persistence for images. Vis. Comput. Graph. IEEE Trans. 16(6), 1251–1260 (2010)

    Article  Google Scholar 

  • Berger, J.: Could Fisher, Jeffreys and Neyman have agreed on testing? Stat. Sci. 18(1), 1–32 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Biscio, C., Møller, J.: The accumulated persistence function, a new useful functional summary statistic for topological data analysis, with a view to brain artery trees and spatial point process applications. (2016). arXiv:1611.00630

  • Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16(1), 77–102 (2015)

    MathSciNet  MATH  Google Scholar 

  • Bubenik, P., Kim, P.T.: A statistical approach to persistent homology. Homol. Homotopy Appl. 9(2), 337–362 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Casella, G., Berger, R.L.: Statistical Inference. Duxbury Press, Belmont (1990)

    MATH  Google Scholar 

  • Cericola, C., Johnson, I., Kiers, J., Krock, M., Purdy, J., Torrence, J. Extending hypothesis testing with persistence homology to three or more groups. (2016). arXiv:1602.03760

  • Cerri, A., Ferri, M., Giorgi, D.: Retrieval of trademark images by means of size functions. Graph. Models 68(5), 451–471 (2006)

    Article  Google Scholar 

  • Chazal, F., Glisse, M., Labruère, C., Michel, B. Optimal rates of convergence for persistence diagrams in topological data analysis. (2013). arXiv:1305.6239

  • Edgington, E. S., Onghena, P.: Randomization Tests, 4th edn. Chapman & Hall/CRC, Boca Raton (2007)

  • Ellis, S. P., Klein, A. Describing high-order statistical dependence using “concurrence topology”, with application to functional mri brain data. (2012). arXiv:1212.1642

  • Gamble, J., Heo, G.: Exploring uses of persistent homology for statistical analysis of landmark-based shape data. J. Multivariate Anal. 101(9), 2184–2199 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Gao, J. X.: Visionlab. WWW. (2004). http://visionlab.uta.edu/shape_data.htm

  • Hatcher, A.: Algebraic topology. Cambridge University Press (2002)

  • Latecki, L. J., Lakamper, R., Eckhardt, T.: Shape descriptors for non-rigid shapes with a single closed contour. In: Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, vol. 1, pp. 424–429. IEEE (2000)

  • Mileyko, Y., Mukherjee, S., Harer, J.: Probability measures on the space of persistence diagrams. Inverse Probl. 27(12), 124007 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Pawitan, Y. : In All Likelihood: Statistical Modelling and Inference Using Likelihood. Clarendon Press, Oxford (2001)

  • Phipson, B., Smyth, G.K.: Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn. Stat. Appl. Genet. Mol. Biol. 9(1) (2010)

  • Robins, V., Turner, K.: Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. (2015). arXiv:1507.01454

  • Sikora, T.: The MPEG-7 visual standard for content description-an overview. Circuits Syst. Video Technol. IEEE Trans. 11(6), 696–702 (2001)

    Article  Google Scholar 

  • Turner, K. Means and medians of sets of persistence diagrams. (2013). arXiv:1307.8300

  • Turner, K., Mileyko, Y., Mukherjee, S., Harer, J. Fréchet means for distributions of persistence diagrams. Discret. Comput. Geom. 52(1), 44–70 (2014a)

  • Turner, K., Mukherjee, S., Boyer, D.M.: Persistent homology transform for modeling shapes and surfaces. Inf. Inference 3(4), 310–344 (2014b)

    Article  MathSciNet  Google Scholar 

  • Welsh, A.H.: Aspects of Statistical Inference. Wiley, New York (1996)

    Book  MATH  Google Scholar 

Download references

Acknowledgements

We thank Steve Ellis and Arno Klein for providing us with the persistence diagrams produced in their work. The authors would like to acknowledge the assistance of the Defence Science Institute in facilitating this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katharine Turner.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Robinson, A., Turner, K. Hypothesis testing for topological data analysis. J Appl. and Comput. Topology 1, 241–261 (2017). https://doi.org/10.1007/s41468-017-0008-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41468-017-0008-7

Keywords

Mathematics Subject Classification

Navigation