skip to main content
10.1145/3318464.3389760acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Debunking Four Long-Standing Misconceptions of Time-Series Distance Measures

Published:31 May 2020Publication History

ABSTRACT

Distance measures are core building blocks in time-series analysis and the subject of active research for decades. Unfortunately, the most detailed experimental study in this area is outdated (over a decade old) and, naturally, does not reflect recent progress. Importantly, this study (i) omitted multiple distance measures, including a classic measure in the time-series literature; (ii) considered only a single time-series normalization method; and (iii) reported only raw classification error rates without statistically validating the findings, resulting in or fueling four misconceptions in the time-series literature. Motivated by the aforementioned drawbacks and our curiosity to shed some light on these misconceptions, we comprehensively evaluate 71 time-series distance measures. Specifically, our study includes (i) 8 normalization methods; (ii) 52 lock-step measures; (iii) 4 sliding measures; (iv) 7 elastic measures; (v) 4 kernel functions; and (vi) 4 embedding measures. We extensively evaluate these measures across 128 time-series datasets using rigorous statistical analysis. Our findings debunk four long-standing misconceptions that significantly alter the landscape of what is known about existing distance measures. With the new foundations in place, we discuss open challenges and promising directions.

Skip Supplemental Material Section

Supplemental Material

3318464.3389760.mp4

mp4

81.8 MB

References

  1. Amaia Abanda, Usue Mori, and Jose A Lozano. 2019. A review on distance based time series classification. Data Mining and Knowledge Discovery 33, 2 (2019), 378--412.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Rakesh Agrawal, Christos Faloutsos, and Arun N. Swami. 1993. Efficient Similarity Search In Sequence Databases. In FODO. 69--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney, and Kyuseok Shim. 1995. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceeding of the 21th International Conference on Very Large Data Bases. Citeseer, 490--501.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Shadab Alam, Franco D Albareti, Carlos Allende Prieto, Friedrich Anders, Scott F Anderson, Timothy Anderton, Brett H Andrews, Eric Armengaud, Éric Aubourg, Stephen Bailey, et al.2015. The eleventh and twelfth data releases of the Sloan Digital Sky Survey: final data from SDSS-III. The Astrophysical Journal Supplement Series 219, 1(2015), 12.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jonathan Alon, Stan Sclaroff, George Kollios, and Vladimir Pavlovic. 2003. Discovering clusters in motion time-series data. In CVPR. 375--381.Google ScholarGoogle Scholar
  6. Francisco Martinez Alvarez, Alicia Troncoso, Jose C Riquelme, and Jesus S Aguilar Ruiz. 2010. Energy time series forecasting based on pattern sequence similarity. IEEE Transactions on Knowledge and Data Engineering 23, 8 (2010), 1230--1243.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Henrik André-Jönsson and Dushan Z Badal. 1997. Using signature files for querying time-series data. In European Symposium on Principles of Data Mining and Knowledge Discovery. Springer, 211--220.Google ScholarGoogle ScholarCross RefCross Ref
  8. Johannes Aßfalg, Hans-Peter Kriegel, Peer Kröger, Peter Kunath, Alexey Pryakhin, and Matthias Renz. 2006. Similarity search on time series based on threshold queries. In International Conference on Extending Database Technology. Springer, 276--294.Google ScholarGoogle Scholar
  9. Martin Bach-Andersen, Bo Rømer-Odgaard, and Ole Winther. 2017. Flexible non-linear predictive models for large-scale wind turbine diagnostics. Wind Energy 20, 5 (2017), 753--764.Google ScholarGoogle ScholarCross RefCross Ref
  10. Anthony Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, and Eamonn Keogh. 2018.The UEA multivariate time series classification archive, 2018. arXivpreprint arXiv:1811.00075(2018).Google ScholarGoogle Scholar
  11. Anthony Bagnall, Jason Lines, Aaron Bostrom, James Large, and Eamonn Keogh. 2017. The great time series classification bake off: are view and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery 31, 3 (2017), 606--660.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Anthony J Bagnall and Gareth J Janacek. 2004. Clustering time series from ARMA models with clipped data. In KDD. 49--58.Google ScholarGoogle Scholar
  13. Ziv Bar-Joseph. 2004. Analyzing time series gene expression data. Bioinformatics 20, 16 (2004), 2493--2503.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ziv Bar-Joseph, Georg K Gerber, David K Gifford, Tommi S Jaakkola, and Itamar Simon. 2003. Continuous representations of time-series gene expression data.Journal of Computational Biology 10, 3--4 (2003),341--356.Google ScholarGoogle Scholar
  15. Ziv Bar-Joseph, Anthony Gitter, and Itamar Simon. 2012. Studying and modelling dynamic biological processes using time-series gene expression data.Nature Reviews Genetics13, 8 (2012), 552.Google ScholarGoogle Scholar
  16. Gustavo EAPA Batista, Eamonn J Keogh, Oben Moses Tataw, and Vinicius MA De Souza. 2014. CID: an efficient complexity-invariant distance for time series.Data Mining and Knowledge Discovery 28, 3(2014), 634--669.Google ScholarGoogle Scholar
  17. Nurjahan Begum and Eamonn Keogh. 2014. Rare time series motif discovery from unbounded streams. Proceedings of the VLDB Endowment 8, 2 (2014), 149--160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Donald J Berndt and James Clifford. 1994. Using Dynamic TimeWarping to Find Patterns in Time Series. In AAAI Workshop on KDD. 359--370.Google ScholarGoogle Scholar
  19. Bharat B Biswal, Maarten Mennes, Xi-Nian Zuo, Suril Gohel, ClareKelly, Steve M Smith, Christian F Beckmann, Jonathan S Adelstein, Randy L Buckner, Stan Colcombe, et al. 2010. Toward discovery science of human brain function. Proceedings of the National Academy of Sciences 107, 10 (2010), 4734--4739.Google ScholarGoogle ScholarCross RefCross Ref
  20. R Bracewell. 1965. Pentagram notation for cross correlation. The Fourier transform and its applications. New York: McGraw-Hill46(1965), 243.Google ScholarGoogle Scholar
  21. Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In ACM sigmod record, Vol. 29. ACM, 93--104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Peter J Brockwell and Richard A Davis. 2016.Introduction to timeseries and forecasting. springer.Google ScholarGoogle Scholar
  23. Lisa Gottesfeld Brown. 1992. A survey of image registration techniques. ACM computing surveys (CSUR)24, 4 (1992), 325--376.Google ScholarGoogle Scholar
  24. Yuhan Cai and Raymond Ng. 2004. Indexing spatio-temporal trajectories with Chebyshev polynomials. In SIGMOD. 599--610.Google ScholarGoogle Scholar
  25. Alessandro Camerra, Themis Palpanas, Jin Shieh, and Eamonn Keogh. 2010. iSAX 2.0: Indexing and mining one billion time series. In 2010 IEEE International Conference on Data Mining. IEEE, 58--67.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sung-Hyuk Cha. 2007. Comprehensive survey on distance/similarity measures between probability density functions. City1, 2 (2007), 1.Google ScholarGoogle Scholar
  27. Lei Chen and Raymond Ng. 2004. On the marriage of Lp-norms and edit distance. InVLDB. 792--803.Google ScholarGoogle Scholar
  28. Lei Chen, M Tamer Özsu, and Vincent Oria. 2005. Robust and fast similarity search for moving object trajectories. In SIGMOD. 491--502.Google ScholarGoogle Scholar
  29. Qiuxia Chen, Lei Chen, Xiang Lian, Yunhao Liu, and Jeffrey Xu Yu.2007. Indexable PLA for efficient similarity search. In VLDB. 435--446.Google ScholarGoogle Scholar
  30. Yueguo Chen, Mario A Nascimento, Beng Chin Ooi, and Anthony KHTung. 2007. Spade: On shape-based pattern detection in streaming time series. In ICDE. 786--795.Google ScholarGoogle Scholar
  31. Bill Chiu, Eamonn Keogh, and Stefano Lonardi. 2003. Probabilistic discovery of time series motifs. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 493--498.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Kelvin Kam Wing Chu and Man Hon Wong. 1999. Fast time-series searching with scaling and shifting. In Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. Citeseer, 237--248.Google ScholarGoogle Scholar
  33. Richard Cole, Dennis Shasha, and Xiaojian Zhao. 2005. Fast window correlations over uncooperative time series. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 743--749.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. James W Cooley and John W Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comp.19, 90(1965), 297--301.Google ScholarGoogle ScholarCross RefCross Ref
  35. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Madalena Costa, Ary L Goldberger, and C-K Peng. 2002. Multiscale entropy analysis of complex physiologic time series.Physical review letters 89, 6 (2002), 068102.Google ScholarGoogle Scholar
  37. Nello Cristianini and John Shawe-Taylor. 2000. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press.Google ScholarGoogle ScholarCross RefCross Ref
  38. Marco Cuturi. 2011. Fast global alignment kernels. In Proceedings of the 28th international conference on machine learning (ICML-11). 929--936.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, and Themis Palpanas. 2012. Uncertain time-series similarity: Return to the basics. Proceedings of the VLDB Endowment 5, 11 (2012), 1662--1673.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Michele Dallachiesa, Themis Palpanas, and Ihab F Ilyas. 2014. Top-k nearest neighbor search in uncertain data series.Proceedings of the VLDB Endowment 8, 1 (2014), 13--24.Google ScholarGoogle Scholar
  41. Hoang Anh Dau, Eamonn Keogh, Kaveh Kamgar, Chin-Chia MichaelYeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana,Yanping, Bing Hu, Nurjahan Begum, Anthony Bagnall, Abdullah Mueen, Gustavo Batista, and Hexagon-ML. 2018. The UCR Time Series Classification Archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.Google ScholarGoogle Scholar
  42. Janez Demar. 2006. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7 (2006),1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Michel-Marie Deza and Elena Deza. 2006.Dictionary of distances. Elsevier.Google ScholarGoogle Scholar
  44. Michel Marie Deza and Elena Deza. 2009. Encyclopedia of distances. In Encyclopedia of distances. Springer, 1--583.Google ScholarGoogle Scholar
  45. Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn Keogh. 2008. Querying and mining of time series data: experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment 1, 2 (2008), 1542--1552.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Rui Ding, Qiang Wang, Yingnong Dang, Qiang Fu, Haidong Zhang,and Dongmei Zhang. 2015. Yading: Fast clustering of large-scale time series data. Proceedings of the VLDB Endowment 8, 5 (2015), 473--484.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Alejandro Domínguez. 2015. A history of the convolution operation [Retrospectroscope]. IEEE pulse6, 1 (2015), 38--49.Google ScholarGoogle ScholarCross RefCross Ref
  48. Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas, and Houda Benbrahim. 2018. The lernaean hydra of data series similarity search:An experimental evaluation of the state of the art. Proceedings of the VLDB Endowment 12, 2 (2018), 112--127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Jason Ernst and Ziv Bar-Joseph. 2006. STEM: a tool for the analysis of short time series gene expression data. BMC bioinformatics 7, 1(2006), 191.Google ScholarGoogle Scholar
  50. Philippe Esling and Carlos Agon. 2012. Time-series data mining. ACM Computing Surveys (CSUR)45, 1 (2012), 12.Google ScholarGoogle Scholar
  51. Christos Faloutsos, M. Ranganathan, and Yannis Manolopoulos. 1994. Fast Subsequence Matching in Time-series Databases. In SIGMOD. 419--429.Google ScholarGoogle Scholar
  52. Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. 2014. Do we need hundreds of classifiers to solve real world classification problems?The journal of machine learning research15,1 (2014), 3133--3181.Google ScholarGoogle Scholar
  53. Elias Frentzos, Kostas Gratsias, and Yannis Theodoridis. 2007. Index-based most similar trajectory search. In ICDE. 816--825.Google ScholarGoogle Scholar
  54. Milton Friedman. 1937. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Amer. Statist. Assoc. 32 (1937), 675--701.Google ScholarGoogle ScholarCross RefCross Ref
  55. Daniel G Gavin, W Wyatt Oswald, Eugene R Wahl, and John W Williams. 2003. A statistical approach to evaluating distance metrics and analog assignments for pollen records.Quaternary Research 60, 3 (2003), 356--367.Google ScholarGoogle Scholar
  56. Martin Gavrilov, Dragomir Anguelov, Piotr Indyk, and Rajeev Motwani. 2000. Mining the stock market: Which measure is best. In Proc. of the 6th ACM SIGKDD. 487--496.Google ScholarGoogle Scholar
  57. Rafael Giusti and Gustavo EAPA Batista. 2013. An Empirical Comparison of Dissimilarity Measures for Time Series Classification. In BRACIS. 82--88.Google ScholarGoogle Scholar
  58. Steve Goddard, Sherri K Harms, Stephen E Reichenbach, Tsegaye Tadesse, and William J Waltman. 2003. Geospatial decision support for drought risk management. Commun. ACM46, 1 (2003), 35--37.Google ScholarGoogle Scholar
  59. Dina Q Goldin and Paris C Kanellakis. 1995. On similarity queries for time-series data: constraint specification and implementation. In International Conference on Principles and Practice of Constraint Programming. Springer, 137--153.Google ScholarGoogle ScholarCross RefCross Ref
  60. Tomasz Górecki and Maciej Luczak. 2013. Using derivatives in time series classification.Data Mining and Knowledge Discovery 26, 2(2013), 310--331.Google ScholarGoogle Scholar
  61. Aditya Grover, Ashish Kapoor, and Eric Horvitz. 2015. A deep hybrid model for weather forecasting. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 379--386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Joel Grus. 2019. Data science from scratch: first principles with python. O'Reilly Media.Google ScholarGoogle Scholar
  63. Jon Hills, Jason Lines, Edgaras Baranauskas, James Mapp, and Anthony Bagnall. 2014. Classification of time series by shapelet transformation. Data Mining and Knowledge Discovery 28, 4 (2014), 851--881.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Ove Hoegh-Guldberg, Peter J Mumby, Anthony J Hooten, Robert S Steneck, Paul Greenfield, Edgardo Gomez, C Drew Harvell, Peter FSale, Alasdair J Edwards, Ken Caldeira, et al. 2007. Coral reefs under rapid climate change and ocean acidification. Science 318, 5857 (2007), 1737--1742.Google ScholarGoogle Scholar
  65. Rie Honda, Shuai Wang, Tokio Kikuchi, and Osamu Konishi. 2002.Mining of moving objects from time-series images and its application to satellite weather imagery. Journal of Intelligent Information Systems 19, 1 (2002), 79--93.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Bing Hu, Yanping Chen, and Eamonn Keogh. 2013. Time Series Classification under More Realistic Assumptions. In SDM. 578--586.Google ScholarGoogle Scholar
  67. Pablo Huijse, Pablo A Estevez, Pavlos Protopapas, Jose C Principe, and Pablo Zegers. 2014. Computational intelligence challenges and applications on large-scale astronomical time series databases. IEEE Computational Intelligence Magazine 9, 3 (2014), 27--39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Young-Seon Jeong, Myong K Jeong, and Olufemi A Omitaomu. 2011. Weighted dynamic time warping for time series classification. Pattern Recognition 44, 9 (2011), 2231--2240.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Konstantinos Kalpakis, Dhiral Gada, and Vasundhara Puttagunta.2001. Distance measures for effective clustering of ARIMA time-series. In ICDM. 273--280.Google ScholarGoogle Scholar
  70. Kunio Kashino, Gavin Smith, and Hiroshi Murase. 1999. Time-series active search for quick retrieval of audio and video. In 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), Vol. 6. IEEE, 2993--2996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Shrikant Kashyap and Panagiotis Karras. 2011. Scalable knn search on vertically stored time series. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1334--1342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Eamonn Keogh. 2006. A decade of progress in indexing and mining large time series databases. In VLDB. 1268--1268.Google ScholarGoogle Scholar
  73. Eamonn Keogh, Kaushik Chakrabarti, Michael Pazzani, and Sharad Mehrotra. 2001. Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In SIGMOD. 151--162.Google ScholarGoogle Scholar
  74. Eamonn Keogh and Jessica Lin. 2005. Clustering of time-series subsequences is meaningless: Implications for previous and future research. Knowledge and Information Systems 8, 2 (2005), 154--177.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Eamonn Keogh and Chotirat Ann Ratanamahatana. 2005. Exact indexing of dynamic time warping. Knowledge and Information Systems 7, 3 (2005), 358--386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Chan Kin-pong and Fu Ada. 1999. Efficient Time Series Matching by Wavelets. In ICDE. 126--133.Google ScholarGoogle Scholar
  77. S Knieling, J Niediek, E Kutter, J Bostroem, CE Elger, and F Mormann. 2017. An online adaptive screening procedure for selective neuronal responses. Journal of neuroscience methods291 (2017), 36--42.Google ScholarGoogle Scholar
  78. Flip Korn, H. V. Jagadish, and Christos Faloutsos. 1997. Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In SIGMOD. 289--300.Google ScholarGoogle Scholar
  79. Yann LeCun, Yoshua Bengio, et al.1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361, 10 (1995), 1995.Google ScholarGoogle Scholar
  80. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436.Google ScholarGoogle Scholar
  81. Yann A LeCun, Léon Bottou, Genevieve B Orr, and Klaus-Robert Müller. 2012. Efficient backprop. InNeural networks: Tricks of the trade. Springer, 9--48.Google ScholarGoogle Scholar
  82. Qi Lei, Jinfeng Yi, Roman Vaculin, Lingfei Wu, and Inderjit S Dhillon.2017. Similarity preserving representation learning for time series analysis. arXiv preprint arXiv:1702.03584(2017).Google ScholarGoogle Scholar
  83. Chung-Sheng Li, Philip S. Yu, and Vittorio Castelli. 1996. Hierarchyscan: A hierarchical similarity search algorithm for databases of long sequences. In ICDE. IEEE, 546--553.Google ScholarGoogle Scholar
  84. Xiang Lian, Lei Chen, Jeffrey Xu Yu, Guoren Wang, and Ge Yu. 2007. Similarity match over high speed time-series streams. InICDE. 1086--1095.Google ScholarGoogle Scholar
  85. Jessica Lin, Michail Vlachos, Eamonn Keogh, and Dimitrios Gunopulos. 2004. Iterative incremental clustering of time series. In EDBT. 106--122.Google ScholarGoogle Scholar
  86. Michele Linardi and Themis Palpanas. 2018. Scalable, variable-length similarity search in data series: The ULISSE approach. Proceedings of the VLDB Endowment 11, 13 (2018), 2236--2248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Jason Lines and Anthony Bagnall. 2015. Time series classification with ensembles of elastic distance measures.Data Mining and Knowledge Discovery 29, 3 (2015), 565--592.Google ScholarGoogle Scholar
  88. Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In2008 Eighth IEEE International Conference on Data Mining. IEEE, 413--422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Helmut Lütkepohl, Markus Krätzig, and Peter CB Phillips. 2004. Applied time series econometrics. Cambridge university press.Google ScholarGoogle Scholar
  90. Mohammad Saeid Mahdavinejad, Mohammadreza Rezvan, Moham-madamin Barekatain, Peyman Adibi, Payam Barnaghi, and Amit P Sheth. 2017. Machine learning for Internet of Things data analysis: Asurvey. Digital Communications and Networks(2017).Google ScholarGoogle Scholar
  91. Rosario N Mantegna. 1999. Hierarchical structure in financial markets.The European Physical Journal B-Condensed Matter and Complex Systems 11, 1 (1999), 193--197.Google ScholarGoogle Scholar
  92. Pierre-François Marteau. 2008. Time warp edit distance with stiffness adjustment for time series matching. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 2 (2008), 306--318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Pierre-François Marteau and Sylvie Gibet. 2014. On recursive edit distance kernels with application to time series classification. IEEE transactions on neural networks and learning systems 26, 6 (2014),1121--1133.Google ScholarGoogle Scholar
  94. Francisco Martínez-Álvarez, Alicia Troncoso, Gualberto Asencio-Cortés, and José Riquelme. 2015. A survey on data mining techniques applied to electricity-related time series forecasting. Energies 8, 11(2015), 13162--13193.Google ScholarGoogle ScholarCross RefCross Ref
  95. Richard McCleary, Richard A Hay, Erroll E Meidinger, and David McDowall. 1980.Applied time series analysis for the social sciences. Sage Publications Beverly Hills, CA.Google ScholarGoogle Scholar
  96. Vasileios Megalooikonomou, Qiang Wang, Guo Li, and Christos Faloutsos. 2005. A multiresolution symbolic representation of time series. In Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on. IEEE, 668--679.Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Katsiaryna Mirylenka, Vassilis Christophides, Themis Palpanas, Ioannis Pefkianakis, and Martin May. 2016. Characterizing home device usage from wireless traffic time series.Google ScholarGoogle Scholar
  98. Katsiaryna Mirylenka, Michele Dallachiesa, and Themis Palpanas. 2017. Data series similarity using correlation-aware measures. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. A Morales-Esteban, Francisco Martínez-Álvarez, A Troncoso, JL Justo,and Cristina Rubio-Escudero. 2010. Pattern recognition to forecast seismic time series.Expert Systems with Applications 37, 12 (2010),8333--8342.Google ScholarGoogle Scholar
  100. Michael D Morse and Jignesh M Patel. 2007. An efficient and accurate method for evaluating time series similarity. In SIGMOD. 569--580.Google ScholarGoogle Scholar
  101. Abdullah Mueen, Eamonn Keogh, and Neal Young. 2011. Logical-shapelets: An expressive primitive for time series classification. In KDD. 1154--1162.Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, and Brandon Westover. 2009. Exact discovery of time series motifs. In Proceedings of the 2009 SIAM international conference on data mining. SIAM, 473--484.Google ScholarGoogle ScholarCross RefCross Ref
  103. Abdullah Mueen, Yan Zhu, Michael Yeh, Kaveh Kamgar, Krishnamurthy Viswanathan, Chetan Gupta, and Eamonn Keogh. 2017.The Fastest Similarity Search Algorithm for Time Series Subsequences under Euclidean Distance. http://www.cs.unm.edu/~mueen/FastestSimilaritySearch.html.Google ScholarGoogle Scholar
  104. Peter Nemenyi. 1963. Distribution-free Multiple Comparisons. Ph.D. Dissertation. Princeton University.Google ScholarGoogle Scholar
  105. Themis Palpanas. 2015. Data series management: the road to big sequence analytics. ACM SIGMOD Record 44, 2 (2015), 47--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Themis Palpanas. 2016. Big sequence management: A glimpse of the past, the present, and the future. InInternational Conference onCurrent Trends in Theory and Practice of Informatics. Springer, 63--80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Panagiotis Papapetrou, Vassilis Athitsos, Michalis Potamias, GeorgeKollios, and Dimitrios Gunopulos. 2011. Embedding-based subsequence matching in time-series databases. TODS 36, 3 (2011), 17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. John Paparrizos. 2019. 2018 UCR Time-Series Archive: Backward Compatibility, Missing Values, and Varying Lengths. https://github.com/johnpaparrizos/UCRArchiveFixes.Google ScholarGoogle Scholar
  109. John Paparrizos and Michael J Franklin. 2019. GRAIL: efficient time-series representation learning. Proceedings of the VLDB Endowment12, 11 (2019), 1762--1777.Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. John Paparrizos and Luis Gravano. 2015. k-shape: Efficient and accurate clustering of time series. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 1855--1870.Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. John Paparrizos and Luis Gravano. 2017. Fast and Accurate Time-Series Clustering. ACM Transactions on Database Systems (TODS)42, 2 (2017), 8.Google ScholarGoogle Scholar
  112. Athanasios Papoulis. 1962. The Fourier integral and its applications. McGraw-Hill.Google ScholarGoogle Scholar
  113. C-K Peng, Shlomo Havlin, H Eugene Stanley, and Ary L Goldberger. 1995. Quantification of scaling exponents and crossover phenomenain nonstationary heartbeat time series. Chaos: An Interdisciplinary Journal of Nonlinear Science 5, 1 (1995), 82--87.Google ScholarGoogle ScholarCross RefCross Ref
  114. François Petitjean, Germain Forestier, Geoffrey I Webb, Ann E Nicholson, Yanping Chen, and Eamonn Keogh. 2014. Dynamic time warping averaging of time series allows faster and more accurate classification. In 2014 IEEE international conference on data mining. IEEE, 470--479.Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. François Petitjean, Germain Forestier, Geoffrey I Webb, Ann E Nicholson, Yanping Chen, and Eamonn Keogh. 2016. Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowledge and Information Systems 47, 1 (2016), 1--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. François Petitjean, Alain Ketterlin, and Pierre Gançarski. 2011. A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition 44, 3 (2011), 678--693.Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. Davood Rafiei and Alberto Mendelzon. 1997. Similarity-based queries for time series data. In ACM SIGMOD Record, Vol. 26. ACM, 13--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, and Eamonn Keogh. 2012. Searching and mining trillions of time series subsequences under dynamic time warping. InKDD. 262--270.Google ScholarGoogle Scholar
  119. Chotirat Ann Ralanamahatana, Jessica Lin, Dimitrios Gunopulos, Eamonn Keogh, Michail Vlachos, and Gautam Das. 2005. Mining time series data. InData mining and knowledge discovery handbook. Springer, 1069--1103.Google ScholarGoogle Scholar
  120. Chotirat Ann Ratanamahatana and Eamonn Keogh. 2004. Making time-series classification more accurate using learned constraints. In SDM. 11--22.Google ScholarGoogle Scholar
  121. Usman Raza, Alessandro Camerra, Amy L Murphy, Themis Palpanas, and Gian Pietro Picco. 2015. Practical data prediction for real-world wireless sensor networks.IEEE Transactions on Knowledge and DataEngineering 27, 8 (2015), 2231--2244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. John Rice. 2006.Mathematical statistics and data analysis. Cengage Learning.Google ScholarGoogle Scholar
  123. Joshua S Richman and J Randall Moorman. 2000. Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology 278, 6(2000), H2039--H2049.Google ScholarGoogle ScholarCross RefCross Ref
  124. Kexin Rong, Clara E Yoon, Karianne J Bergen, Hashem Elezabi, Peter Bailis, Philip Levis, and Gregory C Beroza. 2018. Locality-sensitive hashing for earthquake detection: A case study of scaling data-driven science. Proceedings of the VLDB Endowment11, 11 (2018), 1674--1687.Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. Eduardo J Ruiz, Vagelis Hristidis, Carlos Castillo, Aristides Gionis, and Alejandro Jaimes. 2012. Correlating financial time series with micro-blogging activity. InProceedings of the fifth ACM international conference on Web search and data mining. ACM, 513--522.Google ScholarGoogle Scholar
  126. Hiroaki Sakoe and Seibi Chiba. 1971. A dynamic programming approach to continuous speech recognition. In ICA. 65--69.Google ScholarGoogle Scholar
  127. Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE transactions on acoustics, speech, and signal processing 26, 1 (1978), 43--49.Google ScholarGoogle ScholarCross RefCross Ref
  128. Yasushi Sakurai, Spiros Papadimitriou, and Christos Faloutsos. 2005.Braid: Stream mining through group lag correlations. In SIGMOD. ACM, 599--610.Google ScholarGoogle Scholar
  129. Patrick Schäfer and Mikael Högqvist. 2012. SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets. InProceedings of the 15th International Conference on Extend-ing Database Technology. ACM, 516--527.Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. Bernhard Schölkopf, Alexander Smola, and Klaus-Robert Müller. 1997. Kernel principal component analysis. InInternational Conference on Artificial Neural Networks. Springer, 583--588.Google ScholarGoogle ScholarCross RefCross Ref
  131. Bernhard Schölkopf, Alexander Smola, and Klaus-Robert Müller. 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural computation 10, 5 (1998), 1299--1319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. Bernhard Schölkopf and Alexander J Smola. 2002. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.Google ScholarGoogle Scholar
  133. Pavel Senin, Jessica Lin, Xing Wang, Tim Oates, Sunil Gandhi,Arnold P Boedihardjo, Crystal Chen, and Susan Frankenstein. 2015. Time series anomaly discovery with grammar-based compression. In Edbt. 481--492.Google ScholarGoogle Scholar
  134. Dennis Shasha. 1999. Tuning time series queries in finance: Case studies and recommendations. IEEE Data Eng. Bull. 22, 2 (1999),40--46.Google ScholarGoogle Scholar
  135. Jin Shieh and Eamonn Keogh. 2008. i SAX: indexing and mining terabyte sized time series. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining.ACM, 623--631.Google ScholarGoogle ScholarDigital LibraryDigital Library
  136. Yutao Shou, Nikos Mamoulis, and David Cheung. 2005. Fast and exact warping of time series using adaptive segmental approximations.Machine Learning 58, 2--3 (2005), 231--267.Google ScholarGoogle Scholar
  137. Alexandra Stefan, Vassilis Athitsos, and Gautam Das. 2013. The move-split-merge metric for time series. TKDE 25, 6 (2013), 1425--1438.Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. Ruey S Tsay. 2014. Financial Time Series. Wiley StatsRef: Statistics Reference Online(2014), 1--23.Google ScholarGoogle ScholarCross RefCross Ref
  139. Kuniaki Uehara and Mitsuomi Shimada. 2002. Extraction of primitive motion and discovery of association rules from human motion data. In Progress in Discovery Science. Springer, 338--348.Google ScholarGoogle Scholar
  140. Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, and Eamonn Keogh. 2006. Indexing multidimensional time-series. The VLDB Journal 15, 1 (2006), 1--20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  141. Michail Vlachos, George Kollios, and Dimitrios Gunopulos. 2002. Discovering similar multidimensional trajectories. In Proceedings 18th international conference on data engineering. IEEE, 673--684.Google ScholarGoogle ScholarCross RefCross Ref
  142. Gabriel Wachman, Roni Khardon, Pavlos Protopapas, and Charles R Alcock. 2009. Kernels for periodic time series arising in astronomy. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 489--505.Google ScholarGoogle ScholarCross RefCross Ref
  143. Hao Wang, Yilun Cai, Yin Yang, Shiming Zhang, and Nikos Mamoulis. 2014. Durable Queries over Historical Time Series. TKDE 26, 3 (2014),595--607.Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. Xiaoyue Wang, Abdullah Mueen, Hui Ding, Goce Trajcevski, Peter Scheuermann, and Eamonn Keogh. 2013. Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery(2013), 1--35.Google ScholarGoogle Scholar
  145. Xiaozhe Wang, Kate Smith, and Rob Hyndman. 2006. Characteristic-based clustering for time series data. Data mining and knowledge Discovery 13, 3 (2006), 335--364.Google ScholarGoogle Scholar
  146. Yang Wang, Peng Wang, Jian Pei, Wei Wang, and Sheng Huang. 2013. A data-adaptive and dynamic segmentation index for whole matching on time series. Proceedings of the VLDB Endowment 6, 10 (2013), 793--804.Google ScholarGoogle ScholarDigital LibraryDigital Library
  147. T Warren Liao. 2005. Clustering of time series data - a survey. Pattern Recognition 38, 11 (2005), 1857--1874.Google ScholarGoogle ScholarDigital LibraryDigital Library
  148. Peter J Webster, Greg J Holland, Judith A Curry, and H-R Chang.2005. Changes in tropical cyclone number, duration, and intensity in a warming environment. Science 309, 5742 (2005), 1844--1846.Google ScholarGoogle Scholar
  149. Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bulletin(1945), 80--83.Google ScholarGoogle Scholar
  150. Billy M Williams and Lester A Hoel. 2003. Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: Theoretical basis and empirical results. Journal of transportation engineering 129, 6(2003), 664--672.Google ScholarGoogle ScholarCross RefCross Ref
  151. Lingfei Wu, Ian En-Hsu Yen, Jinfeng Yi, Fangli Xu, Qi Lei, and Michael Witbrock. 2018. Random Warping Series: A Random Features Method for Time-Series Embedding. In AISTATS. 793--802.Google ScholarGoogle Scholar
  152. Xiaopeng Xi, Eamonn Keogh, Christian Shelton, Li Wei, and Chotirat Ann Ratanamahatana. 2006. Fast time series classification using numerosity reduction. In Proceedings of the 23rd international conference on Machine learning. ACM, 1033--1040.Google ScholarGoogle ScholarDigital LibraryDigital Library
  153. Yimin Xiong and Dit-Yan Yeung. 2002. Mixtures of ARMA models for model-based time series clustering. In ICDM. 717--720.Google ScholarGoogle Scholar
  154. Jaewon Yang and Jure Leskovec. 2011. Patterns of temporal variation in online media. In WSDM. 177--186.Google ScholarGoogle Scholar
  155. Dragomir Yankov, Eamonn Keogh, and Umaa Rebbapragada. 2008. Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. Knowledge and Information Systems 17, 2 (2008), 241--262.Google ScholarGoogle ScholarDigital LibraryDigital Library
  156. Lexiang Ye and Eamonn Keogh. 2009. Time series shapelets: a new primitive for data mining. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 947--956.Google ScholarGoogle ScholarDigital LibraryDigital Library
  157. Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Diego Furtado Silva, Abdullah Mueen, and Eamonn Keogh. 2016. Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In 2016 IEEE 16th international conference on data mining(ICDM). IEEE, 1317--1322.Google ScholarGoogle Scholar
  158. Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum,Yifei Ding, Hoang Anh Dau, Zachary Zimmerman, Diego Furtado Silva, Abdullah Mueen, and Eamonn Keogh. 2018. Time series joins,motifs, discords and shapelets: a unifying view that exploits the matrix profile. Data Mining and Knowledge Discovery 32, 1 (2018), 83--123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  159. Mi-Yen Yeh, Kun-Lung Wu, Philip S Yu, and Ming-Syan Chen. 2009.PROUD: a probabilistic approach to processing similarity queries over uncertain data streams. InProceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology. ACM, 684--695.Google ScholarGoogle ScholarDigital LibraryDigital Library
  160. Byoung-Kee Yi and Christos Faloutsos. 2000. Fast time sequence indexing for arbitrary Lp norms. VLDB.Google ScholarGoogle Scholar
  161. Jesin Zakaria, Abdullah Mueen, and Eamonn Keogh. 2012. Clustering Time Series Using Unsupervised-Shapelets. In ICDM. 785--794.Google ScholarGoogle Scholar
  162. Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, and Michal Batko. 2006. Similarity search: the metric space approach. Vol. 32. Springer Science & Business Media.Google ScholarGoogle Scholar
  163. Guoqing Zheng, Yiming Yang, and Jaime Carbonell. 2016. Efficient shift-invariant dictionary learning. In SIGKDD. ACM, 2095--2104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  164. Kostas Zoumpatianos, Stratos Idreos, and Themis Palpanas. 2016. ADS: the adaptive data series index.The VLDB Journal-The International Journal on Very Large Data Bases 25, 6 (2016), 843--866.Google ScholarGoogle Scholar

Index Terms

  1. Debunking Four Long-Standing Misconceptions of Time-Series Distance Measures

                      Recommendations

                      Comments

                      Login options

                      Check if you have access through your login credentials or your institution to get full access on this article.

                      Sign in
                      • Published in

                        cover image ACM Conferences
                        SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
                        June 2020
                        2925 pages
                        ISBN:9781450367356
                        DOI:10.1145/3318464

                        Copyright © 2020 ACM

                        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                        Publisher

                        Association for Computing Machinery

                        New York, NY, United States

                        Publication History

                        • Published: 31 May 2020

                        Permissions

                        Request permissions about this article.

                        Request Permissions

                        Check for updates

                        Qualifiers

                        • research-article

                        Acceptance Rates

                        Overall Acceptance Rate785of4,003submissions,20%

                      PDF Format

                      View or Download as a PDF file.

                      PDF

                      eReader

                      View online with eReader.

                      eReader