ABSTRACT
Distance measures are core building blocks in time-series analysis and the subject of active research for decades. Unfortunately, the most detailed experimental study in this area is outdated (over a decade old) and, naturally, does not reflect recent progress. Importantly, this study (i) omitted multiple distance measures, including a classic measure in the time-series literature; (ii) considered only a single time-series normalization method; and (iii) reported only raw classification error rates without statistically validating the findings, resulting in or fueling four misconceptions in the time-series literature. Motivated by the aforementioned drawbacks and our curiosity to shed some light on these misconceptions, we comprehensively evaluate 71 time-series distance measures. Specifically, our study includes (i) 8 normalization methods; (ii) 52 lock-step measures; (iii) 4 sliding measures; (iv) 7 elastic measures; (v) 4 kernel functions; and (vi) 4 embedding measures. We extensively evaluate these measures across 128 time-series datasets using rigorous statistical analysis. Our findings debunk four long-standing misconceptions that significantly alter the landscape of what is known about existing distance measures. With the new foundations in place, we discuss open challenges and promising directions.
Supplemental Material
- Amaia Abanda, Usue Mori, and Jose A Lozano. 2019. A review on distance based time series classification. Data Mining and Knowledge Discovery 33, 2 (2019), 378--412.Google ScholarDigital Library
- Rakesh Agrawal, Christos Faloutsos, and Arun N. Swami. 1993. Efficient Similarity Search In Sequence Databases. In FODO. 69--84.Google ScholarDigital Library
- Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney, and Kyuseok Shim. 1995. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceeding of the 21th International Conference on Very Large Data Bases. Citeseer, 490--501.Google ScholarDigital Library
- Shadab Alam, Franco D Albareti, Carlos Allende Prieto, Friedrich Anders, Scott F Anderson, Timothy Anderton, Brett H Andrews, Eric Armengaud, Éric Aubourg, Stephen Bailey, et al.2015. The eleventh and twelfth data releases of the Sloan Digital Sky Survey: final data from SDSS-III. The Astrophysical Journal Supplement Series 219, 1(2015), 12.Google ScholarCross Ref
- Jonathan Alon, Stan Sclaroff, George Kollios, and Vladimir Pavlovic. 2003. Discovering clusters in motion time-series data. In CVPR. 375--381.Google Scholar
- Francisco Martinez Alvarez, Alicia Troncoso, Jose C Riquelme, and Jesus S Aguilar Ruiz. 2010. Energy time series forecasting based on pattern sequence similarity. IEEE Transactions on Knowledge and Data Engineering 23, 8 (2010), 1230--1243.Google ScholarDigital Library
- Henrik André-Jönsson and Dushan Z Badal. 1997. Using signature files for querying time-series data. In European Symposium on Principles of Data Mining and Knowledge Discovery. Springer, 211--220.Google ScholarCross Ref
- Johannes Aßfalg, Hans-Peter Kriegel, Peer Kröger, Peter Kunath, Alexey Pryakhin, and Matthias Renz. 2006. Similarity search on time series based on threshold queries. In International Conference on Extending Database Technology. Springer, 276--294.Google Scholar
- Martin Bach-Andersen, Bo Rømer-Odgaard, and Ole Winther. 2017. Flexible non-linear predictive models for large-scale wind turbine diagnostics. Wind Energy 20, 5 (2017), 753--764.Google ScholarCross Ref
- Anthony Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, and Eamonn Keogh. 2018.The UEA multivariate time series classification archive, 2018. arXivpreprint arXiv:1811.00075(2018).Google Scholar
- Anthony Bagnall, Jason Lines, Aaron Bostrom, James Large, and Eamonn Keogh. 2017. The great time series classification bake off: are view and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery 31, 3 (2017), 606--660.Google ScholarDigital Library
- Anthony J Bagnall and Gareth J Janacek. 2004. Clustering time series from ARMA models with clipped data. In KDD. 49--58.Google Scholar
- Ziv Bar-Joseph. 2004. Analyzing time series gene expression data. Bioinformatics 20, 16 (2004), 2493--2503.Google ScholarDigital Library
- Ziv Bar-Joseph, Georg K Gerber, David K Gifford, Tommi S Jaakkola, and Itamar Simon. 2003. Continuous representations of time-series gene expression data.Journal of Computational Biology 10, 3--4 (2003),341--356.Google Scholar
- Ziv Bar-Joseph, Anthony Gitter, and Itamar Simon. 2012. Studying and modelling dynamic biological processes using time-series gene expression data.Nature Reviews Genetics13, 8 (2012), 552.Google Scholar
- Gustavo EAPA Batista, Eamonn J Keogh, Oben Moses Tataw, and Vinicius MA De Souza. 2014. CID: an efficient complexity-invariant distance for time series.Data Mining and Knowledge Discovery 28, 3(2014), 634--669.Google Scholar
- Nurjahan Begum and Eamonn Keogh. 2014. Rare time series motif discovery from unbounded streams. Proceedings of the VLDB Endowment 8, 2 (2014), 149--160.Google ScholarDigital Library
- Donald J Berndt and James Clifford. 1994. Using Dynamic TimeWarping to Find Patterns in Time Series. In AAAI Workshop on KDD. 359--370.Google Scholar
- Bharat B Biswal, Maarten Mennes, Xi-Nian Zuo, Suril Gohel, ClareKelly, Steve M Smith, Christian F Beckmann, Jonathan S Adelstein, Randy L Buckner, Stan Colcombe, et al. 2010. Toward discovery science of human brain function. Proceedings of the National Academy of Sciences 107, 10 (2010), 4734--4739.Google ScholarCross Ref
- R Bracewell. 1965. Pentagram notation for cross correlation. The Fourier transform and its applications. New York: McGraw-Hill46(1965), 243.Google Scholar
- Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In ACM sigmod record, Vol. 29. ACM, 93--104.Google ScholarDigital Library
- Peter J Brockwell and Richard A Davis. 2016.Introduction to timeseries and forecasting. springer.Google Scholar
- Lisa Gottesfeld Brown. 1992. A survey of image registration techniques. ACM computing surveys (CSUR)24, 4 (1992), 325--376.Google Scholar
- Yuhan Cai and Raymond Ng. 2004. Indexing spatio-temporal trajectories with Chebyshev polynomials. In SIGMOD. 599--610.Google Scholar
- Alessandro Camerra, Themis Palpanas, Jin Shieh, and Eamonn Keogh. 2010. iSAX 2.0: Indexing and mining one billion time series. In 2010 IEEE International Conference on Data Mining. IEEE, 58--67.Google ScholarDigital Library
- Sung-Hyuk Cha. 2007. Comprehensive survey on distance/similarity measures between probability density functions. City1, 2 (2007), 1.Google Scholar
- Lei Chen and Raymond Ng. 2004. On the marriage of Lp-norms and edit distance. InVLDB. 792--803.Google Scholar
- Lei Chen, M Tamer Özsu, and Vincent Oria. 2005. Robust and fast similarity search for moving object trajectories. In SIGMOD. 491--502.Google Scholar
- Qiuxia Chen, Lei Chen, Xiang Lian, Yunhao Liu, and Jeffrey Xu Yu.2007. Indexable PLA for efficient similarity search. In VLDB. 435--446.Google Scholar
- Yueguo Chen, Mario A Nascimento, Beng Chin Ooi, and Anthony KHTung. 2007. Spade: On shape-based pattern detection in streaming time series. In ICDE. 786--795.Google Scholar
- Bill Chiu, Eamonn Keogh, and Stefano Lonardi. 2003. Probabilistic discovery of time series motifs. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 493--498.Google ScholarDigital Library
- Kelvin Kam Wing Chu and Man Hon Wong. 1999. Fast time-series searching with scaling and shifting. In Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. Citeseer, 237--248.Google Scholar
- Richard Cole, Dennis Shasha, and Xiaojian Zhao. 2005. Fast window correlations over uncooperative time series. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 743--749.Google ScholarDigital Library
- James W Cooley and John W Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comp.19, 90(1965), 297--301.Google ScholarCross Ref
- Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297.Google ScholarDigital Library
- Madalena Costa, Ary L Goldberger, and C-K Peng. 2002. Multiscale entropy analysis of complex physiologic time series.Physical review letters 89, 6 (2002), 068102.Google Scholar
- Nello Cristianini and John Shawe-Taylor. 2000. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press.Google ScholarCross Ref
- Marco Cuturi. 2011. Fast global alignment kernels. In Proceedings of the 28th international conference on machine learning (ICML-11). 929--936.Google ScholarDigital Library
- Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, and Themis Palpanas. 2012. Uncertain time-series similarity: Return to the basics. Proceedings of the VLDB Endowment 5, 11 (2012), 1662--1673.Google ScholarDigital Library
- Michele Dallachiesa, Themis Palpanas, and Ihab F Ilyas. 2014. Top-k nearest neighbor search in uncertain data series.Proceedings of the VLDB Endowment 8, 1 (2014), 13--24.Google Scholar
- Hoang Anh Dau, Eamonn Keogh, Kaveh Kamgar, Chin-Chia MichaelYeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana,Yanping, Bing Hu, Nurjahan Begum, Anthony Bagnall, Abdullah Mueen, Gustavo Batista, and Hexagon-ML. 2018. The UCR Time Series Classification Archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.Google Scholar
- Janez Demar. 2006. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7 (2006),1--30.Google ScholarDigital Library
- Michel-Marie Deza and Elena Deza. 2006.Dictionary of distances. Elsevier.Google Scholar
- Michel Marie Deza and Elena Deza. 2009. Encyclopedia of distances. In Encyclopedia of distances. Springer, 1--583.Google Scholar
- Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn Keogh. 2008. Querying and mining of time series data: experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment 1, 2 (2008), 1542--1552.Google ScholarDigital Library
- Rui Ding, Qiang Wang, Yingnong Dang, Qiang Fu, Haidong Zhang,and Dongmei Zhang. 2015. Yading: Fast clustering of large-scale time series data. Proceedings of the VLDB Endowment 8, 5 (2015), 473--484.Google ScholarDigital Library
- Alejandro Domínguez. 2015. A history of the convolution operation [Retrospectroscope]. IEEE pulse6, 1 (2015), 38--49.Google ScholarCross Ref
- Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas, and Houda Benbrahim. 2018. The lernaean hydra of data series similarity search:An experimental evaluation of the state of the art. Proceedings of the VLDB Endowment 12, 2 (2018), 112--127.Google ScholarDigital Library
- Jason Ernst and Ziv Bar-Joseph. 2006. STEM: a tool for the analysis of short time series gene expression data. BMC bioinformatics 7, 1(2006), 191.Google Scholar
- Philippe Esling and Carlos Agon. 2012. Time-series data mining. ACM Computing Surveys (CSUR)45, 1 (2012), 12.Google Scholar
- Christos Faloutsos, M. Ranganathan, and Yannis Manolopoulos. 1994. Fast Subsequence Matching in Time-series Databases. In SIGMOD. 419--429.Google Scholar
- Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. 2014. Do we need hundreds of classifiers to solve real world classification problems?The journal of machine learning research15,1 (2014), 3133--3181.Google Scholar
- Elias Frentzos, Kostas Gratsias, and Yannis Theodoridis. 2007. Index-based most similar trajectory search. In ICDE. 816--825.Google Scholar
- Milton Friedman. 1937. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Amer. Statist. Assoc. 32 (1937), 675--701.Google ScholarCross Ref
- Daniel G Gavin, W Wyatt Oswald, Eugene R Wahl, and John W Williams. 2003. A statistical approach to evaluating distance metrics and analog assignments for pollen records.Quaternary Research 60, 3 (2003), 356--367.Google Scholar
- Martin Gavrilov, Dragomir Anguelov, Piotr Indyk, and Rajeev Motwani. 2000. Mining the stock market: Which measure is best. In Proc. of the 6th ACM SIGKDD. 487--496.Google Scholar
- Rafael Giusti and Gustavo EAPA Batista. 2013. An Empirical Comparison of Dissimilarity Measures for Time Series Classification. In BRACIS. 82--88.Google Scholar
- Steve Goddard, Sherri K Harms, Stephen E Reichenbach, Tsegaye Tadesse, and William J Waltman. 2003. Geospatial decision support for drought risk management. Commun. ACM46, 1 (2003), 35--37.Google Scholar
- Dina Q Goldin and Paris C Kanellakis. 1995. On similarity queries for time-series data: constraint specification and implementation. In International Conference on Principles and Practice of Constraint Programming. Springer, 137--153.Google ScholarCross Ref
- Tomasz Górecki and Maciej Luczak. 2013. Using derivatives in time series classification.Data Mining and Knowledge Discovery 26, 2(2013), 310--331.Google Scholar
- Aditya Grover, Ashish Kapoor, and Eric Horvitz. 2015. A deep hybrid model for weather forecasting. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 379--386.Google ScholarDigital Library
- Joel Grus. 2019. Data science from scratch: first principles with python. O'Reilly Media.Google Scholar
- Jon Hills, Jason Lines, Edgaras Baranauskas, James Mapp, and Anthony Bagnall. 2014. Classification of time series by shapelet transformation. Data Mining and Knowledge Discovery 28, 4 (2014), 851--881.Google ScholarDigital Library
- Ove Hoegh-Guldberg, Peter J Mumby, Anthony J Hooten, Robert S Steneck, Paul Greenfield, Edgardo Gomez, C Drew Harvell, Peter FSale, Alasdair J Edwards, Ken Caldeira, et al. 2007. Coral reefs under rapid climate change and ocean acidification. Science 318, 5857 (2007), 1737--1742.Google Scholar
- Rie Honda, Shuai Wang, Tokio Kikuchi, and Osamu Konishi. 2002.Mining of moving objects from time-series images and its application to satellite weather imagery. Journal of Intelligent Information Systems 19, 1 (2002), 79--93.Google ScholarDigital Library
- Bing Hu, Yanping Chen, and Eamonn Keogh. 2013. Time Series Classification under More Realistic Assumptions. In SDM. 578--586.Google Scholar
- Pablo Huijse, Pablo A Estevez, Pavlos Protopapas, Jose C Principe, and Pablo Zegers. 2014. Computational intelligence challenges and applications on large-scale astronomical time series databases. IEEE Computational Intelligence Magazine 9, 3 (2014), 27--39.Google ScholarDigital Library
- Young-Seon Jeong, Myong K Jeong, and Olufemi A Omitaomu. 2011. Weighted dynamic time warping for time series classification. Pattern Recognition 44, 9 (2011), 2231--2240.Google ScholarDigital Library
- Konstantinos Kalpakis, Dhiral Gada, and Vasundhara Puttagunta.2001. Distance measures for effective clustering of ARIMA time-series. In ICDM. 273--280.Google Scholar
- Kunio Kashino, Gavin Smith, and Hiroshi Murase. 1999. Time-series active search for quick retrieval of audio and video. In 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), Vol. 6. IEEE, 2993--2996.Google ScholarDigital Library
- Shrikant Kashyap and Panagiotis Karras. 2011. Scalable knn search on vertically stored time series. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1334--1342.Google ScholarDigital Library
- Eamonn Keogh. 2006. A decade of progress in indexing and mining large time series databases. In VLDB. 1268--1268.Google Scholar
- Eamonn Keogh, Kaushik Chakrabarti, Michael Pazzani, and Sharad Mehrotra. 2001. Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In SIGMOD. 151--162.Google Scholar
- Eamonn Keogh and Jessica Lin. 2005. Clustering of time-series subsequences is meaningless: Implications for previous and future research. Knowledge and Information Systems 8, 2 (2005), 154--177.Google ScholarDigital Library
- Eamonn Keogh and Chotirat Ann Ratanamahatana. 2005. Exact indexing of dynamic time warping. Knowledge and Information Systems 7, 3 (2005), 358--386.Google ScholarDigital Library
- Chan Kin-pong and Fu Ada. 1999. Efficient Time Series Matching by Wavelets. In ICDE. 126--133.Google Scholar
- S Knieling, J Niediek, E Kutter, J Bostroem, CE Elger, and F Mormann. 2017. An online adaptive screening procedure for selective neuronal responses. Journal of neuroscience methods291 (2017), 36--42.Google Scholar
- Flip Korn, H. V. Jagadish, and Christos Faloutsos. 1997. Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In SIGMOD. 289--300.Google Scholar
- Yann LeCun, Yoshua Bengio, et al.1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361, 10 (1995), 1995.Google Scholar
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436.Google Scholar
- Yann A LeCun, Léon Bottou, Genevieve B Orr, and Klaus-Robert Müller. 2012. Efficient backprop. InNeural networks: Tricks of the trade. Springer, 9--48.Google Scholar
- Qi Lei, Jinfeng Yi, Roman Vaculin, Lingfei Wu, and Inderjit S Dhillon.2017. Similarity preserving representation learning for time series analysis. arXiv preprint arXiv:1702.03584(2017).Google Scholar
- Chung-Sheng Li, Philip S. Yu, and Vittorio Castelli. 1996. Hierarchyscan: A hierarchical similarity search algorithm for databases of long sequences. In ICDE. IEEE, 546--553.Google Scholar
- Xiang Lian, Lei Chen, Jeffrey Xu Yu, Guoren Wang, and Ge Yu. 2007. Similarity match over high speed time-series streams. InICDE. 1086--1095.Google Scholar
- Jessica Lin, Michail Vlachos, Eamonn Keogh, and Dimitrios Gunopulos. 2004. Iterative incremental clustering of time series. In EDBT. 106--122.Google Scholar
- Michele Linardi and Themis Palpanas. 2018. Scalable, variable-length similarity search in data series: The ULISSE approach. Proceedings of the VLDB Endowment 11, 13 (2018), 2236--2248.Google ScholarDigital Library
- Jason Lines and Anthony Bagnall. 2015. Time series classification with ensembles of elastic distance measures.Data Mining and Knowledge Discovery 29, 3 (2015), 565--592.Google Scholar
- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In2008 Eighth IEEE International Conference on Data Mining. IEEE, 413--422.Google ScholarDigital Library
- Helmut Lütkepohl, Markus Krätzig, and Peter CB Phillips. 2004. Applied time series econometrics. Cambridge university press.Google Scholar
- Mohammad Saeid Mahdavinejad, Mohammadreza Rezvan, Moham-madamin Barekatain, Peyman Adibi, Payam Barnaghi, and Amit P Sheth. 2017. Machine learning for Internet of Things data analysis: Asurvey. Digital Communications and Networks(2017).Google Scholar
- Rosario N Mantegna. 1999. Hierarchical structure in financial markets.The European Physical Journal B-Condensed Matter and Complex Systems 11, 1 (1999), 193--197.Google Scholar
- Pierre-François Marteau. 2008. Time warp edit distance with stiffness adjustment for time series matching. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 2 (2008), 306--318.Google ScholarDigital Library
- Pierre-François Marteau and Sylvie Gibet. 2014. On recursive edit distance kernels with application to time series classification. IEEE transactions on neural networks and learning systems 26, 6 (2014),1121--1133.Google Scholar
- Francisco Martínez-Álvarez, Alicia Troncoso, Gualberto Asencio-Cortés, and José Riquelme. 2015. A survey on data mining techniques applied to electricity-related time series forecasting. Energies 8, 11(2015), 13162--13193.Google ScholarCross Ref
- Richard McCleary, Richard A Hay, Erroll E Meidinger, and David McDowall. 1980.Applied time series analysis for the social sciences. Sage Publications Beverly Hills, CA.Google Scholar
- Vasileios Megalooikonomou, Qiang Wang, Guo Li, and Christos Faloutsos. 2005. A multiresolution symbolic representation of time series. In Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on. IEEE, 668--679.Google ScholarDigital Library
- Katsiaryna Mirylenka, Vassilis Christophides, Themis Palpanas, Ioannis Pefkianakis, and Martin May. 2016. Characterizing home device usage from wireless traffic time series.Google Scholar
- Katsiaryna Mirylenka, Michele Dallachiesa, and Themis Palpanas. 2017. Data series similarity using correlation-aware measures. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. 1--12.Google ScholarDigital Library
- A Morales-Esteban, Francisco Martínez-Álvarez, A Troncoso, JL Justo,and Cristina Rubio-Escudero. 2010. Pattern recognition to forecast seismic time series.Expert Systems with Applications 37, 12 (2010),8333--8342.Google Scholar
- Michael D Morse and Jignesh M Patel. 2007. An efficient and accurate method for evaluating time series similarity. In SIGMOD. 569--580.Google Scholar
- Abdullah Mueen, Eamonn Keogh, and Neal Young. 2011. Logical-shapelets: An expressive primitive for time series classification. In KDD. 1154--1162.Google ScholarDigital Library
- Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, and Brandon Westover. 2009. Exact discovery of time series motifs. In Proceedings of the 2009 SIAM international conference on data mining. SIAM, 473--484.Google ScholarCross Ref
- Abdullah Mueen, Yan Zhu, Michael Yeh, Kaveh Kamgar, Krishnamurthy Viswanathan, Chetan Gupta, and Eamonn Keogh. 2017.The Fastest Similarity Search Algorithm for Time Series Subsequences under Euclidean Distance. http://www.cs.unm.edu/~mueen/FastestSimilaritySearch.html.Google Scholar
- Peter Nemenyi. 1963. Distribution-free Multiple Comparisons. Ph.D. Dissertation. Princeton University.Google Scholar
- Themis Palpanas. 2015. Data series management: the road to big sequence analytics. ACM SIGMOD Record 44, 2 (2015), 47--52.Google ScholarDigital Library
- Themis Palpanas. 2016. Big sequence management: A glimpse of the past, the present, and the future. InInternational Conference onCurrent Trends in Theory and Practice of Informatics. Springer, 63--80.Google ScholarDigital Library
- Panagiotis Papapetrou, Vassilis Athitsos, Michalis Potamias, GeorgeKollios, and Dimitrios Gunopulos. 2011. Embedding-based subsequence matching in time-series databases. TODS 36, 3 (2011), 17.Google ScholarDigital Library
- John Paparrizos. 2019. 2018 UCR Time-Series Archive: Backward Compatibility, Missing Values, and Varying Lengths. https://github.com/johnpaparrizos/UCRArchiveFixes.Google Scholar
- John Paparrizos and Michael J Franklin. 2019. GRAIL: efficient time-series representation learning. Proceedings of the VLDB Endowment12, 11 (2019), 1762--1777.Google ScholarDigital Library
- John Paparrizos and Luis Gravano. 2015. k-shape: Efficient and accurate clustering of time series. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 1855--1870.Google ScholarDigital Library
- John Paparrizos and Luis Gravano. 2017. Fast and Accurate Time-Series Clustering. ACM Transactions on Database Systems (TODS)42, 2 (2017), 8.Google Scholar
- Athanasios Papoulis. 1962. The Fourier integral and its applications. McGraw-Hill.Google Scholar
- C-K Peng, Shlomo Havlin, H Eugene Stanley, and Ary L Goldberger. 1995. Quantification of scaling exponents and crossover phenomenain nonstationary heartbeat time series. Chaos: An Interdisciplinary Journal of Nonlinear Science 5, 1 (1995), 82--87.Google ScholarCross Ref
- François Petitjean, Germain Forestier, Geoffrey I Webb, Ann E Nicholson, Yanping Chen, and Eamonn Keogh. 2014. Dynamic time warping averaging of time series allows faster and more accurate classification. In 2014 IEEE international conference on data mining. IEEE, 470--479.Google ScholarDigital Library
- François Petitjean, Germain Forestier, Geoffrey I Webb, Ann E Nicholson, Yanping Chen, and Eamonn Keogh. 2016. Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowledge and Information Systems 47, 1 (2016), 1--26.Google ScholarDigital Library
- François Petitjean, Alain Ketterlin, and Pierre Gançarski. 2011. A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition 44, 3 (2011), 678--693.Google ScholarDigital Library
- Davood Rafiei and Alberto Mendelzon. 1997. Similarity-based queries for time series data. In ACM SIGMOD Record, Vol. 26. ACM, 13--25.Google ScholarDigital Library
- Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, and Eamonn Keogh. 2012. Searching and mining trillions of time series subsequences under dynamic time warping. InKDD. 262--270.Google Scholar
- Chotirat Ann Ralanamahatana, Jessica Lin, Dimitrios Gunopulos, Eamonn Keogh, Michail Vlachos, and Gautam Das. 2005. Mining time series data. InData mining and knowledge discovery handbook. Springer, 1069--1103.Google Scholar
- Chotirat Ann Ratanamahatana and Eamonn Keogh. 2004. Making time-series classification more accurate using learned constraints. In SDM. 11--22.Google Scholar
- Usman Raza, Alessandro Camerra, Amy L Murphy, Themis Palpanas, and Gian Pietro Picco. 2015. Practical data prediction for real-world wireless sensor networks.IEEE Transactions on Knowledge and DataEngineering 27, 8 (2015), 2231--2244.Google ScholarDigital Library
- John Rice. 2006.Mathematical statistics and data analysis. Cengage Learning.Google Scholar
- Joshua S Richman and J Randall Moorman. 2000. Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology 278, 6(2000), H2039--H2049.Google ScholarCross Ref
- Kexin Rong, Clara E Yoon, Karianne J Bergen, Hashem Elezabi, Peter Bailis, Philip Levis, and Gregory C Beroza. 2018. Locality-sensitive hashing for earthquake detection: A case study of scaling data-driven science. Proceedings of the VLDB Endowment11, 11 (2018), 1674--1687.Google ScholarDigital Library
- Eduardo J Ruiz, Vagelis Hristidis, Carlos Castillo, Aristides Gionis, and Alejandro Jaimes. 2012. Correlating financial time series with micro-blogging activity. InProceedings of the fifth ACM international conference on Web search and data mining. ACM, 513--522.Google Scholar
- Hiroaki Sakoe and Seibi Chiba. 1971. A dynamic programming approach to continuous speech recognition. In ICA. 65--69.Google Scholar
- Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE transactions on acoustics, speech, and signal processing 26, 1 (1978), 43--49.Google ScholarCross Ref
- Yasushi Sakurai, Spiros Papadimitriou, and Christos Faloutsos. 2005.Braid: Stream mining through group lag correlations. In SIGMOD. ACM, 599--610.Google Scholar
- Patrick Schäfer and Mikael Högqvist. 2012. SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets. InProceedings of the 15th International Conference on Extend-ing Database Technology. ACM, 516--527.Google ScholarDigital Library
- Bernhard Schölkopf, Alexander Smola, and Klaus-Robert Müller. 1997. Kernel principal component analysis. InInternational Conference on Artificial Neural Networks. Springer, 583--588.Google ScholarCross Ref
- Bernhard Schölkopf, Alexander Smola, and Klaus-Robert Müller. 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural computation 10, 5 (1998), 1299--1319.Google ScholarDigital Library
- Bernhard Schölkopf and Alexander J Smola. 2002. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.Google Scholar
- Pavel Senin, Jessica Lin, Xing Wang, Tim Oates, Sunil Gandhi,Arnold P Boedihardjo, Crystal Chen, and Susan Frankenstein. 2015. Time series anomaly discovery with grammar-based compression. In Edbt. 481--492.Google Scholar
- Dennis Shasha. 1999. Tuning time series queries in finance: Case studies and recommendations. IEEE Data Eng. Bull. 22, 2 (1999),40--46.Google Scholar
- Jin Shieh and Eamonn Keogh. 2008. i SAX: indexing and mining terabyte sized time series. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining.ACM, 623--631.Google ScholarDigital Library
- Yutao Shou, Nikos Mamoulis, and David Cheung. 2005. Fast and exact warping of time series using adaptive segmental approximations.Machine Learning 58, 2--3 (2005), 231--267.Google Scholar
- Alexandra Stefan, Vassilis Athitsos, and Gautam Das. 2013. The move-split-merge metric for time series. TKDE 25, 6 (2013), 1425--1438.Google ScholarDigital Library
- Ruey S Tsay. 2014. Financial Time Series. Wiley StatsRef: Statistics Reference Online(2014), 1--23.Google ScholarCross Ref
- Kuniaki Uehara and Mitsuomi Shimada. 2002. Extraction of primitive motion and discovery of association rules from human motion data. In Progress in Discovery Science. Springer, 338--348.Google Scholar
- Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, and Eamonn Keogh. 2006. Indexing multidimensional time-series. The VLDB Journal 15, 1 (2006), 1--20.Google ScholarDigital Library
- Michail Vlachos, George Kollios, and Dimitrios Gunopulos. 2002. Discovering similar multidimensional trajectories. In Proceedings 18th international conference on data engineering. IEEE, 673--684.Google ScholarCross Ref
- Gabriel Wachman, Roni Khardon, Pavlos Protopapas, and Charles R Alcock. 2009. Kernels for periodic time series arising in astronomy. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 489--505.Google ScholarCross Ref
- Hao Wang, Yilun Cai, Yin Yang, Shiming Zhang, and Nikos Mamoulis. 2014. Durable Queries over Historical Time Series. TKDE 26, 3 (2014),595--607.Google ScholarDigital Library
- Xiaoyue Wang, Abdullah Mueen, Hui Ding, Goce Trajcevski, Peter Scheuermann, and Eamonn Keogh. 2013. Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery(2013), 1--35.Google Scholar
- Xiaozhe Wang, Kate Smith, and Rob Hyndman. 2006. Characteristic-based clustering for time series data. Data mining and knowledge Discovery 13, 3 (2006), 335--364.Google Scholar
- Yang Wang, Peng Wang, Jian Pei, Wei Wang, and Sheng Huang. 2013. A data-adaptive and dynamic segmentation index for whole matching on time series. Proceedings of the VLDB Endowment 6, 10 (2013), 793--804.Google ScholarDigital Library
- T Warren Liao. 2005. Clustering of time series data - a survey. Pattern Recognition 38, 11 (2005), 1857--1874.Google ScholarDigital Library
- Peter J Webster, Greg J Holland, Judith A Curry, and H-R Chang.2005. Changes in tropical cyclone number, duration, and intensity in a warming environment. Science 309, 5742 (2005), 1844--1846.Google Scholar
- Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bulletin(1945), 80--83.Google Scholar
- Billy M Williams and Lester A Hoel. 2003. Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: Theoretical basis and empirical results. Journal of transportation engineering 129, 6(2003), 664--672.Google ScholarCross Ref
- Lingfei Wu, Ian En-Hsu Yen, Jinfeng Yi, Fangli Xu, Qi Lei, and Michael Witbrock. 2018. Random Warping Series: A Random Features Method for Time-Series Embedding. In AISTATS. 793--802.Google Scholar
- Xiaopeng Xi, Eamonn Keogh, Christian Shelton, Li Wei, and Chotirat Ann Ratanamahatana. 2006. Fast time series classification using numerosity reduction. In Proceedings of the 23rd international conference on Machine learning. ACM, 1033--1040.Google ScholarDigital Library
- Yimin Xiong and Dit-Yan Yeung. 2002. Mixtures of ARMA models for model-based time series clustering. In ICDM. 717--720.Google Scholar
- Jaewon Yang and Jure Leskovec. 2011. Patterns of temporal variation in online media. In WSDM. 177--186.Google Scholar
- Dragomir Yankov, Eamonn Keogh, and Umaa Rebbapragada. 2008. Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. Knowledge and Information Systems 17, 2 (2008), 241--262.Google ScholarDigital Library
- Lexiang Ye and Eamonn Keogh. 2009. Time series shapelets: a new primitive for data mining. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 947--956.Google ScholarDigital Library
- Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Diego Furtado Silva, Abdullah Mueen, and Eamonn Keogh. 2016. Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In 2016 IEEE 16th international conference on data mining(ICDM). IEEE, 1317--1322.Google Scholar
- Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum,Yifei Ding, Hoang Anh Dau, Zachary Zimmerman, Diego Furtado Silva, Abdullah Mueen, and Eamonn Keogh. 2018. Time series joins,motifs, discords and shapelets: a unifying view that exploits the matrix profile. Data Mining and Knowledge Discovery 32, 1 (2018), 83--123.Google ScholarDigital Library
- Mi-Yen Yeh, Kun-Lung Wu, Philip S Yu, and Ming-Syan Chen. 2009.PROUD: a probabilistic approach to processing similarity queries over uncertain data streams. InProceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology. ACM, 684--695.Google ScholarDigital Library
- Byoung-Kee Yi and Christos Faloutsos. 2000. Fast time sequence indexing for arbitrary Lp norms. VLDB.Google Scholar
- Jesin Zakaria, Abdullah Mueen, and Eamonn Keogh. 2012. Clustering Time Series Using Unsupervised-Shapelets. In ICDM. 785--794.Google Scholar
- Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, and Michal Batko. 2006. Similarity search: the metric space approach. Vol. 32. Springer Science & Business Media.Google Scholar
- Guoqing Zheng, Yiming Yang, and Jaime Carbonell. 2016. Efficient shift-invariant dictionary learning. In SIGKDD. ACM, 2095--2104.Google ScholarDigital Library
- Kostas Zoumpatianos, Stratos Idreos, and Themis Palpanas. 2016. ADS: the adaptive data series index.The VLDB Journal-The International Journal on Very Large Data Bases 25, 6 (2016), 843--866.Google Scholar
Index Terms
- Debunking Four Long-Standing Misconceptions of Time-Series Distance Measures
Recommendations
Query-sensitive distance measure selection for time series nearest neighbor classification
Many distance or similarity measures have been proposed for time series similarity search. However, none of these measures is guaranteed to be optimal when used for 1-Nearest Neighbor (NN) classification. In this paper we study the problem of selecting ...
When Similarity Measures Lie
SISAP 2015: Proceedings of the 8th International Conference on Similarity Search and Applications - Volume 9371Do similarity or distance measures ever go wrong? The inherent subjectivity in similarity discernment has long supported the view that all judgements of similarity are equally valid, and that any selected similarity measure may only be considered more ...
On efficient network similarity measures
Highlights- The approach is novel and application oriented.
- It outperforms classical graph ...
AbstractThis paper presents novel graph similarity measures which can be applied to simple directed and undirected networks. To define the graph similarity measures, we first map graphs to real numbers by utilizing structural graph measures. ...
Comments