Skip to main content
Log in

A review on distance based time series classification

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Time series classification is an increasing research topic due to the vast amount of time series data that is being created over a wide variety of fields. The particularity of the data makes it a challenging task and different approaches have been taken, including the distance based approach. 1-NN has been a widely used method within distance based time series classification due to its simplicity but still good performance. However, its supremacy may be attributed to being able to use specific distances for time series within the classification process and not to the classifier itself. With the aim of exploiting these distances within more complex classifiers, new approaches have arisen in the past few years that are competitive or which outperform the 1-NN based approaches. In some cases, these new methods use the distance measure to transform the series into feature vectors, bridging the gap between time series and traditional classifiers. In other cases, the distances are employed to obtain a time series kernel and enable the use of kernel methods for time series classification. One of the main challenges is that a kernel function must be positive semi-definite, a matter that is also addressed within this review. The presented review includes a taxonomy of all those methods that aim to classify time series using a distance based approach, as well as a discussion of the strengths and weaknesses of each method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. UCR is a repository of time series datasets (Chen et al. 2015a) which is often used as a benchmark for evaluating time series classification methods. These datasets are greatly varied with respect to their application domains, time series lengths, number of classes, and sizes of the training and testing sets.

  2. On-line handwritten digit data set (Guyon et al. 1994).

References

  • Adams CC (2004) The knot book: an elementary introduction to the mathematical theory of knots. American Mathematical Society, Providence

    MATH  Google Scholar 

  • Bagnall A, Janacek G (2014) A run length transformation for discriminating between autoregressive time series. J Classif 31(2):274–295

    Article  MATH  Google Scholar 

  • Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660

    Article  MathSciNet  Google Scholar 

  • Bahlmann C, Haasdonk B, Burkhardt H (2002) Online handwriting recognition with support vector machines: a kernel approach. In: Proceedings of international workshop on frontiers in handwriting recognition, IWFHR, pp 49–54

  • Belkin M, Niyogi P (2002) Laplacian Eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 14:585–591

    Google Scholar 

  • Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. Workshop Knowl Discovery Databases 398:359–370

    Google Scholar 

  • Borg I, Groenen P (1997) Modern multidimensional scaling: theory and applications. Springer, Berlin

    Book  MATH  Google Scholar 

  • Bostrom A, Bagnall A (2014) Binary shapelet transform for multiclass time series classification. Trans Large Scale Data Knowl Centered Syst 8800:24–46

    Google Scholar 

  • Bostrom A, Bagnall A, Lines J (2016) Evaluating improvements to the shapelet transform. www-bcf.usc.edu. Accessed 21 Nov 2017

  • Casacuberta F, Vidal E, Rulot H (1987) On the metric properties of dynamic time warping. IEEE Trans Acoustics Speech Signal Process 35(11):1631–1633

    Article  Google Scholar 

  • Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: International conference on very large data bases, pp 792–803

  • Chen P, Fan R, Lin C (2006) A study on SMO-type decomposition methods for support vector machines. IEEE Trans Neural Netw Learn Syst 17(4):893–908

    Article  Google Scholar 

  • Chen Y, Hu B, Keogh E, Batista GEAPA (2013) DTW-D: time series semi-supervised learning from a single example. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, p 383

  • Chen Y, Garcia E, Gupta M (2009) Similarity-based classification: concepts and algorithms. J Mach Learn Res 10(206):747–776

    MathSciNet  MATH  Google Scholar 

  • Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista GEAPA (2015a) The UCR time series classification archive

  • Chen Z, Zuo W, Hu Q, Lin L (2015b) Kernel sparse representation for time series classification. Inf Sci 292:15–26

    Article  MathSciNet  MATH  Google Scholar 

  • Corduas M, Piccolo D (2008) Time series clustering and classification by the autoregressive metric. Comput Stat Data Anal 52(4):1860–1872

    Article  MathSciNet  MATH  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 297:273–297

    MATH  Google Scholar 

  • Cortes C, Haffner P, Mohri M (2004) Rational kernels: theory and algorithms. J Mach Learn Res 5:1035–1062

    MathSciNet  MATH  Google Scholar 

  • Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    Article  MATH  Google Scholar 

  • Cuturi M (2011) Fast global alignment kernels. In: Proceedings of the 28th ICML international conference on machine learning, pp 929–936

  • Cuturi M, Vert J (2007) A kernel for time series based on global alignments. IEEE Trans Acoustics Speech Signal Process 1:413–416

    Google Scholar 

  • Decoste D, Schölkopf B (2002) Training invariant support vector machines using selective sampling. Mach Learn 46:161–190

    Article  MATH  Google Scholar 

  • Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Very Large Database Endow 1(2):1542–1552

    Article  Google Scholar 

  • Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):1–34

    Article  MATH  Google Scholar 

  • Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: ACM SIGMOD international conference on management of data, pp 419–429

  • Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. Comput Syst Sci 139:119–139

    Article  MathSciNet  MATH  Google Scholar 

  • Fu TC (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181

    Article  Google Scholar 

  • Gaidon A, Harchoui Z, Schmid C (2011) A time series kernel for action recognition. In: Procedings of the British machine vision conference, pp 63.1–63.11

  • Giusti R, Silva DF, Batista GEAPA (2016) Improved time series classification with representation diversity and SVM. In: International conference on machine learning and applications, pp 1–6

  • Grabocka J, Schilling N, Wistuba M, Schmidt-Thieme L (2014) Learning time-series shapelets. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 392–401

  • Graepel T, Herbrich R, Bollmann-Sdorra P, Obermayer K (1999) Classification on pairwise proximity data. Adv Neural Inf Process Syst 11:438–444

    Google Scholar 

  • Greub WH (1975) Linear algebra. Springer, Berlin

    Book  MATH  Google Scholar 

  • Gudmundsson S, Runarsson TP, Sigurdsson S (2008) Support vector machines and dynamic time warping for time series. In: Joint conference on neural networks (IEEE world congress on computational intelligence), pp 2772–2776

  • Guyon I, Schomaker L, Planiondon R, Liberman M, Janet S, Montreal Ecole Polytechnique De, Consortium Linguistic Data (1994) UNIPEN project of on-line data exchange, pp 29–33

  • Haasdonk B (2005) Feature space interpretation of SVMs with indefinite kernels. IEEE Trans Pattern Anal Mach Intell 27(4):482–492

    Article  Google Scholar 

  • Haasdonk B, Bahlmann C (2004) Learning with distance substitution kernels. In: Joint pattern recognition symposium, pp 220–227

  • Hayashi A, Mizuhara Y, Suematsu N (2005) Embedding time series data for classification. In: International workshop on machine learning and data mining in pattern recognition, pp 356–365

  • He Q, Zhi D, Zhuang F, Shang T, Shi Z (2012) Fast time series classification based on infrequent shapelets. In: Proceedings of the 11th ICMLA international conference on machine learning and applications vol 1, pp 215–219

  • Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2014) Classification of time series by shapelet transformation. Data Min Knowl Discovery 28(4):851–881

    Article  MathSciNet  MATH  Google Scholar 

  • Hochreiter S, Obermayer K (2006) Support vector machines for dyadic data. Neural Comput 1510:1472–1510

    Article  MathSciNet  MATH  Google Scholar 

  • Iwana BK, Frinken V, Riesen K, Uchida S (2017) Efficient temporal pattern recognition by means of dissimilarity space embedding with discriminative prototypes. Pattern Recognit 64:268–276

    Article  Google Scholar 

  • Jacobs DW, Weinshall D, Gdalyahu Y (2000) Classification with nonmetric distances: image retrieval and class representation. IEEE Trans Pattern Anal Mach Intell 22(6):583–600

    Article  Google Scholar 

  • Jain B, Spiegel S (2015) Dimension reduction in dissimilarity spaces for time series classification. In: International workshop on advanced analytics and learning on temporal data, pp 31–46

  • Jalalian A, Chalup SK (2013) GDTW-P-SVMs: variable-length time series analysis using support vector machines. Neurocomputing 99:270–282

    Article  Google Scholar 

  • Janyalikit T, Sathianwiriyakhun P, Sivaraks H, Ratanamahatana CA (2016) An enhanced support vector machine for faster time series classification. In: Asian conference on intelligent information and database systems, pp 616–625

  • Jeong YS, Jayaraman R (2015) Support vector-based algorithms with weighted dynamic time warping kernel function for time series classification. Knowl Based Syst 75(June):184–191

    Article  Google Scholar 

  • Jeong Y, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recognit 44(9):2231–2240

    Article  Google Scholar 

  • Kate RJ (2015) Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Discovery 30(2):283–312

    Article  MathSciNet  MATH  Google Scholar 

  • Kaya H, Gündüz-Öüdücü S (2013) SAGA: a novel signal alignment method based on genetic algorithm. Inf Sci 228:113–130

    Article  MathSciNet  Google Scholar 

  • Kaya H, Gündüz-Öüdücü S (2015) A distance based time series classification framework. Inf Syst 51:27–42

    Article  Google Scholar 

  • Keogh E, Kasetty S (2002) On the need for time series data mining benchmarks. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, pp 102

  • Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7:358–386

    Article  Google Scholar 

  • Korn F, Jagaciish HV, Faloutsos C (1997) Efficiently supporting ad hoc queries sequences in large datasets of time for systems. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data, pp 289–300

  • Kumara K, Agrawal R, Bhattacharyya C (2008) A large margin approach for writer independent online handwriting classification. Pattern Recognit Lett 29(7):933–937

    Article  Google Scholar 

  • Lei H, Sun B (2007) A study on the dynamic time warping in kernel machines. In: Proceedings of the 3rd SITIS international IEEE conference on signal-image technologies and internet-based system, pp 839–845

  • Lei Q, Yi J, Vaculin R, Wu L, Dhillon IS (2017) Similarity preserving representation learning for time series analysis. arXiv: 1702.03584 [cs]

  • Leslie C, Eskin E, Noble WS (2002) The spectrum kernel: a string kernel for SVM protein classification. In: Proceedings of the pacific symposium on biocomputing, pp 564–575

  • Li M, Chen X, Li X, Ma B, Vitányi PMB (2004) The similarity metric. IEEE Trans Inf Theory 50(12):3250–3264

    Article  MathSciNet  MATH  Google Scholar 

  • Li X, Lin J (2018) Evolving separating references for time series classification. In: Proceedings of the 2018 SIAM international conference on data mining, pp 243–251

  • Liberman M (1993) TI46 speech corpus. In: Linguistic data consortium

  • Lichman M (2013) UCI machine learning repository

  • Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discovery 15(2):107–144

    Article  MathSciNet  Google Scholar 

  • Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discovery 29(3):565–592

    Article  MathSciNet  MATH  Google Scholar 

  • Lines J, Davis LM, Hills J, Bagnall A (2012) A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 289

  • Lods A, Malinowski S, Tavenard R, Amsaleg L (2017) Learning DTW-preserving shapelets. In: International symposium on intelligent data analysis. Springer, Cham, pp 198–209

  • Lu Z, Leen KT, Huang Y, Erdogmus D (2008) A reproducing kernel hilbert space framework for pairwise time series distances. In: Proceedings of the 25th ICML international conference on machine learning, vol 56, pp 624–631

  • Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318

    Article  Google Scholar 

  • Marteau PF, Gibet S (2010) Constructing positive definite elastic kernels with application to time series classification. In: CoRR, pp 1–18

  • Marteau PF, Gibet S (2014) On recursive edit distance kernels with application to time series classification. IEEE Trans Neural Netw Learn Syst 26(6):1–15

    MathSciNet  Google Scholar 

  • Marteau PF, Bonnel N, Ménier G (2012) Discrete elastic inner vector spaces with application in time series and sequence mining. IEEE Trans Knowl Data Eng 25(9):2024–2035

    Article  Google Scholar 

  • Mizuhara Y, Hayashi A, Suematsu N (2006) Embedding of time series data by using dynamic time warping distances. Syst Comput Jpn 37(3):1–9

    Article  Google Scholar 

  • Mori U, Mendiburu A, Keogh E, Lozano JA (2017) Reliable early classification of time series based on discriminating the classes over time. Data Min Knowl Discovery 31(1):233–263

    Article  MathSciNet  Google Scholar 

  • Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1154–1162

  • Ong CS, Mary X, Canu S, Smola AJ (2004) Learning with non-positive kernels. In: Proceedings of the 21th ICML international conference on machine learning, p 81

  • Pȩkalska E, Duin RPW (2005) The dissimilarity representation for pattern recognition: foundations and applications

  • Pȩkalska E, Paclík P, Duin RPW (2001) A generalized kernel approach to dissimilarity-based classification. J Mach Learn Res 2:175–211

    MathSciNet  MATH  Google Scholar 

  • Pȩkalska E, Duin RPW, Paclík P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recognit 39(2):189–208

    Article  MATH  Google Scholar 

  • Popivanov I, Miller RJ (2002) Similarity search over time-series data using wavelets. In: Proceedings 18th international conference on data engineering (ICDE), pp 212–221

  • Pree H, Herwig B, Gruber T, Sick B, David K, Lukowicz P (2014) On general purpose time series similarity measures and their use as kernel functions in support vector machines. Inf Sci 281:478–495

    Article  Google Scholar 

  • Rahimi A, Recht B (2008) Random features for large-scale kernel machines. In: Advances in neural information processing systems

  • Rakthanmanon T, Keogh E (2013) Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proceedings of the 13th ICDM international conference on data mining, pp 668–676

  • Rasmussen C, Williams C (2006) Gaussian processes for machine learning. Springer, Berlin

    MATH  Google Scholar 

  • Rüping S (2001) SVM kernels for time series analysis. Technical report

  • Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoustics Speech Signal Process 26(1):43–49

    Article  MATH  Google Scholar 

  • Schölkopf B (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond

  • Senin P, Lin J, Wang X, Oates T, Gandhi S, Boedihardjo AP, Chen C, Frankenstein S, Lerner M (2014) GrammarViz 2.0: a tool for grammar-based pattern discovery in time series. In: Joint European conference on machine learning and knowledge discovery in databases, pp 468–472

  • Serrà J, Arcos JL (2014) An empirical evaluation of similarity measures for time series classification. Knowl Based Syst 67:305–314

    Article  Google Scholar 

  • Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Shimodaira H, Noma KI, Nakai M, Sagayama S (2002) Dynamic time-alignment kernel in support vector machine. Adv Neural Inf Process Syst 2(1):921–928

    Google Scholar 

  • Sivaramakrishnan KR, Bhattacharyya C (2004) Time series classification for online tamil handwritten character recognition a kernel based approach. In: International conference on neural information processing, pp 800–805

  • Smyth P (1997) Clustering sequences with hidden Markov models. Adv Neural Inf Process Syst 9:648–654

    Google Scholar 

  • Sun R, Luo ZQ (2016) Guaranteed matrix completion via non-convex factorization. IEEE Trans Inf Theory 62(11):6535–6579

    Article  MathSciNet  MATH  Google Scholar 

  • Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley, Boston

    Google Scholar 

  • Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54:45–66

    Article  MATH  Google Scholar 

  • Troncoso A, Arias M, Riquelme JC (2015) A multi-scale smoothing kernel for measuring time-series similarity. Neurocomputing 167:8–17

    Article  Google Scholar 

  • Vapnik V (1998) Statistical learning theory, vol 2. Wiley, New York

    MATH  Google Scholar 

  • Wachman G, Khardon R, Protopapas P, Charles RA (2009) Kernels for periodic time series arising in astronomy. In: European conference on machine learning and knowledge discovery in databases

  • Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discovery 26(2):275–309

    Article  MathSciNet  Google Scholar 

  • Wang X, Lin J, Senin P, Alamos L, Oates T, Gandhi S, Boedihardjo AP, Chen C, Frankenstein S (2016) RPM: representative pattern mining for efficient time series classification. In: Proceedings of the 19th international conference on extending database technology, pp 185–196

  • Weston J, Schölkopf B, Eskin E, Leslie C, Noble WS (2003) Dealing with large diagonals in kernel matrices. In: Annals of the institute of statistical mathematics, vol 55, pp 391–408

  • Wilson RC, Hancock ER, Pȩkalska E, Duin RPW (2014) Spherical and hyperbolic embeddings of data. IEEE Trans Pattern Anal Mach Intell 36(11):2255–2269

    Article  Google Scholar 

  • Wu G, Chang EY, Zhang Z (2005a) An analysis of transformation on non-positive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22th ICML international conference on machine learning, p 8

  • Wu G, Chang EY, Zhang Z (2005b) Learning with non-metric proximity matrices. In: Proceedings of the 13th ACM international conference on multimedia, p 411

  • Wu L, Yen IEH, Xu F, Ravikuma P, Witbrock M (2018a) D2KE: from distance to kernel and embedding, pp 1–18. arXiv:1802.04956

  • Wu L, Yen IE-H, Yi J, Xu F, Lei Q, Witbrock M (2018b) Random warping series: a random features method for time-series embedding. Proc Twenty-First Int Conf Artif Intell Stat 84:793–802

    Google Scholar 

  • Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd ICML international conference on machine learning, pp 1033–1040

  • Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. ACM SIGKDD Explor Newsl 12(1):40

    Article  Google Scholar 

  • Xue Y, Zhang L, Tao Z, Wang B, Li F (2017) An altered kernel transformation for time series classification. In: International conference on neural information processing, pp 455–465

  • Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, p 947

  • Ye L, Keogh E (2011) Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Discovery 22(1–2):149–182

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang D, Zuo W, Zhang D, Zhang H (2010) Time series classification using support vector machine with Gaussian elastic metric kernel. In: Proceedings of international conference on pattern recognition, pp 29–32

  • Zhang L, Chang P, Liu J, Yan Z, Wang T, Li F (2012) Kernel sparse representation-based classifier. IEEE Trans Signal Process 60(4):1684–1695

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the people who contributed to the UCR time series repository, as well as would like to express our sincere appreciation for the comments and advices provided by Eamonn Keogh and Lingfei Wu to improve this paper. This research is supported by the Basque Government through the BERC 2018-2021 program and by Spanish Ministry of Economy and Competitiveness MINECO through BCAM Severo Ochoa excellence accreditation SEV-2013-0323 and through project TIN2017-82626-R funded by (AEI/FEDER, UE) and acronym GECECPAST. In addition, by the Research Groups 2013-2018 (IT-609-13) programs (Basque Government), TIN2016-78365-R (Spanish Ministry of Economy, Industry and Competitiveness). A. Abanda is also supported by the Grant BES-2016-076890.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amaia Abanda.

Additional information

Responsible editor: Eamonn Keogh.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abanda, A., Mori, U. & Lozano, J.A. A review on distance based time series classification. Data Min Knowl Disc 33, 378–412 (2019). https://doi.org/10.1007/s10618-018-0596-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-018-0596-4

Keywords

Navigation