research-article

Time-series data mining

Authors:
Philippe Esling

Institut de Recherche et Coordination Acoustique/Musique (IRCAM), Paris, France

Institut de Recherche et Coordination Acoustique/Musique (IRCAM), Paris, France
View Profile

,
Carlos Agon

Institut de Recherche et Coordination Acoustique/Musique (IRCAM), Paris, France

Institut de Recherche et Coordination Acoustique/Musique (IRCAM), Paris, France
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 45 Issue 1Article No.: 12pp 1–34https://doi.org/10.1145/2379776.2379788

Published:07 December 2012Publication History

ACM Computing Surveys

Abstract

In almost every scientific field, measurements are performed over time. These observations lead to a collection of organized data called time series. The purpose of time-series data mining is to try to extract all meaningful knowledge from the shape of data. Even if humans have a natural capacity to perform these tasks, it remains a complex problem for computers. In this article we intend to provide a survey of the techniques applied for time-series data mining. The first part is devoted to an overview of the tasks that have captured most of the interest of researchers. Considering that in most cases, time-series task relies on the same components for implementation, we divide the literature depending on these common aspects, namely representation techniques, distance measures, and indexing methods. The study of the relevant literature has been categorized for each individual aspects. Four types of robustness could then be formalized and any kind of distance could then be classified. Finally, the study submits various research trends and avenues that can be explored in the near future. We hope that this article can provide a broad and deep understanding of the time-series data mining research field.

References

Abonyi, J., Fell, B., Nemeth, S., and Arva, P. 2003. Fuzzy clustering based segmentation of time-series. In Proceedings of the 5th International Symposium on Intelligent Data Analysis (IDA 03). Springer, 275--285.Google Scholar
Agrawal, R., Faloutsos, C., and Swami, A. 1993. Efficient similarity search in sequence databases. In Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms. Springer, 69--84. Google ScholarDigital Library
Agrawal, R., Lin, K.-I., Sawhney, H. S., and Shim, K. 1995. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings of the 21st International Conference on Very Large Data Bases. Morgan Kaufmann, 490--501. Google ScholarDigital Library
Ahmed, N., Atiya, A., El Gayar, N., El-Shishiny, H., and Giza, E. 2009. An empirical comparison of machine learning models for time series forecasting. Econometr. Rev. 29, 5, 594--621.Google ScholarCross Ref
Ahmed, T., Oreshkin, B., and Coates, M. 2007. Machine learning approaches to network anomaly detection. In Proceedings of the 2nd USENIX Workshop on Tackling Computer Systems Problems with Machine Learning Techniques. USENIX Association, 1--6. Google ScholarDigital Library
An, J., Chen, H., Furuse, K., Ohbo, N., and Keogh, E. 2003. Grid-Based indexing for large time series databases. In Intelligent Data Engineering and Automated Learning. Lecture Notes in Computer Science, vol. 1983. Springer 614--621.Google Scholar
Antunes, C. and Oliveira, A. 2001. Temporal data mining: An overview. In Proceedings of the KDD Workshop on Temporal Data Mining. 1--13.Google Scholar
Argyros, T. and Ermopoulos, C. 2003. Efficient subsequence matching in time series databases under time and amplitude transformations. In Proceedings of the 3rd IEEE International Conference on Data Mining. 481--484. Google ScholarDigital Library
Assent, I., Krieger, R., Afschari, F., and Seidl, T. 2008. The TS-tree: Efficient time series search and retrieval. In Proceedings of the 11th International Conference on Extending Database Technology. 25--29. Google ScholarDigital Library
Assent, I., Wichterich, M., Krieger, R., Kremer, H., and Seidl, T. 2009. Anticipatory DTW for efficient similarity search in time series databases. Proc. VLDB Endowm. 2, 1, 826--837. Google ScholarDigital Library
Aßfalg, J., Kriegel, H., Kroger, P., Kunath, P., Pryakhin, A., and Renz, M. 2006. Similarity search on time series based on threshold queries. In Proceedings of the 10th International Conference on Extending Database Technology. 276. Google ScholarDigital Library
Aßfalg, J., Kriegel, H., Kröger, P., Kunath, P., Pryakhin, A., and Renz, M. 2008. Similarity search in multimedia time series data using amplitude-level features. In Proceedings of the 14th International Conference on Advances in Multimedia Modeling. Springer, 123--133. Google ScholarDigital Library
Bagnall, A. and Janacek, G. 2005. Clustering time series with clipped data. Mach. Learn. 58, 2, 151--178. Google ScholarDigital Library
Bagnall, A., Janacek, G., De la Iglesia, B., and Zhang, M. 2003. Clustering time series from mixture polynomial models with discretised data. In Proceedings of the 2nd Australasian Data Mining Workshop. 105--120.Google Scholar
Bagnall, A., Ratanamahatana, C., Keogh, E., Lonardi, S., and Janacek, G. 2006. A bit level representation for time series data mining with shape based similarity. Data Min. Knowl. Discov. 13, 1, 11--40. Google ScholarDigital Library
Bai, J. and Ng, S. 2008. Forecasting economic time series using targeted predictors. J. Econometr. 146, 2, 304--317.Google ScholarCross Ref
Bakshi, B. and Stephanopoulos, G. 1994. Representation of process trends--IV. Induction of real-time patterns from operating data for diagnosis and supervisory control. Comput. Chemi. Engin. 18, 4, 303--332.Google ScholarCross Ref
Bakshi, B. and Stephanopoulos, G. 1995. Reasoning in time: Modeling, analysis, and pattern recognition of temporal process trends. Adv. Chem. Engin. 22, 485--548.Google ScholarCross Ref
Bandera, J., Marfil, R., Bandera, A., Rodríguez, J., Molina-Tanco, L., and Sandoval, F. 2009. Fast gesture recognition based on a two-level representation. Pattern Recogn. Lett. 30, 13, 1181--1189. Google ScholarDigital Library
Barone, P., Carfora, M., and March, R. 2009. Segmentation, classification and denoising of a time series field by a variational method. J. Math. Imag. Vis. 34, 2, 152--164. Google ScholarDigital Library
Barreto, G. 2007. Time series prediction with the self-organizing map: A review. Perspect. Neural-Symbo. Integr. 77, 1, 135--158.Google ScholarCross Ref
Bartolini, I., Ciaccia, P., and Patella, M. 2005. Warp: Accurate retrieval of shapes using phase of fourier descriptors and time warping distance. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1, 142--147. Google ScholarDigital Library
Bayer, R. and McCreight, E. 1972. Organization and maintenance of large ordered indexes. Acta Info. 1, 3, 173--189.Google ScholarDigital Library
Beckmann, N., Kriegel, H., Schneider, R., and Seeger, B. 1990. The R*-tree: An efficient and robust access method for points and rectangles. ACM SIGMOD Rec. 19, 2, 322--331. Google ScholarDigital Library
Berchtold, S., Keim, D., and Kriegel, H. 2002. The X-tree: An index structure for high-dimensional data. Read. Multimedia Comput. Netw. 4, 1, 451--463. Google ScholarDigital Library
Berkhin, P. 2006. A survey of clustering data mining techniques. Group. Multidimen. Data, 25--71.Google Scholar
Berndt, D. and Clifford, J. 1994. Using dynamic time warping to find patterns in time series. In Proceedings of the AAAI-94 Workshop on Knowledge Discovery in Databases. 229--248.Google Scholar
Berretti, S., Del Bimbo, A., and Pala, P. 2000. Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans. Multimedia 2, 4, 225--239. Google ScholarDigital Library
Bhargava, R., Kargupta, H., and Powers, M. 2003. Energy consumption in data analysis for on-board and distributed applications. In Proceedings of the ICML. Vol. 3.Google Scholar
Bicego, M., Murino, V., and Figueiredo, M. 2003. Similarity-Based clustering of sequences using hidden Markov models. Lecture Notes in Computer Science, vol. 2743. Springer, 95--104.Google Scholar
Bohm, C., Berchtold, S., and Keim, D. 2001. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33, 3, 322--373. Google ScholarDigital Library
Bollobas, B., Das, G., Gunopulos, D., and Mannila, H. 1997. Time-Series similarity problems and well-separated geometric sets. In Proceedings of the 13th Symposium on Computational Geometry. 454--456. Google ScholarDigital Library
Box, G., Jenkins, G., and Reinsel, G. 1976. Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco. Google ScholarDigital Library
Brockwell, P. and Davis, R. 2002. Introduction to Time Series and Forecasting. Springer.Google Scholar
Brockwell, P. and Davis, R. 2009. Time Series: Theory and Methods. Springer.Google Scholar
Buhler, J. and Tompa, M. 2002. Finding motifs using random projections. J. Comput. Biol. 9, 2, 225--242.Google ScholarCross Ref
Burkom, H., Murphy, S., and Shmueli, G. 2007. Automated time series forecasting for biosurveillance. Statist. Medi. 26, 22, 4202--4218.Google ScholarCross Ref
Cai, Y. and Ng, R. 2004. Indexing spatio-temporal trajectories with Chebyshev polynomials. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 599--610. Google ScholarDigital Library
Cao, L. and Tay, F. 2009. Feature selection for support vector machines in financial time series forecasting. In Intelligent Data Engineering and Automated Learning. Lecture Notes in Computer Science, vol. 1983. Springer, 41--65. Google ScholarDigital Library
Chakrabarti, K. and Mehrotra, S. 1999. The hybrid tree: An index structure for high dimensional feature spaces. In Proceedings of the 15th International Conference on Data Engineering. 440--447. Google ScholarDigital Library
Chan, F., Fu, A., and Yu, C. 2003. Haar wavelets for efficient similarity search of time-series: With and without time warping. IEEE Trans. Knowl. Data Engin. 15, 3, 686--705. Google ScholarDigital Library
Chan, K. and Fu, A. 1999. Efficient time series matching by wavelets. In Proceedings of the 15th IEEE International Conference on Data Engineering. 126--133. Google ScholarDigital Library
Chandola, V., Banerjee, A., and Kumar, V. 2009. Anomaly detection: A survey. ACM Comput. Surv. 41, 3, 15. Google ScholarDigital Library
Chappelier, J. and Grumbach, A. 1996. A Kohonen map for temporal sequences. In Proceedings of the Conference on Neural Networks and Their Applications. 104--110.Google Scholar
Chen, L. and Ng, R. 2004. On the marriage of Lp-norms and edit distance. In Proceedings of the 30th International Conference on Very Large Data Bases. 792--803. Google ScholarDigital Library
Chen, L., Ozsu, M., and Oria, V. 2005. Robust and fast similarity search for moving object trajectories. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 491--502. Google ScholarDigital Library
Chen, Q., Chen, L., Lian, X., Liu, Y., and Yu, J. 2007a. Indexable PLA for efficient similarity search. In Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, 435--446. Google ScholarDigital Library
Chen, X., Kwong, S., and Li, M. 2000. A compression algorithm for DNA sequences and its applications in genome comparison. In Proceedings of the 4th Annual International Conference on Computational Molecular Biology. 107. Google ScholarDigital Library
Chen, X. and Zhan, Y. 2008. Multi-Scale anomaly detection algorithm based on infrequent pattern of time series. J. Comput. Appl. Math. 214, 1, 227--237. Google ScholarDigital Library
Chen, Y., Nascimento, M., Ooi, B., and Tung, A. 2007b. Spade: On shape-based pattern detection in streaming time series. In Proceedings of the IEEE 23rd International Conference on Data Engineering. 786--795.Google Scholar
Chhieng, V. and Wong, R. 2010. Adaptive distance measurement for time series databases. In Lecture Notes in Computer Science, vol. 4443. Springer, 598--610. Google ScholarDigital Library
Chiu, B., Keogh, E., and Lonardi, S. 2003. Probabilistic discovery of time series motifs. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 493--498. Google ScholarDigital Library
Chuah, M. and Fu, F. 2007. ECG anomaly detection via time series analysis. In Frontiers of High Performance Computing and Networking ISPA 07 Workshops. Springer, 123--135. Google ScholarDigital Library
Corduas, M. and Piccolo, D. 2008. Time series clustering and classification by the autoregressive metric. Comput. Statist. Data Anal. 52, 4, 1860--1872. Google ScholarDigital Library
Cormode, G., Muthukrishnan, S., and Zhuang, W. 2007. Conquering the divide: Continuous clustering of distributed data streams. In Proceedings of the IEEE 23rd International Conference on Data Engineering. 1036--1045.Google Scholar
Costa Santos, C., Bernardes, J., Vitanyi, P., and Antunes, L. 2006. Clustering fetal heart rate tracings by compression. In Proceedings of the 19th International Symposium on Computer-Based Medical Systems. 685--690. Google ScholarDigital Library
Das, G., Gunopulos, D., and Mannila, H. 1997. Finding similar time series. In Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD'97). Springer, 88--100. Google ScholarDigital Library
Degli Esposti, M., Farinelli, C., and Menconi, G. 2009. Sequence distance via parsing complexity: Heartbeat signals. Chaos, Sol. Fractals 39, 3, 991--999.Google ScholarCross Ref
Deng, K., Moore, A., and Nechyba, M. 1997. Learning to recognize time series: Combining ARMA models with memory-based learning. In Proceedings of the IEEE International Symposium on Computational Intelligence in Robotics and Automation. 246--251. Google ScholarDigital Library
Denton, A. 2005. Kernel-Density-Based clustering of time series subsequences using a continuous random-walk noise model. In Proceedings of the 5th IEEE International Conference on Data Mining. 122--129. Google ScholarDigital Library
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., and Keogh, E. 2008. Querying and mining of time series data: Experimental comparison of representations and distance measures. Proc. VLDB Endowm. 1, 2, 1542--1552. Google ScholarDigital Library
Domingos, P. and Hulten, G. 2000. Mining high-speed data streams. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 71--80. Google ScholarDigital Library
Dong, G., Han, J., Lakshmanan, L., Pei, J., Wang, H., and Yu, P. 2003. Online mining of changes from data streams: Research problems and preliminary results. In Proceedings of the ACM SIGMOD Workshop on Management and Processing of Data Streams.Google Scholar
Faloutsos, C. and Megalooikonomou, V. 2007. On data mining, compression, and kolmogorov complexity. Data Min. Knowl. Discov. 15, 1, 3--20. Google ScholarDigital Library
Faloutsos, C., Ranganathan, M., and Manolopulos, Y. 1994. Fast subsequence matching in time-series databases. SIGMOD Rec. 23, 419--429. Google ScholarDigital Library
Ferreira, P., Azevedo, P., Silva, C., and Brito, R. 2006. Mining approximate motifs in time series. In Lecture Notes in Computer Science, vol. 4265. Springer, 89--101. Google ScholarDigital Library
Flanagan, J. 2003. A non-parametric approach to unsupervised learning and clustering of symbol strings and sequences. In Proceedings of the 4th Workshop on Self-Organizing Maps (WSOM03). 128--133.Google Scholar
Frentzos, E., Gratsias, K., and Theodoridis, Y. 2007. Index-Based most similar trajectory search. In Proceedings of the IEEE 23rd International Conference on Data Engineering. 816--825.Google Scholar
Fröhwirth-Schnatter, S. and Kaufmann, S. 2008. Model-Based clustering of multiple time series. J. Bus. Econ. Statist. 26, 1, 78--89.Google ScholarCross Ref
Fu, A., Keogh, E., Lau, L., Ratanamahatana, C., and Wong, R. 2008. Scaling and time warping in time series querying. The VLDB J. Int. J. Very Large Data Bases 17, 4, 921. Google ScholarDigital Library
Fuchs, E., Gruber, T., Pree, H., and Sick, B. 2010. Temporal data mining using shape space representations of time series. Neurocomput. 74, 1-3, 379--393. Google ScholarDigital Library
Gaber, M., Zaslavsky, A., and Krishnaswamy, S. 2005. Mining data streams: A review. ACM SIGMOD Rec. 34, 2, 18--26. Google ScholarDigital Library
Gaffney, S. and Smyth, P. 1999. Trajectory clustering with mixtures of regression models. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 63--72. Google ScholarDigital Library
Ge, X. and Smyth, P. 2000. Deformable Markov model templates for time-series pattern matching. In Proceedings of the 6th ACM International Conference on Knowledge Discovery and Data Mining. 81--90. Google ScholarDigital Library
Geurts, P. 2001. Pattern extraction for time series classification. In Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery. 115--127. Google ScholarDigital Library
Golab, L. and Ozsu, M. 2003. Issues in data stream management. ACM SIGMOD Rec. 32, 2, 5--14. Google ScholarDigital Library
Goldin, D. and Kanellakis, P. 1995. On similarity queries for time-series data: Constraint specification and implementation. In Proceedings of the Principles and Practice of Constraint Programming (CP95). Springer, 137--153. Google ScholarDigital Library
Goldin, D., Millstein, T., and Kutlu, A. 2004. Bounded similarity querying for time-series data. Info. Comput. 194, 2, 203--241. Google ScholarDigital Library
Gullo, F., Ponti, G., Tagarelli, A., and Greco, S. 2009. A time series representation model for accurate and fast similarity detection. Pattern Recogn. 42, 11, 2998--3014. Google ScholarDigital Library
Gupta, S., Ray, A., and Keller, E. 2007. Symbolic time series analysis of ultrasonic data for early detection of fatigue damage. Mechan. Syst. Signal Process. 21, 2, 866--884.Google ScholarCross Ref
Gusfield, D. 1997. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press. Google ScholarDigital Library
Han, J. and Kamber, M. 2006. Data Mining: Concepts and Techniques. Morgan Kaufmann. Google ScholarDigital Library
Harris, R. and Sollis, R. 2003. Applied Time Series Modelling and Forecasting. J. Wiley.Google Scholar
Hebrail, G. and Hugueney, B. 2000. Symbolic representation of long time-series. In Symbolic Data Analysis at the 4th European Conference on Principles of Data Mining and Knowledge Discovery. 56--65.Google Scholar
Hellerstein, J., Koutsoupias, E., and Papadimitriou, C. 1997. On the analysis of indexing schemes. In Proceedings of the 16th ACM Symposium on Principles of Database Systems. 249--256. Google ScholarDigital Library
Herrera, L., Pomares, H., Rojas, I., Guillén, A., Prieto, A., and Valenzuela, O. 2007. Recursive prediction for long term time series forecasting using advanced models. Neurocomput. 70, 16-18, 2870--2880. Google ScholarDigital Library
Himberg, J., Korpiaho, K., Tikanmaki, J., and Toivonen, H. 2001a. Time series segmentation for context recognition in mobile devices. In Proceedings of the 1st IEEE International Conference on Data Mining. 203--210. Google ScholarDigital Library
Himberg, J., Mantyjarvi, J., and Korpipaa, P. 2001b. Using PCA and ICA for exploratory data analysis in situation awareness. In Proceedings of the International Conference on Multisensor Fusion and Integration for Intelligent Systems. 127--131.Google Scholar
Huang, Y. and Yu, P. 1999. Adaptive query processing for time-series data. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 282--286. Google ScholarDigital Library
Hulten, G., Spencer, L., and Domingos, P. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 97--106. Google ScholarDigital Library
Indyk, P., Koudas, N., and Muthukrishnan, S. 2000. Identifying representative trends in massive time series data sets using sketches. In Proceedings of the 26th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., 363--372. Google ScholarDigital Library
Janacek, G., Bagnall, A., and Powell, M. 2005. A likelihood ratio distance measure for the similarity between the Fourier transform of time series. In Lecture Notes in Computer Science, vol. 3518. Springer, 737--743. Google ScholarDigital Library
Jeng, S. and Huang, Y. 2008. Time series classification based on spectral analysis. Comm. Statisti. Simul. Comput. 37, 1, 132--142.Google ScholarCross Ref
Kalpakis, K., Gada, D., and Puttagunta, V. 2001. Distance measures for effective clustering of ARIMA time-series. In Proceedings of the IEEE International Conference on Data Mining. 273--280. Google ScholarDigital Library
Kauppinen, H., Seppanen, T., and Pietikainen, M. 1995. An experimental comparison of autoregressive and Fourier-based descriptors in 2D shape classification. IEEE Trans. Pattern Anal. Mach. Intell. 17, 2, 201--207. Google ScholarDigital Library
Kehagias, A. 2004. A hidden Markov model segmentation procedure for hydrological and environmental time series. Stochas. Environ. Res. Risk Assessm. 18, 2, 117--130.Google ScholarCross Ref
Keogh, E., Chakrabarti, K., and Pazzani, M. 2001a. Locally adaptive dimensionality reduction for indexing large time series databases. In Proceedings of ACM Conference on Management of Data. 151--162. Google ScholarDigital Library
Keogh, E., Chakrabarti, K., Pazzani, M., and Mehrotra, S. 2001b. Dimensionality reduction for fast similarity search in large time series databases. Knowl. Info. Syst. 3, 3, 263--286.Google ScholarCross Ref
Keogh, E., Chu, S., Hart, D., and Pazzani, M. 2003a. Segmenting time series: A survey and novel approach. Data Min. Time Series Databases, 1--21.Google Scholar
Keogh, E. and Kasetty, S. 2002. On the need for time series data mining benchmarks : A survey and empirical demonstration. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 102--111. Google ScholarDigital Library
Keogh, E. and Kasetty, S. 2003. On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Min. Knowl. Discov. 7, 4, 349--371. Google ScholarDigital Library
Keogh, E., Lin, J., Lee, S., and Herle, H. 2007. Finding the most unusual time series subsequence: Algorithms and applications. Knowl. Inf. Syst. 11, 1, 1--27. Google ScholarDigital Library
Keogh, E., Lin, J., and Truppel, W. 2003b. Clustering of time series subsequences is meaningless: Implications for previous and future research. In Proceedings of the 3rd IEEE International Conference on Data Mining. 115--122. Google ScholarDigital Library
Keogh, E., Lonardi, S., and Ratanamahatana, C. 2004. Towards parameter-free data mining. In Proceedings of 10th ACM International Conference on Knowledge Discovery and Data Mining. 206--215. Google ScholarDigital Library
Keogh, E. and Pazzani, M. 1998. An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In Proceedings of the 4th International Conference of Knowledge Discovery and Data Mining. AAAI Press, 239--241.Google ScholarDigital Library
Keogh, E. and Ratanamahatana, C. 2005. Exact indexing of dynamic time warping. Knowl. Info. Syst. 7, 3, 358--386. Google ScholarDigital Library
Kerr, G., Ruskin, H., Crane, M., and Doolan, P. 2008. Techniques for clustering gene expression data. Comput. Biol. Med. 38, 3, 283--293. Google ScholarDigital Library
Kim, S., Park, S., and Chu, W. 2001. An index-based approach for similarity search supporting time warping in large sequence databases. In Proceedings of the 17th International Conference on Data Engineering. IEEE Computer Society, 607--614. Google ScholarDigital Library
Kontaki, M., Papadopoulos, A., and Manolopoulos, Y. 2007. Adaptive similarity search in streaming time series with sliding windows. Data Knowl. Engin. 63, 2, 478--502. Google ScholarDigital Library
Kontaki, M., Papadopoulos, A., and Manolopoulos, Y. 2009. Similarity search in time series. In Handbook of Research on Innovations in Database Technologies and Applications. 288--299.Google Scholar
Korn, F., Jagadish, H., and Faloutsos, C. 1997. Efficiently supporting ad hoc queries in large datasets of time sequences. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 289--300. Google ScholarDigital Library
Koskela, T. 2003. Neural network methods in analysing and modelling time varying processes. Ph.D. thesis, Helsinki University of Technology Laboratory of Computational Engineering.Google Scholar
Kumar, N., Lolla, N., Keogh, E., Lonardi, S., Ratanamahatana, C., and Wei, L. 2005. Time Series bitmaps: A practical visualization tool for working with large time-series databases. In Proceedings of the SIAM Data Mining Conference. 531--535.Google Scholar
Latecki, L., Megalooikonomou, V., Wang, Q., Lakaemper, R., Ratanamahatana, C., and Keogh, E. 2005. Elastic partial matching of time series. Knowl. Discov. Databases, 577--584. Google ScholarDigital Library
Latecki, L., Wang, Q., Koknar-Tezel, S., and Megalooikonomou, V. 2007. Optimal subsequence bijection. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 565--570. Google ScholarDigital Library
Law, M. and Kwok, J. 2000. Rival penalized competitive learning for model-based sequence clustering. In Proceedings of the 15th International Conference on Pattern Recognition. Vol. 2. 2186--2195.Google Scholar
Li, C., Yu, P., and Castelli, V. 1998. MALM: A framework for mining sequence database at multiple abstraction levels. In Proceedings of the 7th International Conference on Information and Knowledge Management. ACM, 267--272. Google ScholarDigital Library
Lian, X. and Chen, L. 2007. Efficient similarity search over future stream time series. IEEE Trans. Knowl. Data Engin. 20, 1, 40--54. Google ScholarDigital Library
Lian, X., Chen, L., and Wang, B. 2010. Approximate similarity search over multiple stream time series. In Lecture Notes in Computer Science, vol. 4443. Springer, 962--968. Google ScholarDigital Library
Liao, T. 2005. Clustering of time series data--A survey. Pattern Recogn. 38, 11, 1857--1874. Google ScholarDigital Library
Liew, A., Leung, S., and Lau, W. 2000. Fuzzy image clustering incorporating spatial continuity. IEEE Proc. Vis. Image Signal Process. 147, 2, 185--192.Google ScholarCross Ref
Lin, J. and Keogh, E. 2005. Clustering of time-series subsequences is meaningless: Implications for previous and future research. Knowl. Info. Syst 8, 2, 154--177. Google ScholarDigital Library
Lin, J., Keogh, E., Lonardi, S., and Chiu, B. 2003. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM New York, 2--11. Google ScholarDigital Library
Lin, J., Keogh, E., Lonardi, S., Lankford, J., and Nystrom, D. 2004. Visually mining and monitoring massive time series. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 460--469. Google ScholarDigital Library
Lin, J. and Li, Y. 2009. Finding structural similarity in time series data using bag-of-patterns representation. In Proceedings of the 21st International Conference on Scientific and Statistical Database Management. Springer, 461--477. Google ScholarDigital Library
Lin, T., Kaminski, N., and Bar-Joseph, Z. 2008. Alignment and classification of time series gene expression in clinical studies. Bioinf. 24, 13, 147--155. Google ScholarDigital Library
Liu, Z., Yu, J., Lin, X., Lu, H., and Wang, W. 2005. Locating Motifs in Time Series Data. Springer, 343--353. Google ScholarDigital Library
Lotte, F., Congedo, M., Lécuyer, A., Lamarche, F., and Arnaldi, B. 2007. A review of classification algorithms for EEG-based brain--computer interfaces. J. Neural Engin. 4, 1--13.Google ScholarCross Ref
Lowitz, T., Ebert, M., Meyer, W., and Hensel, B. 2009. Hidden markov models for classification of heart rate variability in RR time series. In World Congress on Medical Physics and Biomedical Engineering. Springer, 1980--1983.Google Scholar
Ma, J. and Perkins, S. 2003. Online novelty detection on temporal sequences. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 613--618. Google ScholarDigital Library
Mannila, H. and Seppnen, J. 2001. Recognizing similar situations from event sequences. In Proceedings of the 1st SIAM Conference on Data Mining. 1--16.Google Scholar
Marteau, P. 2008. Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2, 306--318. Google ScholarDigital Library
Megalooikonomou, V., Li, G., and Wang, Q. 2004. A dimensionality reduction technique for efficient similarity analysis of time series databases. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM, 160--161. Google ScholarDigital Library
Megalooikonomou, V., Wang, Q., Li, G., and Faloutsos, C. 2005. A multiresolution symbolic representation of time series. In Proceedings of the 21st International Conference on Data Engineering. 668--679. Google ScholarDigital Library
Mohammad, Y. and Nishida, T. 2009. Constrained motif discovery in time series. New Gener. Comput. 27, 4, 319--346.Google ScholarCross Ref
Morse, M. and Patel, J. 2007. An efficient and accurate method for evaluating time series similarity. In Proceedings of the ACM International Conference on Management of Data. 569--580. Google ScholarDigital Library
Mueen, A., Keogh, E., Zhu, Q., Cash, S., and Westover, B. 2009. Exact discovery of time series motifs. In Proceedings of the SIAM International Conference on Data Mining (SDM). 473--484.Google Scholar
Muhammad Fuad, M. and Marteau, P. 2008. Extending the edit distance using frequencies of common characters. In Proceedings of the 19th International Conference on Database and Expert Systems Applications. Springer, 150--157. Google ScholarDigital Library
Nanopoulos, A., Alcock, R., and Manolopoulos, Y. 2001. Feature-Based classification of time-series data. In Information Processing and Technology. 49--61. Google ScholarDigital Library
Ogras, Y. and Ferhatosmanoglu, H. 2006. Online summarization of dynamic time series data. The VLDB J. The Int. J. Very Large Data Bases 15, 1, 84--98. Google ScholarDigital Library
Otu, H. and Sayood, K. 2003. A new sequence distance measure for phylogenetic tree construction. Bioinf. 19, 16, 2122--2130.Google ScholarCross Ref
Ouyang, R., Ren, L., Cheng, W., and Zhou, C. 2010. Similarity search and pattern discovery in hydrological time series data mining. Hydrol. Process. 24, 9, 1198--1210.Google ScholarCross Ref
Palpanas, T., Keogh, E., Zordan, V., Gunopulos, D., and Cardle, M. 2004a. Indexing large human-motion databases. In Proceedings of the 13th International Conference on Very Large Data Bases. 780--791. Google ScholarDigital Library
Palpanas, T., Vlachos, M., Keogh, E., and Gunopulos, D. 2008. Streaming time series summarization using user-defined amnesic functions. IEEE Trans. Knowl. Data Engin. 20, 7, 992--1006. Google ScholarDigital Library
Palpanas, T., Vlachos, M., Keogh, E., Gunopulos, D., and Truppel, W. 2004b. Online amnesic approximation of streaming time series. In Proceedings of the 20th International Conference on Data Engineering. 338--349. Google ScholarDigital Library
Panuccio, A., Bicego, M., and Murino, V. 2002. A hidden Markov model-based approach to sequential data clustering. In Lecture Notes in Computer Science, vol. 2396. Springer, 734--743. Google ScholarDigital Library
Papadimitriou, S., Sun, J., and Yu, P. 2006. Local correlation tracking in time series. In Proceedings of the 6th International Conference on Data Mining. 456--465. Google ScholarDigital Library
Papadimitriou, S. and Yu, P. 2006. Optimal multi-scale patterns in time series streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 647--658. Google ScholarDigital Library
Park, S., Chu, W., Yoon, J., and Hsu, C. 2000. Efficient searches for similar subsequences of different lengths in sequence databases. In Proceedings of the 16th International Conference on Data Engineering. 23--32. Google ScholarDigital Library
Park, S., Lee, D., and Chu, W. 1999. Fast retrieval of similar subsequences in long sequence databases. In Proceedings of the 3rd IEEE Knowledge and Data Engineering Exchange Workshop. 60--67. Google ScholarDigital Library
Patel, P., Keogh, E., Lin, J., and Lonardi, S. 2002. Mining motifs in massive time series databases. In Proceedings of IEEE International Conference on Data Mining (ICDM02). 370--377. Google ScholarDigital Library
Perng, C., Wang, H., Zhang, S., and Parker, D. 2000. Landmarks: A new model for similarity-based pattern querying in time series databases. In Proceedings of the 16th International Conference on Data Engineering. 33--42. Google ScholarDigital Library
Pesaran, M., Pettenuzzo, D., and Timmermann, A. 2006. Forecasting time series subject to multiple structural breaks. Rev. Econ. Studies 73, 4, 1057--1084.Google ScholarCross Ref
Popivanov, I. and Miller, R. 2002. Similarity search over time-series data using wavelets. In Proceedings of the International Conference on Data Engineering. 212--224. Google ScholarDigital Library
Povinelli, R., Johnson, M., Lindgren, A., and Ye, J. 2004. Time series classification using Gaussian mixture models of reconstructed phase spaces. IEEE Trans. Knowl. Data Engin. 16, 6, 779--783. Google ScholarDigital Library
Rafiei, D. and Mendelzon, A. 1998. Efficient retrieval of similar time sequences using DFT. In Proceedings of the 5th International Conference of Foundations of Data Organization and Algorithms. 249--257.Google Scholar
Ratanamahatana, C. and Keogh, E. 2004a. Everything you know about dynamic time warping is wrong. In Proceedings of the 3rd Workshop on Mining Temporal and Sequential Data. 1--11.Google Scholar
Ratanamahatana, C. and Keogh, E. 2004b. Making time-series classification more accurate using learned constraints. In Proceedings of SIAM International Conference on Data Mining. 11--22.Google Scholar
Ratanamahatana, C., Keogh, E., Bagnall, A., and Lonardi, S. 2005. A novel bit level time series representation with implication of similarity search and clustering. Adv. Knowl. Discov. Data Min. 771--777. Google ScholarDigital Library
Ratanamahatana, C. and Wanichsan, D. 2008. Stopping criterion selection for efficient semi-supervised time series classification. Studies Comput. Intell. 149, 1--14.Google Scholar
Ravi Kanth, K., Agrawal, D., and Singh, A. 1998. Dimensionality reduction for similarity searching in dynamic databases. ACM SIGMOD Rec. 27, 2, 166--176. Google ScholarDigital Library
Reeves, G., Liu, J., Nath, S., and Zhao, F. 2009. Managing massive time series streams with multi-scale compressed trickles. Proc. VLDB Endow. 2, 1, 97--108. Google ScholarDigital Library
Reinert, G., Schbath, S., and Waterman, M. 2000. Probabilistic and statistical properties of words: An overview. J. Comput. Biol. 7, 1-2, 1--46.Google ScholarCross Ref
Rodriguez, J. and Kuncheva, L. 2007. Time series classification: Decision forests and SVM on interval and DTW features. In Proceedings of the Workshop on Time Series Classification, 13th International Conference on Knowledge Discovery and Data Mining.Google Scholar
Sakurai, Y., Yoshikawa, M., and Faloutsos, C. 2005. FTW: Fast similarity search under the time warping distance. In Proceedings of the 24th ACM Symposium on Principles of Database Systems. 326--337. Google ScholarDigital Library
Sakurai, Y., Yoshikawa, M., Uemura, S., and Kojima, H. 2000. The A-tree: An index structure for high-dimensional spaces using relative approximation. In Proceedings of the 26th International Conference on Very Large Data Bases. 516--526. Google ScholarDigital Library
Salvador, S. and Chan, P. 2007. Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 11, 5, 561--580. Google ScholarDigital Library
Salvador, S., Chan, P., and Brodie, J. 2004. Learning states and rules for time series anomaly detection. In Proceedings of the 17th International FLAIRS Conference. 300--305.Google Scholar
Sebastian, T., Klein, P., and Kimia, B. 2003. On aligning curves. IEEE Trans. Pattern Anal. Mach. Intell. 25, 1, 116--125. Google ScholarDigital Library
Sebastiani, P., Ramoni, M., Cohen, P., Warwick, J., and Davis, J. 1999. Discovering dynamics using Bayesian clustering. In Lecture Notes in Computer Science, vol. 1642. Springer, 199--209. Google ScholarDigital Library
Sfetsos, A. and Siriopoulos, C. 2004. Time series forecasting with a hybrid clustering scheme and pattern recognition. IEEE Trans. Syst. Man Cybernet. A 34, 3, 399--405. Google ScholarDigital Library
Shasha, D. and Zhu, Y. 2004. High Performance Discovery in Time Series: Techniques and Case Studies. Springer. Google ScholarDigital Library
Shatkay, H. and Zdonik, S. 1996. Approximate queries and representations for large data sequences. In Proceedings of the 12th International Conference on Data Engineering. 536--545. Google ScholarDigital Library
Shieh, J. and Keogh, E. 2008. ISAX: Indexing and mining terabyte sized time series. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 623--631. Google ScholarDigital Library
Smyth, P. 1997. Clustering sequences with hidden Markov models. Adv. Neural Info. Process. Syst. 648--654.Google Scholar
Song, H. and Li, G. 2008. Tourism demand modelling and forecasting--A review of recent research. Tour. Manag. 29, 2, 203--220.Google ScholarCross Ref
Sorjamaa, A., Hao, J., Reyhani, N., Ji, Y., and Lendasse, A. 2007. Methodology for long-term prediction of time series. Neurocomput. 70, 16-18, 2861--2869. Google ScholarDigital Library
Srisai, D. and Ratanamahatana, C. 2009. Efficient time series classification under template matching using time warping alignment. In Proceedings of the 4th International Conference on Computer Sciences and Convergence Information Technology. IEEE, 685--690. Google ScholarDigital Library
Stiefmeier, T., Roggen, D., and Troster, G. 2007. Gestures are strings: Efficient online gesture spotting and classification using string matching. In Proceedings of the ICST 2nd International Conference on Body Area Networks. 1--8. Google ScholarDigital Library
Struzik, Z., Siebes, A., and CWI, A. 1999. Measuring time series similarity through large singular features revealed with wavelet transformation. In Proceedings of the 10th International Workshop on Database and Expert Systems Applications. 162--166. Google ScholarDigital Library
Subasi, A. 2007. EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst. Appl. 32, 4, 1084--1093. Google ScholarDigital Library
Tang, H. and Liao, S. 2008. Discovering original motifs with different lengths from time series. Knowl.-Based Syst. 21, 7, 666--671. Google ScholarDigital Library
Tsay, R. 2005. Analysis of Financial Time Series. Wiley-Interscience.Google Scholar
Vasko, K. and Toivonen, H. 2002. Estimating the number of segments in time series data using permutation tests. In Proceedings of the IEEE International Conference on Data Mining. 466--473. Google ScholarDigital Library
Vlachos, M., Gunopoulos, D., and Kollios, G. 2002. Discovering similar multidimensional trajectories. In Proceedings of the 18th International Conference on Data Engineering. IEEE Computer Society, 673--684. Google ScholarDigital Library
Vlachos, M., Gunopulos, D., and Das, G. 2004. Indexing time series under conditions of noise. In Data Mining in Time Series Databases. 67--100.Google Scholar
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., and Keogh, E. 2006. Indexing multidimensional time series. The VLDB J. 15, 1, 1--20. Google ScholarDigital Library
Vlachos, M., Lin, J., Keogh, E., and Gunopulos, D. 2003. A wavelet-based anytime algorithm for k-means clustering of time series. In Proceedings of the Workshop on Clustering High Dimensionality Data and Its Applications. 23--30.Google Scholar
Vlachos, M., Yu, P., and Castelli, V. 2005. On periodicity detection and structural periodic similarity. In Proceedings of the SIAM International Conference on Data Mining. 449--460.Google Scholar
Wagner, N., Michalewicz, Z., Khouja, M., and McGregor, R. 2007. Time series forecasting for dynamic environments: The DyFor genetic program model. IEEE Trans. Evolut. Comput. 11, 4, 433--452. Google ScholarDigital Library
Weigend, A. and Gershenfeld, N. 1994. Time Series Prediction: Forecasting the Future and Understanding the Past. Addison Wesley.Google Scholar
Weiss, G. 2004. Mining with rarity: A unifying framework. ACM SIGKDD Explor. Newslett. 6, 1, 7--19. Google ScholarDigital Library
Xi, X., Keogh, E., Shelton, C., Wei, L., and Ratanamahatana, C. 2006. Fast time series classification using numerosity reduction. In Proceedings of the 23rd International Conference on Machine Learning. 1040. Google ScholarDigital Library
Xi, X., Keogh, E., Wei, L., and Mafra-neto, A. 2007. Finding motifs in database of shapes. In Proceedings of SIAM International Conference on Data Mining. 249--260.Google Scholar
Xie, J. and Yan, W. 2007. Pattern-Based characterization of time series. Int. J. Info. Syst. Sci. 3, 3, 479--491.Google Scholar
Xiong, Y. and Yeung, D. 2004. Time series clustering with ARMA mixtures. Pattern Recogn. 37, 8, 1675--1689.Google ScholarCross Ref
Yadav, R., Kalra, P., and John, J. 2007. Time series prediction with single multiplicative neuron model. Appl. Soft Comput. 7, 4, 1157--1163. Google ScholarDigital Library
Yankov, D., Keogh, E., Medina, J., Chiu, B., and Zordan, V. 2007. Detecting time series motifs under uniform scaling. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 844--853. Google ScholarDigital Library
Yankov, D., Keogh, E., and Rebbapragada, U. 2008. Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. Knowl. Info. Syst. 17, 2, 241--262. Google ScholarDigital Library
Ye, D., Wang, X., Keogh, E., and Mafra-Neto, A. 2009. Autocannibalistic and anyspace indexing algorithms with applications to sensor data mining. In Proceedings of the SIAM International Conference on Data Mining (SDM 09). 85--96.Google Scholar
Ye, L. and Keogh, E. 2009. Time series shapelets: A new primitive for data mining. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 947--956. Google ScholarDigital Library
Yi, B. and Faloutsos, C. 2000. Fast time sequence indexing for arbitrary Lp norms. In Proceedings of the 26th International Conference on Very Large Data Bases. 385--394. Google ScholarDigital Library
Yi, B., Jagadish, H., and Faloutsos, C. 1998. Efficient retrieval of similar time sequences under time warping. In Proceedings of the 14th International Conference on Data Engineering. 201--208. Google ScholarDigital Library
Yoon, H., Yang, K., and Shahabi, C. 2005. Feature subset selection and feature ranking for multivariate time series. IEEE Trans. Knowl. Data Engin., 1186--1198. Google ScholarDigital Library
Ypma, A. and Duin, R. 1997. Novelty detection using self-organizing maps. Progress Connect.-Based Info. Syst. 2, 1322--1325.Google Scholar
Zhan, Y., Chen, X., and Xu, R. 2007. Outlier detection algorithm based on pattern representation of time series. Appl. Res. Comput. 24, 11, 96--99.Google Scholar
Zhang, X., Wu, J., Yang, X., Ou, H., and Lv, T. 2009. A novel pattern extraction method for time series classification. Optimiz. Engin. 10, 2, 253--271.Google ScholarCross Ref
Zhong, S. and Ghosh, J. 2002. HMMs and coupled HMMs for multi-channel EEG classification. In Proceedings of the IEEE International Joint Conference on Neural Networks. 1154--1159.Google Scholar
Zhong, S., Khoshgoftaar, T., and Seliya, N. 2007. Clustering-Based network intrusion detection. Int. J. Reliab. Qual. Safety Engin. 14, 2, 169--187.Google ScholarCross Ref

Index Terms

Time-series data mining
1. Information systems
  1. Information retrieval
    1. Document representation
  2. Information systems applications
2. Mathematics of computing
  1. Probability and statistics

Recommendations

Fuzzy data mining for time-series data

Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to ...
Read More
Mining fuzzy frequent trends from time series

Time-series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many mining approaches were proposed to find useful patterns from time-series data. Time-series data, ...
Read More
Mining fuzzy specific rare itemsets for education data

Association rule mining is an important data analysis method for the discovery of associations within data. There have been many studies focused on finding fuzzy association rules from transaction databases. Unfortunately, in the real world, one may ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Computing Surveys Volume 45, Issue 1
November 2012
455 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/2379776
Issue’s Table of Contents

Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 December 2012
- Revised: 1 August 2011
- Accepted: 1 August 2011
- Received: 1 January 2011
Published in csur Volume 45, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Distance measures
data indexing
data mining
query by content
sequence matching
similarity measures
stream analysis
temporal analysis
time series
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 682
  Total Citations
  View Citations
- 13,651
  Total Downloads
- Downloads (Last 12 months)1,127
- Downloads (Last 6 weeks)126
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Time-series data mining

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Fuzzy data mining for time-series data

Mining fuzzy frequent trends from time series

Mining fuzzy specific rare itemsets for education data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Time-series data mining

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Fuzzy data mining for time-series data

Mining fuzzy frequent trends from time series

Mining fuzzy specific rare itemsets for education data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media