Abstract
In almost every scientific field, measurements are performed over time. These observations lead to a collection of organized data called time series. The purpose of time-series data mining is to try to extract all meaningful knowledge from the shape of data. Even if humans have a natural capacity to perform these tasks, it remains a complex problem for computers. In this article we intend to provide a survey of the techniques applied for time-series data mining. The first part is devoted to an overview of the tasks that have captured most of the interest of researchers. Considering that in most cases, time-series task relies on the same components for implementation, we divide the literature depending on these common aspects, namely representation techniques, distance measures, and indexing methods. The study of the relevant literature has been categorized for each individual aspects. Four types of robustness could then be formalized and any kind of distance could then be classified. Finally, the study submits various research trends and avenues that can be explored in the near future. We hope that this article can provide a broad and deep understanding of the time-series data mining research field.
- Abonyi, J., Fell, B., Nemeth, S., and Arva, P. 2003. Fuzzy clustering based segmentation of time-series. In Proceedings of the 5th International Symposium on Intelligent Data Analysis (IDA 03). Springer, 275--285.Google Scholar
- Agrawal, R., Faloutsos, C., and Swami, A. 1993. Efficient similarity search in sequence databases. In Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms. Springer, 69--84. Google ScholarDigital Library
- Agrawal, R., Lin, K.-I., Sawhney, H. S., and Shim, K. 1995. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings of the 21st International Conference on Very Large Data Bases. Morgan Kaufmann, 490--501. Google ScholarDigital Library
- Ahmed, N., Atiya, A., El Gayar, N., El-Shishiny, H., and Giza, E. 2009. An empirical comparison of machine learning models for time series forecasting. Econometr. Rev. 29, 5, 594--621.Google ScholarCross Ref
- Ahmed, T., Oreshkin, B., and Coates, M. 2007. Machine learning approaches to network anomaly detection. In Proceedings of the 2nd USENIX Workshop on Tackling Computer Systems Problems with Machine Learning Techniques. USENIX Association, 1--6. Google ScholarDigital Library
- An, J., Chen, H., Furuse, K., Ohbo, N., and Keogh, E. 2003. Grid-Based indexing for large time series databases. In Intelligent Data Engineering and Automated Learning. Lecture Notes in Computer Science, vol. 1983. Springer 614--621.Google Scholar
- Antunes, C. and Oliveira, A. 2001. Temporal data mining: An overview. In Proceedings of the KDD Workshop on Temporal Data Mining. 1--13.Google Scholar
- Argyros, T. and Ermopoulos, C. 2003. Efficient subsequence matching in time series databases under time and amplitude transformations. In Proceedings of the 3rd IEEE International Conference on Data Mining. 481--484. Google ScholarDigital Library
- Assent, I., Krieger, R., Afschari, F., and Seidl, T. 2008. The TS-tree: Efficient time series search and retrieval. In Proceedings of the 11th International Conference on Extending Database Technology. 25--29. Google ScholarDigital Library
- Assent, I., Wichterich, M., Krieger, R., Kremer, H., and Seidl, T. 2009. Anticipatory DTW for efficient similarity search in time series databases. Proc. VLDB Endowm. 2, 1, 826--837. Google ScholarDigital Library
- Aßfalg, J., Kriegel, H., Kroger, P., Kunath, P., Pryakhin, A., and Renz, M. 2006. Similarity search on time series based on threshold queries. In Proceedings of the 10th International Conference on Extending Database Technology. 276. Google ScholarDigital Library
- Aßfalg, J., Kriegel, H., Kröger, P., Kunath, P., Pryakhin, A., and Renz, M. 2008. Similarity search in multimedia time series data using amplitude-level features. In Proceedings of the 14th International Conference on Advances in Multimedia Modeling. Springer, 123--133. Google ScholarDigital Library
- Bagnall, A. and Janacek, G. 2005. Clustering time series with clipped data. Mach. Learn. 58, 2, 151--178. Google ScholarDigital Library
- Bagnall, A., Janacek, G., De la Iglesia, B., and Zhang, M. 2003. Clustering time series from mixture polynomial models with discretised data. In Proceedings of the 2nd Australasian Data Mining Workshop. 105--120.Google Scholar
- Bagnall, A., Ratanamahatana, C., Keogh, E., Lonardi, S., and Janacek, G. 2006. A bit level representation for time series data mining with shape based similarity. Data Min. Knowl. Discov. 13, 1, 11--40. Google ScholarDigital Library
- Bai, J. and Ng, S. 2008. Forecasting economic time series using targeted predictors. J. Econometr. 146, 2, 304--317.Google ScholarCross Ref
- Bakshi, B. and Stephanopoulos, G. 1994. Representation of process trends--IV. Induction of real-time patterns from operating data for diagnosis and supervisory control. Comput. Chemi. Engin. 18, 4, 303--332.Google ScholarCross Ref
- Bakshi, B. and Stephanopoulos, G. 1995. Reasoning in time: Modeling, analysis, and pattern recognition of temporal process trends. Adv. Chem. Engin. 22, 485--548.Google ScholarCross Ref
- Bandera, J., Marfil, R., Bandera, A., Rodríguez, J., Molina-Tanco, L., and Sandoval, F. 2009. Fast gesture recognition based on a two-level representation. Pattern Recogn. Lett. 30, 13, 1181--1189. Google ScholarDigital Library
- Barone, P., Carfora, M., and March, R. 2009. Segmentation, classification and denoising of a time series field by a variational method. J. Math. Imag. Vis. 34, 2, 152--164. Google ScholarDigital Library
- Barreto, G. 2007. Time series prediction with the self-organizing map: A review. Perspect. Neural-Symbo. Integr. 77, 1, 135--158.Google ScholarCross Ref
- Bartolini, I., Ciaccia, P., and Patella, M. 2005. Warp: Accurate retrieval of shapes using phase of fourier descriptors and time warping distance. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1, 142--147. Google ScholarDigital Library
- Bayer, R. and McCreight, E. 1972. Organization and maintenance of large ordered indexes. Acta Info. 1, 3, 173--189.Google ScholarDigital Library
- Beckmann, N., Kriegel, H., Schneider, R., and Seeger, B. 1990. The R*-tree: An efficient and robust access method for points and rectangles. ACM SIGMOD Rec. 19, 2, 322--331. Google ScholarDigital Library
- Berchtold, S., Keim, D., and Kriegel, H. 2002. The X-tree: An index structure for high-dimensional data. Read. Multimedia Comput. Netw. 4, 1, 451--463. Google ScholarDigital Library
- Berkhin, P. 2006. A survey of clustering data mining techniques. Group. Multidimen. Data, 25--71.Google Scholar
- Berndt, D. and Clifford, J. 1994. Using dynamic time warping to find patterns in time series. In Proceedings of the AAAI-94 Workshop on Knowledge Discovery in Databases. 229--248.Google Scholar
- Berretti, S., Del Bimbo, A., and Pala, P. 2000. Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans. Multimedia 2, 4, 225--239. Google ScholarDigital Library
- Bhargava, R., Kargupta, H., and Powers, M. 2003. Energy consumption in data analysis for on-board and distributed applications. In Proceedings of the ICML. Vol. 3.Google Scholar
- Bicego, M., Murino, V., and Figueiredo, M. 2003. Similarity-Based clustering of sequences using hidden Markov models. Lecture Notes in Computer Science, vol. 2743. Springer, 95--104.Google Scholar
- Bohm, C., Berchtold, S., and Keim, D. 2001. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33, 3, 322--373. Google ScholarDigital Library
- Bollobas, B., Das, G., Gunopulos, D., and Mannila, H. 1997. Time-Series similarity problems and well-separated geometric sets. In Proceedings of the 13th Symposium on Computational Geometry. 454--456. Google ScholarDigital Library
- Box, G., Jenkins, G., and Reinsel, G. 1976. Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco. Google ScholarDigital Library
- Brockwell, P. and Davis, R. 2002. Introduction to Time Series and Forecasting. Springer.Google Scholar
- Brockwell, P. and Davis, R. 2009. Time Series: Theory and Methods. Springer.Google Scholar
- Buhler, J. and Tompa, M. 2002. Finding motifs using random projections. J. Comput. Biol. 9, 2, 225--242.Google ScholarCross Ref
- Burkom, H., Murphy, S., and Shmueli, G. 2007. Automated time series forecasting for biosurveillance. Statist. Medi. 26, 22, 4202--4218.Google ScholarCross Ref
- Cai, Y. and Ng, R. 2004. Indexing spatio-temporal trajectories with Chebyshev polynomials. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 599--610. Google ScholarDigital Library
- Cao, L. and Tay, F. 2009. Feature selection for support vector machines in financial time series forecasting. In Intelligent Data Engineering and Automated Learning. Lecture Notes in Computer Science, vol. 1983. Springer, 41--65. Google ScholarDigital Library
- Chakrabarti, K. and Mehrotra, S. 1999. The hybrid tree: An index structure for high dimensional feature spaces. In Proceedings of the 15th International Conference on Data Engineering. 440--447. Google ScholarDigital Library
- Chan, F., Fu, A., and Yu, C. 2003. Haar wavelets for efficient similarity search of time-series: With and without time warping. IEEE Trans. Knowl. Data Engin. 15, 3, 686--705. Google ScholarDigital Library
- Chan, K. and Fu, A. 1999. Efficient time series matching by wavelets. In Proceedings of the 15th IEEE International Conference on Data Engineering. 126--133. Google ScholarDigital Library
- Chandola, V., Banerjee, A., and Kumar, V. 2009. Anomaly detection: A survey. ACM Comput. Surv. 41, 3, 15. Google ScholarDigital Library
- Chappelier, J. and Grumbach, A. 1996. A Kohonen map for temporal sequences. In Proceedings of the Conference on Neural Networks and Their Applications. 104--110.Google Scholar
- Chen, L. and Ng, R. 2004. On the marriage of Lp-norms and edit distance. In Proceedings of the 30th International Conference on Very Large Data Bases. 792--803. Google ScholarDigital Library
- Chen, L., Ozsu, M., and Oria, V. 2005. Robust and fast similarity search for moving object trajectories. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 491--502. Google ScholarDigital Library
- Chen, Q., Chen, L., Lian, X., Liu, Y., and Yu, J. 2007a. Indexable PLA for efficient similarity search. In Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, 435--446. Google ScholarDigital Library
- Chen, X., Kwong, S., and Li, M. 2000. A compression algorithm for DNA sequences and its applications in genome comparison. In Proceedings of the 4th Annual International Conference on Computational Molecular Biology. 107. Google ScholarDigital Library
- Chen, X. and Zhan, Y. 2008. Multi-Scale anomaly detection algorithm based on infrequent pattern of time series. J. Comput. Appl. Math. 214, 1, 227--237. Google ScholarDigital Library
- Chen, Y., Nascimento, M., Ooi, B., and Tung, A. 2007b. Spade: On shape-based pattern detection in streaming time series. In Proceedings of the IEEE 23rd International Conference on Data Engineering. 786--795.Google Scholar
- Chhieng, V. and Wong, R. 2010. Adaptive distance measurement for time series databases. In Lecture Notes in Computer Science, vol. 4443. Springer, 598--610. Google ScholarDigital Library
- Chiu, B., Keogh, E., and Lonardi, S. 2003. Probabilistic discovery of time series motifs. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 493--498. Google ScholarDigital Library
- Chuah, M. and Fu, F. 2007. ECG anomaly detection via time series analysis. In Frontiers of High Performance Computing and Networking ISPA 07 Workshops. Springer, 123--135. Google ScholarDigital Library
- Corduas, M. and Piccolo, D. 2008. Time series clustering and classification by the autoregressive metric. Comput. Statist. Data Anal. 52, 4, 1860--1872. Google ScholarDigital Library
- Cormode, G., Muthukrishnan, S., and Zhuang, W. 2007. Conquering the divide: Continuous clustering of distributed data streams. In Proceedings of the IEEE 23rd International Conference on Data Engineering. 1036--1045.Google Scholar
- Costa Santos, C., Bernardes, J., Vitanyi, P., and Antunes, L. 2006. Clustering fetal heart rate tracings by compression. In Proceedings of the 19th International Symposium on Computer-Based Medical Systems. 685--690. Google ScholarDigital Library
- Das, G., Gunopulos, D., and Mannila, H. 1997. Finding similar time series. In Proceedings of the 1st European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD'97). Springer, 88--100. Google ScholarDigital Library
- Degli Esposti, M., Farinelli, C., and Menconi, G. 2009. Sequence distance via parsing complexity: Heartbeat signals. Chaos, Sol. Fractals 39, 3, 991--999.Google ScholarCross Ref
- Deng, K., Moore, A., and Nechyba, M. 1997. Learning to recognize time series: Combining ARMA models with memory-based learning. In Proceedings of the IEEE International Symposium on Computational Intelligence in Robotics and Automation. 246--251. Google ScholarDigital Library
- Denton, A. 2005. Kernel-Density-Based clustering of time series subsequences using a continuous random-walk noise model. In Proceedings of the 5th IEEE International Conference on Data Mining. 122--129. Google ScholarDigital Library
- Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., and Keogh, E. 2008. Querying and mining of time series data: Experimental comparison of representations and distance measures. Proc. VLDB Endowm. 1, 2, 1542--1552. Google ScholarDigital Library
- Domingos, P. and Hulten, G. 2000. Mining high-speed data streams. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 71--80. Google ScholarDigital Library
- Dong, G., Han, J., Lakshmanan, L., Pei, J., Wang, H., and Yu, P. 2003. Online mining of changes from data streams: Research problems and preliminary results. In Proceedings of the ACM SIGMOD Workshop on Management and Processing of Data Streams.Google Scholar
- Faloutsos, C. and Megalooikonomou, V. 2007. On data mining, compression, and kolmogorov complexity. Data Min. Knowl. Discov. 15, 1, 3--20. Google ScholarDigital Library
- Faloutsos, C., Ranganathan, M., and Manolopulos, Y. 1994. Fast subsequence matching in time-series databases. SIGMOD Rec. 23, 419--429. Google ScholarDigital Library
- Ferreira, P., Azevedo, P., Silva, C., and Brito, R. 2006. Mining approximate motifs in time series. In Lecture Notes in Computer Science, vol. 4265. Springer, 89--101. Google ScholarDigital Library
- Flanagan, J. 2003. A non-parametric approach to unsupervised learning and clustering of symbol strings and sequences. In Proceedings of the 4th Workshop on Self-Organizing Maps (WSOM03). 128--133.Google Scholar
- Frentzos, E., Gratsias, K., and Theodoridis, Y. 2007. Index-Based most similar trajectory search. In Proceedings of the IEEE 23rd International Conference on Data Engineering. 816--825.Google Scholar
- Fröhwirth-Schnatter, S. and Kaufmann, S. 2008. Model-Based clustering of multiple time series. J. Bus. Econ. Statist. 26, 1, 78--89.Google ScholarCross Ref
- Fu, A., Keogh, E., Lau, L., Ratanamahatana, C., and Wong, R. 2008. Scaling and time warping in time series querying. The VLDB J. Int. J. Very Large Data Bases 17, 4, 921. Google ScholarDigital Library
- Fuchs, E., Gruber, T., Pree, H., and Sick, B. 2010. Temporal data mining using shape space representations of time series. Neurocomput. 74, 1-3, 379--393. Google ScholarDigital Library
- Gaber, M., Zaslavsky, A., and Krishnaswamy, S. 2005. Mining data streams: A review. ACM SIGMOD Rec. 34, 2, 18--26. Google ScholarDigital Library
- Gaffney, S. and Smyth, P. 1999. Trajectory clustering with mixtures of regression models. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 63--72. Google ScholarDigital Library
- Ge, X. and Smyth, P. 2000. Deformable Markov model templates for time-series pattern matching. In Proceedings of the 6th ACM International Conference on Knowledge Discovery and Data Mining. 81--90. Google ScholarDigital Library
- Geurts, P. 2001. Pattern extraction for time series classification. In Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery. 115--127. Google ScholarDigital Library
- Golab, L. and Ozsu, M. 2003. Issues in data stream management. ACM SIGMOD Rec. 32, 2, 5--14. Google ScholarDigital Library
- Goldin, D. and Kanellakis, P. 1995. On similarity queries for time-series data: Constraint specification and implementation. In Proceedings of the Principles and Practice of Constraint Programming (CP95). Springer, 137--153. Google ScholarDigital Library
- Goldin, D., Millstein, T., and Kutlu, A. 2004. Bounded similarity querying for time-series data. Info. Comput. 194, 2, 203--241. Google ScholarDigital Library
- Gullo, F., Ponti, G., Tagarelli, A., and Greco, S. 2009. A time series representation model for accurate and fast similarity detection. Pattern Recogn. 42, 11, 2998--3014. Google ScholarDigital Library
- Gupta, S., Ray, A., and Keller, E. 2007. Symbolic time series analysis of ultrasonic data for early detection of fatigue damage. Mechan. Syst. Signal Process. 21, 2, 866--884.Google ScholarCross Ref
- Gusfield, D. 1997. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press. Google ScholarDigital Library
- Han, J. and Kamber, M. 2006. Data Mining: Concepts and Techniques. Morgan Kaufmann. Google ScholarDigital Library
- Harris, R. and Sollis, R. 2003. Applied Time Series Modelling and Forecasting. J. Wiley.Google Scholar
- Hebrail, G. and Hugueney, B. 2000. Symbolic representation of long time-series. In Symbolic Data Analysis at the 4th European Conference on Principles of Data Mining and Knowledge Discovery. 56--65.Google Scholar
- Hellerstein, J., Koutsoupias, E., and Papadimitriou, C. 1997. On the analysis of indexing schemes. In Proceedings of the 16th ACM Symposium on Principles of Database Systems. 249--256. Google ScholarDigital Library
- Herrera, L., Pomares, H., Rojas, I., Guillén, A., Prieto, A., and Valenzuela, O. 2007. Recursive prediction for long term time series forecasting using advanced models. Neurocomput. 70, 16-18, 2870--2880. Google ScholarDigital Library
- Himberg, J., Korpiaho, K., Tikanmaki, J., and Toivonen, H. 2001a. Time series segmentation for context recognition in mobile devices. In Proceedings of the 1st IEEE International Conference on Data Mining. 203--210. Google ScholarDigital Library
- Himberg, J., Mantyjarvi, J., and Korpipaa, P. 2001b. Using PCA and ICA for exploratory data analysis in situation awareness. In Proceedings of the International Conference on Multisensor Fusion and Integration for Intelligent Systems. 127--131.Google Scholar
- Huang, Y. and Yu, P. 1999. Adaptive query processing for time-series data. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 282--286. Google ScholarDigital Library
- Hulten, G., Spencer, L., and Domingos, P. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 97--106. Google ScholarDigital Library
- Indyk, P., Koudas, N., and Muthukrishnan, S. 2000. Identifying representative trends in massive time series data sets using sketches. In Proceedings of the 26th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., 363--372. Google ScholarDigital Library
- Janacek, G., Bagnall, A., and Powell, M. 2005. A likelihood ratio distance measure for the similarity between the Fourier transform of time series. In Lecture Notes in Computer Science, vol. 3518. Springer, 737--743. Google ScholarDigital Library
- Jeng, S. and Huang, Y. 2008. Time series classification based on spectral analysis. Comm. Statisti. Simul. Comput. 37, 1, 132--142.Google ScholarCross Ref
- Kalpakis, K., Gada, D., and Puttagunta, V. 2001. Distance measures for effective clustering of ARIMA time-series. In Proceedings of the IEEE International Conference on Data Mining. 273--280. Google ScholarDigital Library
- Kauppinen, H., Seppanen, T., and Pietikainen, M. 1995. An experimental comparison of autoregressive and Fourier-based descriptors in 2D shape classification. IEEE Trans. Pattern Anal. Mach. Intell. 17, 2, 201--207. Google ScholarDigital Library
- Kehagias, A. 2004. A hidden Markov model segmentation procedure for hydrological and environmental time series. Stochas. Environ. Res. Risk Assessm. 18, 2, 117--130.Google ScholarCross Ref
- Keogh, E., Chakrabarti, K., and Pazzani, M. 2001a. Locally adaptive dimensionality reduction for indexing large time series databases. In Proceedings of ACM Conference on Management of Data. 151--162. Google ScholarDigital Library
- Keogh, E., Chakrabarti, K., Pazzani, M., and Mehrotra, S. 2001b. Dimensionality reduction for fast similarity search in large time series databases. Knowl. Info. Syst. 3, 3, 263--286.Google ScholarCross Ref
- Keogh, E., Chu, S., Hart, D., and Pazzani, M. 2003a. Segmenting time series: A survey and novel approach. Data Min. Time Series Databases, 1--21.Google Scholar
- Keogh, E. and Kasetty, S. 2002. On the need for time series data mining benchmarks : A survey and empirical demonstration. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 102--111. Google ScholarDigital Library
- Keogh, E. and Kasetty, S. 2003. On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Min. Knowl. Discov. 7, 4, 349--371. Google ScholarDigital Library
- Keogh, E., Lin, J., Lee, S., and Herle, H. 2007. Finding the most unusual time series subsequence: Algorithms and applications. Knowl. Inf. Syst. 11, 1, 1--27. Google ScholarDigital Library
- Keogh, E., Lin, J., and Truppel, W. 2003b. Clustering of time series subsequences is meaningless: Implications for previous and future research. In Proceedings of the 3rd IEEE International Conference on Data Mining. 115--122. Google ScholarDigital Library
- Keogh, E., Lonardi, S., and Ratanamahatana, C. 2004. Towards parameter-free data mining. In Proceedings of 10th ACM International Conference on Knowledge Discovery and Data Mining. 206--215. Google ScholarDigital Library
- Keogh, E. and Pazzani, M. 1998. An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In Proceedings of the 4th International Conference of Knowledge Discovery and Data Mining. AAAI Press, 239--241.Google ScholarDigital Library
- Keogh, E. and Ratanamahatana, C. 2005. Exact indexing of dynamic time warping. Knowl. Info. Syst. 7, 3, 358--386. Google ScholarDigital Library
- Kerr, G., Ruskin, H., Crane, M., and Doolan, P. 2008. Techniques for clustering gene expression data. Comput. Biol. Med. 38, 3, 283--293. Google ScholarDigital Library
- Kim, S., Park, S., and Chu, W. 2001. An index-based approach for similarity search supporting time warping in large sequence databases. In Proceedings of the 17th International Conference on Data Engineering. IEEE Computer Society, 607--614. Google ScholarDigital Library
- Kontaki, M., Papadopoulos, A., and Manolopoulos, Y. 2007. Adaptive similarity search in streaming time series with sliding windows. Data Knowl. Engin. 63, 2, 478--502. Google ScholarDigital Library
- Kontaki, M., Papadopoulos, A., and Manolopoulos, Y. 2009. Similarity search in time series. In Handbook of Research on Innovations in Database Technologies and Applications. 288--299.Google Scholar
- Korn, F., Jagadish, H., and Faloutsos, C. 1997. Efficiently supporting ad hoc queries in large datasets of time sequences. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 289--300. Google ScholarDigital Library
- Koskela, T. 2003. Neural network methods in analysing and modelling time varying processes. Ph.D. thesis, Helsinki University of Technology Laboratory of Computational Engineering.Google Scholar
- Kumar, N., Lolla, N., Keogh, E., Lonardi, S., Ratanamahatana, C., and Wei, L. 2005. Time Series bitmaps: A practical visualization tool for working with large time-series databases. In Proceedings of the SIAM Data Mining Conference. 531--535.Google Scholar
- Latecki, L., Megalooikonomou, V., Wang, Q., Lakaemper, R., Ratanamahatana, C., and Keogh, E. 2005. Elastic partial matching of time series. Knowl. Discov. Databases, 577--584. Google ScholarDigital Library
- Latecki, L., Wang, Q., Koknar-Tezel, S., and Megalooikonomou, V. 2007. Optimal subsequence bijection. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 565--570. Google ScholarDigital Library
- Law, M. and Kwok, J. 2000. Rival penalized competitive learning for model-based sequence clustering. In Proceedings of the 15th International Conference on Pattern Recognition. Vol. 2. 2186--2195.Google Scholar
- Li, C., Yu, P., and Castelli, V. 1998. MALM: A framework for mining sequence database at multiple abstraction levels. In Proceedings of the 7th International Conference on Information and Knowledge Management. ACM, 267--272. Google ScholarDigital Library
- Lian, X. and Chen, L. 2007. Efficient similarity search over future stream time series. IEEE Trans. Knowl. Data Engin. 20, 1, 40--54. Google ScholarDigital Library
- Lian, X., Chen, L., and Wang, B. 2010. Approximate similarity search over multiple stream time series. In Lecture Notes in Computer Science, vol. 4443. Springer, 962--968. Google ScholarDigital Library
- Liao, T. 2005. Clustering of time series data--A survey. Pattern Recogn. 38, 11, 1857--1874. Google ScholarDigital Library
- Liew, A., Leung, S., and Lau, W. 2000. Fuzzy image clustering incorporating spatial continuity. IEEE Proc. Vis. Image Signal Process. 147, 2, 185--192.Google ScholarCross Ref
- Lin, J. and Keogh, E. 2005. Clustering of time-series subsequences is meaningless: Implications for previous and future research. Knowl. Info. Syst 8, 2, 154--177. Google ScholarDigital Library
- Lin, J., Keogh, E., Lonardi, S., and Chiu, B. 2003. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. ACM New York, 2--11. Google ScholarDigital Library
- Lin, J., Keogh, E., Lonardi, S., Lankford, J., and Nystrom, D. 2004. Visually mining and monitoring massive time series. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 460--469. Google ScholarDigital Library
- Lin, J. and Li, Y. 2009. Finding structural similarity in time series data using bag-of-patterns representation. In Proceedings of the 21st International Conference on Scientific and Statistical Database Management. Springer, 461--477. Google ScholarDigital Library
- Lin, T., Kaminski, N., and Bar-Joseph, Z. 2008. Alignment and classification of time series gene expression in clinical studies. Bioinf. 24, 13, 147--155. Google ScholarDigital Library
- Liu, Z., Yu, J., Lin, X., Lu, H., and Wang, W. 2005. Locating Motifs in Time Series Data. Springer, 343--353. Google ScholarDigital Library
- Lotte, F., Congedo, M., Lécuyer, A., Lamarche, F., and Arnaldi, B. 2007. A review of classification algorithms for EEG-based brain--computer interfaces. J. Neural Engin. 4, 1--13.Google ScholarCross Ref
- Lowitz, T., Ebert, M., Meyer, W., and Hensel, B. 2009. Hidden markov models for classification of heart rate variability in RR time series. In World Congress on Medical Physics and Biomedical Engineering. Springer, 1980--1983.Google Scholar
- Ma, J. and Perkins, S. 2003. Online novelty detection on temporal sequences. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 613--618. Google ScholarDigital Library
- Mannila, H. and Seppnen, J. 2001. Recognizing similar situations from event sequences. In Proceedings of the 1st SIAM Conference on Data Mining. 1--16.Google Scholar
- Marteau, P. 2008. Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2, 306--318. Google ScholarDigital Library
- Megalooikonomou, V., Li, G., and Wang, Q. 2004. A dimensionality reduction technique for efficient similarity analysis of time series databases. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM, 160--161. Google ScholarDigital Library
- Megalooikonomou, V., Wang, Q., Li, G., and Faloutsos, C. 2005. A multiresolution symbolic representation of time series. In Proceedings of the 21st International Conference on Data Engineering. 668--679. Google ScholarDigital Library
- Mohammad, Y. and Nishida, T. 2009. Constrained motif discovery in time series. New Gener. Comput. 27, 4, 319--346.Google ScholarCross Ref
- Morse, M. and Patel, J. 2007. An efficient and accurate method for evaluating time series similarity. In Proceedings of the ACM International Conference on Management of Data. 569--580. Google ScholarDigital Library
- Mueen, A., Keogh, E., Zhu, Q., Cash, S., and Westover, B. 2009. Exact discovery of time series motifs. In Proceedings of the SIAM International Conference on Data Mining (SDM). 473--484.Google Scholar
- Muhammad Fuad, M. and Marteau, P. 2008. Extending the edit distance using frequencies of common characters. In Proceedings of the 19th International Conference on Database and Expert Systems Applications. Springer, 150--157. Google ScholarDigital Library
- Nanopoulos, A., Alcock, R., and Manolopoulos, Y. 2001. Feature-Based classification of time-series data. In Information Processing and Technology. 49--61. Google ScholarDigital Library
- Ogras, Y. and Ferhatosmanoglu, H. 2006. Online summarization of dynamic time series data. The VLDB J. The Int. J. Very Large Data Bases 15, 1, 84--98. Google ScholarDigital Library
- Otu, H. and Sayood, K. 2003. A new sequence distance measure for phylogenetic tree construction. Bioinf. 19, 16, 2122--2130.Google ScholarCross Ref
- Ouyang, R., Ren, L., Cheng, W., and Zhou, C. 2010. Similarity search and pattern discovery in hydrological time series data mining. Hydrol. Process. 24, 9, 1198--1210.Google ScholarCross Ref
- Palpanas, T., Keogh, E., Zordan, V., Gunopulos, D., and Cardle, M. 2004a. Indexing large human-motion databases. In Proceedings of the 13th International Conference on Very Large Data Bases. 780--791. Google ScholarDigital Library
- Palpanas, T., Vlachos, M., Keogh, E., and Gunopulos, D. 2008. Streaming time series summarization using user-defined amnesic functions. IEEE Trans. Knowl. Data Engin. 20, 7, 992--1006. Google ScholarDigital Library
- Palpanas, T., Vlachos, M., Keogh, E., Gunopulos, D., and Truppel, W. 2004b. Online amnesic approximation of streaming time series. In Proceedings of the 20th International Conference on Data Engineering. 338--349. Google ScholarDigital Library
- Panuccio, A., Bicego, M., and Murino, V. 2002. A hidden Markov model-based approach to sequential data clustering. In Lecture Notes in Computer Science, vol. 2396. Springer, 734--743. Google ScholarDigital Library
- Papadimitriou, S., Sun, J., and Yu, P. 2006. Local correlation tracking in time series. In Proceedings of the 6th International Conference on Data Mining. 456--465. Google ScholarDigital Library
- Papadimitriou, S. and Yu, P. 2006. Optimal multi-scale patterns in time series streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 647--658. Google ScholarDigital Library
- Park, S., Chu, W., Yoon, J., and Hsu, C. 2000. Efficient searches for similar subsequences of different lengths in sequence databases. In Proceedings of the 16th International Conference on Data Engineering. 23--32. Google ScholarDigital Library
- Park, S., Lee, D., and Chu, W. 1999. Fast retrieval of similar subsequences in long sequence databases. In Proceedings of the 3rd IEEE Knowledge and Data Engineering Exchange Workshop. 60--67. Google ScholarDigital Library
- Patel, P., Keogh, E., Lin, J., and Lonardi, S. 2002. Mining motifs in massive time series databases. In Proceedings of IEEE International Conference on Data Mining (ICDM02). 370--377. Google ScholarDigital Library
- Perng, C., Wang, H., Zhang, S., and Parker, D. 2000. Landmarks: A new model for similarity-based pattern querying in time series databases. In Proceedings of the 16th International Conference on Data Engineering. 33--42. Google ScholarDigital Library
- Pesaran, M., Pettenuzzo, D., and Timmermann, A. 2006. Forecasting time series subject to multiple structural breaks. Rev. Econ. Studies 73, 4, 1057--1084.Google ScholarCross Ref
- Popivanov, I. and Miller, R. 2002. Similarity search over time-series data using wavelets. In Proceedings of the International Conference on Data Engineering. 212--224. Google ScholarDigital Library
- Povinelli, R., Johnson, M., Lindgren, A., and Ye, J. 2004. Time series classification using Gaussian mixture models of reconstructed phase spaces. IEEE Trans. Knowl. Data Engin. 16, 6, 779--783. Google ScholarDigital Library
- Rafiei, D. and Mendelzon, A. 1998. Efficient retrieval of similar time sequences using DFT. In Proceedings of the 5th International Conference of Foundations of Data Organization and Algorithms. 249--257.Google Scholar
- Ratanamahatana, C. and Keogh, E. 2004a. Everything you know about dynamic time warping is wrong. In Proceedings of the 3rd Workshop on Mining Temporal and Sequential Data. 1--11.Google Scholar
- Ratanamahatana, C. and Keogh, E. 2004b. Making time-series classification more accurate using learned constraints. In Proceedings of SIAM International Conference on Data Mining. 11--22.Google Scholar
- Ratanamahatana, C., Keogh, E., Bagnall, A., and Lonardi, S. 2005. A novel bit level time series representation with implication of similarity search and clustering. Adv. Knowl. Discov. Data Min. 771--777. Google ScholarDigital Library
- Ratanamahatana, C. and Wanichsan, D. 2008. Stopping criterion selection for efficient semi-supervised time series classification. Studies Comput. Intell. 149, 1--14.Google Scholar
- Ravi Kanth, K., Agrawal, D., and Singh, A. 1998. Dimensionality reduction for similarity searching in dynamic databases. ACM SIGMOD Rec. 27, 2, 166--176. Google ScholarDigital Library
- Reeves, G., Liu, J., Nath, S., and Zhao, F. 2009. Managing massive time series streams with multi-scale compressed trickles. Proc. VLDB Endow. 2, 1, 97--108. Google ScholarDigital Library
- Reinert, G., Schbath, S., and Waterman, M. 2000. Probabilistic and statistical properties of words: An overview. J. Comput. Biol. 7, 1-2, 1--46.Google ScholarCross Ref
- Rodriguez, J. and Kuncheva, L. 2007. Time series classification: Decision forests and SVM on interval and DTW features. In Proceedings of the Workshop on Time Series Classification, 13th International Conference on Knowledge Discovery and Data Mining.Google Scholar
- Sakurai, Y., Yoshikawa, M., and Faloutsos, C. 2005. FTW: Fast similarity search under the time warping distance. In Proceedings of the 24th ACM Symposium on Principles of Database Systems. 326--337. Google ScholarDigital Library
- Sakurai, Y., Yoshikawa, M., Uemura, S., and Kojima, H. 2000. The A-tree: An index structure for high-dimensional spaces using relative approximation. In Proceedings of the 26th International Conference on Very Large Data Bases. 516--526. Google ScholarDigital Library
- Salvador, S. and Chan, P. 2007. Toward accurate dynamic time warping in linear time and space. Intell. Data Anal. 11, 5, 561--580. Google ScholarDigital Library
- Salvador, S., Chan, P., and Brodie, J. 2004. Learning states and rules for time series anomaly detection. In Proceedings of the 17th International FLAIRS Conference. 300--305.Google Scholar
- Sebastian, T., Klein, P., and Kimia, B. 2003. On aligning curves. IEEE Trans. Pattern Anal. Mach. Intell. 25, 1, 116--125. Google ScholarDigital Library
- Sebastiani, P., Ramoni, M., Cohen, P., Warwick, J., and Davis, J. 1999. Discovering dynamics using Bayesian clustering. In Lecture Notes in Computer Science, vol. 1642. Springer, 199--209. Google ScholarDigital Library
- Sfetsos, A. and Siriopoulos, C. 2004. Time series forecasting with a hybrid clustering scheme and pattern recognition. IEEE Trans. Syst. Man Cybernet. A 34, 3, 399--405. Google ScholarDigital Library
- Shasha, D. and Zhu, Y. 2004. High Performance Discovery in Time Series: Techniques and Case Studies. Springer. Google ScholarDigital Library
- Shatkay, H. and Zdonik, S. 1996. Approximate queries and representations for large data sequences. In Proceedings of the 12th International Conference on Data Engineering. 536--545. Google ScholarDigital Library
- Shieh, J. and Keogh, E. 2008. ISAX: Indexing and mining terabyte sized time series. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 623--631. Google ScholarDigital Library
- Smyth, P. 1997. Clustering sequences with hidden Markov models. Adv. Neural Info. Process. Syst. 648--654.Google Scholar
- Song, H. and Li, G. 2008. Tourism demand modelling and forecasting--A review of recent research. Tour. Manag. 29, 2, 203--220.Google ScholarCross Ref
- Sorjamaa, A., Hao, J., Reyhani, N., Ji, Y., and Lendasse, A. 2007. Methodology for long-term prediction of time series. Neurocomput. 70, 16-18, 2861--2869. Google ScholarDigital Library
- Srisai, D. and Ratanamahatana, C. 2009. Efficient time series classification under template matching using time warping alignment. In Proceedings of the 4th International Conference on Computer Sciences and Convergence Information Technology. IEEE, 685--690. Google ScholarDigital Library
- Stiefmeier, T., Roggen, D., and Troster, G. 2007. Gestures are strings: Efficient online gesture spotting and classification using string matching. In Proceedings of the ICST 2nd International Conference on Body Area Networks. 1--8. Google ScholarDigital Library
- Struzik, Z., Siebes, A., and CWI, A. 1999. Measuring time series similarity through large singular features revealed with wavelet transformation. In Proceedings of the 10th International Workshop on Database and Expert Systems Applications. 162--166. Google ScholarDigital Library
- Subasi, A. 2007. EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst. Appl. 32, 4, 1084--1093. Google ScholarDigital Library
- Tang, H. and Liao, S. 2008. Discovering original motifs with different lengths from time series. Knowl.-Based Syst. 21, 7, 666--671. Google ScholarDigital Library
- Tsay, R. 2005. Analysis of Financial Time Series. Wiley-Interscience.Google Scholar
- Vasko, K. and Toivonen, H. 2002. Estimating the number of segments in time series data using permutation tests. In Proceedings of the IEEE International Conference on Data Mining. 466--473. Google ScholarDigital Library
- Vlachos, M., Gunopoulos, D., and Kollios, G. 2002. Discovering similar multidimensional trajectories. In Proceedings of the 18th International Conference on Data Engineering. IEEE Computer Society, 673--684. Google ScholarDigital Library
- Vlachos, M., Gunopulos, D., and Das, G. 2004. Indexing time series under conditions of noise. In Data Mining in Time Series Databases. 67--100.Google Scholar
- Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., and Keogh, E. 2006. Indexing multidimensional time series. The VLDB J. 15, 1, 1--20. Google ScholarDigital Library
- Vlachos, M., Lin, J., Keogh, E., and Gunopulos, D. 2003. A wavelet-based anytime algorithm for k-means clustering of time series. In Proceedings of the Workshop on Clustering High Dimensionality Data and Its Applications. 23--30.Google Scholar
- Vlachos, M., Yu, P., and Castelli, V. 2005. On periodicity detection and structural periodic similarity. In Proceedings of the SIAM International Conference on Data Mining. 449--460.Google Scholar
- Wagner, N., Michalewicz, Z., Khouja, M., and McGregor, R. 2007. Time series forecasting for dynamic environments: The DyFor genetic program model. IEEE Trans. Evolut. Comput. 11, 4, 433--452. Google ScholarDigital Library
- Weigend, A. and Gershenfeld, N. 1994. Time Series Prediction: Forecasting the Future and Understanding the Past. Addison Wesley.Google Scholar
- Weiss, G. 2004. Mining with rarity: A unifying framework. ACM SIGKDD Explor. Newslett. 6, 1, 7--19. Google ScholarDigital Library
- Xi, X., Keogh, E., Shelton, C., Wei, L., and Ratanamahatana, C. 2006. Fast time series classification using numerosity reduction. In Proceedings of the 23rd International Conference on Machine Learning. 1040. Google ScholarDigital Library
- Xi, X., Keogh, E., Wei, L., and Mafra-neto, A. 2007. Finding motifs in database of shapes. In Proceedings of SIAM International Conference on Data Mining. 249--260.Google Scholar
- Xie, J. and Yan, W. 2007. Pattern-Based characterization of time series. Int. J. Info. Syst. Sci. 3, 3, 479--491.Google Scholar
- Xiong, Y. and Yeung, D. 2004. Time series clustering with ARMA mixtures. Pattern Recogn. 37, 8, 1675--1689.Google ScholarCross Ref
- Yadav, R., Kalra, P., and John, J. 2007. Time series prediction with single multiplicative neuron model. Appl. Soft Comput. 7, 4, 1157--1163. Google ScholarDigital Library
- Yankov, D., Keogh, E., Medina, J., Chiu, B., and Zordan, V. 2007. Detecting time series motifs under uniform scaling. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 844--853. Google ScholarDigital Library
- Yankov, D., Keogh, E., and Rebbapragada, U. 2008. Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. Knowl. Info. Syst. 17, 2, 241--262. Google ScholarDigital Library
- Ye, D., Wang, X., Keogh, E., and Mafra-Neto, A. 2009. Autocannibalistic and anyspace indexing algorithms with applications to sensor data mining. In Proceedings of the SIAM International Conference on Data Mining (SDM 09). 85--96.Google Scholar
- Ye, L. and Keogh, E. 2009. Time series shapelets: A new primitive for data mining. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 947--956. Google ScholarDigital Library
- Yi, B. and Faloutsos, C. 2000. Fast time sequence indexing for arbitrary Lp norms. In Proceedings of the 26th International Conference on Very Large Data Bases. 385--394. Google ScholarDigital Library
- Yi, B., Jagadish, H., and Faloutsos, C. 1998. Efficient retrieval of similar time sequences under time warping. In Proceedings of the 14th International Conference on Data Engineering. 201--208. Google ScholarDigital Library
- Yoon, H., Yang, K., and Shahabi, C. 2005. Feature subset selection and feature ranking for multivariate time series. IEEE Trans. Knowl. Data Engin., 1186--1198. Google ScholarDigital Library
- Ypma, A. and Duin, R. 1997. Novelty detection using self-organizing maps. Progress Connect.-Based Info. Syst. 2, 1322--1325.Google Scholar
- Zhan, Y., Chen, X., and Xu, R. 2007. Outlier detection algorithm based on pattern representation of time series. Appl. Res. Comput. 24, 11, 96--99.Google Scholar
- Zhang, X., Wu, J., Yang, X., Ou, H., and Lv, T. 2009. A novel pattern extraction method for time series classification. Optimiz. Engin. 10, 2, 253--271.Google ScholarCross Ref
- Zhong, S. and Ghosh, J. 2002. HMMs and coupled HMMs for multi-channel EEG classification. In Proceedings of the IEEE International Joint Conference on Neural Networks. 1154--1159.Google Scholar
- Zhong, S., Khoshgoftaar, T., and Seliya, N. 2007. Clustering-Based network intrusion detection. Int. J. Reliab. Qual. Safety Engin. 14, 2, 169--187.Google ScholarCross Ref
Index Terms
- Time-series data mining
Recommendations
Fuzzy data mining for time-series data
Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to ...
Mining fuzzy frequent trends from time series
Time-series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many mining approaches were proposed to find useful patterns from time-series data. Time-series data, ...
Mining fuzzy specific rare itemsets for education data
Association rule mining is an important data analysis method for the discovery of associations within data. There have been many studies focused on finding fuzzy association rules from transaction databases. Unfortunately, in the real world, one may ...
Comments