skip to main content
10.1145/1031171.1031203acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

A dimensionality reduction technique for efficient similarity analysis of time series databases

Published:13 November 2004Publication History

ABSTRACT

Efficiently searching for similarities among time series and discovering interesting patterns is an important and non-trivial problem with applications in many domains. The high dimensionality of the data makes the analysis very challenging. To solve this problem, many dimensionality reduction methods have been proposed. PCA (Piecewise Constant Approximation) and its variant have been shown efficient in time series indexing and similarity retrieval. However, in certain applications, too many false alarms introduced by the approximation may reduce the overall performance dramatically. In this paper, we introduce a new piecewise dimensionality reduction technique that is based on Vector Quantization. The new technique, PVQA (Piecewise Vector Quantized Approximation), partitions each sequence into equi-length segments and uses vector quantization to represent each segment by the closest (based on a distance metric) codeword from a codebook of key-sequences. The efficiency of calculations is improved due to the significantly lower dimensionality of the new representation. We demonstrate the utility and efficiency of the proposed technique on real and simulated datasets. By exploiting prior knowledge about the data, the proposed technique generally outperforms PCA and its variants in similarity searches.

References

  1. Gersho, A. & Gray R. M. (1992). Vector Quantization and Signal Compression. Kluwer Academic, Boston. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Keogh, E., Chakrabarti, K., Pazzani, M. & Mehrotra, S. (2000). "Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases", Knowledge and Information Systems 3(3): 263--286.Google ScholarGoogle ScholarCross RefCross Ref
  3. Lin, J., Keogh, E., Patel, P. & Lonardi, S. (2002). "Finding motifs in time series", 2nd Workshop on Temporal Data Mining at the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. July 23-26. Edmonton, Alberta, Canada.Google ScholarGoogle Scholar
  4. Lloyd, S. P. (1982). "Least squares quantization in PCM", IEEE Transactions on Information Theory, IT(28), pp. 127--135.Google ScholarGoogle Scholar
  5. Stanford Genomic Resources. http://genome-www.stanford.edu/nci60Google ScholarGoogle Scholar
  6. UCI KDD Archive. http://kdd.ics.uci.eduGoogle ScholarGoogle Scholar
  7. Yi, B-K & Faloutsos, C. (2000). "Fast Time Sequence Indexing for Arbitrary Lp Norms", in Proceedings of the VLDB, Cairo, Egypt, pp. 385--394. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A dimensionality reduction technique for efficient similarity analysis of time series databases

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '04: Proceedings of the thirteenth ACM international conference on Information and knowledge management
          November 2004
          678 pages
          ISBN:1581138741
          DOI:10.1145/1031171

          Copyright © 2004 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 November 2004

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader