skip to main content
10.1145/1353343.1353410acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article
Free Access

Querying time-series streams

Published:25 March 2008Publication History

ABSTRACT

Index trees created using distance based indexing are difficult to maintain online since the distance function involved is often costly to compute. This problem is intensified when the database we are dealing with, is frequently updated, as only limited time is available to perform the maintenance. In this paper, we propose a novel tree maintenance mechanism for the problem of answering approximate k-Nearest Neighbor queries with a probabilistic guarantee on timeseries streams. When the underlying data change, we may choose to defer updating the tree as long as the probabilistic guarantee of answering queries is high. To prolong such deferment, we present innovative techniques that maintain the utility of the tree by migrating its pivots and by partially reconstructing it. As the probabilistic guarantee decays with time and crosses the minimum guarantee threshold, all of the deferred updates are performed. In essence, our work offers an elegant compromise between the accuracy guarantee of query results and the cost of providing them. With extensive empirical studies, we also show the flexibility and efficiency of our approach.

References

  1. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In PODS, pages 1--16, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan. Maintaining variance and k-medians over data stream windows. In PODS, pages 234--243, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Bozkaya and Z. M. Özsoyoglu. Indexing large metric spaces for similarity search queries. ACM Trans. Database Syst., 24(3):361--404, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Brin. Near neighbor search in large metric spaces. In VLDB, pages 574--584, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Bustos, G. Navarro, and E. Chávez. Pivot selection techniques for proximity searching in metric spaces. Pattern Recognition Letters, 24(14):2357--2366, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Cai, K. A. Hua, G. Cao, and T. Xu. Real-time processing of range-monitoring queries in heterogeneous mobile databases. IEEE Trans. Mob. Comput., 5(7):931--942, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Chen and R. T. Ng. On the marriage of Lp-norms and edit distance. In VLDB, pages 792--803, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Cheng, D. V. Kalashnikov, and S. Prabhakar. Querying imprecise data in moving object environments. IEEE Trans. Knowl. Data Eng., 16(9):1112--1127, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, pages 426--435, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Cui, B. C. Ooi, J. Su, and K.-L. Tan. Contorting high dimensional data for efficient main memory processing. In SIGMOD Conference, pages 479--490, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. M. Fischer. Path sharing and predicate evaluation for high-performance XML filtering. ACM Trans. Database Syst., 28(4):467--516, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. W.-C. Fu, P. M.-S. Chan, Y.-L. Cheung, and Y. S. Moon. Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB J., 9(2):154--173, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Gao and X. S. Wang. Continually evaluating similarity-based pattern queries on a streaming time series. In SIGMOD Conference, pages 370--381, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. V. Gopalkrishnan, P. Chairunnanda, and A. Najib. Efficient index maintenance for answering approximate queries in a volatile environment. Technical Report TR/VG/07/03, CAIS, Nanyang Technological University, Singapore, 2007.Google ScholarGoogle Scholar
  15. G. R. Hjaltason and H. Samet. Index-driven similarity search in metric spaces. ACM Trans. Database Syst., 28(4):517--580, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. V. Jagadish, B. C. Ooi, K.-L. Tan, C. Yu, and R. Zhang. iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst., 30(2):364--397, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Koudas, B. C. Ooi, K.-L. Tan, and R. Z. 0003. Approximate NN queries on streams with guaranteed error/performance bounds. In VLDB, pages 804--815, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y.-N. Law and C. Zaniolo. An adaptive nearest neighbor classification algorithm for data streams. In PKDD, pages 108--120, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Liu, W.-C. Lee, and D. L. Lee. Distributed caching of multi-dimensional data in mobile environments. In Mobile Data Management, pages 229--233, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. W. P. M. Polly and M. H. Wong. Efficient and robust feature extraction and pattern matching of time series by a lattice structure. In CIKM, pages 271--278, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Saltenis, C. S. Jensen, S. T. Leutenegger, and M. A. Lopez. Indexing the positions of continuously moving objects. In SIGMOD Conference, pages 331--342, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Tao, R. Cheng, X. Xiao, W. K. Ngai, B. Kao, and S. Prabhakar. Indexing multi-dimensional uncertain data with arbitrary probability density functions. In VLDB, pages 922--933, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. K. Uhlmann. Metric trees. Appl. Math. Lett., 4(5):61--62, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  24. J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett., 40(4):175--179, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  25. M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, and E. J. Keogh. Indexing multi-dimensional time-series with support for multiple distance measures. In KDD, pages 216--225, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Vlachos, C. Meek, Z. Vagena, and D. Gunopulos. Identifying similarities, periodicities and bursts for online search queries. In SIGMOD Conference, pages 131--142, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Wei, E. J. Keogh, H. V. Herle, and A. Mafra-Neto. Atomic wedgie: Efficient query filtering for streaming times series. In ICDM, pages 490--497, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. O. Wolfson, A. P. Sistla, S. Chamberlain, and Y. Yesha. Updating and querying databases that track mobile units. Distributed and Parallel Databases, 7(3):257--387, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. H. Wu, B. Salzberg, and D. Zhang. Online event-driven subsequence matching over financial data streams. In SIGMOD Conference, pages 23--34, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Xia and S. Prabhakar. Q+Rtree: Efficient indexing for moving object database. In DASFAA, pages 175--182, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In SODA, pages 311--321, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Querying time-series streams

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          EDBT '08: Proceedings of the 11th international conference on Extending database technology: Advances in database technology
          March 2008
          762 pages
          ISBN:9781595939265
          DOI:10.1145/1353343

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 March 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate7of10submissions,70%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader