skip to main content
article

Multi-resolution bitmap indexes for scientific data

Published:01 August 2007Publication History
Skip Abstract Section

Abstract

The unique characteristics of scientific data and queries cause traditional indexing techniques to perform poorly on scientific workloads, occupy excessive space, or both. Refinements of bitmap indexes have been proposed previously as a solution to this problem. In this article, we describe the difficulties we encountered in deploying bitmap indexes with scientific data and queries from two real-world domains. In particular, previously proposed methods of binning, encoding, and compressing bitmap vectors either were quite slow for processing the large-range query conditions our scientists used, or required excessive storage space. Nor could the indexes easily be built or used on parallel platforms. In this article, we show how to solve these problems through the use of multi-resolution, parallelizable bitmap indexes, which support a fine-grained trade-off between storage requirements and query performance. Our experiments with large data sets from two scientific domains show that multi-resolution, parallelizable bitmap indexes occupy an acceptable amount of storage while improving range query performance by roughly a factor of 10, compared to a single-resolution bitmap index of reasonable size.

References

  1. Amer-Yahia, S. and Johnson, T. 2000. Optimizing queries on compressed bitmaps. In Proceedings of the Very Large Data Bases Conference. 329--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Antoshenkov, G. 1995. Byte-aligned bitmap compression. In DCC '95: Proceedings of the Conference on Data Compression (Washington, DC). IEEE Computer Society Press, Los Alamitos, CA, 476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cha, G.-H. 2003. Bitmap indexing method for complex similarity queries with relevance feedback. In Proceedings of ACM International Workshop on Multimedia Databases. ACM, New York, 55--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chan, C. and Ioannidis, Y. 1999. An efficient bitmap encoding scheme for selection queries. In Proceedings of the ACM Conference on Management of Data (SIGMOD). ACM, New York, 215--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chan, C. Y. and Ioannidis, Y. E. 1998. Bitmap index design and evaluation. In Proceedings of the ACM Conference on Management of Data (SIGMOD). ACM, New York, 355--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Copeland, G. P. and Khoshafian, S. 1985. A decomposition storage model. In Proceedings of the ACM Conference on Management of Data (SIGMOD), S. B. Navathe, Ed. ACM, New York, 268--279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2002. Introduction to Algorithms, Second ed. McGraw Hill, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cover, T. M. and Thomas, J. A. 2002. Elements of Information Theory, Second ed. Wiley-Interscience, New York, Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Department of Energy 2004. The Department of Energy Office of Science Data Management Challenge. http://www.sc.doe.gov/ascr/Final-report-v26.pdf.Google ScholarGoogle Scholar
  10. Gaede, V. and Günther, O. 1998. Multidimensional access methods. ACM Comput. Surv. 30, 2, 170--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Guttman, A. 1984. R-trees: A dynamic indexing structure for spatial searching. In Proceedings of the ACM Conference on Management of Data (SIGMOD). 47--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jeong, J. and Nang, J. 2004. An efficient bitmap indexing method for similarity search in high dimensional multimedia databases. In Proceedings of the IEEE International Conference on Multimedia and Expo. IEEE Computer Society Press, Los Alamitos, CA, 815--818.Google ScholarGoogle Scholar
  13. Johnson, T. 1999. Performance measurements of compressed bitmap indices. In Proceedings of the Very Large Data Bases Conference. 278--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jürgens, M. and Lenz, H.-J. 2001. Tree based indexes versus bitmap indexes: A performance study. Int. J. Cooperat. Inf. Syst. 10, 3, 355--376.Google ScholarGoogle ScholarCross RefCross Ref
  15. Koudas, N. 2000. Space efficient bitmap indexing. In Proceedings of the Conference on Information and Knowledge Management. 194--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Mitra, S., Sinha, R. R., Winslett, M., and Jiao, X. 2005. An efficient, non intrusive, log based I/O mechanism for scientific simulations on clusters. In Proceedings of the IEEE Cluster Conference. IEEE Computer Society Press, Los Alamitos, CA.Google ScholarGoogle Scholar
  17. Morzy, M., Morzy, T., Nanopoulos, A., and Manolopoulos, Y. 2003. Hierarchical bitmap index: An efficient and scalable indexing technique for set-valued attributes. In Proceedings of the East-European Conference on Advances in Databases and Information Systems. 236--252.Google ScholarGoogle Scholar
  18. O'Neil, P. 1987. Model 204 architecture and performance. In Proceedings of Conference on High Performance Transaction Systems. 40--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. O'Neil, P. and Quass, D. 1997. Improved query performance with variant indexes. In Proceedings of the ACM Conference on Management of Data (SIGMOD). ACM, New York, 38--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. ROMIO. Romio: A high-performance, portable MPI-io implementation. www.mcs.anl.gov/romio.Google ScholarGoogle Scholar
  21. Rotem, D., Stockinger, K., and Wu, K. 2005a. Optimizing candidate check costs for bitmap indices. In Proceedings of the Conference on Information and Knowledge Management. 648--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Rotem, D., Stockinger, K., and Wu, K. 2005b. Optimizing I/O costs of multi-dimensional queries using bitmap indices. In Proceedings of the International Conference on Database and Expert System Applications. 220--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Sinha, R. R., Mitra, S., and Winslett, M. 2006. Bitmap indexes for large scientific data sets: A case study. In Proceedings of the IEEE International Parallel & Distributed Proceessing Symposium. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Stockinger, K. 2001. Design and implementation of bitmap indices for scientific data. In Proceedings of the International Database Engineering & Applications Symposium. 47--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Stockinger, K., Düllmann, D., Hoschek, W., and Schikuta, E. 2000. Improving the performance of high-energy physics analysis through bitmap indices. In Proceedings of the International Conference on Database and Expert System Applications. 835--845. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Stockinger, K., Wu, K., and Shoshani, A. 2004. Evaluation strategies for bitmap indices with binning. In Proceedings of the International Conference on Database and Expert System Applications. 120--129.Google ScholarGoogle Scholar
  27. VTune. http://www.intel.com/cd/software/products/asmo-na/eng/239144.htm.Google ScholarGoogle Scholar
  28. Wong, H. K. T., Li, J., Olken, F., Rotem, D., and Wong, L. 1986. Bit transposition for very large scientific and statistical databases. Algorithmica 1, 3, 289--309.Google ScholarGoogle ScholarCross RefCross Ref
  29. Wong, H. K. T., Liu, H.-F., Olken, F., Rotem, D., and Wong, L. 1985. Bit transposed files. In Proceedings of the Very Large Data Bases Conference. 448--457. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Wu, C.-L., Koh, J.-L., and An, P.-Y. 2005. Improved sequential pattern mining using an extended bitmap representation. In Proceedings of the International Conference on Database and Expert System Applications. 776--785. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Wu, K., Otoo, E. J., and Shoshani, A. 2002a. Compressing bitmap indexes for faster search operations. In Proceedings of the International Scientific and Statistical Database Management Conference. 99--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Wu, K., Otoo, E. J., and Shoshani, A. 2004. On the performance of bitmap indices for high cardinality attributes. In Proceedings of the Conference on Very Large Data Bases. 24--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Wu, K., Otoo, E. J., and Shoshani, A. 2006. Optimizing bitmap indices with efficient compression. ACM Trans. Datab. Syst. 31, 2, 1--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Wu, K., Otoo, E. J., Shoshani, A., and Nordberg, H. 2002b. Notes on design and implementation of compressed bit vectors. Tech. Rep. LBNL/PUB-3161, Lawrence Berkeley National Laboratory.Google ScholarGoogle Scholar
  35. Wu, K.-L. and Yu, P. S. 1998. Range-based bitmap indexing for high cardinality attributes with skew. In Proceedings of the International Computer Software and Applications Conference. 61--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wu, M.-C. 1999. Query optimization for selections using bitmaps. In Proceedings of the ACM Conference on Management of Data (SIGMOD). ACM, New York, 227--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wu, M.-C. and Buchmann, A. P. 1998. Encoded bitmap indexing for data warehouses. In Proceedings of the International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA, 220--230. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-resolution bitmap indexes for scientific data

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader