ABSTRACT
In this paper, we present ForeCache, a general-purpose tool for exploratory browsing of large datasets. ForeCache utilizes a client-server architecture, where the user interacts with a lightweight client-side interface to browse datasets, and the data to be browsed is retrieved from a DBMS running on a back-end server. We assume a detail-on-demand browsing paradigm, and optimize the back-end support for this paradigm by inserting a separate middleware layer in front of the DBMS. To improve response times, the middleware layer fetches data ahead of the user as she explores a dataset.
We consider two different mechanisms for prefetching: (a) learning what to fetch from the user's recent movements, and (b) using data characteristics (e.g., histograms) to find data similar to what the user has viewed in the past. We incorporate these mechanisms into a single prediction engine that adjusts its prediction strategies over time, based on changes in the user's behavior. We evaluated our prediction engine with a user study, and found that our dynamic prefetching strategy provides: (1) significant improvements in overall latency when compared with non-prefetching systems (430% improvement); and (2) substantial improvements in both prediction accuracy (25% improvement) and latency (88% improvement) relative to existing prefetching techniques.
- S. Agarwal, B. Mozafari, A. Panda, H. Milner, S. Madden, and I. Stoica. Blinkdb: queries with bounded errors and bounded response times on very large data. In Proc. EuroSys 2013, pages 29--42, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- L. Battle, M. Stonebraker, and R. Chang. Dynamic reduction of query result sets for interactive visualizaton. In IEEE BigDataVis Workshop, pages 1--8, 2013.Google ScholarCross Ref
- E. Brown, A. Ottley, H. Zhao, Q. Lin, R. Souvenir, A. Endert, and R. Chang. Finding Waldo: Learning about Users from their Interactions. IEEE TVCG, 20(12):1663--1672, Dec. 2014.Google Scholar
- S. K. Card, G. G. Robertson, and J. D. Mackinlay. The Information Visualizer, an Information Workspace. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '91, pages 181--186, New York, NY, USA, 1991. ACM. Google ScholarDigital Library
- U. Cetintemel, M. Cherniack, J. DeBrabant, Y. Diao, K. Dimitriadou, A. Kalinin, O. Papaemmanouil, and S. B. Zdonik. Query steering for interactive data exploration. In CIDR, 2013.Google Scholar
- S.-M. Chan, L. Xiao, J. Gerth, and P. Hanrahan. Maintaining interactivity while exploring massive time series. In VAST, 2008.Google Scholar
- S. F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4):359--394, Oct. 1999. Google ScholarDigital Library
- P. Doshi, E. Rundensteiner, and M. Ward. Prefetching for visual data exploration. In Proc. DASFAA, 2003. Google ScholarDigital Library
- D. Fisher. Incremental, approximate database queries and uncertainty for exploratory visualization. In LDAV, 2011.Google ScholarCross Ref
- N. Kamat, P. Jayachandran, K. Tunga, and A. Nandi. Distributed interactive cube exploration. ICDE, 2014.Google ScholarCross Ref
- R. Kohavi et al. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Ijcai, volume 14, pages 1137--1145, 1995. Google ScholarDigital Library
- D. H. Lee, J. S. Kim, S. D. Kim, K.-C. Kim, Y.-S. Kim, and J. Park. Adaptation of a Neighbor Selection Markov Chain for Prefetching Tiled Web GIS Data. ADVIS '02, pages 213--222, London, UK, UK, 2002. Springer-Verlag. Google ScholarDigital Library
- R. Li, R. Guo, Z. Xu, and W. Feng. A Prefetching Model Based on Access Popularity for Geospatial Data in a Cluster-based Caching System. Int. J. Geogr. Inf. Sci., 26(10):1831--1844, Oct. 2012. Google ScholarDigital Library
- L. Lins, J. Klosowski, and C. Scheidegger. Nanocubes for real-time exploration of spatiotemporal datasets. IEEE TVCG, 2013. Google ScholarDigital Library
- Z. Liu and J. Heer. The Effects of Interactive Latency on Exploratory Visual Analysis. IEEE TVCG, 20(12):2122--2131, Dec. 2014.Google Scholar
- Z. Liu, B. Jiang, and J. Heer. immens: Real-time visual querying of big data. Proc. EuroVis, 32, 2013. Google ScholarDigital Library
- J. Nielsen. Powers of 10: Time Scales in User Experience, Oct. 2009.Google Scholar
- A. Pauls and D. Klein. Faster and smaller n-gram language models. HLT, pages 258--267, Stroudsburg, PA, USA, 2011. Google ScholarDigital Library
- P. Pirolli and S. Card. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Proc. International Conference on Intelligence Analysis, volume 2005, pages 2--4, 2005.Google Scholar
- G. Planthaber, M. Stonebraker, and J. Frew. Earthdb: Scalable analysis of modis data using scidb. In BigSpatial, pages 11--19, New York, NY, USA. ACM. Google ScholarDigital Library
- K. Rittger, T. H. Painter, and J. Dozier. Assessment of methods for mapping snow cover from modis. Advances in Water Resources, 51(0):367 -- 380, 2013.Google ScholarCross Ref
- E. Soroush, M. Balazinska, S. Krughoff, and A. Connolly. Efficient Iterative Processing in the SciDB Parallel Array Engine. In Proceedings of the 27th International Conference on Scientific and Statistical Database Management, SSDBM '15, pages 39:1--39:6, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- M. Stonebraker, P. Brown, A. Poliakov, and S. Raman. The architecture of scidb. In SSDBM, pages 1--16. Springer, 2011. Google ScholarDigital Library
- R. Taft, M. Vartak, N. R. Satish, N. Sundaram, S. Madden, and M. Stonebraker. GenBase: A Complex Analytics Genomics Benchmark. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 177--188, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- S. Yesilmurat and V. Isler. Retrospective Adaptive Prefetching for Interactive Web GIS Applications. Geoinformatica, 16(3):435--466, July 2012. Google ScholarDigital Library
Index Terms
- Dynamic Prefetching of Data Tiles for Interactive Visualization
Recommendations
Stealth prefetching
Proceedings of the 2006 ASPLOS ConferencePrefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...
Maintaining Cache Coherence through Compiler-Directed Data Prefetching
In this paper, we propose a compiler-directed cache coherence scheme which makes use of data prefetching to enforce cache coherence in large-scale distributed shared-memory (DSM) systems. TheCache Coherence With Data Prefetching(CCDP) scheme uses ...
Stealth prefetching
ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systemsPrefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...
Comments