Abstract
We focus in this work on an aspect of online computation that is not addressed by standard competitive analysis, namely, identifying request sequences for which nontrivial online algorithms are useful versus request sequences for which all algorithms perform equally poorly. The motivations for this work are advanced system and architecture designs which allow the operating system to dynamically allocate resources to online protocols such as prefetching and caching. To utilize these features, the operating system needs to identify data streams that can benefit from more resources.
Our approach in this work is based on the relation between entropy, compression, and gambling, extensively studied in information theory. It has been shown that in some settings, entropy can either fully or at least partially characterize the expected outcome of an iterative gambling game. Our goal is to study the extent to which the entropy of the input characterizes the expected performance of online algorithms for problems that arise in computer applications. We study bounds based on entropy for three classical online problems---list accessing, prefetching, and caching. Our bounds relate the performance of the best online algorithm to the entropy, a parameter intrinsic to characteristics of the request sequence. This is in contrast to the competitive ratio parameter of competitive analysis, which quantifies the performance of the online algorithm with respect to an optimal offline algorithm. For the prefetching problem, we give explicit upper and lower bounds for the performance of the best prefetching algorithm in terms of the entropy of the request sequence. In contrast, we show that the entropy of the request sequence alone does not fully capture the performance of online list accessing and caching algorithms.
- Albers, S., and Mitzenmacher, M. 1998. Average case analysis of list update algorithms, with applications to data compression. Algorithmica 21, 312--329.Google ScholarCross Ref
- Algoet, P. 1992. Universal schemes for prediction, gambling and portfolio selection. Ann. Probability 20, 2, 901--941.Google ScholarCross Ref
- Algoet, P. 1994. The strong law of large numbers for sequential decisions under uncertainty. IEEE Trans. Inf. Theory 40, 3, 609--633.Google ScholarDigital Library
- Algoet, P., and Cover, T. 1988. Asymptotic optimality and asymptotic equipartition property of log-optimal investment. Ann. Probability 16, 876--898.Google ScholarCross Ref
- Bentley, J., Sleator, D., Tarjan, R. E., and Wei, V. 1986. A locally adaptive data compression scheme. Commun. ACM 29, 4, 320--330. Google ScholarDigital Library
- Borodin, A., and El-Yaniv, R. 1998. Online Computation and Competitive Analysis. Cambridge University Press, New York. Google ScholarDigital Library
- Chiou, D., Jain, P., Devadas, S., and Rudolph, L. 2000. Dynamic cache partitioning via columnization. In Proceedings of the Design Automation Conference (Los Angeles).Google Scholar
- Cover, T., and Thomas, J. 1991. Elements of Information Theory. Wiley, New York. Google ScholarDigital Library
- Curewitz, K., Krishnan, P., and Vitter, J. 1993. Practical prefetching via data compression. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 257--266. Google ScholarDigital Library
- Durrett, R. 1996. Probability: Theory and Examples, 2nd ed. Duxbury Press.Google Scholar
- Elias, P. 1975. Universal codeword sets and the representation of the integers. IEEE Trans. Inf. Theory 21, 2, 194--203.Google ScholarDigital Library
- Feder, M., and Merhav, N. 1994. Relations between entropy and error probability. IEEE Trans. Inf. Theory 40, 1, 259--266.Google ScholarDigital Library
- Feder, M., Merhav, N., and Gutman, M. 1992. Universal prediction of individual sequences. Trans. Inf. Theory 38, 1258--1270.Google ScholarDigital Library
- Fiat, A., Karp, R., Luby, M., McGeoch, L. A., Sleator, D., and Young, N. 1991. On competitive algorithms for paging problems. J. Algorithms 12, 685--699. Google ScholarDigital Library
- Fonseca, R., Almeida, V., and Crovella, M. 2005. Localilty in a web of streams. Comun. ACM 48, 1, 82--88. Google ScholarDigital Library
- Franaszek, P., and Wagner, T. 1974. Some distribution-free aspects of paging performance. J. ACM 21, 31--39. Google ScholarDigital Library
- Gallager, R. 1968. Information Theory and Reliable Communication. Wiley, New York. Google ScholarDigital Library
- Gonnet, G., Munro, J., and Suwanda, H. 1982. Exegesis of self-organizing linear search. SIAM J. Comput. 10, 613--637.Google ScholarCross Ref
- Hannan, J. 1957. Approximation to Bayes risk in repeated plays. Contrib. Theory Games, Ann. Math. Studies 3, 97--139.Google Scholar
- Hellman, M., and Raviv, J. 1970. Probability of error, equivocation and the Chernoff bound. IEEE Trans. Inf. Theory 16, 4, 368--372.Google ScholarDigital Library
- Hennessey, J., and Patterson, D. 1996. Computer Architecture: A Quantitative Approach, 2nd ed. Morgan Kaufmann, San Fransisco. Google ScholarDigital Library
- Karlin, A., Phillips, S., and Raghavan, P. 2000. Markov paging. SIAM J. Comput. 30, 3, 906--922. Google ScholarDigital Library
- Kelly, J. 1956. A new interpretation of information rate. Bell Syst. Tech. J. 35, 917--926.Google ScholarCross Ref
- Krishnan, P., and Vitter, J. 1998. Optimal prediction for prefetching in the worst case. SIAM J. Comput. 27, 6, 1617--1636. Google ScholarDigital Library
- Langdon, G. G. 1983. A note on the Ziv-Lempel model for compressing individual sequences. IEEE Trans. Inf. Theory 29, 284--287.Google ScholarDigital Library
- Lempel, A., and Ziv, J. 1976. On the complexity of finite sequences. IEEE Trans. Inf. Theory 22, 75--81.Google ScholarDigital Library
- Loomis, L. H. 1946. On a theorem of von Neumann. In Proceedings of the National Academy of Sciences of the USA, 32, 213--215.Google ScholarCross Ref
- Lund, C., Phillips, S., and Reingold, N. 1999. Paging against a distribution and IP networking. J. Comput. Syst. Sci. 58, 222--231. Google ScholarDigital Library
- Malleable Cache. 2006. The malleable caches project at MIT. http://www.csg.lcs.mit.edu/mcache/index.html.Google Scholar
- Merhav, N., and Feder, M. 1993. Universal schemes for sequential decision from individual data sequences. IEEE Trans. Inf. Theory 39, 4, 1280--1292.Google ScholarDigital Library
- Merhav, N., Ordentlich, E., Seroussi, G., and Weinberger, M. J. 2002. On sequential strategies for loss functions with memory. IEEE Trans. Inf. Theory 48, 7, 1947--1958.Google ScholarDigital Library
- Merhev, N., and Feder, M. 1998. Universal prediction. IEEE Trans. Inf. Theory 44, 2124--2147. Google ScholarDigital Library
- Motwani, R., and Raghavan, P. 1995. Randomized Algorithms. Cambridge University Press, New York. Google ScholarDigital Library
- Pandurangan, G., and Szpankowski, W. 2005. A universal online caching algorithm based on pattern matching. In Proceedings of the IEEE International Symposium on Information Theory (ISIT).Google Scholar
- Pandurangan, G., and Upfal, E. 2001. Can entropy characterize performance of online algorithms? In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA). 727--734. Google ScholarDigital Library
- Sleator, D., and Tarjan, R. 1985. Amortized efficiency of list update and paging rules. Commun. ACM 28, 2, 202--208. Google ScholarDigital Library
- Suh, E., and Rudolph, L. 2000. Adaptive cache partitioning. Tech. Rep. CSG-Memo 432, Laboratory for Computer Science, MIT.Google Scholar
- Vitter, J., and Krishnan, P. 1996. Optimal prefetching via data compression. J. ACM 43, 5, 771--793. Google ScholarDigital Library
- Weinberger, M., and Ordentlich, E. 2000. On-Line decision making for a class of loss functions via Lempel-Ziv parsing. In Proceedings of the IEEE Data Compression Conference. 163--172. Google ScholarDigital Library
- Ziv, J., and Lempel, A. 1978. Compression of individual sequences via variable rate coding. IEEE Trans. Inf. Theory 24, 5, 530--536.Google ScholarDigital Library
Index Terms
- Entropy-based bounds for online algorithms
Recommendations
Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer ArchitectureOn-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches ...
Achieving High Performance in Bus-Based Shared-Memory Multiprocessors
In bus-based SMPs, cache misses and bus traffic form key obstacles to high performance. To overcome these problems, several techniques have been proposed: cache prefetching, read snarfing, software-controlled updating, and cache injection for reducing ...
Online algorithms for prefetching and caching on parallel disks
SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architecturesParallel disks provide a cost effective way of speeding up I/Os in applications that work with large amounts of data. The main challenge is to achieve as much parallelism as possible, using prefetching to avoid bottlenecks in disk access. Efficient ...
Comments