ABSTRACT
Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as soon as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level goals of performance and quality of service. Unfortunately, many common applications issue disk read requests in a synchronous manner, interspersing successive requests with short periods of computation. The scheduler chooses the next request too early; this induces deceptive idleness, a condition where the scheduler incorrectly assumes that the last request issuing process has no further requests, and becomes forced to switch to a request from another process.We propose the anticipatory disk scheduling framework to solve this problem in a simple, general and transparent way, based on the non-work-conserving scheduling discipline. Our FreeBSD implementation is observed to yield large benefits on a range of microbenchmarks and real workloads. The Apache webserver delivers between 29% and 71% more throughput on a disk-intensive workload. The Andrew filesystem benchmark runs faster by 8%, due to a speedup of 54% in its read-intensive phase. Variants of the TPC-B database benchmark exhibit improvements between 2% and 60%. Proportional-share schedulers are seen to achieve their contracts accurately and efficiently.
- 1.J. Almeida, M. Dabu, A. Manikutty, and P. Gao. Providing differentiated quality of service in web hosting services. In WISP, June 1998.]]Google Scholar
- 2.M. Aron and P. Druschel. Soft timers: Efficient microsecond software timer support for network processing. In 17th ACM SOSP, Dec. 1999.]] Google ScholarDigital Library
- 3.M. Aron, S. Iyer, and P. Druschel. A resource management framework for predictable quality of service in web servers, July 2001. Submitted. http://www.cs.rice.edu/-ssiyer/r/mbqos/.]]Google Scholar
- 4.G. Banga, P. Druschel, and J. C. Mogul. Resource containers: A new facility for resource management in server systems. In 3rd USENIX OSDI, Feb. 1999.]] Google ScholarDigital Library
- 5.J. Bennett and H. Zhang. WF2Q: Worst-case fair weighted fair queueing. In IEEE Infocom, Mar. 1996.]]Google ScholarCross Ref
- 6.HTTP log files at the University of California, Berkeley. http: //www.cs.berkeley.edu/logs/http /.]]Google Scholar
- 7.J. Bruno, J. Brustoloni, E. Gabber, B. Ozden, and A. Silberschatz. Disk scheduling with quality of service guarantees. In IEEE ICMCS, June 1999.]] Google ScholarDigital Library
- 8.J. Bruno, E. Gabber, B. ()zden, and A. Silberschatz. The Eclipse operating system: Providing quality of service via reservation domains. In USENIX 1998 Annual Technical Conference, June 1998.]] Google ScholarDigital Library
- 9.S. Chen, J. A. Stankovic, J. F. Kurose, and D. Towsley. Performance evaluation of two new disk scheduling algorithms for real-time systems. Journal of Real-Time Systems, 3(3):307-336, Sept. 1991.]]Google ScholarCross Ref
- 10.L. Golubchik, J. C. S. Lui, E. de Souza e Silva, and H. R. Gail. Evaluation of tradeoffs in resource management techniques for multimedia storage servers. In IEEE ICMCS, June 1999.]] Google ScholarDigital Library
- 11.P. Goyal, X. Guo, and H. Vin. A hierarchical CPU scheduler for multimedia operating systems. In 2nd USENIX OSDI, Oct. 1996.]] Google ScholarDigital Library
- 12.J. Howard, M. Kazar, S. Menees, D. Nichols, M. Satyanarayanan, R. Sidebotham, and M. West. Scale and performance in a distributed file system. ACM Transactions on Computer Systems, 6(1):51-81, Feb. 1988.]] Google ScholarDigital Library
- 13.L. Huang and T. Chiueh. Implementation of a rotation latency sensitive disk scheduler. Technical Report ECSL-TR81, SUNY, Stony Brook, Mar. 2000.]]Google Scholar
- 14.S. Iyer and P. DruscheL The effect of deceptive idleness on disk schedulers. Technical Report CSTR01-379, Rice University, June 2001.]]Google Scholar
- 15.D. Jacobson and J. Wilkes. Disk scheduling algorithms based on rotational position. Technical Report HPL-CSP-91-Trevl, Hewlett-Packard, Feb. 1991.]]Google Scholar
- 16.C. Lumb, J. Schindler, G. Ganger, D. Nagle, and E. Riedel. Towards higher disk head utilization: Extracting free bandwidth from busy disk drives. In 4th USENIX OSDI, Oct. 2000.]] Google ScholarDigital Library
- 17.T. Mowry, A. Demke, and O. Krieger. Automatic compiler inserted I/O prefetching for out-of-core applications. In Pnd USENIX OSDI, Oct. 1996.]] Google ScholarDigital Library
- 18.H. Patterson, G. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka. Informed prefetching and caching. In 15th ACM SOSP, Dec. 1995.]] Google ScholarDigital Library
- 19.D. Roselli, J. R. Lorch, and T. E. Anderson. A comparison of file system workloads. In USENIX Annual Technical Conference, June 2000.]] Google ScholarDigital Library
- 20.E. Rosti, E. Smirni, G. Serazzi, and L. W. Dowdy. Analysis of non-work-conserving processor partitioning policies. Lecture Notes in Computer Science, 949:165-181, 1995.]] Google ScholarDigital Library
- 21.C. Ruemmler and J. Wilkes. An introduction to disk drive modeling. IEEE Computer, 27(3):17-28, 1994.]] Google ScholarDigital Library
- 22.M. Seltzer, P. Chen, and J. Ousterhout. Disk scheduling revisited. In USENIX Winter Technical Conference, Jan. 1990.]]Google Scholar
- 23.P. Shenoy and H. Via. Cello: A disk scheduling framework for next generation operating systems. In ACM Sigmetrics, June 1998.]] Google ScholarDigital Library
- 24.E. Shriver, C. Small, and K. Smith. Why does file system prefetching work? In USENIX Annual Technical Conference, June 1999.]] Google ScholarDigital Library
- 25.J. B. Siegal, Jan. 2000. http://www.cs.rice.edu/~ssiyer/r/antsched/linux.html.]]Google Scholar
- 26.D. Sullivan and M. Seltzer. Isolation with flexibility: A resource management framework for central servers. In USENIX Annual Technical Conference, June 2000.]] Google ScholarDigital Library
- 27.Transaction Processing Performance Council. TPC-B standard specification, revision 2.0, 1994.]]Google Scholar
- 28.R. Vaswani and J. Zahorjan. The implications of cache affinity on processor scheduling for shared memory multiprocessors. In 13th ACM SOSP, Oct. 1991.]] Google ScholarDigital Library
- 29.B. Verghese, A. Gupta, and M. Rosenblum. Performance isolation: Sharing and isolation in shared memory multiprocessors. In ASPLOS, Oct. 1998.]] Google ScholarDigital Library
- 30.W. Vogels. File system usage in Windows NT 4.0. In 17th ACM SOSP, June 2000.]] Google ScholarDigital Library
- 31.C. Waldspurger and W. Weihl. Lottery scheduling: Flexible proportional-share resource management. In 1st USENIX OSDI, Nov. 1994.]] Google ScholarDigital Library
- 32.C. Waldspurger and W. Weihl. Stride scheduling: Deterministic proportional resource management. Technical report, MIT/LCS/TM-528, June 1995.]] Google ScholarDigital Library
- 33.F. Waters. AIX performance tuning guide, chapter 8. Prentice Hall, 1994.]] Google ScholarDigital Library
- 34.B. Worthington, G. Ganger, and Y. Part. Scheduling algorithms for modern disk drives. In ACM Sigmetrics, 1994.]] Google ScholarDigital Library
- 35.X. Yu, B. Gum, Y. Chen, R. Wang, K. Li, A. Krishnamurthy, and T. Anderson. Trading capacity for performance in a disk array. In 4th USENIX OSDI, Oct. 2000.]] Google ScholarDigital Library
- 36.H. Zhang. Providing end-to-end performance guarantees using non-work-conserving disciplines. Computer Communications, 18(10), Oct. 1995.]]Google Scholar
Index Terms
- Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O
Recommendations
Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O
Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as soon as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level ...
A heuristic for scheduling two-machine no-wait flow shops with anticipatory setups
We consider a problem of scheduling jobs in two-machine no-wait flow shops for which the objective is to minimize the makespan. Each job, upon completion of its processing on the first machine, must be transferred immediately to the second machine due ...
Early-release fair scheduling
Euromicro-RTS'00: Proceedings of the 12th Euromicro conference on Real-time systemsWe present a variant of Pfair scheduling, which we call early-release fair (ERfair) scheduling. Like conventional Pfair scheduling, ERfair scheduling algorithms can be applied to optimally schedule periodic tasks on a multiprocessor system in polynomial ...
Comments