ABSTRACT
A detailed understanding of high-performance computer (HPC) file system read and write (I/O) workloads allows stakeholders to evaluate the effectiveness of the I/O infrastructure and identify bottlenecks and other issues. Always-on, server-side monitoring, like that provided by the Lustre Monitoring Tool, permits a comprehensive and nonintrusive mechanism for capturing details of the I/O workload. The statistical properties of data movement to and from mass storage on an HPC system reveal transaction patterns that connect the server-side observations back to the computer-side jobs that caused them. This paper lays out strategies to characterize such patterns using I/O statistics.
Supplemental Material
Available for Download
- E. Im, I. Bustany, C. Ashcraft, J. Demmel, K. A. Yelick. Performance Tuning of Matrix Triple Products Based on Matrix Structure. In Proceedings of PARA'2004. pp.740--746, 2004. Google ScholarDigital Library
- C. M. Herb Wartens, Jim Garlick. LMT - The Lustre Monitoring Tool. http://code.google.com/p/lmtGoogle Scholar
- A. Uselton, K. Antypas, D. Ushizima, J.Sukharev, "File System Monitoring as a Window Into User I/O Requirements", CUG-2010, Edinburgh, UK, May 24--27th, 2010.Google Scholar
- D. Ushizima, A. Uselton, K. Antypas, J. Sukharev "Minimizing I/O contention at NERSC using data analysis", Workshop on Algorithms for Modern Massive Data Sets (MMDS'10), Stanford, CA, June 15--18, 2010.Google Scholar
Index Terms
- Poster: I/O workload analysis with server-side data collection
Recommendations
A model for optimizing file access patterns using spatio-temporal parallelism
UltraVis '13: Proceedings of the 8th International Workshop on Ultrascale VisualizationFor many years now, I/O read time has been recognized as the primary bottleneck for parallel visualization and analysis of large-scale data. In this paper, we introduce a model that can estimate the read time for a file stored in a parallel filesystem ...
Poster: rethinking raid for SSD based HPC systems
SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis CompanionThe emerging Solid State Drives (SSDs) have changed the landscape of storage systems and have the potential to be widely deployed in computing systems including HPC systems. However, the cost and the capacity of SSDs have often been cited as the primary ...
POSTER: Efficient Cross-User Chunk-Level Client-Side Data Deduplication with Symmetrically Encrypted Two-Party Interactions
CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications SecurityData deduplication has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. Unfortunately, as an increasing number of sensitive data are stored remotely, the encryption, the simplest way for data privacy, is not ...
Comments