Abstract
Transactional Memory (TM), Thread-Level Speculation (TLS), and Checkpointed multiprocessors are three popular architectural techniques based on the execution of multiple, cooperating speculative threads. In these environments, correctly maintaining data dependences across threads requires mechanisms for disambiguating addresses across threads, invalidating stale cache state, and making committed state visible. These mechanisms are both conceptually involved and hard to implement. In this paper, we present Bulk, a novel approach to simplify these mechanisms. The idea is to hash-encode a thread's access information in a concise signature, and then support in hardware signature operations that efficiently process sets of addresses. Such operations implement the mechanisms described. Bulk operations are inexact but correct, and provide substantial conceptual and implementation simplicity. We evaluate Bulk in the context of TLS using SPECint2000 codes and TM using multithreaded Java workloads. Despite its simplicity, Bulk has competitive performance with more complex schemes. We also find that signature configuration is a key design parameter.
- {1} B. Alpern, S. Augart, S. Blackburn, M. Butrico, A. Cocchi, P. Cheng, J. Dolby, S. Fink, D. Grove, M. Hind, K. McKinley, M. Mergen, J. Moss, T. Ngo, V. Sarkar, and M. Trapp, "The Jikes Research Virtual Machine Project: Building an Open-Source Research Community," IBM Systems Journal, November 2005. Google ScholarDigital Library
- {2} C. S. Ananian, K. Asanovic, B. C. Kuszmaul, C. E. Leiserson, and S. Lie, "Unbounded Transactional Memory, "in International Symposium on High Performance Computer Architecture, February 2005. Google ScholarDigital Library
- {3} B. Bloom, "Space/Time Trade-Offs in Hash Coding with Allowable Errors," Communications of the ACM, July 1970. Google ScholarDigital Library
- {4} B. D. Carlstrom, J. Chung, H. Chafi, A. McDonald, C. C. Minh, L. Hammond, C. Kozyrakis, and K. Olukotun, "Transactional Execution of Java Programs," in Workshop on Synchronization and Concurrency in Object-Oriented Languages (SCOOL), October 2005.Google Scholar
- {5} L. Ceze, K. Strauss, J. Tuck, J. Renau, and J. Torrellas, "Using Checkpoint-Assisted Value Prediction to Hide L2 Misses, "ACM Transactions on Architecture and Code Optimization, June 2006. Google ScholarDigital Library
- {6} M. Cintra, J. F. Martínez, and J. Torrellas, "Architectural Support for Scalable Speculative Parallelization in Shared-Memory Multiprocessors," in International Symposium on Computer Architecture, June 2000. Google ScholarDigital Library
- {7} M. Franklin and G. Sohi, "ARB: A Hardware Mechanism for Dynamic Reordering of Memory References," in IEEE Transactions on Computers, May 1996. Google ScholarDigital Library
- {8} M. Galluzzi, V. Puente, A. Cristal, R. Beivide, J. -A. Gregorio, and M. Valero, "Evaluating Kilo-instruction Multiprocessors," in Workshop on Memory Performance Issues, June 2004. Google ScholarDigital Library
- {9} M. J. Garzaran, M. Prvulovic, J. M. Llaberia, V. Vinals, L. Rauchwerger, and J. Torrellas, "Tradeoffs in Buffering Speculative Memory State for Thread-Level Speculation in Multiprocessors," ACM Transactions on Architecture and Code Optimization, September 2006. Google ScholarDigital Library
- {10} S. Gopal, T. N. Vijaykumar, J. E. Smith, and G. Sohi, "Speculative Versioning Cache," in International Symposium on High Performance Computer Architecture , February 1998. Google ScholarDigital Library
- {11} L. Hammond, M. Willey, and K. Olukotun, "Data Speculation Support for a Chip Multiprocessor," in International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998. Google ScholarDigital Library
- {12} L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K. Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun, "Transactional Memory Coherence and Consistency, "in International Symposium on Computer Architecture , June 2004. Google ScholarDigital Library
- {13} M. Herlihy and J. E. B. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures, "in International Symposium on Computer Architecture , May 1993. Google ScholarDigital Library
- {14} M. Kirman, N. Kirman, and J. F. Martinez, "Cherry-MP: Correctly Integrating Checkpointed Early Resource Recycling in Chip Multiprocessors, "in International Symposium on Microarchitecture, November 2005. Google ScholarDigital Library
- {15} V. Krishnan and J. Torrellas, "Hardware and Software Support for Speculative Execution of Sequential Binaries on a Chip-Multiprocessor," in International Conference on Supercomputing, July 1998. Google ScholarDigital Library
- {16} W. Liu, J. Tuck, L. Ceze, W. Ahn, K. Strauss, J. Renau, and J. Torrellas, "POSH: A TLS Compiler that Exploits Program Structure," in Symposium on Principles and Practice of Parallel Programming, March 2006. Google ScholarDigital Library
- {17} P. Marcuello and A. Gonzalez, "Clustered Speculative Multithreaded Processors," in International Conference on Supercomputing, June 1999. Google ScholarDigital Library
- {18} K. Moore, J. Bobba, M. J. Moravam, M. Hill, and D. Wood, "LogTM: Log-Based Transactional Memory, "in International Symposium on High Performance Computer Architecture, February 2006.Google Scholar
- {19} E. Moss and T. Hosking, "Nested Transactional Memory: Model and Preliminary Architecture Sketches," in Workshop on Synchronization and Concurrency in Object-Oriented Languages (SCOOL), October 2005.Google Scholar
- {20} R. Rajwar, M. Herlihy, and K. Lai, "Virtualizing Transactional Memory," in International Symposium on Computer Architecture, June 2005. Google ScholarDigital Library
- {21} J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, K. Strauss, S. Sarangi, P. Sack, and P. Montesinos, "SESC Simulator," January 2005. http://sesc.sourceforge.net.Google Scholar
- {22} J. Renau, J. Tuck, W. Liu, L. Ceze, K. Strauss, and J. Torrellas, "Tasking with Out-of-Order Spawn in TLS Chip Multiprocessors: Microarchitecture and Compilation," in International Conference on Supercomputing, June 2005. Google ScholarDigital Library
- {23} G. Sohi, S. Breach, and T. Vijayakumar, "Multiscalar Processors," in International Symposium on Computer Architecture, June 1995. Google ScholarDigital Library
- {24} J. G. Steffan, C. Colohan, A. Zhai, and T. Mowry, "A Scalable Approach to Thread-Level Speculation," in International Symposium on Computer Architecture , June 2000. Google ScholarDigital Library
- {25} J. G. Steffan and T. C. Mowry, "The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization," in International Symposium on High Performance Computer Architecture, February 1998. Google ScholarDigital Library
- {26} M. Tremblay, "MAJC: Microprocessor Architecture for Java Computing." Hot Chips, August 1999.Google Scholar
- {27} J. Tsai, J. Huang, C. Amlo, D. Lilja, and P. Yew, "The Superthreaded Processor Architecture," IEEE Transactions on Computers, September 1999. Google ScholarDigital Library
Index Terms
- Bulk Disambiguation of Speculative Threads in Multiprocessors
Recommendations
Bulk Disambiguation of Speculative Threads in Multiprocessors
ISCA '06: Proceedings of the 33rd annual international symposium on Computer ArchitectureTransactional Memory (TM), Thread-Level Speculation (TLS), and Checkpointed multiprocessors are three popular architectural techniques based on the execution of multiple, cooperating speculative threads. In these environments, correctly maintaining data ...
Efficient execution of speculative threads and transactions with hardware transactional memory
Thread-level speculation (TLS) was researched to automatically parallelize portions of serial programs for execution, and transactional memory (TM) was studied as a promising alternative of lock for parallel programming due to its simplicity. Both TLS ...
Comments