skip to main content
research-article
Free Access

FlexSig: Implementing flexible hardware signatures

Published:26 January 2012Publication History
Skip Abstract Section

Abstract

With the advent of chip multiprocessors, new techniques have been developed to make parallel programing easier and more reliable. New parallel programing paradigms and new methods of making the execution of programs more efficient and more reliable have been developed. Usually, these improvements require hardware support to avoid a system slowdown.

Signatures based on Bloom filters are widely used as hardware support for parallel programing in chip multiprocessors. Signatures are used in Transactional Memory, thread-level speculation, parallel debugging, deterministic replay and other tools and applications. The main limitation of hardware signatures is the lack of flexibility: if signatures are designed with a given configuration, tailored to the requirements of a specific tool or application, it is likely that they do not fit well for other different requirements.

In this paper a new hardware signature organization, called Flexible Signatures (FlexSig), is proposed. FlexSig can change dynamically the resources assigned to a given signature and the number of signatures in the system, by redistributing the available hardware resources according to the system requirements. This allows higher flexibility than with traditional fixed-resources signatures based on Bloom filters, while maintaining a low false positive rate.

FlexSig has been evaluated by comparing it with signatures based on parallel Bloom filters, and we conclude that FlexSig outperforms (in terms of false positive rate) conventional parallel Bloom filters in most cases, due to its ability to use all the signature resources available.

References

  1. Adve, S. V., Hill, M. D., Miller, B. P., and Netzer, R. H. B. 1991. Detecting data races on weak memory systems. In Proceedings of the 18th Annual International Symposium on Computer Architecture. (ISCA '91). ACM, New York, NY, 234--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Almeida, P., Baquero, C., Preguica, N., and Hutchison, D. 2007. Scalable bloom filters. Infor. Process. Lett. 101, 6, 255--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ananian, C. S., Asanovic, K., Kuszmaul, B. C., Leiserson, C. E., and Lie, S. 2005. Unbounded transactional memory. In HPCA '05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture. IEEE Computer Society, Los Alamitos, CA, 316--327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bienia, C., Kumar, S., Singh, J. P., and Li, K. 2008. The parsec benchmark suite: characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, (PACT '08). ACM, New York, NY, 72--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Comm. ACM 13, 422--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cao Minh, C., Chung, J., Kozyrakis, C., and Olukotun, K. 2008. STAMP: Stanford transactional applications for multi-processing. In IISWC '08: Proceedings of the IEEE International Symposium on Workload Characterization.Google ScholarGoogle Scholar
  7. Carter, J. L. and Wegman, M. N. 1977. Universal classes of hash functions (extended abstract). In Proceedings of the 9th Annual ACM Symposium on Theory of Computing, (STOC '77). ACM, New York, NY, 106--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Casper, J., Oguntebi, T., Hong, S., Bronson, N. G., Kozyrakis, C., and Olukotun, K. 2011. Hardware acceleration of transactional memory on commodity systems. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, (ASPLOS '11). ACM, New York, NY, 27--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ceze, L., Tuck, J., Montesinos, P., and Torrellas, J. 2007. Bulksc: bulk enforcement of sequential consistency. In Proceedings of ISCA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ceze, L., Tuck, J., Torrellas, J., and Cascaval, C. 2006. Bulk disambiguation of speculative threads in multiprocessors. In Proceedings of the 33rd Annual International Symposium on Computer Architecture. IEEE Computer Society, 238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chang, F., chang Feng, W., and Li, K. 2004. Approximate caches for packet classification. In INFOCOM 2004. Proceedings of the 23th Annual Joint Conference of the IEEE Computer and Communications Societies. Vol. 4, 2196--2207.Google ScholarGoogle Scholar
  12. Chang, F., Li, K., and chang Feng, W. 2004. Approximate caches for packet classification. In Proceedings of INFOCOM.Google ScholarGoogle ScholarCross RefCross Ref
  13. Choi, J.-D., Lee, K., Loginov, A., O'Callahan, R., Sarkar, V., and Sridharan, M. 2002. Efficient and precise datarace detection for multithreaded object-oriented programs. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, (PLDI '02). ACM, New York, NY, 258--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Choi, W. and Draper, J. 2011. Implementation of unified signatures for transactional memory systems. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium.Google ScholarGoogle Scholar
  15. Fan, L., Cao, P., Almeida, J., and Broder, A. Z. 2000. Summary cache: A scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. 8, 281--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hammond, L., Wong, V., Chen, M., Carlstrom, B. D., Davis, J. D., Hertzberg, B., Prabhu, M. K., Wijaya, H., Kozyrakis, C., and Olukotun, K. 2004. Transactional memory coherence and consistency. In Proceedings of the 31st Annual International Symposium on Computer Architecture, (ISCA '04). IEEE Computer Society, Los Alamitos, CA, 102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Herlihy, M. and Moss, J. 1993. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the 20th Annual International Symposium on Computer Architecture. ACM, 300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. keung Luk, C., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Janapa, V., and Hazelwood, R. K. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Programming Language Design and Implementation. ACM, 190--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lev, Y. and Moir, M. 2006. Debugging with transactional memory. In Proceedings of the 1st ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing.Google ScholarGoogle Scholar
  20. Lucia, B., Devietti, J., Strauss, K., and Ceze, L. 2008. Atom-aid: Detecting and surviving atomicity violations. In Proceedings of the 35th Annual International Symposium on Computer Architecture, (ISCA '08). IEEE Computer Society, Los Alamitos, CA, 277--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Mehrara, M., Hao, J., Hsu, P.-C., and Mahlke, S. 2009. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, (PLDI '09). ACM, New York, NY, 166--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Muzahid, A., Suárez, D., Qi, S., and Torrellas, J. 2009. Sigrace: signature-based data race detection. In ISCA '09: Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, New York, NY, 337--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Netzer, R. H. B. and Miller, B. P. 1991. Improving the accuracy of data race detection. In Proceedings of the 3th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (PPOPP '91). ACM, New York, NY, 133--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Netzer, R. N. and Miller, B. P. 1989. Detecting data races in parallel program executions. In Proceedings of the 1990 Workshop on Advances in Languages and Compilers for Parallel Computing. MIT Press, 109--129.Google ScholarGoogle Scholar
  25. O'Callahan, R. and Choi, J.-D. 2003. Hybrid dynamic data race detection. In Proceedings of the 9th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (PPoPP '03). ACM, New York, NY, 167--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Prvulovic, M. and Torrellas, J. 2003. Reenact: using thread-level speculation mechanisms to debug data races in multithreaded codes. In Proceedings of the 30th Annual International Symposium on Computer Architecture, (ISCA '03). ACM, New York, NY, 110--121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Quislant, R., Gutierrez, E., Plata, O., and Zapata, E. L. 2009. Improving signatures by locality exploitation for transactional memory. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques. IEEE Computer Society, Los Alamitos, CA, 303--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ramakrishna, M. V., Fu, E., and Bahcekapili, E. 1997. Efficient hardware hashing functions for high performance computers. IEEE Trans. Comput. 46, 1378--1381. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ratanaworabhan, P., Burtscher, M., Kirovski, D., Zorn, B., Nagpal, R., and Pattabiraman, K. 2009. Detecting and tolerating asymmetric races. In PPoPP '09: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York, NY, 173--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Reynolds, P. and Vahdat, A. 2003. Efficient peer-to-peer keyword searching. In Proceedings of the ACM/IFIP/USENIX International Conference on Middleware, (Middleware '03). Springer-Verlag, Berlin, 21--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Rhea, S. and Kubiatowicz, J. 2002. Probabilistic location and routing. In INFOCOM 2002. Proceedings of the 21st Annual Joint Conference of the IEEE Computer and Communications Societies. IEEE. Vol. 3, 1248--1257.Google ScholarGoogle Scholar
  32. Riegel, T., Fetzer, C., and Felber, P. 2007. Time-based transactional memory with scalable time bases. In Proceedings of the 19th Annual ACM Symposium on Parallel Algorithms and Architectures, (SPAA '07). ACM, New York, NY, 221--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sanchez, D., Yen, L., Hill, M. D., and Sankaralingam, K. 2007. Implementing signatures for transactional memory. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, (MICRO 40). IEEE Computer Society, Los Alamitos, CA, 123--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sanyal, S., Roy, S., Cristal, A., Unsal, O. S., and Valero, M. 2009. Dynamically filtering thread-local variables in lazy-lazy hardware transactional memory. In Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications. IEEE Computer Society, Los Alamitos, CA, 171--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Shenghua, Z., Zheng, Q., Yuan, Z., and Xiaolan, P. 2009. A cascade hash design of bloom filter for signature detection. In Proceedings of the International Forum on Information Technology and Applications, (IFITA '09). Vol. 2, 559--562.Google ScholarGoogle Scholar
  36. Spear, M. F., Michael, M. M., and von Praun, C. 2008. Ringstm: scalable transactions with a single atomic instruction. In Proceedings of the 20th Annual Symposium on Parallelism in Algorithms and Architectures, (SPAA '08). ACM, New York, NY, 275--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Tuck, J., Ahn, W., Ceze, L., and Torrellas, J. 2008. Softsig: software-exposed hardware signatures for code analysis and optimization. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, (ASPLOS XIII). ACM, New York, NY, 145--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yen, L. 2009. Signatures in transactional memory systems. Ph.D. thesis, Madison, WI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yen, L., Bobba, J., Marty, M. R., Moore, K. E., Volos, H., Hill, M. D., Swift, M. M., and Wood, D. A. 2007. Logtm-se: Decoupling hardware transactional memory from caches. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture. IEEE Computer Society, Los Alamitos, CA, 261--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yen, L., Draper, S. C., and Hill, M. D. 2008. Notary: Hardware techniques to enhance signatures. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture, (MICRO 41). IEEE Computer Society, Los Alamitos, CA, 234--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yu, Y., Rodeheffer, T., and Chen, W. 2005. Racetrack: efficient detection of data race conditions via adaptive tracking. In Proceedings of the 20th ACM Symposium on Operating Systems Principles, (SOSP '05). ACM, New York, NY, 221--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Zhou, P., Teodorescu, R., and Zhou, Y. 2007. Hard: Hardware-assisted lockset-based race detection. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture. IEEE Computer Society, Los Alamitos, CA, 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. FlexSig: Implementing flexible hardware signatures

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Architecture and Code Optimization
        ACM Transactions on Architecture and Code Optimization  Volume 8, Issue 4
        Special Issue on High-Performance Embedded Architectures and Compilers
        January 2012
        765 pages
        ISSN:1544-3566
        EISSN:1544-3973
        DOI:10.1145/2086696
        Issue’s Table of Contents

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 January 2012
        • Accepted: 1 November 2011
        • Revised: 1 October 2011
        • Received: 1 July 2011
        Published in taco Volume 8, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader