skip to main content
10.1145/2000064.2000113acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip

Published:04 June 2011Publication History

ABSTRACT

With the emergence of many-core architectures, it is quite likely that multiple applications will run concurrently on a system. Existing locally and globally adaptive routing algorithms largely overlook issues associated with workload consolidation. The shortsightedness of locally adaptive routing algorithms limits performance due to poor network congestion avoidance. Globally adaptive routing algorithms attack this issue by introducing a congestion propagation network to obtain network status information beyond neighboring nodes. However, they may suffer from intra- and inter-application interference during output port selection for consolidated workloads, coupling the behavior of otherwise independent applications and negatively affecting performance.

To address these two issues, we propose Destination-Based Adaptive Routing (DBAR). We design a novel low-cost congestion propagation network that leverages both local and non-local network information for more accurate congestion estimates. Thus, DBAR offers effective adaptivity for congestion beyond neighboring nodes. More importantly, by integrating the destination into the selection function, DBAR mitigates intra- and inter-application interference and offers dynamic isolation among regions. Experimental results show that DBAR can offer better performance than the best baseline algorithm for all measured configurations; it is well suited for workload consolidation. The wiring overhead of DBAR is low and DBAR provides improvement in the energy-delay product for medium and high injection rates.

Skip Supplemental Material Section

Supplemental Material

isca_8b_3.mp4

mp4

109.5 MB

References

  1. G. Ascia, V. Catania, M. Palesi, and D. Patti. Implementation and analysis of a new selection strategy for adaptive routing in networks-on-chip. Computers, IEEE Transactions on, 57(6):809--820, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bell et al. TILE64 - processor: A 64-core SoC with mesh interconnect. In ISSCC 2008, pages 88--598, February 2008.Google ScholarGoogle Scholar
  3. G.-M. Chiu. The odd-even turn model for adaptive routing. Parallel and Distributed Systems, IEEE Transactions on, 11(7):729--738, July 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C.-L. Chou and R. Marculescu. Run-time task allocation considering user behavior in embedded multiprocessor networks-on-chip. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 29(1):78--91, January 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C.-L. Chou, U. Ogras, and R. Marculescu. Energy- and performance-aware incremental mapping for networks on chip with multiple voltage levels. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 27(10):1866--1879, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Dally and C. Seitz. Deadlock-free message routing in multiprocessor interconnection networks. Computers, IEEE Transactions on, C-36(5):547--553, May 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Dally and B. Towles. Route packets, not wires: on-chip interconnection networks. In DAC 2001, pages 684--689, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. J. Dally and H. Aoki. Deadlock-free adaptive routing in multicomputer networks using virtual channels. Parallel and Distributed Systems, IEEE Transactions on, 4:466--475, April 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Duato. A new theory of deadlock-free adaptive routing in wormhole networks. Parallel and Distributed Systems, IEEE Transactions on, 4(12):1320--1331, December 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Duato. A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. Parallel and Distributed Systems, IEEE Transactions on, 6(10):1055--1067, October 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Duato. A necessary and sufficient condition for deadlock-free routing in cut-through and store-and-forward networks. Parallel and Distributed Systems, IEEE Transactions on, 7(8):841--854, August 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Enright Jerger and L. Peh. On-Chip Networks. Morgan and Claypool Publishers, San Francisco, CA, USA, 1 edition, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W.-C. Feng and K. G. Shin. Impact of selection functions on routing algorithm performance in multicomputer networks. In ICS 1997, pages 132--139, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Galles. Spider: a high-speed network interconnect. Micro, IEEE, 17(1):34--39, January-February 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Glass and L. Ni. The turn model for adaptive routing. In ISCA 1992, pages 278--287, June 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Gratz, B. Grot, and S. Keckler. Regional congestion awareness for load balance in networks-on-chip. In HPCA 2008, pages 203--214, February 2008.Google ScholarGoogle ScholarCross RefCross Ref
  18. P. Gratz, K. Sankaralingam, H. Hanson, P. Shivakumar, R. McDonald, S. Keckler, and D. Burger. Implementation and evaluation of a dynamically routed processor operand network. In NOCS 2007, pages 7--17, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Hoskote, S. Vangal, A. Singh, N. Borkar, and S. Borkar. A 5-GHz mesh interconnect for a Teraflops processor. Micro, IEEE, 27(5):51--61, September-October 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Hu and R. Marculescu. DyAD - smart routing for networks-on-chip. In DAC 2004, pages 260--263, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Hu and R. Marculescu. Energy- and performance-aware mapping for regular NoC architectures. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 24(4):551--562, April 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. ITRS. International Technology Roadmap for Semiconductors, 2007 edition. http://www.itrs.net, 2007.Google ScholarGoogle Scholar
  23. M. Karol, M. Hluchyj, and S. Morgan. Input versus output queueing on a space-division packet switch. Communications, IEEE Transactions on, 35(12):1347--1356, December 1987.Google ScholarGoogle ScholarCross RefCross Ref
  24. J. Kim, D. Park, T. Theocharides, N. Vijaykrishnan, and C. Das. A low latency router supporting adaptivity for on-chip interconnects. In DAC 2005, pages 559--564, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Kumar, P. Kundu, A. Singh, L.-S. Peh, and N. Jha. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In ICCD 2007, pages 63--70, October 2007.Google ScholarGoogle ScholarCross RefCross Ref
  26. T. Lei and S. Kumar. A two-step genetic algorithm for mapping task graphs to a network on chip architecture. In DSD 2003, pages 180--187, September 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Li, Q.-A. Zeng, and W.-B. Jone. DyXY - a proximity congestion-aware deadlock-free dynamic routing method for network on chip. In DAC 2006, pages 849--852, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. llitzky, J. Hoffman, A. Chun, and B. Esparza. Architecture of the scalable communications core's network on chip. Micro, IEEE, 27(5):62--74, September-October 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. C. Martínez, F. Silla, P. López, and J. Duato. On the influence of the selection function on the performance of networks of workstations. In ISHPC 2000, pages 292--299, October 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Michelogiannakis, D. Sanchez, W. Dally, and C. Kozyrakis. Evaluating bufferless flow control for on-chip networks. In NOCS 2010, pages 9--16, May 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. O. Mutlu and T. Moscibroda. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems. In ISCA 2008, pages 63--74, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L.-S. Peh and W. Dally. A delay model and speculative architecture for pipelined routers. In HPCA 2001, pages 255--266, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. S. Ramanujam and B. Lin. Destination-based adaptive routing on 2D mesh networks. In ANCS 2010, pages 19:1--19:12, October 25-26 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Rodrigo, J. Flich, J. Duato, and M. Hummel. Efficient unicast and multicast support for CMPs. In MICRO 2008, pages 364--375, November 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. L. Schwiebert and R. Bell. Performance tuning of adaptive wormhole routing through selection function choice. J. Parallel Distrib. Comput., 62:1121--1141, July 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Singh, W. Dally, A. Gupta, and B. Towles. GOAL: a load-balanced adaptive routing algorithm for torus networks. In ISCA 2003, pages 194--205, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. SPEC. SPEC benchmarks. http://www.spec.org, 2009.Google ScholarGoogle Scholar
  38. TPC. TPC benchmarks. http://www.tpc.org, 2008.Google ScholarGoogle Scholar
  39. S. Woo, M. Ohara, E. Torrie, J. Singh, and A. Gupta. The SPLASH-2 programs: characterization and methodological considerations. In ISCA 1995, pages 24--36, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing shared resource contention in multicore processors via scheduling. In ASPLOS 2010, pages 129--142, March 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISCA '11: Proceedings of the 38th annual international symposium on Computer architecture
        June 2011
        488 pages
        ISBN:9781450304726
        DOI:10.1145/2000064
        • cover image ACM SIGARCH Computer Architecture News
          ACM SIGARCH Computer Architecture News  Volume 39, Issue 3
          ISCA '11
          June 2011
          462 pages
          ISSN:0163-5964
          DOI:10.1145/2024723
          Issue’s Table of Contents

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 June 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate543of3,203submissions,17%

        Upcoming Conference

        ISCA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader