ABSTRACT
A new bi-modal asynchronous arbitration node is introduced for use as a building block in an adaptive asynchronous interconnection network. The target network topology is a variant Mesh-of-Trees (MoT), combining a binary fan-out network (i.e. routing) and a binary fan-in network (i.e. arbitration) for each source-sink pair. The key feature of the new arbitration node is that it dynamically reconfigures based on the traffic it receives, entering a special "single-channel-bias" mode when the other channel has no recent activity. Arbitration is totally bypassed on the critical path, resulting in significantly lower node latency and, in high-traffic scenarios, improved throughput.
The router nodes were implemented in IBM 90nm technology using ARM standard cells. SPICE simulations indicate that the bi-modal arbitration node provided significant reductions in latency (41.6%), and increased throughput (19.8%, in high-traffic single-channel scenarios), when in biased mode. Node reconfiguration required at most 338 ps. Simulations were then performed on two distinct MoT indirect networks, "baseline" and "adaptive" (the latter incorporating the new bi-modal node), on eight diverse synthetic benchmarks, using mixes of random and deterministic traffic. Improvements in system latency up to 19.8% and throughput up to 27.8% were obtained using the adaptive network. Overall end-to-end latencies, through 6 router nodes and 5 hops, of 1.8-2.8 ns (at 25% load) and throughputs of 0.27-1.8 Gigaflits/s (at saturation rate) were also observed.
- J. Bainbridge and S. Furber. Chain: a delay-insensitive chip area interconnect. IEEE Micro Magazine, 22(5):16--23, Sept.-Oct. 2002. Google ScholarDigital Library
- A. O. Balkan, M. N. Horak, G. Qu, and U. Vishkin. Layout-accurate design and implementation of a high-throughput interconnection network for single-chip parallel processing. In Hot Interconnects, pages 21--28, Aug. 2007. Google ScholarDigital Library
- A. O. Balkan, G. Qu, and U. Vishkin. An area-efficient high-throughput hybrid interconnection network for single-chip parallel processing. In Proc. of ACM/IEEE DAC Conf., pages 435--440, 2008. Google ScholarDigital Library
- E. Beigné, F. Clermidy, P. Vivet, A. Clouard, and M. Renaudin. An asynchronous NOC architecture providing low latency service and its multi-level design framework. In Proc. of IEEE Async Symp., pages 54--63, 2005. Google ScholarDigital Library
- L. Benini and G. De Micheli. Networks on chips: A new SoC paradigm. IEEE Computer, 35(1):70--78, 2002. Google ScholarDigital Library
- T. Bjerregaard and J. Sparsoe. A router architecture for connection-oriented service guarantees in the MANGO clock-less network-on-chip. In Proc. Design, Automation and Test in Europe (DATE), pages 1226--1231, March 2005. Google ScholarDigital Library
- W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, Inc., 2003. Google ScholarDigital Library
- R. Dobkin, R. Ginosar, and C. P. Sotiriou. Data synchronization issues in GALS SoCs. In Proc. of IEEE Async Symp., pages 170--179, 2004.Google ScholarCross Ref
- R. M. Fuhrer and S. M. Nowick. Sequential Optimization of Asynchronous and Synchronous Finite-State Machines: Algorithms and Tools. Kluwer Academic Publishers, 2001. {MINIMALIST download site (URL): www.cs.columbia.edu/~nowick/asynctools}. Google ScholarDigital Library
- H. van Gageldonk, K. van Berkel, A. Peeters, D. Baumann a nd D. Gloor, and G. Stegmann. An asynchronous low-power 80C51 microcontroller. In Proc. of IEEE Async Symp., pages 96--107, Apr. 1998. Google ScholarDigital Library
- M. N. Horak. A high-throughput, low-power asynchronous mesh-of-trees interconnection network for the explicit multithreading (XMT) parallelarchitecture. Master's thesis, Univ. of Maryland, August 2008. http://hdl.handle.net/1903/8361.Google Scholar
- M. N. Horak, S. M. Nowick, M. Carlberg, and U. Vishkin. A low-overhead asynchronous interconnection network for GALS chip multiprocessors. In Proc. of ACM NOCS Symp., pages 43--50, 2010. Google ScholarDigital Library
- M. Imai, K. Takada, and T. Nanya. Fine-grain leakage power reduction method for m-out-of-n encoded circuits using multi-threshold-voltage transistors. In Proc. of IEEE Async Symp., pages 209--216, May 2009. Google ScholarDigital Library
- T. N. K. Jain, P. V. Gratz, A. Sprintson, and G. Choi. Asynchronous bypass channels: Improving performance for multi-synchronous NoCs. In Proc. of ACM NOCS Symp., pages 43--50, 2010. Google ScholarDigital Library
- A. Kumar, L.-S. Peh, P. Kundu, and N. K. Jha. Express virtual channels: Towards the ideal interconnection fabric. In Proc. of ISCA, pages 150--161, 2007. Google ScholarDigital Library
- D. Naishlos, J. Nuzman, C.-W. Tseng, and U. Vishkin. Towards a first vertical prototype of an extremely fine-grained parallel programming approach. IEEE Theory of Computing Systems (TOCS), 36(5):521--552, September 2003.Google Scholar
- S. M. Nowick, M. N. Horak, and M. Carlberg. Asynchronous digital circuits including arbitration and routing primitives for asynchronous and mixed-timing networks. US Patent App. PCT/US09/50561, 7/14/2009.Google Scholar
- U. Y. Ogras, J. Hu, and R. Marculescu. Key research problems in NoC design: A holistic perspective. In Proc. of CODES, pages 69--74, 2005. Google ScholarDigital Library
- J. D. Owens, W. J. Dally, R. Ho, D. N. Jayasimha, S. W. Keckler, and L.-S. Peh. Research challenges for on-chip interconnection networks. IEEE Micro, 27(5):96--108, 2007. Google ScholarDigital Library
- A. Sheibanyrad, A. Greiner, and I. Miro-Panades. Multisyn-chronous and fully asynchronous NoCs for GALS. IEEE Design & Test, 25(6):572--580, Nov. 2008. Google ScholarDigital Library
- M. Singh and S. M. Nowick. MOUSETRAP: High-speed transition-signaling asynchronous pipelines. IEEE Trans. VLSI, 15(6):684--697, June 2007. Google ScholarDigital Library
- P. Teehan, M. Greenstreet, and G. Lemieux. A survey and taxonomy of GALS design styles. IEEE Design & Test, pages 418--428, 2007. Google ScholarDigital Library
- J. Teifel and R. Manohar. An asynchronous dataflow FPGA architecture. IEEE Transactions on Computers, 53(11):1376--1392, Nov. 2004. Google ScholarDigital Library
- Y. Thonnart, E. Beigne, A. Valentian, and P. Vivet. Automatic power regulation based on an asynchronous activity detection and its application to ANOC node leakage reduction. In Proc. of IEEE Async Symp., pages 48--57, 2008. Google ScholarDigital Library
Index Terms
- A low-latency adaptive asynchronous interconnection network using bi-modal router nodes
Recommendations
Low latency network-on-chip router microarchitecture using request masking technique
Network-on-Chip (NoC) is fast emerging as an on-chip communication alternative for many-core System-on-Chips (SoCs). However, designing a high performance low latency NoC with low area overhead has remained a challenge. In this paper, we present a two-...
A low-cost and latency bypass channel-based on-chip network
The number of cores on the chip increases rapidly; therefore, scalability is the most important design choice. Mesh-based Networks-on-Chip (NoC) are the most widely used topologies as a scalable alternative for traditional shared bus in many-core chips ...
The Adaptive Bubble Router
The design of a new adaptive virtual cut-through router for torus networks is presented in this paper. With much lower VLSI costs than adaptive wormhole routers, the adaptive Bubble router is even faster than deterministic wormhole routers based on ...
Comments