skip to main content
tutorial

Performance Evaluation of NoC-Based Multicore Systems: From Traffic Analysis to NoC Latency Modeling

Authors Info & Claims
Published:11 May 2016Publication History
Skip Abstract Section

Abstract

In this survey, we review several approaches for predicting performance of Network-on-Chip (NoC)-based multicore systems, starting from the traffic models to the complex NoC models for latency evaluation. We first review typical traffic models to represent the application workloads in NoC. Specifically, we review Markovian and non-Markovian (e.g., self-similar or long-range memory-dependent) traffic models and discuss their applications on multicore platform design. Then, we review the analytical techniques to predict NoC performance under given input traffic. We investigate analytical models for average as well as maximum delay evaluation. We also review the developments and design challenges of NoC simulators. One interesting research direction in NoC performance evaluation consists of combining simulation and analytical models in order to exploit their advantages together. Toward this end, we discuss several newly proposed approaches that use hardware-based or learning-based techniques. Finally, we summarize several open problems and our perspective to address these challenges.

References

  1. C. Ababei, P. P. Pande, and S. Pasricha. 2012. Network-on-chips (NoC) Blog. Retrieved from http://networkonchip.wordpress.com/.Google ScholarGoogle Scholar
  2. P. Abad, P. Prieto, L. G. Menezo, A. Colaso, V. Puente, and J. A. Gregorio. 2012. TOPAZ: An open-source interconnection network simulator for chip multiprocessors and supercomputers. In Proceedings of the 2012 6th IEEE/ACM International Symposium on Networks on Chip (NoCS). 99--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. Agarwal, T. Krishna, L. S. Peh, and N. K. Jha. 2009. GARNET: A detailed on-chip network model inside a full-system simulator. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’09). 33--42.Google ScholarGoogle Scholar
  4. E. K. Ardestani and J. Renau. 2013. ESESC: A fast multicore simulator using time-based sampling. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’13). 448--459. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Arjomand and H. Sarbazi-Azad. 2009. A comprehensive power-performance model for NoCs with multi-flit channel buffers. In Proceedings of the 23rd International Conference on Supercomputing (ICS’09). ACM, New York, NY, 470--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Arjomand and H. Sarbazi-Azad. 2010. Power-performance analysis of networks-on-chip with arbitrary buffer allocation schemes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 10 (Oct. 2010), 1558--1571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Atlas. 2011. Atlas environment for Network-on-Chips. Retrieved from http://corfu.pucrs.br/redmine/projects/atlas/wiki.Google ScholarGoogle Scholar
  8. M. Bekooij, O. P. Poplavko, B. Mesman, M. Pastrnak, and J. Van Meerbergen. 2004. Predictable embedded multiprocessor system design. In Proceedings of the International Workshop on Software and Compilers for Embedded Systems (SCOPES), Lecture Notes in Computer Science, Vol. 3199. Springer.Google ScholarGoogle Scholar
  9. Y. Ben-Itzhak, I. Cidon, and A. Kolodny. 2011. Delay analysis of wormhole based heterogeneous NoC. In Proceedings of the 2011 5th IEEE/ACM International Symposium on Networks on Chip (NoCS). 161--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Ben-Itzhak, E. Zahavi, I. Cidon, and A. Kolodny. 2012. HNOCS: Modular open-source simulator for heterogeneous NoCs. In Proceedings of the 2012 International Conference on Embedded Computer Systems (SAMOS). 51--57.Google ScholarGoogle Scholar
  11. L. Benini and G. De Micheli. 2002. Networks on chips: A new SoC paradigm. Computer 35, 1 (Jan. 2002), 70--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Beran. 1994. Statistics for Long-Memory Processes. Chapman and Hall.Google ScholarGoogle Scholar
  13. D. Bertozzi and L. Benini. 2004. Xpipes: A network-on-chip architecture for gigascale systems-on-chip. IEEE Circuits and Systems Magazine 4, 2 (2004), 18--31.Google ScholarGoogle ScholarCross RefCross Ref
  14. D. Bertozzi, A. Jalabert, S. Murali, R. Tamhankar, S. Stergiou, L. Benini, and G. De Micheli. 2005. NoC synthesis flow for customized domain specific multiprocessor systems-on-chip. IEEE Transactions on Parallel and Distributed Systems 16, 2 (2005), 113--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Bogdan. 2015. Mathematical modeling and control of multifractal workloads for data-center-on-a-chip optimization. In Proceedings of the 2015 9th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 21:1--21:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Bogdan, M. Kas, R. Marculescu, and O. Mutlu. 2010. QuaLe: A quantum-leap inspired model for non-stationary analysis of NoC traffic in chip multi-processors. In Proceedings of the 2010 4th ACM/IEEE International Symposium on Networks-on-Chip (NOCS). 241--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Bogdan and R. Marculescu. 2009. Statistical physics approaches for network-on-chip traffic characterization. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’09). ACM, New York, NY, 461--470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Bogdan and R. Marculescu. 2010. Workload characterization and its impact on multicore platform design. In Proceedings of the 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). 231--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Bogdan and R. Marculescu. 2011. Non-stationary traffic analysis and its implications on multicore platform design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 30, 4 (2011), 508--519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Bogdan and Y. Xue. 2015. Mathematical models and control algorithms for dynamic optimization of multicore platforms: A complex dynamic approach. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’15). 170--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Bolch, S. Greiner, H. de Meer, and K. S. Trivedi. 2006. Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications (2nd ed.). John Wiley and Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Borkar. 2009. Design perspectives on 22nm CMOS and beyond. In Proceedings of the 46th ACM/IEEE Design Automation Conference (DAC’09). 93--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Y. Le Boudec and P. Thiran. 2004. Network Calculus: A Theory of Deterministic Queuing Systems for the Internet. Lecture Notes in Computer Science, Vol. 2050. Springer-Verlag, Berlin. http://www.springer.com/us/book/9783540421849?token=prtst0416p. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. E. Carlson, W. Heirman, and L. Eeckhout. 2013. Sampled simulation of multi-threaded applications. In Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 2--12.Google ScholarGoogle Scholar
  25. G. Casale, E. Z. Zhang, and E. Smirni. 2008. KPC-toolbox: Simple yet effective trace fitting using Markovian arrival processes. In Proceedings of the 5th International Conference on Quantitative Evaluation of Systems (QEST’08). 83--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Chakraborty, S. Kunzli, and L. Thiele. 2003. A general framework for analysing system properties in platform-based embedded system designs. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, 2003. 190--195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. S. Chang. 2000. Performance Guarantees in Communication Networks. Springer-Verlag, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Chatterjee, M. Kishinevsky, and U. Y. Ogras. 2012. xMAS: Quick formal modeling of communication fabrics to enable verification. IEEE Design Test of Computers 29, 3 (2012), 80--88.Google ScholarGoogle ScholarCross RefCross Ref
  29. Connect. 2011. Configurable NEtwork Creation Tool. Retrieved from http://users.ece.cmu.edu/∼mpapamic/connect/.Google ScholarGoogle Scholar
  30. Wenbo Dai and N. E. Jerger. 2014. Sampling-based approaches to accelerate network-on-chip simulation. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 41--48.Google ScholarGoogle Scholar
  31. W. Dally. 1992. Virtual-channel flow control. IEEE Transactions on Parallel and Distributed Systems 3, 2 (1992), 194--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. W. J. Dally and B. Towles. 2001. Route packets, not wires: On-chip interconnection networks. In Proceedings of Design Automation Conference, 2001. 684--689. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Dally and B. Towles. 2003. Principles and Practices of Interconnect Networks. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Diamond and A. Alfa. 2000. On approximating higher-order MAPs with MAPs of order two. Queueing Systems 34 (2000), 269--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Donald and C. M. Harris. 2008. Fundamentals of Queueing Theory. Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. Du, M. Li, Z. Lu, M. Gao, and C. Wang. 2014. An analytical model for worst-case reorder buffer size of multi-path minimal routing NoCs. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 49--56.Google ScholarGoogle Scholar
  37. M. Eggenberger and M. Radetzki. 2013. Scalable parallel simulation of networks on chip. In Proceedings of the 2013 7th IEEE/ACM International Symposium on Networks on Chip (NoCS). 1--8.Google ScholarGoogle Scholar
  38. E. Fischer and G. P. Fettweis. 2013. An accurate and scalable analytic model for round-robin arbitration in network-on-chip. In Proceedings of the 2013 7th IEEE/ACM International Symposium on Networks on Chip (NoCS). 1--8.Google ScholarGoogle Scholar
  39. W. Fischer and K. Meier-Hellstern. 1993. The Markov-modulated Poisson process (MMPP) cookbook. Elsevier Performance Evaluation 18, 2 (1993), 149--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. Flich and D. Bertozzi (Eds.). 2010. Designing Network On-Chip Architectures in the Nanoscale Era. Chapman and Hall/CRC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. Foroutan, Y. Thonnart, and F. Petrot. 2013. An iterative computational technique for performance evaluation of networks-on-chip. IEEE Transactions on Computers 62, 8 (Aug. 2013), 1641--1655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Gem5. 2009. Gem5 simulator. (2009). http://www.m5sim.org/.Google ScholarGoogle Scholar
  43. N. Genko, D. Atienza, G. De Micheli, J. M. Mendias, R. Hermida, and F. Catthoor. 2005. A complete network-on-chip emulation framework. In Proceedings of the Conference on Design, Automation and Test in Europe - Volume 1 (DATE’05). 246--251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. gMemNoCsim. 2011. gMemNoCsim simulator. Retrieved from http://www.gap.upv.es/index.php?option=com_content&view==article&id==72&Itemid==108.Google ScholarGoogle Scholar
  45. Graphite. 2010. Graphite simulator. Retrieved from http://groups.csail.mit.edu/carbon/.Google ScholarGoogle Scholar
  46. P. Gratz and S. W. Keckler. 2010. Realistic workload characterization and analysis for networks-on-chip design. In Proceedings of the 4th Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI).Google ScholarGoogle Scholar
  47. C. Grecu, A. Ivanov, P. Pandey, A. Jantsch, E. Salminen, and R. Marculescu. 2007. An initiative towards open network-on-chip benchmarks.Google ScholarGoogle Scholar
  48. Z. Guz, I. Walter, E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny. 2007. Network delays and link capacities in application-specific wormhole NoCs. VLSI Design (2007).Google ScholarGoogle Scholar
  49. A. Hansson, M. Wiggers, A. Moonen, K. Goossens, and M. Bekooij. 2008. Applying dataflow analysis to dimension buffers for guaranteed performance in networks on chip. In Proceedings of the 2nd ACM/IEEE International Symposium on Networks-on-Chip (NoCS’08). 211--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. J. Hestness and S. W. Keckler. 2010. Netrace: Dependency-driven, trace-based network-on-chip simulation. In Proceedings of the 3rd International Workshop on Network on Chip Architectures (NoCArc). Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. HNoC. 2013. HNoC simulator. Retrieved from http://hnocs.eew.technion.ac.il/.Google ScholarGoogle Scholar
  52. H. Hossain, M. Ahmed, A. Al-Nayeem, T. Z. Islam, and M. M. Akbar. 2007. Gpnocsim—A general purpose simulator for network-on-chip. In Proceedings of the International Conference on Information and Communication Technology (ICICT’07). 254--257.Google ScholarGoogle Scholar
  53. J. Hu and R. Marculescu. 2003. Energy-aware mapping for tile-based NoC architectures under performance constraints. In Proceedings of the ASP-DAC 2003 Design Automation Conference. 233--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. J. Hu and R. Marculescu. 2004a. Application-specific buffer space allocation for networks-on-chip router design. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design (ICCAD’04). 354--361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. J. Hu and R. Marculescu. 2004b. Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In Proceedings of Design, Automation and Test in Europe Conference and Exhibition, 2004, Vol. 1. 234--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. J. Hu and R. Marculescu. 2005. Energy- and performance-aware mapping for regular NoC architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 4 (April 2005), 551--562. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. P. C. Hu and L. Kleinrock. 1997. An analytical model for wormhole routing with finite size input buffers. In 15th International Teletraffic Congress.Google ScholarGoogle Scholar
  58. E. A. F. Ihlen. 2012. Introduction to multifractal detrended fluctuation analysis in matlab. Frontiers in Physiology 3 (2012), 141.Google ScholarGoogle ScholarCross RefCross Ref
  59. F. Jafari, Z. Lu, A. Jantsch, and M. H. Yaghmaee. 2010. Buffer optimization in network-on-chip through flow regulation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 12 (2010), 1973--1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Nan Jiang, D. U. Becker, G. Michelogiannakis, J. Balfour, B. Towles, D. E. Shaw, J. Kim, and W. J. Dally. 2013. A detailed and flexible cycle-accurate network-on-chip simulator. In Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 86--96.Google ScholarGoogle Scholar
  61. Y. Jiang and Y. Liu. 2008. Stochastic Network Calculus. Springer-Verlag, London, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. A. B. Kahng, B. Li, L. S. Peh, and K. Samadi. 2012. ORION 2.0: A power-area simulator for interconnection networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 20, 1 (Jan. 2012), 191--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. J. W. Kantelhardt, S. A. Zschiegner, E. Koscielny-Bunde, S. Havlin, A. Bunde, and H. E. Stanley. 2002. Multifractal detrended fluctuation analysis of nonstationary time series. Physica A: Statistical Mechanics and its Applications 316 (Dec. 2002), 87--114.Google ScholarGoogle Scholar
  64. A. E. Kiasari, A. Jantsch, and Z. Lu. 2013a. Mathematical formalisms for performance evaluation of networks-on-chip. ACM Computing Surveys 45, 3, Article 38 (July 2013), 41 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. A. E. Kiasari, Z. Lu, and A. Jantsch. 2013b. An analytical latency model for networks-on-chip. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 21, 1 (Jan. 2013), 113--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. A. E. Kiasari, D. Rahmati, H. Sarbazi-Azad, and S. Hessabi. 2008. A Markovian performance model for networks-on-chip. In Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP’08). 157--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. L. Kleinrock. 1975. Queueing Systems, Volume I: Theory. Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. A. Klemm, C. Lindemann, and M. Lohmann. 2002. Traffic modeling of IP networks using the batch Markovian arrival process. In Computer Performance Evaluation: Modelling Techniques and Tools, T. Field, P. G. Harrison, J. Bradley, and U. Harder (Eds.). Lecture Notes in Computer Science, Vol. 2324. Springer, Berlin, 92--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. H. Kobayashi. 1974. Application of the diffusion approximation to queueing networks I: Equilibrium queue distributions. Journal of the ACM 21, 2 (April 1974), 316--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. D. D. Kouvatsos, S. Assi, and M. Ould-Khaoua. 2005. Performance modelling of wormhole-routed hypercubes with bursty traffice and finite buffers. International Journal of Simulation 6, 3--4 (2005), 69--81.Google ScholarGoogle Scholar
  71. P. J. Kuhn. 2013. Tutorial on Queuing Theory. University of Stuttgart.Google ScholarGoogle Scholar
  72. M. C. Lai, L. Gao, N. Xiao, and Z. Y. Wang. 2009. An accurate and efficient performance analysis approach based on queuing model for network on chip. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design—Digest of Technical Papers (ICCAD’09). 563--570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. S. Lee. 2003. Real-time wormhole channels. Journal of Parallel And Distributed Computing 63 (2003), 299--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. M. Lis, P. Ren, M. H. Cho, K. S. Shim, C. W. Fletcher, O. Khan, and S. Devadas. 2011. Scalable, accurate multicore simulation in the 1000-core era. In Proceedings of the 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 175--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. M. Lis, K. S. Shim, M. H. Cho, P. Ren, O. Khan, and S. Devadas. 2010. DARSIM: A parallel cycle-level NoC simulator. In Proceedings of the 6th Annual Workshop on Modeling, Benchmarking and Simulation (MoBS’10), Lieven Eeckhout and Thomas Wenisch (Eds.). https://hal.inria.fr/inria-00492982.Google ScholarGoogle Scholar
  76. W. Liu, J. Xu, X. Wu, Y. Ye, X. Wang, W. Zhang, M. Nikdast, and Z. Wang. 2011. A NoC traffic suite based on real applications. In 2011 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 66--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. M. Lodde and J. Flich. 2012. Memory hierarchy and network co-design through trace-driven simulation. In Proceedings of 7th International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems.Google ScholarGoogle Scholar
  78. R. Lopes and N. Betrouni. 2009. Fractal and multifractal analysis: A review. Medical Image Analysis 13, 4 (2009), 634--649.Google ScholarGoogle ScholarCross RefCross Ref
  79. Z. Lu, R. Thid, M. Millberg, E. Nilsson, and A. Jantsch. 2005. NNSE: Nostrum network-on-chip simulation environment. In Swedish System-on-Chip Conference (SSoCC). 1--4.Google ScholarGoogle Scholar
  80. Z. Lu, Y. Yao, and Y. Jiang. 2014. Towards stochastic delay bound analysis for network-on-chip. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 64--71.Google ScholarGoogle Scholar
  81. O. Lysne. 1998. Towards a generic analytical model of wormhole routing networks. Microprocessors and Microsystems 21, 7--8 (1998), 491--498.Google ScholarGoogle ScholarCross RefCross Ref
  82. I. R. Mackintosh. 2008. OCP-IP NoC benchmarking WG activities. IEEE Design Test of Computers 25, 5 (Sept. 2008), 504--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. S. Mahadevan, F. Angiolini, M. Storoaard, R. G. Olsen, J. Sparsoe, and J. Madsen. 2005. Network traffic generator model for fast network-on-chip simulation. In Proceedings of Design, Automation and Test in Europe, 2005, Vol. 2. 780--785. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. R. Marculescu and P. Bogdan. 2009. The chip is the network: Toward a science of network-on-chip design. Foundations and Trends in Electronic Design Automation 2, 4 (2009), 371--461.Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. G. Min and M. Ould-Khaoua. 2004. A performance model for wormhole-switched interconnection networks under self-similar traffic. IEEE Transactions on Computers 53, 5 (2004), 601--613. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. A. Nayebi, S. Meraji, A. Shamaei, and H. Sarbazi-Azad. 2007. XMulator: A listener-based integrated simulation platform for interconnection networks. In Proceedings of the 1st Asia International Conference on Modelling Simulation (AMS’07). 128--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Netmaker. 2009. Netmaker interconnection networks simulator. Retrieved from http://www-dyn.cl.cam.ac.uk/∼rdm34/wiki/index.php?title=Main_Page.Google ScholarGoogle Scholar
  88. N. Nikitin and J. Cortadella. 2009. A performance analytical model for network-on-chip with constant service time routers. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design—Digest of Technical Papers (ICCAD’09). 571--578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. NIRGAM. 2007. NIRGAM simulator. Retrieved from http://nirgam.ecs.soton.ac.uk/home.php.Google ScholarGoogle Scholar
  90. NoCbench. 2011. NoCbench. Retrieved from http://www.tkt.cs.tut.fi/research/nocbench/index.html.Google ScholarGoogle Scholar
  91. Noxim. 2011. Noxim simulator. Retrieved from http://noxim.sourceforge.net/.Google ScholarGoogle Scholar
  92. OCCN. 2003. OCCN modeling framework. Retrieved from http://occn.sourceforge.net/.Google ScholarGoogle Scholar
  93. U. Y. Ogras, P. Bogdan, and R. Marculescu. 2010. An analytical approach for network-on-chip performance analysis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 12 (2010), 2001--2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. D. Ohmann, E. Fischer, and G. Fettweis. 2014. Transient queuing models for input-buffered routers in network-on-chip. In Proceedings of the 2014 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS). 57--63.Google ScholarGoogle Scholar
  95. M. Ould-Khaoua. 1999. A performance model for Duato’s fully adaptive routing algorithm in k-ary n-cubes. IEEE Transactions on Computers 48, 12 (1999), 1297--1304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. M. K. Papamichael and J. C. Hoe. 2012. CONNECT: Re-examining conventional wisdom for designing NoCs in the context of FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’12). 37--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. M. K. Papamichael, J. C. Hoe, and O. Mutlu. 2011. FIST: A fast, lightweight, FPGA-friendly packet latency estimator for NoC modeling in full-system simulations. In Proceedings of the 2011 5th IEEE/ACM International Symposium on Networks on Chip (NoCS). 137--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. K. Park and W. Willinger. 2000. Self-Similar Network Traffic and Performance Evaluation. John Wiley and Sons, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. PARSEC. 2009. PARSEC Benchmark Suite. (2009). http://parsec.cs.princeton.edu/.Google ScholarGoogle Scholar
  100. V. Paxson. 1997. Fast, approximate synthesis of fractional Gaussian noise for generating self-similar network traffic. Computer Communication Review 27 (1997), 5--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. L.-S. Peh and W. J. Dally. 2001. A delay model and speculative architecture for pipelined routers. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture (HPCA’01). IEEE Computer Society, Washington, DC, 255--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Physionet. 2004. A Brief Overview of Multifractal Time Series. Retrieved from http://www.physionet. org/tutorials/multifractal/index.shtml.Google ScholarGoogle Scholar
  103. C. Pinto, S. Raghav, A. Marongiu, M. Ruggiero, D. Atienza, and L. Benini. 2011. GPGPU-accelerated parallel and fast simulation of thousand-core platforms. In Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). 53--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. S. Prabhu. 2010. OCIN_TSIM-A DVFS Aware Simulator for NoC Design Space Exploration and Optimization. M.Sc. thesis. Texas A&M University.Google ScholarGoogle Scholar
  105. V. Puente, J. A. Gregorio, and R. Beivide. 2002. SICOSYS: An integrated framework for studying interconnection network performance in multiprocessor systems. In Proceedings of the 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing, 2002. 15--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. A. Pullini, F. Angiolini, P. Meloni, D. Atienza, S. Murali, L. Raffo, G. De Micheli, and L. Benini. 2007. NoC design and implementation in 65nm technology. In Proceedings of the 1st International Symposium on Networks-on-Chip (NOCS’07). 273--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Y. Qian, Z. Lu, and Q. Dou. 2010a. QoS scheduling for NoCs: Strict priority queueing versus weighted round robin. In Proceedings of the 2010 IEEE International Conference on Computer Design (ICCD). 52--59.Google ScholarGoogle Scholar
  108. Y. Qian, Z. Lu, and W. Dou. 2009a. Analysis of communication delay bounds for network on chips. In Proceedings of the Design Automation Conference (ASP-DAC’09). 7--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. Y. Qian, Z. Lu, and W. Dou. 2009b. Analysis of worst-case delay bounds for best-effort communication in wormhole networks on chip. In Proceedings of the 3rd ACM/IEEE International Symposium on Networks-on-Chip (NoCS’09). 44--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. Y. Qian, Z. Lu, and W. Dou. 2009c. Applying network calculus for performance analysis of self-similar traffic in on-chip networks. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’09). ACM, New York, NY, 453--460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. Y. Qian, Z. Lu, and W. Dou. 2010b. Analysis of worst-case delay bounds for on-chip packet-switching networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29, 5 (2010), 802--815. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Z. Qian, D. Juan, P. Bogdan, C. Tsui, D. Marculescu, and R. Marculescu. 2015. A support vector regression (SVR) based latency model for network-on-chip (NoC) architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems PP, 99 (2015), 1--1.Google ScholarGoogle Scholar
  113. Zhiliang Qian, Da-Cheng Juan, P. Bogdan, Chi ying Tsui, D. Marculescu, and R. Marculescu. 2014. A comprehensive and accurate latency model for network-on-chip performance analysis. In Proceedings of the 19th Design Automation Conference (ASP-DAC’14). 323--328.Google ScholarGoogle Scholar
  114. Z. L. Qian, D. C. Juan, P. Bogdan, C. Y. Tsui, D. Marculescu, and R. Marculescu. 2013. SVR-NoC: A performance analysis tool for network-on-chips using learning-based support vector regression model. In Proceedings of the ACM/IEEE Design Automation and Test in Europe (DATE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. D. Rahmati, S. Murali, L. Benini, F. Angiolini, G. De Micheli, and H. Sarbazi-Azad. 2009. A method for calculating hard QoS guarantees for networks-on-chip. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design—Digest of Technical Papers (ICCAD’09). 579--586. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. D. Rahmati, S. Murali, L. Benini, F. Angiolini, G. De Micheli, and H. Sarbazi-Azad. 2013. Computing accurate performance bounds for best effort networks-on-chip. IEEE Transactions on Computers 62, 3 (March 2013), 452--467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. F. J. Ridruejo Perez and J. Miguel-Alonso. 2005. INSEE: An interconnection network simulation and evaluation environment. In Euro-Par 2005 Parallel Processing. Lecture Notes in Computer Science, Vol. 3648. Springer, Berlin, 1014--1023. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. B. Ryu and S. Lowen. 2000. Fractal traffic models for internet simulation. In Proceedings of the 5th IEEE Symposium on Computers and Communications (ISCC’00). 200--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. S. Shah-Heydari and T. Le-Ngoc. 1998. Multiple-state MMPP models for multimedia ATM traffic. In Proceedings of the International Conference on Telecommunications (ICT’98). 435--439.Google ScholarGoogle Scholar
  120. S. Shah-Heydari and T. Le-Ngoc. 2000. MMPP models for multimedia traffic. Telecommunication Systems 15, 3--4 (2000), 273--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Z. Shi and A. Burns. 2008. Real-time communication analysis for on-chip networks with wormhole switching. In Proceedings of the 2nd ACM/IEEE International Symposium on Networks-on-Chip (NoCS’08). 161--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. Z. Shi and A. Burns. 2010. Schedulability analysis and task mapping for real-time on-chip communication. Real-Time Systems 46, 3 (2010), 360--385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. Simics. 2012. Simics simulator. Retrieved from http://www.virtutech.com/.Google ScholarGoogle Scholar
  124. V. Soteriou, H. S. Wang, and L. S. Peh. 2006. A statistical traffic model for on-chip interconnection networks. In Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’06). 104--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. SPLASH-2. 2007. Modified SPLASH-2 Benchmark Suite. Retrieved from http://www.capsl.udel.edu/splash/.Google ScholarGoogle Scholar
  126. C. D. Spradling. 2007. SPEC CPU2006 benchmark tools. SIGARCH Computer Architecture News 35, 1 (March 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. S. Stergiou, F. Angiolini, S. Carta, L. Raffo, D. Bertozzi, and G. De Micheli. 2005. Xpipes Lite: A synthesis oriented design library for networks on chips. In Proceedings of Design, Automation and Test in Europe, 2005, Vol. 2. 1188--1193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. L. Thiele, S. Chakraborty, and M. Naedele. 2000. Real-time calculus for scheduling hard real-time systems. In Proceedings of the 2000 IEEE International Symposium on Circuits and Systems (ISCAS’00), Vol. 4. 101--104.Google ScholarGoogle Scholar
  129. A. Tran and B. Baas. 2012. NoCTweak: A Highly Parameterizable Simulator for Early Exploration of Performance and Energy of Networks On-Chip. Technical Report, VLSI Computation Lab, ECE Department, UC Davis, July 2012.Google ScholarGoogle Scholar
  130. V. Vapnik. 1998. Statistical Learning theory. John Wiley and Sons.Google ScholarGoogle Scholar
  131. G. Varatkar and R. Marculescu. 2002. Traffic analysis for on-chip networks design of multimedia applications. In Proceedings of the 39th Design Automation Conference, 2002. 795--800. Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. G. V. Varatkar and R. Marculescu. 2004. On-chip traffic modeling and synthesis for MPEG-2 video applications. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 12, 1 (2004), 108--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  133. D. Y. Wang, N. E. Jerger, and J. G. Steffan. 2011. DART: A programmable architecture for NoC simulation on FPGAs. In Proceedings of the 2011 5th IEEE/ACM International Symposium on Networks on Chip (NoCS). 145--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  134. Z. Wang, W. Liu, J. Xu, B. Li, R. Iyer, R. Illikkal, X. Wu, W. H. Mow, and W. Ye. 2014. A case study on the communication and computation behaviors of real applications in NoC-based MPSoCs. In Proceedings of the 2014 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). 480--485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. T. F. Wenisch, R. E. Wunderlich, M. Ferdman, A. Ailamaki, B. Falsafi, and J. C. Hoe. 2006. SimFlex: Statistical sampling of computer system simulation. IEEE Micro 26, 4 (July 2006), 18--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  136. P. T. Wolkotte, P. K. F. Holzenspies, and G. J. M. Smit. 2007. Fast, accurate and detailed NoC simulations. In Proceedings of the 1st International Symposium on Networks-on-Chip (NOCS’07). 323--332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  137. S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd International Symposium on Computer Architecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. Wormsim. 2008. Wormsim simulator. (2008). http://www.ece.cmu.edu/∼sld/software/worm_sim.php.Google ScholarGoogle Scholar
  139. Y. Wu, G. Min, M. Ould-Khaoua, H. Yin, and L. Wang. 2010. Analytical modelling of networks in multicomputer systems under bursty and batch arrival traffic. The Journal of Supercomputing 51, 2 (2010), 115--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. T. Yoshihara, S. Kasahara, and Y. Takahashi. 2001. Practical time-scale fitting of self-similar traffic with Markov-modulated Poisson process. Telecommunication Systems 17, 1 (2001), 185--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  141. X. Zhao and Z. Lu. 2013. Per-flow delay bound analysis based on a formalized microarchitectural model. In Proceedings of the 2013 7th IEEE/ACM International Symposium on Networks on Chip (NoCS). 1--8.Google ScholarGoogle Scholar
  142. M. Zolghadr, K. Mirhosseini, S. Gorgin, and A. Nayebi. 2011. GPU-based NoC simulator. In Proceedings of the 2011 9th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE). 83--88.Google ScholarGoogle Scholar

Index Terms

  1. Performance Evaluation of NoC-Based Multicore Systems: From Traffic Analysis to NoC Latency Modeling

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Design Automation of Electronic Systems
        ACM Transactions on Design Automation of Electronic Systems  Volume 21, Issue 3
        Special Section on New Physical Design Techniques for the Next Generation Integration Technology and Regular Papers
        July 2016
        434 pages
        ISSN:1084-4309
        EISSN:1557-7309
        DOI:10.1145/2926747
        • Editor:
        • Naehyuck Chang
        Issue’s Table of Contents

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 May 2016
        • Accepted: 1 December 2015
        • Revised: 1 October 2015
        • Received: 1 June 2015
        Published in todaes Volume 21, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • tutorial
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader