Abstract
Reconfigurable platforms are a promising technology that offers an interesting trade-off between flexibility and performance, which many recent embedded system applications demand, especially in fields such as multimedia processing. These applications typically involve multiple ad-hoc tasks for hardware acceleration, which are usually represented using formalisms such as Data Flow Diagrams (DFDs), Data Flow Graphs (DFGs), Control and Data Flow Graphs (CDFGs) or Petri Nets. However, none of these models is able to capture at the same time the pipeline behavior between tasks (that therefore can coexist in order to minimize the application execution time), their communication patterns, and their data dependencies. This article proves that the knowledge of all this information can be effectively exploited to reduce the resource requirements and the timing performance of modern reconfigurable systems, where a set of hardware accelerators is used to support the computation. For this purpose, this article proposes a novel task representation model, named Temporal Constrained Data Flow Diagram (TCDFD), which includes all this information. This article also presents a mapping-scheduling algorithm that is able to take advantage of the new TCDFD model. It aims at minimizing the dynamic reconfiguration overhead while meeting the communication requirements among the tasks. Experimental results show that the presented approach achieves up to 75% of resources saving and up to 89% of reconfiguration overhead reduction with respect to other state-of-the-art techniques for reconfigurable platforms.
- P. Alexander and C. Kong. 2001. Rosetta: Semantic support for model-centered systems-level design. Computer 34, 11, 64--70. DOI:http://dx.doi.org/10.1109/2.963446 Google ScholarDigital Library
- S. Anellal and B. Kaminska. 1993. Scheduling of a control and data flow graph. In Proceedings of the IEEE International Symposium on Circuits and Systems. 1666--1669. DOI:http://dx.doi.org/10.1109/ISCAS. 1993.394061Google Scholar
- L. A. Belady. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Syst. J. 5, 2, 78--101. DOI:http://dx.doi.org/10.1147/sj.52.0078 Google ScholarDigital Library
- A. Bender. 1996. MILP based task mapping for heterogeneous multiprocessor systems. In Proceedings of the European Design Automation Conference with EURO-VHDL '96 and Exhibition. 190--197. DOI:http://dx.doi.org/10.1109/EURDAC.1996.558204 Google ScholarDigital Library
- L. Benini, D. Bertozzi, D. Bruni, N. Drago, F. Fummi, and M. Poncino. 2003. SystemC cosimulation and emulation of multiprocessor SoC designs. Computer 36, 4, 53--59. DOI:http://dx.doi.org/10.1109/MC.2003. 1193229 Google ScholarDigital Library
- L. Benini and G. De Micheli. 2002. Networks on chip: A new paradigm for systems on chip design. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 418--419. DOI:http://dx.doi.org/10.1109/DATE.2002.998307 Google ScholarDigital Library
- I. Beretta, V. Rana, D. Atienza, and D. Sciuto. 2011a. Island-based adaptable embedded system design. IEEE Embedded Syst. Lett. 3, 2, 53--57. DOI:http://dx.doi.org/10.1109/LES.2011.2115991 Google ScholarDigital Library
- I. Beretta, V. Rana, D. Atienza, and D. Sciuto. 2011b. A mapping flow for dynamically reconfigurable multi-core system-on-chip design. IEEE Trans. Comput.-Aid. Design Integr. Circuits Syst. 30, 8, 1211--1224. DOI:http://dx.doi.org/10.1109/TCAD.2011.2138140 Google ScholarDigital Library
- P. D. Bruza and Th. P. van der Weide. 1993. The semantics of data flow diagrams. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 66--78. DOI:http://dx.doi.org/10.1.1.40.9398Google Scholar
- C. Chang, J. Wawrzynek, and R.W. Brodersen. 2005. BEE2: A high-end reconfigurable computing system. IEEE Des. Test Comput. 22, 2, 114--125. DOI:http://dx.doi.org/10.1109/MDT.2005.30 Google ScholarDigital Library
- J. A. Clemente, I. Beretta, V. Rana, D. Atienza, and D. Sciuto. 2011a. A hybrid mapping-scheduling technique for dynamically reconfigurable hardware. In Proceedings of the 21st International Conference on Field Programmable Logic and Applications. 177--180. DOI:http://dx.doi.org/10.1109/FPL.2011.40 Google ScholarDigital Library
- J. A. Clemente, J. Resano, C. Gonzalez, and D. Mozos. 2011b. A hardware implementation of a run-time scheduler for reconfigurable systems. IEEE Trans. VLSI Syst. 19, 7, 1263--1276. DOI:http://dx. doi.org/10.1109/TVLSI.2010.2050158 Google ScholarDigital Library
- S. F. Corbetta, M. Morandi, M. Novati, M. D. Santambrogio, and D. Sciuto. 2007. Two novel approaches to online partial bitstream relocation in a dynamically reconfigurable system. In Proceedings of the IEEE Annual Symposium on VLSI. 457--458. DOI:http://dx.doi.org/10.1109/ISVLSI.2007.99 Google ScholarDigital Library
- R. Cordone, F. Redaelli, M. A. Redaelli, M. D. Santambrogio, and D. Sciuto. 2009. Partitioning and scheduling of task graphs on partially dynamically reconfigurable FPGAs. IEEE Trans. Comput.-Aid. Design Integr. Circuits Syst. 28, 5, 662--675. DOI:http://dx.doi.org/10.1109/TCAD.2009.2015739 Google ScholarDigital Library
- J. Eker, J. W. Janneck, E. A. Lee, J. Liu, X. Liu, J. Ludvig, S. Neuendorffer, S. Sachs, and Yuhong Xiong. 2003. Taming heterogeneity: The Ptolemy approach. Proc. IEEE 91, 1, 127--144. DOI:http://dx.doi.org/10.1109/JPROC.2002.805829Google ScholarCross Ref
- R. Eskinazi, M. E. Lima, P. R. M. Maciel, C. A. Valderrama, A. G. S. Filho, and P. S. B. Nascimento. 2005. A timed petri net approach for pre-runtime scheduling in partial and dynamic reconfigurable systems. In Proceedings of the 19th International Parallel and Distributed Processing Symposium. 154a. DOI:http://dx.doi.org/10.1109/IPDPS.2005.72 Google ScholarDigital Library
- F. Ferrandi, C. Pilato, D. Sciuto, and A. Tumeo. 2010. Mapping and scheduling of parallel C applications with ant colony optimization onto heterogeneous reconfigurable MPSoCs. In Proceedings of the 15th Asia and South Pacific Design Automation Conference. 799--804. DOI:http://dx.doi.org/10.1109/ASPDAC. 2010.5419782 Google ScholarDigital Library
- S. Ghiasi, A. Nahapetian, and M. Sarrafzadeh. 2004. An optimal algorithm for minimizing runtime reconfiguration delay. ACM Trans. Embed. Comput. Syst. 3, 237--256. DOI:http://dx.doi.org/10.1145/993396. 993398 Google ScholarDigital Library
- A. Hansson. 2005. A unified approach to mapping and routing in a combined guaranteed service and best-effort network-on-chip architecture. Tech. Rep., Lund University, Sweden.Google Scholar
- C. Haubelt, S. Otto, C. Grabbe, and J. Teich. 2005. A system-level approach to hardware reconfigurable systems. In Proceedings of the 10th Asia and South Pacific Design Automation Conference. 298--301. DOI:http://dx.doi.org/10.1109/ASPDAC.2005.1466177 Google ScholarDigital Library
- B. Hendrickson and R. Leland. 1994. The Chaco user's guide, Version 2.0. Tech. Rep. Sandia National Laboratories. http://www.cs.sandia.gov/_bahendr/chaco.htmlGoogle Scholar
- International Telecommunication Union (ITU). 1993. ITU-T Recommendation H.261. (1993). http://www.itu. int/rec/T-REC-H.261/eGoogle Scholar
- M. Janiaut, C. Tanougast, H. Rabah, Y. Berviller, C. Mannino, and S. Weber. 2005. Configurable hardware implementation of a conceptual decoder for a real-time mpeg-2 analysis. In Proceedings of the 15th International Conference on Field Programmable Logic and Applications. 386--390. DOI:http://dx.doi.org/10.1109/FPL.2005.1515752Google Scholar
- C. Kao. 2006. Benefits of partial reconfiguration. Xilinx.Google Scholar
- K. M. Kavi, B. P. Buckles, and U. N. Bhat. 1986. A formal definition of data flow graph models. IEEE Trans. Comput. C-35, 11, 940--948. DOI:http://dx.doi.org/10.1109/TC.1986.1676696 Google ScholarDigital Library
- E. A. Lee, C. Hylands, J. Janneck, J. Davis II, J. Liu, X. Liu, S. Neuendorffer, S. Sachs M. Stewart, K. Vissers, and P. Whitaker. Overview of the Ptolemy project. Tech. Rep.Google Scholar
- M. Li and Y. Ruan. 2011. Approach to formalizing UML sequence diagrams. In Proceedings of the 3rd International Workshop on Intelligent Systems and Applications. 1--4. DOI:http://dx.doi.org/10.1109/ISA.2011. 5873348Google Scholar
- Z. Li. 2002. Configuration prefetching techniques for partial reconfigurable coprocessor with relocation and defragmentation. In Proceedings of the ACM/SIGDA 10th Symposium on Field-Programmable Gate Arrays. 187--195. DOI:http://dx.doi.org/10.1145/503048.503076 Google ScholarDigital Library
- T. Lindroth, N. Avessta, J. Teuhola, and T. Seceleanu. 2006. Complexity analysis of H.264 decoder for FPGA design. In Proceedings of the IEEE International Conference on Multimedia and Expo. 1253--1256. DOI:http://dx.doi.org/10.1109/ICME.2006.262765Google Scholar
- S. Lukovic and L. Fiorin. 2008. An automated design flow for NoC-based MPSoCs on FPGA. In Proceedings of the 19th IEEE/IFIP International Symposium on Rapid System Prototyping. 58--64. DOI:http://dx.doi.org/10.1109/RSP.2008.31 Google ScholarDigital Library
- X. Mei-hua, C. Yu-lan, R. Feng, and C. Zhang-jin. 2007. Optimizing design and FPGA implementation for CABAC decoder. In Proceedings of the International Symposium on High Density packaging and Microsystem Integration. 1--5. DOI:http://dx.doi.org/10.1109/HDP.2007.4283645Google Scholar
- S. O. Memik, G. Memik, R. Jafari, and E. Kursun. 2003. Global resource sharing for synthesis of control data flow graphs on FPGAs. In Proceedings of the 50th Design Automation Conference. 604--609. DOI:http://dx.doi.org/10.1109/DAC.2003.1219090 Google ScholarDigital Library
- S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G. De Micheli. 2006a. A methodology for mapping multiple use-cases onto networks on chips. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 118--123. DOI:http://dx.doi.org/10.1109/DATE.2006.244007 Google ScholarDigital Library
- S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G. De Micheli. 2006b. A methodology for mapping multiple use-cases onto networks on chips. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1--6. DOI:http://dx.doi.org/10.1109/DATE.2006.244007 Google ScholarDigital Library
- J. Noguera and R. M. Badía. 2004. Multitasking on reconfigurable architectures: microarchitecture support and dynamic scheduling. ACM Trans. Embed. Comput. Syst. 3, 2, 385--406. DOI:http://dx.doi.org/10.1145/993396.993404 Google ScholarDigital Library
- J. Resano, D. Mozos, D. Verkest, and F. Catthoor. 2005. A reconfiguration manager for dynamically reconfigurable hardware. IEEE Des. Test Comput. 22, 5, 452--460. DOI:http://dx.doi.org/10.1109/MDT.2005.100 Google ScholarDigital Library
- M. Roitzsch. 2007. Slice-balancing H.264 video encoding for improved scalability of multicore decoding. In Proceedings of the 7th ACM and IEEE International Conference on Embedded Software. 269--278. DOI:http://dx.doi.org/10.1145/1289927.1289969 Google ScholarDigital Library
- H. Taghipour, J. Frounchi, and M. H. Zarifi. 2008. Design and implementation of MP3 decoder using partial dynamic reconfiguration on Virtex-4 FPGAs. In Proceedings of the International Conference on Computer and Communication Engineering. 683--686. DOI:http://dx.doi.org/10.1109/ICCCE.2008.4580691Google Scholar
- B. D. Theelen, M. C. W. Geilen, S. Stuijk, S. V. Gheorghita, T. Basten, J. P. M. Voeten, and A. H. Ghamarian. 2008. Scenario-aware data flow. Tech. Rep. Eindhoven University of Technology, Eindhoven, The Netherlands.Google Scholar
- M. Verderber, A. Zemva, and D. Lampret. 2003. HW/SW partitioned optimization and VLSI-FPGA implementation of the MPEG-2 video decoder. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 238--243 suppl. DOI:http://dx.doi.org/10.1109/DATE.2003.1253835 Google ScholarDigital Library
- H. Walder and M. Platzner. 2004. A Runtime environment for reconfigurable hardware operating systems. In Proceedings of the 14th International Conference on Field Programmable Logic and Application, Lecture Notes in Computer Science, vol. 3203. Springer, 831--835. DOI:http://dx.doi.org/10.1007/978-3-540-30117-284Google Scholar
- S. Wildermann, F. Reimann, D. Ziener, and J. Teich. 2011. Symbolic design space exploration for multi-mode reconfigurable systems. In Proceedings of the 9th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). 129--138. DOI:http://dx.doi.org/10.1145/2039370.2039393 Google ScholarDigital Library
- Xilinx Corporation. 2010. Virtex-5 FPGA User Guide.Google Scholar
- Xilinx Corporation. 2012a. MicroBlaze Processor Reference Guide.Google Scholar
- Xilinx Corporation. 2012b. Zynq-7000 extensible processing platform overview.Google Scholar
- D. Zaretsky, G. Mittal, R. P. Dick, and P. Banerjee. 2005. Generation of control and data flow graphs from scheduled and pipelined assembly code. In Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science, vol. 4339, Springer, 76--90. DOI:http://dx.doi.org/10.1007/978-3-540-69330-7_6 Google ScholarDigital Library
- H. Zhang and G. Wu. 2009. Petri nets based scheduling modeling for embedded systems. In Proceedings of the 2nd International Conference on Intelligent Computation Technology and Automation, Vol. 4. 80--83. DOI:http://dx.doi.org/10.1109/ICICTA.2009.736 Google ScholarDigital Library
- R. Zurawski and M. Zhou. 1994. Petri nets and industrial applications: A tutorial. IEEE Trans. Ind. Electron. 41, 6, 567--583. DOI:http://dx.doi.org/10.1109/41.334574Google ScholarCross Ref
Index Terms
- A Mapping-Scheduling Algorithm for Hardware Acceleration on Reconfigurable Platforms
Recommendations
Fingerprint image processing acceleration through run-time reconfigurable hardware
To the best of the authors' knowledge, this is the first brief that implements a complete automatic fingerprint-based authentication system (AFAS) application under a dynamically partial self-reconfigurable field-programmable gate array (FPGA). The main ...
Microkernel Architecture and Hardware Abstraction Layer of a Reliable Reconfigurable Real-Time Operating System (R3TOS)
This article presents a new solution for easing the development of reconfigurable applications using Field-Programable Gate Arrays (FPGAs). Namely, our Reliable Reconfigurable Real-Time Operating System (R3TOS) provides OS-like support for partially ...
Hardware and software infrastructure to implement many-core systems in modern FPGAs
SBCCI '17: Proceedings of the 30th Symposium on Integrated Circuits and Systems Design: Chip on the SandsMany-core systems are increasingly popular in embedded systems due to their high-performance and flexibility to execute different workloads. These many-core systems provide a rich processing fabric but lack the flexibility to accelerate critical ...
Comments