Abstract
Physical models utilize mathematical equations to characterize physical systems like airway mechanics, neuron networks, or chemical reactions. Previous work has shown that field programmable gate arrays (FPGAs) execute physical models efficiently. To improve the implementation of physical models on FPGAs, this article leverages graph theoretic techniques to synthesize physical models onto FPGAs. The first phase maps physical model equations onto a structured virtual processing element (PE) graph using graph theoretic folding techniques. The second phase maps the structured virtual PE graph onto physical PE regions on an FPGA using graph embedding theory. A simulated annealing algorithm is introduced that can map any physical model onto an FPGA regardless of the model's underlying topology. We further extend the simulated annealing approach by leveraging existing graph drawing algorithms to generate the initial placement. Compared to previous work on physical model implementation on FPGAs, embedding increases clock frequency by 25% on average (for applicable topologies), whereas simulated annealing increases frequency by 13% on average. The embedding approach typically produces a circuit whose frequency is limited by the FPGA clock instead of routing. Additionally, complex models that could not previously be routed due to complexity were made routable when using placement constraints.
- Romas Aleliunas and Arnold Rosenberg. 1982. On embedding rectangular grids in square grids. IEEE Transactions on Computers 31, 9, 907--913. DOI:http://dx.doi.org/10.1109/TC.1982.1676109 Google ScholarDigital Library
- Pritha Banerjee, Susmita Sur-Kolay, Arijit Bishnu, Sandip Das, Subhas C. Nandy, and Subhasis Bhattacharjee. 2009. FPGA placement using space-filling curves: Theory meets practice. ACM Transactions on Embedded Computing Systems 9, 2, Article No. 12. DOI:http://doi.acm.org/10.1145/1596543.1596546 Google ScholarDigital Library
- Francine Berman and Lawrence Snyder. 1987. On mapping parallel algorithms into parallel architectures. Journal of Parallel and Distributed Computing 4, 5, 439--458. DOI:http://dx.doi.org/10.1016/0743--7315(87)90018--9 Google ScholarDigital Library
- Abhinav Bhatelé and Laxmikant V. Kalé. 2008. Benefits of topology aware mapping for mesh interconnects. Parallel Processing Letters 18, 4, 549--566.Google ScholarCross Ref
- Shahid H. Bokhari. 1981. On the mapping problem. IEEE Transactions on Computers 30, 3, 207--214. Google ScholarDigital Library
- Woei-Kae Chen and Matthias F. M. Stallmann. 1995. On embedding binary trees into hypercubes. Journal of Parallel and Distributed Computing 24, 2, 132--138. DOI:http://dx.doi.org/10.1006/jpdc.1995.1013 Google ScholarDigital Library
- John A. Ellis. 1991. Embedding rectangular grids into square grids. IEEE Transactions on Computers 40, 1, 46--52. Google ScholarDigital Library
- Thomas MJ Fruchterman and Edward M. Reingold. 1991. Graph drawing by force-directed placement. Software: Practice and Experience 21, 11, 1129--1164. Google ScholarDigital Library
- Elzbieta Gabryś;, Marek Rybaczuk, and Alicja Kedzia. 2005. Fractal models of circulatory system. Symmetrical and asymmetrical approach comparison. Chaos, Solitons & Fractals 24, 3, 707--715.Google ScholarCross Ref
- Padmini Gopalakrishnan, Xin Li, and Lawrence Pileggi. 2006. Architecture-aware FPGA placement using metric embedding. In Proceedings of the 43rd Annual Design Automation Conference (DAC’06). ACM, New York, NY, 460--465. DOI:http://doi.acm.org/10.1145/1146909.1147033 Google Scholar
- Keith Horsfield, Wendy Kemp, and Sally Phillips. 1982. An asymmetrical model of the airways of the dog lung. Journal of Applied Physiology 52, 1, 21--26.Google ScholarCross Ref
- Chen Huang, Bailey Miller, Frank Vahid, and Tony Givargis. 2012. Synthesis of custom networks of heterogeneous processing elements for complex physical system emulation. In Proceedings of the 8th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’12). ACM, New York, NY, 215--224. DOI:http://doi.acm.org/10.1145/2380445.2380483 Google Scholar
- Chen Huang, Frank Vahid, and Tony Givargis. 2011. A custom FPGA processor for physical model ordinary differential equation solving. IEEE Embedded Systems Letters 3, 4, 113--116. Google ScholarDigital Library
- David S. Johnson, Cecilia R. Aragon, Lyle A. McGeoch, and Catherine Schevon. 1989. Optimization by simulated annealing: An experimental evaluation. Part I: Graph partitioning. Operations Research 37, 6, 865--892. Google ScholarDigital Library
- Alexander Marquardt, Vaughn Betz, and Jonathan Rose. 2000. Timing-driven placement for FPGAs. In Proceedings of the ACM/SIGDA 8th International Symposium on Field Programmable Gate Arrays (FPGA’00). ACM, New York, NY, 203--213. DOI:http://doi.acm.org/10.1145/329166.329208 Google ScholarDigital Library
- Sang-Kyu Lee and Hyeong-Ah Choi. 1996. Embedding of complete binary trees into meshes with row-column routing. IEEE Transactions on Parallel and Distributed Systems 7, 5, 493--497. DOI:http://dx.doi.org/10.1109/71.503774 Google ScholarDigital Library
- Bailey Miller, Frank Vahid, and Tony Givargis. 2012. Digital mockups for the testing of a medical ventilator. In Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium (IHI’12). ACM, New York, NY, 859--862. DOI:http://doi.acm.org/10.1145/2110363.2110473 Google Scholar
- Bailey Miller, Frank Vahid, and Tony Givargis. 2013. Embedding-based placement of processing element networks on FPGAs for physical model simulation. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’13). ACM, New York, NY, 181--190. Google Scholar
- Erdem Motuk, Roger Woods, and Stefan Bilbao. 2005. Implementation of finite difference schemes for the wave equation on FPGA. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’05). IEEE, Los Alamitos, CA, 237--240. DOI:http://dx.doi.org/10.1109/ICASSP.2005.1415690Google ScholarCross Ref
- Jayram Moorkanikara Nageswaran, Nikil Dutt, Jeffrey L. Krichmar, Alex Nicolau, and Alexander V. Veidenbaum. 2009. A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors. Neural Networks 22, 5, 791--800. DOI:http://dx.doi.org/10.1016/j.neunet.2009.06.028 Google ScholarDigital Library
- Yaghout Nourani and Bjarne Andresen. 1998. A comparison of simulated annealing cooling strategies. Journal of Physics A: Mathematical and General 31, 41, 8373--8385.Google ScholarCross Ref
- Julio C. G. De Pimentel, and Y. G. Tirat-Gefen. 2006. Hardware acceleration for real time simulation of physiological systems. In Proceedings of the 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS’06). IEEE, Los Alamitos, CA, 218--223. DOI:10.1109/IEMBS.2006.260298Google ScholarCross Ref
- Satnam Singh. 2011. The RLOC is dead—long live the RLOC. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’11). ACM, New York, NY, 185--188. Google Scholar
- Kozo Sugiyama, Shojiro Tagawa, and Mitsuhiko Toda. 1981. Methods for visual understanding of hierarchical system structures. IEEE Systems, Man, and Cybernetics 11, 2, 109--125. DOI:10.1109/TSMC.1981.4308636Google ScholarCross Ref
- Ilias Tagkopoulos, Charles Zukowski, German Cavelier, and Dimitris Anastassiou. 2003. A custom FPGA for the simulation of gene regulatory networks. In Proceedings of the 13th ACM Great Lakes Symposium on VLSI (GLSVLSI’03). ACM, New York, NY, 132--135. DOI:http://doi.acm.org/10.1145/764808.764843 Google Scholar
- David Terman, Sungwoo Ahn, Xueying Wang, and Winfried Just. 2008. Reducing neuronal networks to discrete dynamics. Physica D: Nonlinear Phenomena 237, 3, 324--338.Google ScholarCross Ref
- Jeffrey D. Ullman. 1984. Computational Aspects of VLSI. W. H. Freeman & Co., New York, NY. Google Scholar
- Stuart Ullman and Bhagirath Narahari. 1990. Mapping binary precedence trees to hypercubes and meshes. In Proceedings of the 2nd IEEE Symposium on Parallel and Distributed Processing. 838--841. DOI:10.1109/SPDP.1990.143655 Google ScholarDigital Library
- Alan S. Wagner. 1990. Embedding All Binary Trees in the Hypercube. Technical Report. University of British Columbia, Vancouver, BC, Canada. Google Scholar
- Ewald Weibel. 1963. Morphometry of the Human Lung. Springer-Verlag, Berlin, Germany.Google Scholar
- Xilinx Inc. 2010. Virtex-6 FPGA Routing Optimization Design Techniques. Retrieved Sept 1, 2013, from http://www.xilinx.com/support/documentation/white_papers/wp381_V6_Routing_Optimization.pdf.Google Scholar
- Henggui Zhang, Arun V. Holden, and Mark R. Boyett. 2001. Gradient model versus mosaic model of the sinoatrial node. Circulation 103, 4, 584--588.Google ScholarCross Ref
Index Terms
- Graph-Based Approaches to Placement of Processing Element Networks on FPGAs for Physical Model Simulation
Recommendations
Embedding-based placement of processing element networks on FPGAs for physical model simulation
FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arraysPhysical models utilize mathematical equations to model physical systems like airway mechanics, neuron networks, or chemical reactions. Previous work has shown that physical models can execute fast on FPGAs (field-programmable gate arrays). We introduce ...
Hardware-assisted simulated annealing with application for fast FPGA placement
FPGA '03: Proceedings of the 2003 ACM/SIGDA eleventh international symposium on Field programmable gate arraysTo truly exploit FPGAs for rapid turn-around development and prototyping, placement times must be reduced to seconds; late-bound, reconfigurable computing applications may demand placement times as short as microseconds. In this paper, we show how a ...
Routing-architecture-aware analytical placement for heterogeneous FPGAs
DAC '15: Proceedings of the 52nd Annual Design Automation ConferencePlacement is a crucial stage for FPGA implementation. Most FPGA placers optimize their placement results by minimizing half-perimeter wirelength (HPWL). Due to the segmented routing architecture in FPGAs, however, the HPWL function cannot model routed ...
Comments