ABSTRACT
Moore's Law states that the number of transistors on a device doubles every two years; however, it is often (mis)quoted based on its impact on CPU performance. This important corollary of Moore's Law states that improved clock frequency plus improved architecture yields a doubling of CPU performance every 18 months. This paper examines the impact of Moore's Law on the peak floating-point performance of FPGAs. Performance trends for individual operations are analyzed as well as the performance trend of a common instruction mix (multiply accumulate). The important result is that peak FPGA floating-point performance is growing significantly faster than peak floating-point performance for a CPU.
- International Technology Roadmap for Semiconductors. December 2003.Google Scholar
- W. A. Wulf and S. A. McKee, "Hitting the memory wall: Implications of the obvious," Computer Architecture News, vol. 23, pp. 20--24, March 1995. Google ScholarDigital Library
- N. Shirazi, A. Walters, and P. Athanas, "Quantitative analysis of floating point arithmetic on fpga based custom computing machines," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 155--162, 1995. Google ScholarDigital Library
- P. Belanovic and M. Leeser, "A library of parameterized floating-point modules and their use," in Proceedings of the International Conference on Field Programmable Logic and Applications, 2002. Google ScholarDigital Library
- J. Dido, N. Geraudie, L. Loiseau, O. Payeur, Y. Savaria, and D. Poirier, "A flexible floating-point format for optimizing data-paths and operators in fpga based dsps," in Proceedings of the ACM International Symposium on Field Programmable Gate Arrays, (Monterrey, CA), February 2002. Google ScholarDigital Library
- A. A. Gaar, W. Luk, P. Y. Cheung, N. Shirazi, and J. Hwang, "Automating customisation of floating-point designs," in Proceedings of the International Conference on Field Programmable Logic and Applications, 2002. Google ScholarDigital Library
- J. Liang, R. Tessier, and O. Mencer, "Floating point unit generation and evaluation for fpgas," in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, (Napa Valley, CA), pp. 185--194, April 2003. Google ScholarDigital Library
- M. P. Leong, M. Y. Yeung, C. K. Yeung, C. W. Fu, P. A. Heng, and P. H. W. Leong, "Automatic floating to fixed point translation and its application to post-rendering 3d warping," in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, (Napa Valley, CA), pp. 240--248, April 1999. Google ScholarDigital Library
- A. A. Gaar, O. Mencer, W. Luk, P. Y. Cheung, and N. Shirazi, "Floating point bitwidth analysis via automatic differentiation," in Proceedings of the International Conference on Field Programmable Technology, (Hong Kong), 2002.Google Scholar
- IEEE Standards Board, "IEEE standard for binary floating-point arithmetic," Tech. Rep. ANSI/IEEE Std. 754-1985, The Institute of Electrical and Electronics Engineers, New York, 1985.Google Scholar
- B.Fagin and C. Renard, "Field programmable gate arrays and floating point arithmetic," IEEE Transactions on VLSI, vol. 2, no. 3, pp. 365--367, 1994.Google ScholarDigital Library
- L. Louca, T. A. Cook, and W. H. Johnson, "Implementation of ieee single precision floating point addition and multiplication on fpgas," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 107--116, 1996.Google ScholarCross Ref
- W. B. Ligon, S. P. McMillan, G. Monn, F. Stivers, K. Schoonover, and K. D. Underwood, "A re-evaluation of the praticality of floating-point on FPGAs," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, (Napa Valley, CA), pp. 206--215, April 1998. Google ScholarDigital Library
- Z. Luo and M. Martonosi, "Accelerating pipelined integer and floating-point accumulations in configurable hardware with delayed addition techniques," IEEE Transactions on Computers, vol. 49, no. 3, pp. 208--218, 2000. Google ScholarDigital Library
- X. Wang and B. E. Nelson, "Tradeoffs of designing floating-point division and square root on virtex fpgas," in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, (Napa Valley, CA), pp. 195--203, April 2003. Google ScholarDigital Library
- W. D. Smith and A. R. Schnore, "Towards and RCC-based accelerator for computational fluid dynamics applications," pp. 226--232, 2003.Google Scholar
- E. Roesler and B. Nelson, "Novel Optimizations for Hardware Floating-Point Units in a Modern FPGA Architecture," in Proceedings of the 12th International Workshop on Field Programmable Logic and Applications (FPL'2002), pp. 637--646, August 2002. Google ScholarDigital Library
- J. J. Dongarra, "The linpack benchmark: An explanation," in 1st International Conference on Supercomputing, pp. 456--474, June 1987. Google ScholarDigital Library
- "Top 500 web site," September 2003. URL: http://www.top500.org.Google Scholar
- J. S. Vetter and A. Yoo, "An empirical performance evaluation of scalable scientific applications," in Proceedings of the 2002 Conference on Supercomputing, Nov. 2002. Google ScholarDigital Library
- IA-32 Intel Architecture Optimization: Reference Manual. USA: Intel Corporation, 2003. Order Number:248966-009.Google Scholar
- A. Rodrigues, R. Murphy, P. Kogge, and K. Underwood, "Characterizing a new class of threads in scientific applications for high end supercomputers," in in Preparation.Google Scholar
Index Terms
FPGAs vs. CPUs: trends in peak floating-point performance
Recommendations
Fast, Efficient Floating-Point Adders and Multipliers for FPGAs
Floating-point applications are a growing trend in the FPGA community. As such, it has become critical to create floating-point units optimized for standard FPGA technology. Unfortunately, the FPGA design space is very different from the VLSI design ...
Porting the COSMO Weather Model to Manycore CPUs
PASC '19: Proceedings of the Platform for Advanced Scientific Computing ConferenceWeather and climate simulations are a major application driver in high-performance computing (HPC). With the end of Dennard scaling and Moore's law, the HPC industry increasingly employs specialized computation accelerators to increase computational ...
Floating-point sparse matrix-vector multiply for FPGAs
FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arraysLarge, high density FPGAs with high local distributed memory bandwidth surpass the peak floating-point performance of high-end, general-purpose processors. Microprocessors do not deliver near their peak floating-point performance on efficient algorithms ...
Comments