skip to main content
10.1145/968280.968305acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
Article

FPGAs vs. CPUs: trends in peak floating-point performance

Published:22 February 2004Publication History

ABSTRACT

Moore's Law states that the number of transistors on a device doubles every two years; however, it is often (mis)quoted based on its impact on CPU performance. This important corollary of Moore's Law states that improved clock frequency plus improved architecture yields a doubling of CPU performance every 18 months. This paper examines the impact of Moore's Law on the peak floating-point performance of FPGAs. Performance trends for individual operations are analyzed as well as the performance trend of a common instruction mix (multiply accumulate). The important result is that peak FPGA floating-point performance is growing significantly faster than peak floating-point performance for a CPU.

References

  1. International Technology Roadmap for Semiconductors. December 2003.Google ScholarGoogle Scholar
  2. W. A. Wulf and S. A. McKee, "Hitting the memory wall: Implications of the obvious," Computer Architecture News, vol. 23, pp. 20--24, March 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. Shirazi, A. Walters, and P. Athanas, "Quantitative analysis of floating point arithmetic on fpga based custom computing machines," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 155--162, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Belanovic and M. Leeser, "A library of parameterized floating-point modules and their use," in Proceedings of the International Conference on Field Programmable Logic and Applications, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Dido, N. Geraudie, L. Loiseau, O. Payeur, Y. Savaria, and D. Poirier, "A flexible floating-point format for optimizing data-paths and operators in fpga based dsps," in Proceedings of the ACM International Symposium on Field Programmable Gate Arrays, (Monterrey, CA), February 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. A. Gaar, W. Luk, P. Y. Cheung, N. Shirazi, and J. Hwang, "Automating customisation of floating-point designs," in Proceedings of the International Conference on Field Programmable Logic and Applications, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Liang, R. Tessier, and O. Mencer, "Floating point unit generation and evaluation for fpgas," in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, (Napa Valley, CA), pp. 185--194, April 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. P. Leong, M. Y. Yeung, C. K. Yeung, C. W. Fu, P. A. Heng, and P. H. W. Leong, "Automatic floating to fixed point translation and its application to post-rendering 3d warping," in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, (Napa Valley, CA), pp. 240--248, April 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. A. Gaar, O. Mencer, W. Luk, P. Y. Cheung, and N. Shirazi, "Floating point bitwidth analysis via automatic differentiation," in Proceedings of the International Conference on Field Programmable Technology, (Hong Kong), 2002.Google ScholarGoogle Scholar
  10. IEEE Standards Board, "IEEE standard for binary floating-point arithmetic," Tech. Rep. ANSI/IEEE Std. 754-1985, The Institute of Electrical and Electronics Engineers, New York, 1985.Google ScholarGoogle Scholar
  11. B.Fagin and C. Renard, "Field programmable gate arrays and floating point arithmetic," IEEE Transactions on VLSI, vol. 2, no. 3, pp. 365--367, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Louca, T. A. Cook, and W. H. Johnson, "Implementation of ieee single precision floating point addition and multiplication on fpgas," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 107--116, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  13. W. B. Ligon, S. P. McMillan, G. Monn, F. Stivers, K. Schoonover, and K. D. Underwood, "A re-evaluation of the praticality of floating-point on FPGAs," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, (Napa Valley, CA), pp. 206--215, April 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Z. Luo and M. Martonosi, "Accelerating pipelined integer and floating-point accumulations in configurable hardware with delayed addition techniques," IEEE Transactions on Computers, vol. 49, no. 3, pp. 208--218, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. Wang and B. E. Nelson, "Tradeoffs of designing floating-point division and square root on virtex fpgas," in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, (Napa Valley, CA), pp. 195--203, April 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. D. Smith and A. R. Schnore, "Towards and RCC-based accelerator for computational fluid dynamics applications," pp. 226--232, 2003.Google ScholarGoogle Scholar
  17. E. Roesler and B. Nelson, "Novel Optimizations for Hardware Floating-Point Units in a Modern FPGA Architecture," in Proceedings of the 12th International Workshop on Field Programmable Logic and Applications (FPL'2002), pp. 637--646, August 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. J. Dongarra, "The linpack benchmark: An explanation," in 1st International Conference on Supercomputing, pp. 456--474, June 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. "Top 500 web site," September 2003. URL: http://www.top500.org.Google ScholarGoogle Scholar
  20. J. S. Vetter and A. Yoo, "An empirical performance evaluation of scalable scientific applications," in Proceedings of the 2002 Conference on Supercomputing, Nov. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. IA-32 Intel Architecture Optimization: Reference Manual. USA: Intel Corporation, 2003. Order Number:248966-009.Google ScholarGoogle Scholar
  22. A. Rodrigues, R. Murphy, P. Kogge, and K. Underwood, "Characterizing a new class of threads in scientific applications for high end supercomputers," in in Preparation.Google ScholarGoogle Scholar

Index Terms

  1. FPGAs vs. CPUs: trends in peak floating-point performance

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          FPGA '04: Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
          February 2004
          266 pages
          ISBN:1581138296
          DOI:10.1145/968280

          Copyright © 2004 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 February 2004

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate125of627submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader