ABSTRACT
The architectural differences between ASICs and FPGAs limit the effective performance gains achievable by the application of ASIC-based approximation principles for FPGA-based reconfigurable computing systems. This paper presents a novel approximate multiplier architecture customized towards the FPGA-based fabrics, an efficient design methodology, and an open-source library. Our designs provide higher area, latency and energy gains along with better output accuracy than those offered by the state-of-the-art ASIC-based approximate multipliers. Moreover, compared to the multiplier IP offered by the Xilinx Vivado, our proposed design achieves up to 30%, 53%, and 67% gains in terms of area, latency, and energy, respectively, while incurring an insignificant accuracy loss (on average, below 1% average relative error). Our library of approximate multipliers is open-source and available online at https://cfaed.tudresden.de/pd-downloads to fuel further research and development in this area, and thereby enabling a new research direction for the FPGA community.
- K. Bhardwaj et al. 2014. Power-and area-efficient Approximate Wallace Tree Multiplier for error-resilient systems. In ISQED. IEEE.Google Scholar
- N. Brunie et al. 2013. Arithmetic core generation using bit heaps. In FPL.Google Scholar
- V. K Chippa et el. 2013. Analysis and characterization of inherent application resilience for approximate computing. In DAC. Google ScholarDigital Library
- A. K. Verma et al. 2008. Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design. In DATE. Google ScholarDigital Library
- M. Shafique et al. 2015. A low latency generic accuracy configurable adder. In DAC. Google ScholarDigital Library
- P. Kulkarni et al. 2011. Trading Accuracy for Power with an Underdesigned Multiplier Architecture. In Internatioal Conference on VLSI Design. Google ScholarDigital Library
- S. Hashemi et al. {n. d.}. Drum: A dynamic range unbiased multiplier for approximate applications. In ICCAD. Google ScholarDigital Library
- V. Gupta et al. 2013. Low-Power Digital Signal Processing Using Approximate Adders. IEEE Transactions on CAD of Integrated Circuits and Systems (2013). Google ScholarDigital Library
- V. Gupta et al. 2011. IMPACT: imprecise adders for low-power approximate computing. In ISLPED. Google ScholarDigital Library
- Intel. 2017. Integer Arithmetic IP Cores User Guide. (2017). https://www.altera.com/en_US/pdfs/literature/ug/ug_lpm_alt_mfug.pdfGoogle Scholar
- A. B Kahng et al. 2012. Accuracy-configurable adder for approximate arithmetic designs. In DAC. Google ScholarDigital Library
- M. Kumm et al. 2015. An efficient softcore multiplier architecture for Xilinx FPGAs. In ARITH. Google ScholarDigital Library
- Ian Kuon and Jonathan Rose. 2007. Measuring the gap between FPGAs and ASICs. IEEE TCAD 26, 2 (2007). Google ScholarDigital Library
- Chia-Hao Lin et al. 2013. High accuracy approximate multiplier with error correction. In ICCD.Google Scholar
- C. Liu et al. 2014. A low-power, high-performance approximate multiplier with configurable partial error recovery. In DATE. Google ScholarDigital Library
- J. Mody et al. 2015. Study of approximate compressors for multiplication using FPGA. In IC-GET.Google Scholar
- V. Mrazek et al. 2017. EvoApproxSb: Library of approximate adders and multipliers for circuit design and benchmarking of approximation methods. In DATE. Google ScholarDigital Library
- H. Parandeh-Afshar et al. 2011. Measuring and reducing the performance gap between embedded and soft multipliers on FPGAs. In FPL. Google ScholarDigital Library
- S. Rehman et al. 2016. Architectural-space exploration of approximate multipliers. In ICCAD. Google ScholarDigital Library
- Xilinx. 2011. LogiCORE IP Multiplier v11.2. (2011). https://www.xilinx.com/support/documentation/ip_documentation/mult_gen_ds255.pdfGoogle Scholar
- Xilinx. 2016. 7 Series FPGAs Configurable Logic Block User Guide. (2016). https://www.xilinx.com/support/documentation/user_guides/ug474_7Series_CLB.pdfGoogle Scholar
Recommendations
Area-Optimized Low-Latency Approximate Multipliers for FPGA-based Hardware Accelerators
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)The architectural differences between ASICs and FPGAs limit the effective performance gains achievable by the application of ASIC-based approximation principles for FPGA-based reconfigurable computing systems. This paper presents a novel approximate ...
A low-area yet performant FPGA implementation of Shabal
SAC'10: Proceedings of the 17th international conference on Selected areas in cryptographyIn this paper, we present an efficient FPGA implementation of the SHA-3 hash function candidate Shabal [7]. Targeted at the recent Xilinx Virtex-5 FPGA family, our design achieves a relatively high throughput of 2 Gbit/s at a cost of only 153 slices, ...
On-Chip Reconfigurable Hardware Accelerators for Popcount Computations
Popcount computations are widely used in such areas as combinatorial search, data processing, statistical analysis, and bio- and chemical informatics. In many practical problems the size of initial data is very large and increase in throughput is ...
Comments