High Performance Approximate Memories for Image Processing Applications

Jothin, R.; Mohamed, M. Peer

doi:10.1007/s10836-020-05879-0

High Performance Approximate Memories for Image Processing Applications

Published: 21 May 2020

Volume 36, pages 419–428, (2020)
Cite this article

Journal of Electronic Testing Aims and scope Submit manuscript

130 Accesses
Explore all metrics

Abstract

Efficient utilization of on-chip Static Random Access Memory (SRAM) space is more important on processor core design in modern Field Programmable Gate Array (FPGA) based Digital Signal Processing (DSP) applications. In the proposed High-performance Approximate Single Port (HASP) SRAM architecture, a significant amount of data is stored to achieve high performance. The constraints involved with high performance are counterbalanced to provide high accuracy, high speed, low power and area efficiency. In the proposed High-performance Approximate Sub-Bank Dual Port (HASBDP1 and HASBDP2) memory architectures, HASP has been employed and modified to work as a True DP SRAM with energy and area efficiency. The performance of the proposed memories is investigated by comparing its speed, area and power with those of the existing approaches. The proposed HASP SRAM provides 14.99% less power consumption and thirteen numbers of logic elements savings in the resource utilization than the existing conventional SP SRAM. By considering the design metrics, the proposed HASBDP SRAMs outperform than the conventional TDP and sub-bank DP SRAMs approaches. The proposed HASBDP2 exhibits 29.09%, 22.37% higher PSNR and 32.94%, 28.48% higher SSIM than the truncated least significant bit and static segment on-chip approximate memories respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High Performance Static Segment On-Chip Memory for Image Processing Applications

Article 16 July 2018

Resource-Efficient TCAM Implementation Using SRAM

Analysis of Memory-Based Real Fast Fourier Transform Architectures for Low-Area Applications

References

Altera Corporation (2010) Embedded design handbook, Chapter 7:1–18
Ang SS, Constantinides GA, Luk W, Cheung PYK (2008) Custom parallel caching schemes for hardware accelerated image compression. J Real-Time Image Proc 3(4):289–302
Article Google Scholar
Bajwa H (2007) An area-efficient, high-performance, low-power multi-port cache memory architecture, Ph.D. Thesis, Department of Electrical Engineering, City University of New York pp.109
Bonato V, Marques E, Constantinides GA (2009) A floating-point extended kalman filter implementation for autonomous mobile robots. J Sign Process Syst Sign Image Video Technol 56:41–50
Article Google Scholar
Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv 34(2):171–210
Article Google Scholar
D’ıaz J, Ros E, Sabatini SP, Solari F, Mota S (2007) A phase-based stereo vision system-on-a-chip. BioSystems 87(2–3):314–321
Article Google Scholar
Deepa P, Vasanthanayaki C (2012) FPGA based efficient on-chip memory for image processing algorithms. Microelectron J 43(11):916–928
Article Google Scholar
Dı’az J, Ros E, Pelayo F, Ortigosa EM, Mota S (2006) FPGA-based real-time opticalflow and system. IEEE Trans Circuits Syst Video Technol 16:274–279
Article Google Scholar
Donald GB (2011) Design for embedded image processing on FPGAs. Wiley, Hoboken
Google Scholar
Renesas Technology Develops 90 nm Dual-Port SRAM for SoC (2004) Featuring High-Level Density and Low Power: https://www.businesswire.com/news/home/20040217006411/en/Renesas-Technology-Develops-90-nm-Dual-Port-SRAM
Esmaeilzadeh H, Sampson A, Ceze L, Burger D (2012) Neural acceleration for general-purpose approximate programs, Proc. of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO45), pp. 449–460
Ferrari G, Colavolpe G, Raheli R (2003) On trellis-based truncated-memory detection, GLOBECOM '03. IEEE Global Telecommunications Conference (IEEE Cat. No.03CH37489), San Francisco, CA, vol.4. pp. 2218–2222
Guo Z, Najjar W, Vahid F, Vissers K (2004) A quantitative analysis of the speedup factors of FPGAs over processors, In FPGA ‘04: Proceedings of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays. New York, NY, USA: ACM. 12, pp. 162–170
Jacobsson R (2007) Building integrated remote control systems for electronics boards, in: Real-Time Conference, 15th IEEE-NPSS, 2007, pp. 1–6 https://doi.org/10.1109/RTC.2007.4382741
Jothin R, Vasanthanayaki C (2018) High performance static segment on-Chip memory for image processing applications. J Electron Test 34(4):389–404
Article Google Scholar
Khudia DS, Zamirai B, Samadi M, Mahlke S 2015 Rumba: an online quality management system for approximate computing, Proc. of the 42nd Annual International Symposium on Computer Architecture (ISCA-42), pp. 554–566
Kuon I, Rose J (2006) Measuring the gap between FPGAs and ASICs, FPGA ‘06: Proceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays. New York, NY, USA: ACM. 14, pp.21–30
Liu Q, Constantinides G, Masselos K, Cheung P (2007) Automatic on-chip memory minimization for data reuse, in: 15th Annual IEEE Symposium on Field- Programmable Custom Computing Machines, pp. 251–260
Samadi M, Lee J, Jamshidi DA, Hormati A, Mahlke S (2013) SAGE: self-tuning approximation for graphics engines, Proc. of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46), pp. 13–24
Samadi M, Jamshidi DA, Lee J, Mahlke S (2014) Paraprox: pattern-based approximation for data parallel applications, Proc. of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XIX), pp. 35–50
Sampson A, Dietl W, Fortuna E, Gnanapragasam D, Ceze L, Grossman D (2011) EnerJ: approximate data types for safe and general low-power computation, Proc. of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2011), IEEE, pp 164–174
Sampson A, Nelson J, Strauss K, Ceze L (2013) Approximate storage in solid-state memories, Proc. of the 46th Annual IEEE/ACM International Symposium on Micro architecture (MICRO-46), pp. 25–36
San Miguel J, Albericio J, Moshovos A, Jerger NE (2015) Doppelganger: a cache for approximate computing, Proc. of the 48th International Symposium on Microarchitecture (MICRO-48), pp. 50–61
Sidiroglou-Douskos S, Misailovic S, Hoffmann H, Rinard M (2011) Managing performance vs. accuracy trade-offs with loop perforation, Proc. of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE 2011), pp. 124–134
Silva F, Magalhães W, Silveira J, Ferreira JM, Magalhães P, Lima OA, Marcon C (2017) Evaluation of multiple bit upset tolerant codes for NoCs buffering, Circuits & Systems (LASCAS) 2017 IEEE 8th Latin American Symposium on, pp. 1–4
Stallings W (2005) Computer Organization and Architecture: Designing for Performance, 7th edn. Prentice Hall, Boston
Stefania P, Pasquale C (2011) Efficient memory architecture for image processing. Int J Circuit Theory Appl 39:351–356
Article Google Scholar
Venkataramani S, Chippa VK, Chakradhar ST, Roy K, Raghunathan A (2013) Quality programmable vector processors for approximate computing, Proc. of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46), pp. 1–12
Ranjith K, Volkan K (2009) Temperature adaptive voltage scaling for enhanced energy efficiency in subthreshold memory arrays. Microelectron J 40(6):1013–1025
Wang Y, Chen S, Bermak A (2008) Novel VLSI implementation of Peano–Hilbert curve address generator, IEEE International Symposium on Circuits and Systems, ISCAS 2008. pp. 476–479
You L, Xiangqing H (2008) A novel area-efficient and full current-mode dual-port SRAM, Proceedings of IEEE International Conference on Communications, Circuits and Systems, 2008. ICCCAS 2008, pp. 1079–1082

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Infant Jesus College of Engineering, Thoothukudi, India
R. Jothin
Anna University, Chennai, India
M. Peer Mohamed

Authors

R. Jothin
View author publications
You can also search for this author in PubMed Google Scholar
M. Peer Mohamed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Jothin.

Additional information

Responsible Editor: S. T. Chakradhar

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jothin, R., Mohamed, M.P. High Performance Approximate Memories for Image Processing Applications. J Electron Test 36, 419–428 (2020). https://doi.org/10.1007/s10836-020-05879-0

Download citation

Received: 15 September 2019
Accepted: 15 April 2020
Published: 21 May 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10836-020-05879-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High Performance Approximate Memories for Image Processing Applications

Abstract

Access this article

Similar content being viewed by others

High Performance Static Segment On-Chip Memory for Image Processing Applications

Resource-Efficient TCAM Implementation Using SRAM

Analysis of Memory-Based Real Fast Fourier Transform Architectures for Low-Area Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

High Performance Approximate Memories for Image Processing Applications

Abstract

Access this article

Similar content being viewed by others

High Performance Static Segment On-Chip Memory for Image Processing Applications

Resource-Efficient TCAM Implementation Using SRAM

Analysis of Memory-Based Real Fast Fourier Transform Architectures for Low-Area Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation