research-article

Open Access

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM

Authors:
Zhenhua Zhu

Dept. of EE, BNRist, Tsinghua University

Dept. of EE, BNRist, Tsinghua University
View Profile

,
Hanbo Sun

Dept. of EE, BNRist, Tsinghua University

Dept. of EE, BNRist, Tsinghua University
View Profile

,
Yujun Lin

Dept. of EECS, Massachusetts Institute of Technology

Dept. of EECS, Massachusetts Institute of Technology
View Profile

,
Guohao Dai

Dept. of EE, BNRist, Tsinghua University

Dept. of EE, BNRist, Tsinghua University
View Profile

,
Lixue Xia

Alibaba Group

Alibaba Group
View Profile

,
Song Han

Dept. of EECS, Massachusetts Institute of Technology

Dept. of EECS, Massachusetts Institute of Technology
View Profile

,
Yu Wang

Dept. of EE, BNRist, Tsinghua University

Dept. of EE, BNRist, Tsinghua University
View Profile

,
Huazhong Yang

Dept. of EE, BNRist, Tsinghua University

Dept. of EE, BNRist, Tsinghua University
View Profile

DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019June 2019Article No.: 56Pages 1–6https://doi.org/10.1145/3316781.3317739

Published:02 June 2019Publication History

DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019

Pages 1–6

ABSTRACT

Convolutional Neural Networks (CNNs) play a vital role in machine learning. Emerging resistive random-access memories (RRAMs) and RRAM-based Processing-In-Memory architectures have demonstrated great potentials in boosting both the performance and energy efficiency of CNNs. However, restricted by the immature process technology, it is hard to implement and fabricate a CNN accelerator chip based on multi-bit RRAM devices. In addition, existing single bit RRAM based CNN accelerators only focus on binary or ternary CNNs which have more than 10% accuracy loss compared with full precision CNNs. This paper proposes a configurable multi-precision CNN computing framework based on single bit RRAM, which consists of an RRAM computing overhead aware network quantization algorithm and a configurable multi-precision CNN computing architecture based on single bit RRAM. The proposed method can achieve equivalent accuracy as full precision CNN but also with lower storage consumption and latency via multiple precision quantization. The designed architecture supports for accelerating the multi-precision CNNs even with various precision among different layers. Experiment results show that the proposed framework can reduce 70% computing area and 75% computing energy on average, with nearly no accuracy loss. And the equivalent energy efficiency is 1.6 ~ 8.6× compared with existing RRAM based architectures with only 1.07% area overhead.

References

M. Chang et al. 2014. 19.4 embedded 1Mb ReRAM in 28nm CMOS with 0.27-to-1V read using swing-sample-and-couple sense amplifier and self-boost-write-termination scheme. In ISSCC, 2014. 332--333.Google Scholar
W. Chen et al. 2018. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In ISSCC, 2018. 494--496.Google Scholar
P. Chi et al. 2016. PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory. In ISCA, 2016. Google ScholarDigital Library
K. D. Choo et al. 2016. 27.3 Area-efficient 1GS/s 6b SAR ADC with charge-injection-cell-based DAC. In ISSCC, 2016. 460--461.Google Scholar
K. He et al. 2016. Deep Residual Learning for Image Recognition. In CVPR, 2016.Google ScholarCross Ref
M. Hu et al. 2016. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In DAC, 2016. Google ScholarDigital Library
Kaggle et al. 2014. CIFAR-10 - Object Recognition in Images, website. (2014). https://www.kaggle.com/c/cifar-10.Google Scholar
S. Karen et al. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Science (2014).Google Scholar
L. Kull et al. 2017. 28.5 A 10b 1.5GS/s pipelined-SAR ADC with background second-stage common-mode regulation and offset calibration in 14nm CMOS FinFET. In ISSCC, 2017. 474--475.Google Scholar
Y. LeCun et al. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, 1998. 2278--2324.Google ScholarCross Ref
B. Li et al. 2015. MErging the Interface: Power, area and accuracy co-optimization for RRAM crossbar-based mixed-signal computing system. In DAC, 2015. Google ScholarDigital Library
J. Lin et al. 2018. Rescuing memristor-based computing with non-linear resistance levels. In DATE, 2018. 407--412.Google Scholar
M. Rastegari et al. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV, 2016. Springer, 525--542.Google Scholar
M. Saberi et al. 2011. Analysis of Power Consumption and Linearity in Capacitive Digital-to-Analog Converters Used in Successive Approximation ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers 58, 8 (Aug 2011), 1736--1748.Google ScholarCross Ref
A. Shafiee et al. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-situ Analog Arithmetic in Crossbars. In ISCA, 2016. Google ScholarDigital Library
L. Song et al. 2017. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning. In HPCA, 2017.Google ScholarCross Ref
X. Sun et al. 2018. XNOR-RRAM: A scalable and parallel resistive synaptic architecture for binary neural networks. In DATE, 2018. 1423--1428.Google Scholar
T. Tang et al. 2017. Binary convolutional neural network on RRAM. In ASPDAC, 2017. 782--787.Google Scholar
H. S. P. Wong et al. 2012. Metal Oxide RRAM. Proc. IEEE 100, 6, 1951--1970.Google ScholarCross Ref
S. Wu et al. 2018. Training and inference with integers in deep neural networks. arXiv preprint arXiv:1802.04680 (2018).Google Scholar
L. Xia et al. 2016. Switched by input: Power efficient structure for RRAM-based convolutional neural network. In DAC, 2016. 1--6. Google ScholarDigital Library
S. Yu and P. Chen. 2016. Emerging Memory Technologies: Recent Trends and Prospects. IEEE Solid-State Circuits Magazine 8, 2 (Spring 2016), 43--56.Google ScholarCross Ref
S. Zhou et al. 2016. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016).Google Scholar
Z. Zhu et al. 2018. Mixed Size Crossbar based RRAM CNN Accelerator with Overlapped Mapping Method. In ICCAD, 2018. 1--6. Google ScholarDigital Library

Recommendations

ExtendLife: Weights Mapping Framework to Improve RRAM Lifetime for Accelerating CNN
Advanced Parallel Processing Technologies
Abstract
Process-in-memory (PIM) engines based on Resistive random-access memory (RRAM) are used to accelerate the convolutional neural network (CNN). RRAM performs computation by mapping weights on its crossbars and applying a high voltage to get results. ...
Read More
Bit parallel 6T SRAM in-memory computing with reconfigurable bit-precision
DAC '20: Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference

This paper presents 6T SRAM cell-based bit-parallel in-memory computing (IMC) architecture to support various computations with reconfigurable bit-precision. In the proposed technique, bitline computation is performed with a short WL followed by BL ...
Read More
Experimental demonstration of Single-Level and Multi-Level-Cell RRAM-based In-Memory Computing with up to 16 parallel operations
2022 IEEE International Reliability Physics Symposium (IRPS)
Crossbar arrays of resistive memories (RRAM) hold the promise of enabling In-Memory Computing (IMC), but essential challenges due to the impact of device imperfection and device endurance have yet to be overcome. In this work, we demonstrate ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019
June 2019
1378 pages
ISBN:9781450367257
DOI:10.1145/3316781

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 June 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,770of5,499submissions,32%
Upcoming Conference
DAC '24

Sponsor:

sigda

61st ACM/IEEE Design Automation Conference

June 23 - 27, 2024

San Francisco , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 52
  Total Citations
  View Citations
- 1,244
  Total Downloads
- Downloads (Last 12 months)253
- Downloads (Last 6 weeks)23
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM

DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019

ABSTRACT

References

Cited By

Recommendations

ExtendLife: Weights Mapping Framework to Improve RRAM Lifetime for Accelerating CNN

Bit parallel 6T SRAM in-memory computing with reconfigurable bit-precision

Experimental demonstration of Single-Level and Multi-Level-Cell RRAM-based In-Memory Computing with up to 16 parallel operations

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM

DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019

ABSTRACT

References

Cited By

Recommendations

ExtendLife: Weights Mapping Framework to Improve RRAM Lifetime for Accelerating CNN

Bit parallel 6T SRAM in-memory computing with reconfigurable bit-precision

Experimental demonstration of Single-Level and Multi-Level-Cell RRAM-based In-Memory Computing with up to 16 parallel operations

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media