ABSTRACT
Convolutional Neural Networks (CNNs) play a vital role in machine learning. Emerging resistive random-access memories (RRAMs) and RRAM-based Processing-In-Memory architectures have demonstrated great potentials in boosting both the performance and energy efficiency of CNNs. However, restricted by the immature process technology, it is hard to implement and fabricate a CNN accelerator chip based on multi-bit RRAM devices. In addition, existing single bit RRAM based CNN accelerators only focus on binary or ternary CNNs which have more than 10% accuracy loss compared with full precision CNNs. This paper proposes a configurable multi-precision CNN computing framework based on single bit RRAM, which consists of an RRAM computing overhead aware network quantization algorithm and a configurable multi-precision CNN computing architecture based on single bit RRAM. The proposed method can achieve equivalent accuracy as full precision CNN but also with lower storage consumption and latency via multiple precision quantization. The designed architecture supports for accelerating the multi-precision CNNs even with various precision among different layers. Experiment results show that the proposed framework can reduce 70% computing area and 75% computing energy on average, with nearly no accuracy loss. And the equivalent energy efficiency is 1.6 ~ 8.6× compared with existing RRAM based architectures with only 1.07% area overhead.
- M. Chang et al. 2014. 19.4 embedded 1Mb ReRAM in 28nm CMOS with 0.27-to-1V read using swing-sample-and-couple sense amplifier and self-boost-write-termination scheme. In ISSCC, 2014. 332--333.Google Scholar
- W. Chen et al. 2018. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In ISSCC, 2018. 494--496.Google Scholar
- P. Chi et al. 2016. PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory. In ISCA, 2016. Google ScholarDigital Library
- K. D. Choo et al. 2016. 27.3 Area-efficient 1GS/s 6b SAR ADC with charge-injection-cell-based DAC. In ISSCC, 2016. 460--461.Google Scholar
- K. He et al. 2016. Deep Residual Learning for Image Recognition. In CVPR, 2016.Google ScholarCross Ref
- M. Hu et al. 2016. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In DAC, 2016. Google ScholarDigital Library
- Kaggle et al. 2014. CIFAR-10 - Object Recognition in Images, website. (2014). https://www.kaggle.com/c/cifar-10.Google Scholar
- S. Karen et al. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Science (2014).Google Scholar
- L. Kull et al. 2017. 28.5 A 10b 1.5GS/s pipelined-SAR ADC with background second-stage common-mode regulation and offset calibration in 14nm CMOS FinFET. In ISSCC, 2017. 474--475.Google Scholar
- Y. LeCun et al. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, 1998. 2278--2324.Google ScholarCross Ref
- B. Li et al. 2015. MErging the Interface: Power, area and accuracy co-optimization for RRAM crossbar-based mixed-signal computing system. In DAC, 2015. Google ScholarDigital Library
- J. Lin et al. 2018. Rescuing memristor-based computing with non-linear resistance levels. In DATE, 2018. 407--412.Google Scholar
- M. Rastegari et al. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV, 2016. Springer, 525--542.Google Scholar
- M. Saberi et al. 2011. Analysis of Power Consumption and Linearity in Capacitive Digital-to-Analog Converters Used in Successive Approximation ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers 58, 8 (Aug 2011), 1736--1748.Google ScholarCross Ref
- A. Shafiee et al. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-situ Analog Arithmetic in Crossbars. In ISCA, 2016. Google ScholarDigital Library
- L. Song et al. 2017. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning. In HPCA, 2017.Google ScholarCross Ref
- X. Sun et al. 2018. XNOR-RRAM: A scalable and parallel resistive synaptic architecture for binary neural networks. In DATE, 2018. 1423--1428.Google Scholar
- T. Tang et al. 2017. Binary convolutional neural network on RRAM. In ASPDAC, 2017. 782--787.Google Scholar
- H. S. P. Wong et al. 2012. Metal Oxide RRAM. Proc. IEEE 100, 6, 1951--1970.Google ScholarCross Ref
- S. Wu et al. 2018. Training and inference with integers in deep neural networks. arXiv preprint arXiv:1802.04680 (2018).Google Scholar
- L. Xia et al. 2016. Switched by input: Power efficient structure for RRAM-based convolutional neural network. In DAC, 2016. 1--6. Google ScholarDigital Library
- S. Yu and P. Chen. 2016. Emerging Memory Technologies: Recent Trends and Prospects. IEEE Solid-State Circuits Magazine 8, 2 (Spring 2016), 43--56.Google ScholarCross Ref
- S. Zhou et al. 2016. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016).Google Scholar
- Z. Zhu et al. 2018. Mixed Size Crossbar based RRAM CNN Accelerator with Overlapped Mapping Method. In ICCAD, 2018. 1--6. Google ScholarDigital Library
Recommendations
ExtendLife: Weights Mapping Framework to Improve RRAM Lifetime for Accelerating CNN
Advanced Parallel Processing TechnologiesAbstractProcess-in-memory (PIM) engines based on Resistive random-access memory (RRAM) are used to accelerate the convolutional neural network (CNN). RRAM performs computation by mapping weights on its crossbars and applying a high voltage to get results. ...
Bit parallel 6T SRAM in-memory computing with reconfigurable bit-precision
DAC '20: Proceedings of the 57th ACM/EDAC/IEEE Design Automation ConferenceThis paper presents 6T SRAM cell-based bit-parallel in-memory computing (IMC) architecture to support various computations with reconfigurable bit-precision. In the proposed technique, bitline computation is performed with a short WL followed by BL ...
Experimental demonstration of Single-Level and Multi-Level-Cell RRAM-based In-Memory Computing with up to 16 parallel operations
2022 IEEE International Reliability Physics Symposium (IRPS)Crossbar arrays of resistive memories (RRAM) hold the promise of enabling In-Memory Computing (IMC), but essential challenges due to the impact of device imperfection and device endurance have yet to be overcome. In this work, we demonstrate ...
Comments