Skip to main content
Log in

2L-3W: 2-Level 3-Way Hardware–Software Co-verification for the Mapping of Convolutional Neural Network (CNN) onto FPGA Boards

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

FPGAs have become a popular choice for deploying Convolutional Neural Networks (CNNs). As a result, many researchers have explored the deployment and mapping of CNN on FPGA. However, the verification of these deployments at the design time is one of the biggest challenges. The need for design-time verification is growing exponentially because of its use in safety-critical applications. To the best of our knowledge, this is the first work that proposes a 2-Level 3-Way (2L-3W) hardware–software co-verification methodology at design time. 2L-3W provides a step-by-step guide for the successful mapping, deployment, and verification of CNN on FPGA boards. The 2-Level verification serves the purpose of ensuring the implementation in each stage (software and hardware) is following the desired behavior. The 3-Way co-verification provides a cross-paradigm (software, design architecture, and hardware) layer-by-layer parameter check to assure the correct implementation and mapping of the CNNs onto FPGA boards. The proposed 2L-3W co-verification methodology has been evaluated over several test cases. In each case, the prediction and layer-by-layer output of the CNN deployed on the PYNQ FPGA board (hardware), intermediate design results of the layer-by-layer output of the CNN implemented on Vivado HLS, and the prediction and layer-by-layer output of the software level (Caffe) are compared to obtain a similarity score with a Python script. The comparison provides the degree of success of the CNN mapping to the FPGA and helps identify in design time the layer to be debugged in the case of unsuccessful mapping. We demonstrated our technique on LeNet CNN and LeNet-3D CNN (a Caffe-inspired network for the Cifar10 dataset), and the co-verification results yielded layer-by-layer similarity scores of 99% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Zhang C, Li P, Sun G, Guan Y, Xiao B, Cong J. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161–170 (2015). ACM.

  2. Odetola TA, Oderhohwo O, Hasan SR. A scalable multilabel classification to deploy deep learning architectures for edge devices. arXiv preprint 2019. http://arxiv.org/abs/1911.02098.

  3. Wang C, Gong L, Yu Q, Li X, Xie Y, Zhou X. DLAU: a scalable deep learning accelerator unit on FPGA. IEEE Trans Comput Aided Des Integr Circuits Syst. 2017;36(3):513–7.

    Google Scholar 

  4. Odetola TA, Mohammed HR, Hasan, SR. A stealthy hardware trojan exploiting the architectural vulnerability of deep learning architectures: Input interception attack (iia). arXiv preprint 2019. http://arxiv.org/abs/1911.00783.

  5. Bacis M, Natale G, Del Sozzo E, Santambrogio MD. A pipelined and scalable dataflow implementation of convolutional neural networks on fpga. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 90–97 (2017). IEEE.

  6. Hailesellasie MT, Hasan SR. Mulnet: a flexible CNN processor with higher resource utilization efficiency for constrained devices. IEEE Access. 2019;7:47509–24.

    Article  Google Scholar 

  7. Guo K, Sui L, Qiu J, Yu J, Wang J, Yao S, Han S, Wang Y, Yang H. Angel-Eye: a complete design flow for mapping CNN onto embedded FPGA. IEEE Trans Comput Aided Des Integr Circuits Syst. 2017;37(1):35–47.

    Article  Google Scholar 

  8. Park J, Sung W. FPGA based implementation of deep neural networks using on-chip memory only. In: Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference On, pp. 1011–1015 (2016). IEEE.

  9. Rastegari M, Ordonez V, Redmon J, Farhadi A. XNOR-net: Imagenet classification using binary convolutional neural networks. In: European Conference on Computer Vision, pp. 525–542. Springer; 2016.

  10. Zhang X, Ramachandran A, Zhuge C, He D, Zuo W, Cheng Z, Rupnow K, Chen D. Machine learning on fpgas to face the iot revolution. In: Proceedings of the 36th International Conference on Computer-Aided Design, pp. 819–826. IEEE Press; 2017.

  11. Wang L-T, Chang Y-W, Cheng K-TT. Electronic design automation: synthesis, verification, and test. Burlington: Morgan Kaufmann; 2009.

    Google Scholar 

  12. Xiang W, Tran H-D, Johnson TT. Output reachable set estimation and verification for multilayer neural networks. IEEE Trans Neural Netw Learn Syst. 2018;29(11):5777–83.

    Article  MathSciNet  Google Scholar 

  13. Dwarakanath A, Ahuja M, Sikand S, Rao RM, Bose R, Dubash N, Podder S. Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 118–128. ACM; 2018.

  14. Mu J, Zhang W, Liang H, Sinha S. A Collaborative Framework for FPGA-based CNN Design Modeling and Optimization. In: 2018 28th International Conference on Field Programmable Logic and Applications (FPL), pp. 139–1397. IEEE; 2018.

  15. Hao C, Zhang X, Li Y, Huang S, Xiong J, Rupnow K, Hwu W-m, Chen D. Fpga/dnn co-design: an efficient design methodology for 1ot intelligence on the edge. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), pp. 1–6. IEEE; 2019.

  16. Park H, Lee C, Lee H, Yoo Y, Park Y, Kim I, Yi K.:Optimizing DCNN FPGA accelerator design for handwritten hangul character recognition: work-in-progress. In: Proceedings of the 2017 International Conference on Compilers, Architectures and Synthesis for Embedded Systems Companion, p. 11. ACM; 2017.

  17. O’Loughlin D, Coffey A, Callaly F, Lyons D, Morgan F. Xilinx vivado high level synthesis: Case studies 2014.

  18. Lacey G, Taylor GW, Areibi S. Deep learning on FPGAs: past, present, and future. arXiv preprint 2016. http://arxiv.org/abs/1602.04283.

  19. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM; 2014.

  20. Guo Y, Yao A, Chen Y. Dynamic network surgery for efficient DNNs. In: Advances in neural information processing systems, 2016; pp. 1379–1387.

  21. Janßen B, Wingender T, Hübner M. Hardware accelerator framework approach for dynamic partial reconfigurable overlays on xilinx pynq. Informatik 2017.

  22. Janßen B, Zimprich, P, Hübner M. A dynamic partial reconfigurable overlay concept for pynq. In: 2017 27th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4. IEEE; 2017.

  23. Xilinx: Python productivity for Zynq (Pynq) Documentation Release 2.2. https://buildmedia.readthedocs.org/media/pdf/pynq/latest/pynq.pdf 2019.

  24. Johnson J. Using the AXI DMA in Vivado. 2014. http://www.fpgadeveloper.com/2014/08/using-the-axi-dma-in-vivado.html.

  25. Xilinx: AXI DMA Controller. 2019. https://www.xilinx.com/products/intellectual-property/axi_dma.html.

  26. Odetola TA, Mohammed Y. Similarity Map. 2020. https://github.com/yousufm97/similarity_maps.

  27. Kästner F, Janßen B, Kautz F, Hübner M, Corradi G. Hardware/software codesign for convolutional neural networks exploiting dynamic partial reconfiguration on pynq. In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 154–161. IEEE; 2018.

  28. Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ. EIE: efficient inference engine on compressed deep neural network. In: Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium On, pp. 243–254. IEEE; 2016.

  29. Choi J, Irick KM, Hardin J, Qiu W, Yuille A, Sampson J, Narayanan V. Stochastic functional verification of DNN design through progressive virtual dataset generation. In: 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5. IEEE; 2018.

  30. Lee C-w. FPGA Accelerator for CNN using Vivado HLS. GitHub 2018.

  31. Evanczuk S. Get started with machine learning using readily available hardware and software. Digi-Key 2018.

  32. Xilinx: Accelerating DNNs with Xilinx Alveo Accelerator Cards. Xilinx 2018.

Download references

Funding

This work is partially supported by the National Science Foundation NSF CNS #1852126, the Carnegie Classification Funding from College of Engineering, and the Center for Manufacturing Research (CMR) at Tennessee Technological University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tolulope A. Odetola.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Odetola, T.A., Groves, K.M., Mohammed, Y. et al. 2L-3W: 2-Level 3-Way Hardware–Software Co-verification for the Mapping of Convolutional Neural Network (CNN) onto FPGA Boards. SN COMPUT. SCI. 3, 60 (2022). https://doi.org/10.1007/s42979-021-00954-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00954-5

Keywords

Navigation