ABSTRACT
Deep Neural Networks (DNNs) have emerged as a powerful and versatile set of techniques showing successes on challenging artificial intelligence (AI) problems. Applications in domains such as image/video processing, autonomous cars, natural language processing, speech synthesis and recognition, genomics and many others have embraced deep learning as the foundation. DNNs achieve superior accuracy for these applications with high computational complexity using very large models which require 100s of MBs of data storage, exaops of computation and high bandwidth for data movement. In spite of these impressive advances, it still takes days to weeks to train state of the art Deep Networks on large datasets - which directly limits the pace of innovation and adoption. In this paper, we present a multi-pronged approach to address the challenges in meeting both the throughput and the energy efficiency goals for DNN training.
- Gupta.S., Agrawal.A., Gopalakrishnan.K., Narayanan.P., "Deep Learning with Limited Numerical Precision," ICML, 2015. Google ScholarDigital Library
- Agrawal A., Choi J., Gopalakrishnan K., Gupta S., Nair R., Oh J., Prener D., Shukla S., Srinivasan V., Sura Z., "Approximate computing: Challenges and opportunities", IEEE International Conference on Rebooting Computing (ICRC) 2016.Google ScholarCross Ref
- Gupta S., Zhang W., Wang F., "Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study", IEEE International Conference on Data Mining (ICDM) 2016.Google Scholar
- Venkataramani S., Choi J., Srinivasan V., Gopalakrishnan K., Chang L., "DeepMatrix: A Systematic Framework to Analyze Deep Neural Network Performance on Shared Memory Accelerator Systems", IEEE/ACM Parallel Architectures and Compiler Techniques (PACT) 2017 (under review)Google Scholar
Recommendations
A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples
AbstractDeep neural networks (DNNs) are vulnerable to adversarial attacks that generate adversarial examples by adding small perturbations to the clean images. To combat adversarial attacks, the two main defense methods used are denoising and adversarial ...
An OpenCL™ Deep Learning Accelerator on Arria 10
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysConvolutional neural nets (CNNs) have become a practical means to perform vision tasks, particularly in the area of image classification. FPGAs are well known to be able to perform convolutions efficiently, however, most recent efforts to run CNNs on ...
Comments