skip to main content
10.1145/3081333.3081336acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article
Public Access

MobileDeepPill: A Small-Footprint Mobile Deep Learning System for Recognizing Unconstrained Pill Images

Authors Info & Claims
Published:16 June 2017Publication History

ABSTRACT

Correct identification of prescription pills based on their visual appearance is a key step required to assure patient safety and facilitate more effective patient care. With the availability of high-quality cameras and computational power on smartphones, it is possible and helpful to identify unknown prescription pills using smartphones. Towards this goal, in 2016, the U.S. National Library of Medicine (NLM) of the National Institutes of Health (NIH) announced a nationwide competition, calling for the creation of a mobile vision system that can recognize pills automatically from a mobile phone picture under unconstrained real-world settings. In this paper, we present the design and evaluation of such mobile pill image recognition system called MobileDeepPill. The development of MobileDeepPill involves three key innovations: a triplet loss function which attains invariances to real-world noisiness that deteriorates the quality of pill images taken by mobile phones; a multi-CNNs model that collectively captures the shape, color and imprints characteristics of the pills; and a Knowledge Distillation-based deep model compression framework that significantly reduces the size of the multi-CNNs model without deteriorating its recognition performance. Our deep learning-based pill image recognition algorithm wins the First Prize (champion) of the NIH NLM Pill Image Recognition Challenge. Given its promising performance, we believe MobileDeepPill helps NIH tackle a critical problem with significant societal impact and will benefit millions of healthcare personnel and the general public.

References

  1. Amazon Shopping Mobile App. https://itunes.apple.com/us/app/amazon-app-shop-scan-compare/id297606951?mt=8.Google ScholarGoogle Scholar
  2. Google Translate. https://play.google.com/store/apps/details?id=com.google.android.apps.translate&hl=en.Google ScholarGoogle Scholar
  3. Monsoon Power Monitor. https://www.msoon.com/LabEquipment/PowerMonitor/.Google ScholarGoogle Scholar
  4. NIH NLM Pill Image Recognition Challenge. https://pir.nlm.nih.gov/challenge/.Google ScholarGoogle Scholar
  5. PWRcheck DC power analyzer. http://www.westmountainradio.com/product_info.php?products_id=pwrcheck.Google ScholarGoogle Scholar
  6. Your Prescriptions and Your Privacy. https://www.privacyrights.org/consumer-guides/your-prescriptions-and-your-privacy-california-medical-privacy-series.Google ScholarGoogle Scholar
  7. S. Bhattacharya and N. D. Lane. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM, pages 176--189. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. J. Caban, A. Rosebrock, and T. S. Yoo. Automatic identification of prescription drugs using shape distribution models. In 2012 19th IEEE International Conference on Image Processing, pages 1005--1008. IEEE, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  9. T. Y.-H. Chen, L. Ravindranath, S. Deng, P. Bahl, and H. Balakrishnan. Glimpse: Continuous, real-time object recognition on mobile devices. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, pages 155--168. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. L. Denton, W. Zaremba, J. Bruna, Y. LeCun, and R. Fergus. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems, pages 1269--1277, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems, pages 1135--1143, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Han, H. Shen, M. Philipose, S. Agarwal, A. Wolman, and A. Krishnamurthy. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, MobiSys '16, pages 123--136, New York, NY, USA, 2016. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. He and J. Sun. Convolutional neural networks at constrained time cost. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5353--5360, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770--778, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  15. G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.Google ScholarGoogle Scholar
  16. F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 1mb model size. arXiv preprint arXiv:1602.07360, 2016.Google ScholarGoogle Scholar
  17. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 1725--1732, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. D. Lane, S. Bhattacharya, P. Georgiev, C. Forlivesi, L. Jiao, L. Qendro, and F. Kawsar. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), pages 1--12. IEEE, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436--444, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  21. Y.-B. Lee, U. Park, A. K. Jain, and S.-W. Lee. Pill-id: Matching and retrieval of drug pill images. Pattern Recognition Letters, 33(7):904--910, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Lin, Q. Chen, and S. Yan. Network in network. arXiv preprint arXiv:1312.4400, 2013.Google ScholarGoogle Scholar
  23. M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. arXiv preprint arXiv:1603.05279, 2016.Google ScholarGoogle Scholar
  24. F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 815--823, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  25. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.Google ScholarGoogle Scholar
  26. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1--9, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  27. Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1701--1708, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. In Advances in neural information processing systems, pages 487--495, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. MobileDeepPill: A Small-Footprint Mobile Deep Learning System for Recognizing Unconstrained Pill Images

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MobiSys '17: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services
        June 2017
        520 pages
        ISBN:9781450349284
        DOI:10.1145/3081333

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 June 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        MobiSys '17 Paper Acceptance Rate34of188submissions,18%Overall Acceptance Rate274of1,679submissions,16%

        Upcoming Conference

        MOBISYS '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader