Skip to main content
  • 1580 Accesses

Abstract

One of the comments on AI has been that we don’t really understand what’s going inside the black box of an AI model. What are all the hidden neurons doing when a CNN is recognizing a cat? How is an AI model able to generalize to unseen examples? Another persistent comment has been that we don’t really know what’s happening during the training process. What is the nature of this optimization landscape? Why doesn’t the training get stuck in a local minimum?

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Antonio Auffinger, Gérard Ben Arous, and Jiří Černý “Random Matrices and Complexity of Spin Glasses”. In: Communications on Pure and Applied Mathematics 66.2 (2013), pp. 165–201. doi: https://doi.org/10.1002/cpa.21422.

  2. A. Choromanska et al. “The Loss Surfaces of Multilayer Networks”. In: Journal of Machine Learning Research 38.8 (2015), pp. 192–204.

    Google Scholar 

  3. Y. Dauphin et al. “Identifying and Attacking the Saddle Point Problem in High-Dimensional Non-convex Optimization”. In: Proceedings of Neural Information Processing Systems. 2014.

    Google Scholar 

  4. David S. Dean and Satya N. Majumdar. “Large Deviations of Extreme Eigenvalues of Random Matrices”. In: Phys. Rev. Lett. 97 16 (2006), pp. 160–201. doi: https://doi.org/10.1103/PhysRevLett.97.160201.

  5. Jean-Pierre Dedieu and Gregorio Malajovich. “On the Number of Minima of a Random Polynomial”. In: Journal of Complexity 24.2 (2008), pp. 89–108. issn: 0885-064X. doi: https://doi.org/10.1016/j.jco.2007.09.003.

  6. Timur Garipov et al. “Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs”. In: NeurIPS. 2018.

    Google Scholar 

  7. Ian J. Goodfellow and Oriol Vinyals. “Qualitatively Characterizing Neural Network Optimization Problems”. In: ICLR. 2015.

    Google Scholar 

  8. Geoffrey E. Hinton, James L. McClelland, and David E. Rumelhart. “Distributed Representations”. In: The Philosophy of Artificial Intelligence. 1990.

    Google Scholar 

  9. Scott Kirkpatrick and Bart Selman. “Critical Behavior in the Satisfiability of Random Boolean Expressions”. In: Science 264 (1994), pp. 1297–1301.

    Google Scholar 

  10. Yann LeCun et al. “Efficient BackProp”. In: Neural Networks: Tricks of the Trade. 1998.

    Google Scholar 

  11. Chunyuan Li et al. “Measuring the Intrinsic Dimension of Objective Landscapes”. In: ICLR. 2018.

    Google Scholar 

  12. Hao Li et al. “Visualizing the Loss Landscape of Neural Nets”. In: Neural Information Processing Systems. 2018.

    Google Scholar 

  13. Andrew Lucas. “Ising Formulations of Many NP Problems”. In: Frontiers in Physics 2 (2014), p. 5. doi: 10.3389/fphy.2014.00005.

    Google Scholar 

  14. Guido F Montufar et al. “On the Number of Linear Regions of Deep Neural Networks”. In: Advances in Neural Information Processing Systems 27. Curran Associates, Inc., 2014, pp. 2924–2932.

    Google Scholar 

  15. Maithra Raghu et al. “On the Expressive Power of Deep Neural Networks”. In: Proceedings of the 34th International Conference on Machine Learning. Vol. 70. Proceedings of Machine Learning Research. 2017, pp. 2847–2854.

    Google Scholar 

  16. Richard P. Stanley. “An Introduction to Hyperplane Arrangements”. In: IAS/Park City Mathematics Series. 2006.

    Google Scholar 

  17. Liwen Zhang, Gregory Naitzat, and Lek-Heng Lim. “Tropical Geometry of Deep Neural Networks”. In: Proceedings of the 35th International Conference on Machine Learning. Vol. 80. Proceedings of Machine Learning Research. 2018, pp. 5824–5832.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Dube, S. (2021). Why AI Works. In: An Intuitive Exploration of Artificial Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-68624-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68624-6_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68623-9

  • Online ISBN: 978-3-030-68624-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics