Abstract
We propose deep reinforcement learning as a model-free method for exploring the landscape of string vacua. As a concrete application, we utilize an artificial intelligence agent known as an asynchronous advantage actor-critic to explore type IIA compactifications with intersecting D6-branes. As different string background configurations are explored by changing D6-brane configurations, the agent receives rewards and punishments related to string consistency conditions and proximity to Standard Model vacua. These are in turn utilized to update the agent’s policy and value neural networks to improve its behavior. By reinforcement learning, the agent’s performance in both tasks is significantly improved, and for some tasks it finds a factor of \( \mathcal{O}(200) \) more solutions than a random walker. In one case, we demonstrate that the agent learns a human-derived strategy for finding consistent string models. In another case, where no human-derived strategy exists, the agent learns a genuinely new strategy that achieves the same goal twice as efficiently per unit time. Our results demonstrate that the agent learns to solve various string theory consistency conditions simultaneously, which are phrased in terms of non-linear, coupled Diophantine equations.
Article PDF
Similar content being viewed by others
References
S. Ashok and M.R. Douglas, Counting flux vacua, JHEP 01 (2004) 060 [hep-th/0307049] [INSPIRE].
W. Taylor and Y.-N. Wang, The F-theory geometry with most flux vacua, JHEP 12 (2015) 164 [arXiv:1511.03209] [INSPIRE].
J. Halverson, C. Long and B. Sung, Algorithmic universality in F-theory compactifications, Phys. Rev. D 96 (2017) 126006 [arXiv:1706.02299] [INSPIRE].
W. Taylor and Y.-N. Wang, Scanning the skeleton of the 4D F-theory landscape, JHEP 01 (2018) 111 [arXiv:1710.11235] [INSPIRE].
R. Altman, J. Carifio, J. Halverson and B.D. Nelson, Estimating Calabi-Yau hypersurface and triangulation counts with equation learners, JHEP 03 (2019) 186 [arXiv:1811.06490] [INSPIRE].
W. Lerche, D. Lüst and A. Schellekens, Chiral four-dimensional heterotic strings from self-dual lattices, Nucl. Phys. B 287 (1987) 477.
F. Denef and M.R. Douglas, Computational complexity of the landscape. I, Annals Phys. 322 (2007) 1096 [hep-th/0602072] [INSPIRE].
J. Halverson and F. Ruehle, Computational complexity of vacua and near-vacua in field and string theory, Phys. Rev. D 99 (2019) 046015 [arXiv:1809.08279] [INSPIRE].
M. Cvetič, I. Garcia-Etxebarria and J. Halverson, On the computation of non-perturbative effective potentials in the string theory landscape: IIB/F-theory perspective, Fortsch. Phys. 59 (2011) 243 [arXiv:1009.5386] [INSPIRE].
Y.-H. He, Deep-learning the landscape, arXiv:1706.02714 [INSPIRE].
D. Krefl and R.-K. Seong, Machine learning of Calabi-Yau volumes, Phys. Rev. D 96 (2017) 066014 [arXiv:1706.03346] [INSPIRE].
F. Ruehle, Evolving neural networks with genetic algorithms to study the string landscape, JHEP 08 (2017) 038 [arXiv:1706.07024] [INSPIRE].
J. Carifio, J. Halverson, D. Krioukov and B.D. Nelson, Machine learning in the string landscape, JHEP 09 (2017) 157 [arXiv:1707.00655] [INSPIRE].
D. Klaewer and L. Schlechter, Machine learning line bundle cohomologies of hypersurfaces in toric varieties, Phys. Lett. B 789 (2019) 438 [arXiv:1809.02547] [INSPIRE].
J. Liu, Artificial neural network in cosmic landscape, JHEP 12 (2017) 149 [arXiv:1707.02800] [INSPIRE].
Y.-N. Wang and Z. Zhang, Learning non-Higgsable gauge groups in 4D F-theory, JHEP 08 (2018) 009 [arXiv:1804.07296] [INSPIRE].
R. Jinno, Machine learning for bounce calculation, arXiv:1805.12153 [INSPIRE].
K. Bull, Y.-H. He, V. Jejjala and C. Mishra, Machine learning CICY threefolds, Phys. Lett. B 785 (2018) 65 [arXiv:1806.03121] [INSPIRE].
T. Rudelius, Learning to inflate, JCAP 02 (2019) 044 [arXiv:1810.05159] [INSPIRE].
V. Jejjala, A. Kar and O. Parrikar, Deep learning the hyperbolic volume of a knot, arXiv:1902.05547 [INSPIRE].
K. Hashimoto, S. Sugishita, A. Tanaka and A. Tomiya, Deep learning and holographic QCD, Phys. Rev. D 98 (2018) 106014 [arXiv:1809.10536] [INSPIRE].
A. Cole and G. Shiu, Topological data analysis for the string landscape, JHEP 03 (2019) 054 [arXiv:1812.06960] [INSPIRE].
A. Mütter, E. Parr and P.K.S. Vaudrevange, Deep learning in the heterotic orbifold landscape, Nucl. Phys. B 940 (2019) 113 [arXiv:1811.05993] [INSPIRE].
I.J. Goodfellow et al., Generative adversarial networks, arXiv:1406.2661 [INSPIRE].
H. Erbin and S. Krippendorf, GANs for generating EFT models, arXiv:1809.02612 [INSPIRE].
J. Carifio et al., Vacuum selection from cosmology on networks of string geometries, Phys. Rev. Lett. 121 (2018) 101602 [arXiv:1711.06685] [INSPIRE].
D. Silver et al., Mastering the game of go with deep neural networks and tree search, Nature 529 (2016) 484.
D. Silver et al., Mastering the game of go without human knowledge, Nature 550 (2017) 354.
I. Bello et al., Neural combinatorial optimization with reinforcement learning, arXiv:1611.09940.
R.S. Sutton and A.G. Barto, Reinforcement learning: an introduction, MIT Press, U.S.A. (1998).
D. Silver, UCL course on RL, http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html.
G. Brockman et al., Openai gym, arXiv:1606.01540.
V. Mnih et al., Asynchronous methods for deep reinforcement learning, arXiv:1602.01783.
R. Williams, A class of gradient-estimating algorithms for reinforcement learning in neural networks, in ICNN, M. Caudill and C. Butler eds. IEEE, New York U.S.A. (1987).
R.J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning 8 (1992) 229.
V. Mnih et al., Human-level control through deep reinforcement learning, Nature 518 (2015) 529.
M. Birck et al., Multi-task reinforcement learning: An hybrid a3c domain approach, (2017).
M.R. Douglas and W. Taylor, The landscape of intersecting brane models, JHEP 01 (2007) 031 [hep-th/0606109] [INSPIRE].
A.M. Uranga, D-brane probes, RR tadpole cancellation and k-theory charge, Nucl. Phys. B 598 (2001) 225 [hep-th/0011048] [INSPIRE].
E. Witten, An SU(2) anomaly, Phys. Lett. B 117 (1982) 324 [INSPIRE].
F. Gmeiner et al., One in a billion: MSSM-like D-brane statistics, JHEP 01 (2006) 004 [hep-th/0510170] [INSPIRE].
M.R. Douglas, The statistics of string/M theory vacua, JHEP 05 (2003) 046 [hep-th/0303194] [INSPIRE].
B.S. Acharya, F. Denef and R. Valandro, Statistics of M-theory vacua, JHEP 06 (2005) 056 [hep-th/0502060] [INSPIRE].
E.I. Buchbinder, A. Constantin and A. Lukas, The moduli space of heterotic line bundle models: a case study for the tetra-quadric, JHEP 03 (2014) 025 [arXiv:1311.1941] [INSPIRE].
M. Cvetič, J. Halverson, D. Klevers and P. Song, On finiteness of Type IIB compactifications: Magnetized branes on elliptic Calabi-Yau threefolds, JHEP 06 (2014) 138 [arXiv:1403.4943] [INSPIRE].
S. Groot Nibbelink, O. Loukas, F. Ruehle and P.K.S. Vaudrevange, Infinite number of MSSMs from heterotic line bundles?, Phys. Rev. D 92 (2015) 046002 [arXiv:1506.00879] [INSPIRE].
V. Mnih et al., Asynchronous methods for deep reinforcement learning, arXiv:1602.01783.
S. Tokui, K. Oono, S. Hido and J. Clayton, Chainer: a next-generation open source framework for deep learning, in the proceedings of the Workshop on Machine Learning Systems (LearningSys) in The Twenty-ninth Annual Conference on Neural Information Processing Systems (NIPS ), December 7-12, Montreal, Canada (2015).
M. Cvetič, T. Li and T. Liu, Supersymmetric Pati-Salam models from intersecting D6-branes: a road to the standard model, Nucl. Phys. B 698 (2004) 163 [hep-th/0403061] [INSPIRE].
M. Bukov et al., Reinforcement learning in different phases of quantum control, Phys. Rev. X 8 (2018) 031086.
R. Sweke et al., Reinforcement learning decoders for fault-tolerant quantum computation, arXiv:1810.07207.
V. Rosenhaus and W. Taylor, Diversity in the tail of the intersecting brane landscape, JHEP 06 (2009) 073 [arXiv:0905.1951] [INSPIRE].
Open Access
This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
ArXiv ePrint: 1903.11616
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Halverson, J., Nelson, B. & Ruehle, F. Branes with brains: exploring string vacua with deep reinforcement learning. J. High Energ. Phys. 2019, 3 (2019). https://doi.org/10.1007/JHEP06(2019)003
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/JHEP06(2019)003