Abstract
In this paper, we consider the challenge of maximizing an unknown function f for which evaluations are noisy and are acquired with high cost. An iterative procedure uses the previous measures to actively select the next estimation of f which is predicted to be the most useful. We focus on the case where the function can be evaluated in parallel with batches of fixed size and analyze the benefit compared to the purely sequential procedure in terms of cumulative regret. We introduce the Gaussian Process Upper Confidence Bound and Pure Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure Exploration in the same batch of evaluations along the parallel iterations. We prove theoretical upper bounds on the regret with batches of size K for this procedure which show the improvement of the order of \(\sqrt{K}\) for fixed iteration cost over purely sequential versions. Moreover, the multiplicative constants involved have the property of being dimension-free. We also confirm empirically the efficiency of GP-UCB-PE on real and synthetic problems compared to state-of-the-art competitors.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Hennig, P., Schuler, C.J.: Entropy search for information-efficient global optimization. Journal of Machine Learning Research 13, 1809–1837 (2012)
Fedorov, V.V.: Theory of Optimal Experiments. Academic Press (1972)
Chen, B., Castro, R., Krause, A.: Joint optimization and variable selection of high-dimensional gaussian processes. In: Proceedings of ICML. ACM (2012)
Guestrin, C., Krause, A., Singh, A.: Near-optimal sensor placements in Gaussian processes. In: Proceedings of ICML, pp. 265–272. ACM (2005)
Grünewälder, S., Audibert, J.Y., Opper, M., Shawe-Taylor, J.: Regret Bounds for Gaussian Process Bandit Problems. In: Proceedings of AISTATS, pp. 273–280. MIT Press (2010)
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Transactions on Information Theory 58(5), 3250–3265 (2012)
Mockus, J.: Bayesian approach to global optimization: theory and applications. Mathematics and its applications (Kluwer Academic Publishers). Soviet series, Kluwer Academic (1989)
Mes, M.R., Powell, W.B., Frazier, P.I.: Hierarchical knowledge gradient for sequential sampling. Journal of Machine Learning Research 12, 2931–2974 (2011)
Carpentier, A., Lazaric, A., Ghavamzadeh, M., Munos, R., Auer, P.: Upper-confidence-bound algorithms for active learning in multi-armed bandits. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS, vol. 6925, pp. 189–203. Springer, Heidelberg (2011)
Chen, Y., Krause, A.: Near-optimal batch mode active learning and adaptive submodular optimization. In: Proceedings of ICML. ACM (2013)
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3), 235–256 (2002)
Auer, P., Ortner, R., Szepesvári, C.: Improved rates for the stochastic continuum-armed bandit problem. In: Proceedings of COLT, pp. 454–468. Omnipress (2007)
Coquelin, P.A., Munos, R.: Bandit algorithms for tree search. In: Proceedings of UAI, pp. 67–74. AUAI Press (2007)
Kleinberg, R.: Nearly tight bounds for the continuum-armed bandit problem. In: Advances in NIPS, pp. 697–704. MIT Press (2004)
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Audibert, J.Y., Bubeck, S., Munos, R.: Bandit view on noisy optimization. In: Optimization for Machine Learning, pp. 431–454. MIT Press (2011)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book (1998)
Bubeck, S., Munos, Stoltz, G., Szepesvári, C.: Online optimization in x-armed bandits. In: Advances in NIPS, pp. 201–208. Curran Associates, Inc. (2008)
Desautels, T., Krause, A., Burdick, J.: Parallelizing exploration-exploitation tradeoffs with gaussian process bandit optimization. In: Proceedings of ICML. icml.cc/Omnipress (2012)
Azimi, J., Fern, A., Fern, X.: Batch bayesian optimization via simulation matching. In: Advances in NIPS, pp. 109–117. Curran Associates, Inc. (2010)
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning 5(1), 1–122 (2012)
Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in multi-armed bandits problems. In: Gavaldà , R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol. 5809, pp. 23–37. Springer, Heidelberg (2009)
Rasmussen, C.E., Williams, C.: Gaussian Processes for Machine Learning. MIT Press (2005)
de Freitas, N., Smola, A.J., Zoghi, M.: Exponential regret bounds for gaussian process bandits with deterministic observations. In: Proceedings of ICML. icml.cc/Omnipress (2012)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley-Interscience (1991)
Ko, C., Lee, J., Queyranne, M.: An exact algorithm for maximum entropy sampling. Operations Research, 684–691 (1995)
Kuss, M., Pfingsten, T., Csató, L., Rasmussen, C.E.: Approximate inference for robust gaussian process regression (2005)
Krause, A., Guestrin, C.: Near-optimal nonmyopic value of information in graphical models. In: Proceedings of UAI, pp. 324–331. AUAI Press (2005)
Flake, G.W., Lawrence, S.: Efficient svm regression training with smo. Machine Learning 46(1-3), 271–290 (2002)
Hill, E.M., Borrero, J.C., Huang, Z., Qiu, Q., Banerjee, P., Natawidjaja, D.H., Elosegui, P., Fritz, H.M., Suwargadi, B.W., Pranantyo, I.R., Li, L., Macpherson, K.A., Skanavis, V., Synolakis, C.E., Sieh, K.: The 2010 mw 7.8 mentawai earthquake: Very shallow source of a rare tsunami earthquake determined from tsunami field survey and near-field gps data. J. Geophys. Res. 117, B06402 (2010)
Stefanakis, T.S., Dias, F., Vayatis, N., Guillas, S.: Long-wave runup on a plane beach behind a conical island. In: Proceedings of WCEE (2012)
Dutykh, D., Poncet, R., Dias, F.: The VOLNA code for the numerical modelling of tsunami waves: generation, propagation and inundation. European Journal of Mechanics B/Fluids 30, 598–615 (2011)
Nash, W., Tasmania. Marine Research Laboratories: The population biology of abalone (haliotis species) in tasmania: Blacklip abalone (h. rubra) from the north coast and the islands of bass strait. Technical report, Tasmania. Sea Fisheries Division (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Contal, E., Buffoni, D., Robicquet, A., Vayatis, N. (2013). Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40988-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-40988-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40987-5
Online ISBN: 978-3-642-40988-2
eBook Packages: Computer ScienceComputer Science (R0)