ABSTRACT
In this paper we initiate an investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions. The ultimate goal in this direction is informally termed agnostic learning, in which we make virtually no assumptions on the target function. The name derives from the fact that as designers of learning algorithms, we give up the belief that Nature (as represented by the target function) has a simple or succinct explanation.
We give a number of both positive and negative results that provide an initial outline of the possibilities for agnostic learning. Our results include hardness results for the most obvious generalization of the PAC model to an agnostic setting, an efficient and general agnostic learning method based on dynamic programming, relationships between loss functions for agnostic learning, and an algorithm for learning in a model for problems involving hidden variables.
- 1.David Aldous and Umesh Vazirani. A Markovian extension of Valiant's learning model. In 31st Annual Symposium on Foundations of Computer Science, pages 392-404, October 1990.Google ScholarDigital Library
- 2.Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the Association for Computing Machinery, 36(4):929-965, October 1989. Google ScholarDigital Library
- 3.Richard O. Duda and Peter E. Hart. Pattern Classification and Scene Analysis. Wiley, 1973.Google Scholar
- 4.R. M. Dudley. Central limit theorems for empirical measures. The Annals of Probability, 6(6):899-929, 1978.Google ScholarCross Ref
- 5.M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory o/NP-Completeness. W. H. Freeman, San Francisco, 1979. Google ScholarDigital Library
- 6.David Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, To appear. Google ScholarDigital Library
- 7.David P. Helmbold and Philip M. Long. Tracking drifting concepts using random examples. In Proceedings o/the Fourth Annual Workshop on Computational Learning Theory, pages 13-23, August 1991. Google ScholarDigital Library
- 8.Michael Kearns and Ming Li. Learning in the presence of malicious errors. In Proceedings o/ the Twentieth Annual ACM Symposium on Theory of Computing, pages 267-280, May 1988. Google ScholarDigital Library
- 9.Michael Kearns, Ming Li, Leonard Pitt, and Leslie Valiant. On the learnability of Boolean formulae. In Proceedings of the Nineteenth Annual A CM Symposium on Theory of Computing, pages 285-295, May 1987. Google ScholarDigital Library
- 10.Michael Kearns and Leslie G. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. In Proceedings of the Twenty First Annual ACM Symposium on Theory o/Computing, pages 433-444, May 1989. Google ScholarDigital Library
- 11.Michael J. Kearns and Robert E. Schapire. Efficient distribution-free learning of probabilistic concepts. In 31st Annual Symposium on Foundations of Computer Science, pages 382-391, October 1990. To appear, Journal of Computer and System Sciences. Google ScholarDigital Library
- 12.Nathan Linial, Yishay Mansour, and Noam Nisan. Constant depth circuits, Fourier transform, and learnability. In 30th Annual Symposium on Foundations of Computer Science, pages 574-579, October 1989.Google ScholarDigital Library
- 13.Leonard Pitt and Leslie G. Valiant. Computational limitations on learning from examples. Journal of the Association for Computing Machinery, 35(4):965-984, October 1988. Google ScholarDigital Library
- 14.David Pollard. Convergence of Stochastic Processes. Springer-Verlag, 1984.Google ScholarCross Ref
- 15.Robert E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197-227, 1990. Google ScholarDigital Library
- 16.L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134-1142, November 1984. Google ScholarDigital Library
- 17.L. G. Valiant. Learning disjunctions of conjunctions. In Proceedings of the 9th International Joint Conference on Artificial Intelligence, pages 560- 566, August 1985.Google ScholarDigital Library
- 18.V. N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer-Verlag, 1982. Google ScholarDigital Library
- 19.Halbert White. Learning in artificial neural networks: A statistical perspective. Neural Computation, 1(4):425-464, 1989.Google ScholarDigital Library
- 20.Kenji Yamanishi. A learning criterion for stochastic rules. In Proceedings of the Third Annual Workshop on Computational Learning Theory, August 1990. To appear, Machine Learning. Google ScholarDigital Library
Index Terms
- Toward efficient agnostic learning
Recommendations
Agnostic active learning
ICML '06: Proceedings of the 23rd international conference on Machine learningWe state and analyze the first active learning algorithm which works in the presence of arbitrary forms of noise. The algorithm, A2 (for Agnostic Active), relies only upon the assumption that the samples are drawn i.i.d. from a fixed distribution. We ...
Toward Efficient Agnostic Learning
Special issue on computational learning theory, COLT'92In this paper we initiate an investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions. The ultimate goal in this direction is informally termed ...
Bayesian model-agnostic meta-learning
NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing SystemsDue to the inherent model uncertainty, learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method ...
Comments