ABSTRACT
This paper describes a new technique and analysis for using on-line learning algorithms to solve active learning problems. Our algorithm is called Active Vote, and it works by actively selecting instances that force several perturbed copies of an on-line algorithm to make mistakes. The main intuition for our result is based on the fact that the number of mistakes made by the optimal on-line algorithm is a lower bound on the number of labels needed for active learning. We provide performance bounds for Active Vote in both a batch and on-line model of active learning. These performance bounds depend on the algorithm having a set of unlabeled instances in which the various perturbed on-line algorithms disagree. The motivating application for Active Vote is an Internet advertisement rating program. We conduct experiments using data collected for this advertisement problem along with experiments using standard datasets. We show Active Vote can achieve an order of magnitude decrease in the number of labeled instances over various passive learning algorithms such as Support Vector Machines.
- D. Angluin. Queries revisited. Theor. Comput. Sci., 313:175--194, February 2004. Google ScholarDigital Library
- J. Attenberg and F. Provost. Why label when you can search? In KDD, 2010.Google ScholarDigital Library
- M. F. Balcan, A. Beygelzimer, and J. Langford. Agnostic active learning. In ICML, 2006. Google ScholarDigital Library
- M. F. Balcan, S. Hanneke, and J. W. Vaughan. The true sample complexity of active learning. Machine Learning, 80(2--3):111--139, 2010.Google Scholar
- A. Beygelzimer, S. Dasgupta, and J. Langford. Importance weighted active learning. In ICML, 2009. Google ScholarDigital Library
- A. Beygelzimer, D. Hsu, J. Langford, and T. Zhang. Agnostic active learning without constraints. In NIPS, 2010.Google ScholarDigital Library
- H. D. Block. The perceptron: A model for brain functioning. Reviews of Modern Physics, 34(1):123--135, 1962.Google ScholarCross Ref
- N. Cesa-Bianchi, A. Conconi, and C. Gentile. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory, 50(9):2050--2057, 2004. Google ScholarDigital Library
- N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Worst-case analysis of selective sampling for linear classification. Journal of Machine Learning Research, 7:1205--1230, 2006. Google ScholarDigital Library
- C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273--297, 1995. Google ScholarDigital Library
- S. Dasgupta. Coarse sample complexity bounds for active learning. In Y. Weiss, B. Schölkopf, and J. Platt, editors, NIPS, pages 235--242, Cambridge, MA, 2006. MIT Press.Google Scholar
- S. Dasgupta, D. Hsu, and C. Monteleoni. A general agnostic active learning algorithm. In NIPS, 2007.Google ScholarDigital Library
- C. B. D.J. Newman, S. Hettich and C. Merz. UCI repository of machine learning databases, 1998; http://archive.ics.uci.edu/ml/.Google Scholar
- B. Edelman, M. Ostrovsky, and M. Schwarz. Internet advertising and the generalized second-price auction American Economic Review, 97(1):242--259, 2007.Google Scholar
- Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. In COLT, 1998. Google ScholarDigital Library
- Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28:133--1680, 1997. Google ScholarDigital Library
- C. Gentile. A new approximate maximal margin classification algorithm. Machine Learning, 2:213--242, 2001. Google ScholarDigital Library
- S. Hanneke. Rates of convergence in active learning. The Annals of Statistics, 39(1):333--361, 2011.Google ScholarCross Ref
- M.-C. Jenkins and D. Smith. Conservative stemming for search and indexing. In SIGIR, 2005.Google Scholar
- T. Joachims. Making large-scale support vector machine learning practical. In A. S. B. Schölkopf, C. Burges, editor, Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA, 1999. Google ScholarDigital Library
- D. D. Lewis and W. A. Gale. A sequential algorithm for training text classifiers. In SIGIR, pages 3--12. Springer-Verlag, 1994. Google ScholarDigital Library
- R. Liere and P. Tadepalli. Active learning with committees for text categorization. In AAAI, pages 591--596, 1997. Google ScholarDigital Library
- N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285--318, 1988. Google ScholarCross Ref
- N. Littlestone. From on-line to batch learning. In COLT, pages 269--284, 1989. Google ScholarDigital Library
- N. Littlestone. Mistake Bounds and Linear-threshold Learning Algorithms. PhD thesis, Computer Science, University of California, Santa Cruz, 1989. Technical Report UCSC-CRL-89--11. Google ScholarDigital Library
- W. Maass and M. K. Warmuth. Efficient learning with virtual threshold gates. Information and Computation, 141:378--386, 1997. Google ScholarDigital Library
- C. Mesterharm. Improving On-line Learning. PhD thesis, Computer Science, Rutgers, The State University of New Jersey, 2007. Google ScholarDigital Library
- B. Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin--Madison, 2009.Google Scholar
- V. Vapnik and A. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16:264--280, 1971.Google ScholarCross Ref
Index Terms
- Active learning using on-line algorithms
Recommendations
Active learning with direct query construction
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningActive learning may hold the key for solving the data scarcity problem in supervised learning, i.e., the lack of labeled data. Indeed, labeling data is a costly process, yet an active learner may request labels of only selected instances, thus reducing ...
Online Passive-Aggressive Active learning
We investigate online active learning techniques for online classification tasks. Unlike traditional supervised learning approaches, either batch or online learning, which often require to request class labels of each incoming instance, online active ...
Combining active learning and semi-supervised for improving learning performance
ISABEL '11: Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication TechnologiesIn many learning tasks, there are abundant unlabeled samples but the number of labeled training samples is limited, because labeling the samples requires the efforts of human annotators and expertise. There are three major techniques for labeling the ...
Comments