Article

Free Access

Active learning using adaptive resampling

Authors:
Vijay S. Iyengar

IBM Research Division, T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY

IBM Research Division, T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
View Profile

,
Chidanand Apte

IBM Research Division, T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY

IBM Research Division, T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
View Profile

,
Tong Zhang

IBM Research Division, T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY

IBM Research Division, T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
View Profile

KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2000Pages 91–98https://doi.org/10.1145/347090.347110

Published:01 August 2000Publication History

KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 91–98

References

1.N. Abe and H. Mamitsuka. Query learning strategies using boosting and bagging. In Proceedings of the International Conference on Machine Learning, pages 1-9, 1998. Google ScholarDigital Library
2.D. Angluin. Queries and concept learning. Machine Learning, 2(4):319-342, 1988. Google ScholarCross Ref
3.C. Apte, F. Damerau, and S. Weiss. Automated learning of decision rules for text categorization. ACM Transactions on Information Systems, 12(3):233-251, July 1994. Google ScholarDigital Library
4.E. Bauer and R. Kohavi. An empirical comparison of voting classifiucation algorithms: Bagging, boosting, and variants. Machine Learning, 36:105-142, 1999. Google ScholarDigital Library
5.M. Berry and G. Lino. Data Mining Techniques: For Marketing, Sales, and Customer Support. John Wiley and Sons, Inc., 1997. Google ScholarDigital Library
6.C. Blake, E. Keogh, and C. Merz. UCI repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Science, URL=http://www.ics.uci.edu/ mlearn/- MLRespository.html, 1998.Google Scholar
7.L. Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801-849, 1998.Google ScholarCross Ref
8.W. Cochran. Sampling Techniques. John Wiley and Sons, Inc., 1977.Google Scholar
9.D. Cohn, L. Atlas, and R.Ladner. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems 2. Morgan Kaufmann, 1990. Google ScholarDigital Library
10.D. Cohn, L. Atlas, and R.Ladner. Improved generalization with active learning. Machine Learning, 15:201-221, 1994. Google ScholarDigital Library
11.D. Cohn, Z. Ghahramani, and M. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research, 4:129-145, 1996. Google ScholarDigital Library
12.D. Cohn, Z. Ghahramani, and M. Jordan. Active learning with mixture models. In Multiple model approaches to modeling and control. Taylor and Francis, 1997.Google Scholar
13.T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 40(2), August 2000. Google ScholarDigital Library
14.Y. Freund. Sifting informative examples from a random source. In Advances in Neural Information Processing, pages 85-89, 1994.Google Scholar
15.Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In Proceedings of the International Conference on Machine Learning, pages 148-156. Morgan Kaufmann, 1996.Google ScholarDigital Library
16.Y. Freund, H. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems 5, pages 337-344. Morgan Kaufmann, 1992. Google ScholarDigital Library
17.Y. Freund, H. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28:133-168, 1997. Google ScholarDigital Library
18.J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. Technical Report Technical Report, Stanford University, Dept. of Statistics, July 1998.Google Scholar
19.N. Kushmerick. Learning to remove internet advertisements. In Proceedings of the Third International Conference onAutonomous Agents, pages 175-181, 1999. Google ScholarDigital Library
20.D. Lewis. Reuters 21578 data set. URL=http://www.research.att.com/lewis/- reuters21578.html.Google Scholar
21.D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the Eleventh International Conference onMachine Learning, pages 148-156, 1994.Google ScholarDigital Library
22.D. Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proceedings of the Seventeenth Annual ACM-SIGR Conference on Research and Development in Information Retrieval, pages 3-12, 1994. Google ScholarDigital Library
23.R. Liere and P. Tadepalli. Active learning with committees for text categorization. In Proceedings of the Fourteenth National Conference on Artificial Intelligence, pages 591-596, 1997. Google ScholarDigital Library
24.A. McCallum and K. Nigam. Employing em in pool-based active learning for text classification. In Proceedings of the Fifteenth International Conference on Machine Learning, pages 350-358, 1998. Google ScholarDigital Library
25.F. Provost, D. Jensen, and T. Oates. Efficient progressive sampling. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pages 23-32, 1999. Google ScholarDigital Library
26.R. Schapire, Y. Freund, P. Bartlett, and W. Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651-1686, 1998.Google ScholarCross Ref
27.H. Seung, M. Opper, and H. Sompolinsky. Query by committee. In Proceedings of the Fifth ACM Workshop on Computational Learning Theory, pages 287-294, 1992. Google ScholarDigital Library
28.S. Weiss, C. Apte, F. Damerau, D. Johnson, F. Oles, T. Goetz, and T. Hampp. Maximizing text-mining performance. IEEE Intelligent Systems and their applications, 14(4):63-69, July/August 1999. Google ScholarDigital Library
29.S. Weiss and N. Indurkhya. Data-miner software kit (DMSK). URL=http://www.data-miner.com, 1998.Google Scholar
30.S. M. Weiss and C. A. Kulikowski. Computer Systems that Learn. Morgan Kaufmann, 1991.Google ScholarDigital Library
31.Y. Yang and J. Pedersen. A comparitive study on feature selection in text categorization. In ICML'97, Proceedings of the Fourteenth International Conference on Machine Learning, pages 412-420, 1997. Google ScholarDigital Library

Index Terms

Active learning using adaptive resampling
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Applying active learning to assertion classification of concepts in clinical text

Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that ...
Read More
Machine learning paradigms for utility-based data mining
UBDM '05: Proceedings of the 1st international workshop on Utility-based data mining

In this talk, I will describe a number of machine learning paradigms that are relevant to utility-based data mining, and review some key techniques and results in each.

Read More
Active Learning with Adaptive Heterogeneous Ensembles
ICDM '09: Proceedings of the 2009 Ninth IEEE International Conference on Data Mining

One common approach to active learning is to iteratively train a single classifier by choosing data points based on its uncertainty, but it is nontrivial to design uncertainty measures unbiased by the choice of classifier. Query by committee suggests ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2000
537 pages
ISBN:1581132336
DOI:10.1145/347090
Chairmen:
Raghu Ramakrishnan
Univ. of Wisconsin
,
Sal Stolfo
Columbia Univ., New York, NY
,
Roberto Bayardo
IBM Almaden Research Center, San Jose, CA
,
Ismail Parsa
Epsilon
Copyright © 2000 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 2000
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
active learning
adaptive resampling
classification
data mining
machine learning
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 52
  Total Citations
  View Citations
- 1,405
  Total Downloads
- Downloads (Last 12 months)107
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Active learning using adaptive resampling

KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining

References

Cited By

Index Terms

Recommendations

Applying active learning to assertion classification of concepts in clinical text

Machine learning paradigms for utility-based data mining

Active Learning with Adaptive Heterogeneous Ensembles

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Active learning using adaptive resampling

KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining

References

Cited By

Index Terms

Recommendations

Applying active learning to assertion classification of concepts in clinical text

Machine learning paradigms for utility-based data mining

Active Learning with Adaptive Heterogeneous Ensembles

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media