Article

Sequential cost-sensitive decision making with reinforcement learning

Authors:
Edwin Pednault

IBM T. J. Watson Res. Ctr., Yorktown Hieghts, NY

IBM T. J. Watson Res. Ctr., Yorktown Hieghts, NY
View Profile

,
Naoki Abe

IBM T. J. Watson Res. Ctr., Yorktown Hieghts, NY

IBM T. J. Watson Res. Ctr., Yorktown Hieghts, NY
View Profile

,
Bianca Zadrozny

University of Calif., San Diego, La Jolla, CA

University of Calif., San Diego, La Jolla, CA
View Profile

KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2002Pages 259–268https://doi.org/10.1145/775047.775086

Published:23 July 2002Publication History

KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 259–268

ABSTRACT

Recently, there has been increasing interest in the issues of cost-sensitive learning and decision making in a variety of applications of data mining. A number of approaches have been developed that are effective at optimizing cost-sensitive decisions when each decision is considered in isolation. However, the issue of sequential decision making, with the goal of maximizing total benefits accrued over a period of time instead of immediate benefits, has rarely been addressed. In the present paper, we propose a novel approach to sequential decision making based on the reinforcement learning framework. Our approach attempts to learn decision rules that optimize a sequence of cost-sensitive decisions so as to maximize the total benefits accrued over time. We use the domain of targeted' marketing as a testbed for empirical evaluation of the proposed method. We conducted experiments using approximately two years of monthly promotion data derived from the well-known KDD Cup 1998 donation data set. The experimental results show that the proposed method for optimizing total accrued benefits out performs the usual targeted-marketing methodology of optimizing each promotion in isolation. We also analyze the behavior of the targeting rules that were obtained and discuss their appropriateness to the application domain.

References

C. Apte, E. Bibelnieks, R. Natarajan, E. Pednault, F. Tipu, D. Campbell, and B. Nelson. Segmentation-based modeling for advanced targeted marketing. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 408--413. ACM, 2001. Google ScholarDigital Library
S. D. Bay. UCI KDD archive. Department of Information and Computer Sciences, University of California, Irvine, 2000. http://kdd.ics.uci.edu/.Google Scholar
D. P. Bertsekas and J. N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996. Google ScholarDigital Library
P. Domingos. MetaCost: A general method for making classifiers cost sensitive. In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pages 155--164. ACM Press, 1999. Google ScholarDigital Library
C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Aug. 2001. Google ScholarDigital Library
W. Fan, S. J. Stolfo, J. Zhang, and P. K. Chan. AdaCost: misclassification cost-sensitive boosting. In Proc. 16th International Conf. on Machine Learning, pages 97--105. Morgan Kaufmann, San Francisco, CA, 1999. Google ScholarDigital Library
L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 1996. Google ScholarDigital Library
D. D. Margineantu and T. G. Dietterich. Bootstrap methods for the cost-sensitive evaluation of classifiers. In Proc. 17th International Conf. on Machine Learning, pages 583--590. Morgan Kaufmann, San Francisco, CA, 2000. Google ScholarDigital Library
R. Natarajan and E. Pednault. Segmented regression estimators for massive data sets. In Second SIAM International Conference on Data Mining, Arlington, Virginia, 2002. to appear.Google ScholarCross Ref
G. A. Rummery and M. Niranjan. On-line q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Cambridge University Engineering Departement, 1994. Ph.D. thesis.Google Scholar
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998. Google ScholarDigital Library
J. N. Tsitsiklis and B. V. Roy. An analysis of temporal difference learning with function approximation. IEEE Transactions on Automatic Control, 42(5):674--690, 1997.Google ScholarCross Ref
P. Turney. Cost-sensitive learning bibliography. Institute for Information Technology, National Research Council, Ottawa, Canada, 2000. http://extractor.iit.nrc.ca/bibliographies/cost-sensitive.html.Google Scholar
X. Wang and T. Dietterich. Efficient value function approximation using regression trees. In Proceedings of the IJCAI Workshop on Statistical Machine Learning for Large-Scale Optimization, 1999.Google Scholar
C. J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, 1989.Google Scholar
C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8:279--292, 1992. Google ScholarDigital Library
B. Zadrozny and C. Elkan. Learning and making decisions when costs and probabilities are both unknown. In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining, 2001. Google ScholarDigital Library

Index Terms

Sequential cost-sensitive decision making with reinforcement learning
1. Computing methodologies
  1. Machine learning

Recommendations

Decision making process: typology, intelligence, and optimization

Decision making is concerned with evaluating and/or ranking possible alternatives of action. In this paper, we develop a model for the process of decision making. Understanding the decision process can provide insights into how humans make decisions, ...
Read More
Sustainable decision making: the role of decision support systems

Sustainable decision making stands for decision making which contributes to the transition to a sustainable society. It raises a number of challenging problems for which existing decision support systems (DSS) may not be equipped. The role of DSS in ...
Read More
Cost-Sensitive Three-Way Decision: A Sequential Strategy
Proceedings of the 8th International Conference on Rough Sets and Knowledge Technology - Volume 8171

Three-way decision model is an extension of two-way decision model, in which boundary region decision is regarded as a new feasible decision choice when precise decision can not be immediately made due to lack of available information. In this paper, a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
July 2002
719 pages
ISBN:158113567X
DOI:10.1145/775047
Conference Chair:
Osmar R. Zaïane
University of Alberta, Canada
,
General Chair:
Randy Goebel
University of Alberta, Canada
,
Program Chairs:
David Hand
Imperial College, UK
,
Daniel Keim
AT&T
,
Raymond Ng
University of British Columbia, Canada
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 July 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
KDD '02 Paper Acceptance Rate44of307submissions,14%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 45
  Total Citations
  View Citations
- 1,160
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Sequential cost-sensitive decision making with reinforcement learning

KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Decision making process: typology, intelligence, and optimization

Sustainable decision making: the role of decision support systems

Cost-Sensitive Three-Way Decision: A Sequential Strategy