ABSTRACT
Recently, there has been increasing interest in the issues of cost-sensitive learning and decision making in a variety of applications of data mining. A number of approaches have been developed that are effective at optimizing cost-sensitive decisions when each decision is considered in isolation. However, the issue of sequential decision making, with the goal of maximizing total benefits accrued over a period of time instead of immediate benefits, has rarely been addressed. In the present paper, we propose a novel approach to sequential decision making based on the reinforcement learning framework. Our approach attempts to learn decision rules that optimize a sequence of cost-sensitive decisions so as to maximize the total benefits accrued over time. We use the domain of targeted' marketing as a testbed for empirical evaluation of the proposed method. We conducted experiments using approximately two years of monthly promotion data derived from the well-known KDD Cup 1998 donation data set. The experimental results show that the proposed method for optimizing total accrued benefits out performs the usual targeted-marketing methodology of optimizing each promotion in isolation. We also analyze the behavior of the targeting rules that were obtained and discuss their appropriateness to the application domain.
- C. Apte, E. Bibelnieks, R. Natarajan, E. Pednault, F. Tipu, D. Campbell, and B. Nelson. Segmentation-based modeling for advanced targeted marketing. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 408--413. ACM, 2001. Google ScholarDigital Library
- S. D. Bay. UCI KDD archive. Department of Information and Computer Sciences, University of California, Irvine, 2000. http://kdd.ics.uci.edu/.Google Scholar
- D. P. Bertsekas and J. N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996. Google ScholarDigital Library
- P. Domingos. MetaCost: A general method for making classifiers cost sensitive. In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pages 155--164. ACM Press, 1999. Google ScholarDigital Library
- C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Aug. 2001. Google ScholarDigital Library
- W. Fan, S. J. Stolfo, J. Zhang, and P. K. Chan. AdaCost: misclassification cost-sensitive boosting. In Proc. 16th International Conf. on Machine Learning, pages 97--105. Morgan Kaufmann, San Francisco, CA, 1999. Google ScholarDigital Library
- L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 1996. Google ScholarDigital Library
- D. D. Margineantu and T. G. Dietterich. Bootstrap methods for the cost-sensitive evaluation of classifiers. In Proc. 17th International Conf. on Machine Learning, pages 583--590. Morgan Kaufmann, San Francisco, CA, 2000. Google ScholarDigital Library
- R. Natarajan and E. Pednault. Segmented regression estimators for massive data sets. In Second SIAM International Conference on Data Mining, Arlington, Virginia, 2002. to appear.Google ScholarCross Ref
- G. A. Rummery and M. Niranjan. On-line q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Cambridge University Engineering Departement, 1994. Ph.D. thesis.Google Scholar
- R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998. Google ScholarDigital Library
- J. N. Tsitsiklis and B. V. Roy. An analysis of temporal difference learning with function approximation. IEEE Transactions on Automatic Control, 42(5):674--690, 1997.Google ScholarCross Ref
- P. Turney. Cost-sensitive learning bibliography. Institute for Information Technology, National Research Council, Ottawa, Canada, 2000. http://extractor.iit.nrc.ca/bibliographies/cost-sensitive.html.Google Scholar
- X. Wang and T. Dietterich. Efficient value function approximation using regression trees. In Proceedings of the IJCAI Workshop on Statistical Machine Learning for Large-Scale Optimization, 1999.Google Scholar
- C. J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, 1989.Google Scholar
- C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8:279--292, 1992. Google ScholarDigital Library
- B. Zadrozny and C. Elkan. Learning and making decisions when costs and probabilities are both unknown. In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining, 2001. Google ScholarDigital Library
Index Terms
- Sequential cost-sensitive decision making with reinforcement learning
Recommendations
Decision making process: typology, intelligence, and optimization
Decision making is concerned with evaluating and/or ranking possible alternatives of action. In this paper, we develop a model for the process of decision making. Understanding the decision process can provide insights into how humans make decisions, ...
Sustainable decision making: the role of decision support systems
Sustainable decision making stands for decision making which contributes to the transition to a sustainable society. It raises a number of challenging problems for which existing decision support systems (DSS) may not be equipped. The role of DSS in ...
Cost-Sensitive Three-Way Decision: A Sequential Strategy
Proceedings of the 8th International Conference on Rough Sets and Knowledge Technology - Volume 8171Three-way decision model is an extension of two-way decision model, in which boundary region decision is regarded as a new feasible decision choice when precise decision can not be immediately made due to lack of available information. In this paper, a ...
Comments