Abstract
The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programing equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.
Similar content being viewed by others
References
Dubins, L. E., andSavage, L. J.,Inequalities for Stochastic Processes: How to Gamble If You Must, Dover, New York, New York, 1976.
Simon, H. A.,Models of Man, Wiley, New York, New York, 1957.
Rendelman, R. J., andMcEnally, R. W.,Assessing the Costs of Portfolio Insurance, Financial Analyst Journal, pp. 27–37, May–June 1987.
Lau, H. S.,The Newsboy Problem Alternative Optimizing Criteria, Journal of the Operational Research Society, Vol. 26, pp. 525–535, 1980.
Kumarsswamy, S., andSankarasubramanian, E.,A Note on Optimal Ordering Quantity to Realize a Predetermined Level of Profit, Management Science, Vol. 29, pp. 512–513, 1983.
Kao, E. P.,A Preference Order Dynamic Program for a Stochastic Traveling Salesman Problem, Operations Research, Vol. 26, pp. 1033–1045, 1978.
Henig, M. I.,Target and Percentile Criteria in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, Working Paper, Department of Business Administration, University of Illinois at Urbana, 1984.
Filar, J. E.,Percentiles and Markov Decision Processes, Operations Research Letters, Vol. 2, pp. 13–15, 1983.
Heyman, D., andSobel, M. J.,Stochastic Models in Operations Research,Vol. 2, McGraw-Hill, New York, New York, 1984.
Sobel, M. J.,The Variance of Discounted Markov Decision Processes, Journal of Applied Probability, Vol. 19, pp. 794–802, 1982.
Chung, M. J., andSobel, M. J.,Discounted MDPs: Distribution Functions and Exponential Utility Maximization, SIAM Journal on Control and Optimization, Vol. 25, pp. 49–62, 1987.
Ross, S. M.,Applied Probability Models with Optimization Applications, Holden-Day, San Francisco, California, 1970.
Royden, H.,Real Analysis, Macmillan, New York, New York, 1968.
Author information
Authors and Affiliations
Additional information
Communicated by M. Pachter
Rights and permissions
About this article
Cite this article
Bouakiz, M., Kebir, Y. Target-level criterion in Markov decision processes. J Optim Theory Appl 86, 1–15 (1995). https://doi.org/10.1007/BF02193458
Issue Date:
DOI: https://doi.org/10.1007/BF02193458