Target-level criterion in Markov decision processes

Bouakiz, M.; Kebir, Y.

doi:10.1007/BF02193458

Target-level criterion in Markov decision processes

Contributed Papers
Published: July 1995

Volume 86, pages 1–15, (1995)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

M. Bouakiz¹ &
Y. Kebir²

184 Accesses
31 Citations
Explore all metrics

Abstract

The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programing equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A practical guide to multi-objective reinforcement learning and planning

Article Open access 13 April 2022

Introduction to Reinforcement Learning

Non-convex scenario optimization

Article Open access 08 April 2024

References

Dubins, L. E., andSavage, L. J.,Inequalities for Stochastic Processes: How to Gamble If You Must, Dover, New York, New York, 1976.
Google Scholar
Simon, H. A.,Models of Man, Wiley, New York, New York, 1957.
Google Scholar
Rendelman, R. J., andMcEnally, R. W.,Assessing the Costs of Portfolio Insurance, Financial Analyst Journal, pp. 27–37, May–June 1987.
Lau, H. S.,The Newsboy Problem Alternative Optimizing Criteria, Journal of the Operational Research Society, Vol. 26, pp. 525–535, 1980.
Google Scholar
Kumarsswamy, S., andSankarasubramanian, E.,A Note on Optimal Ordering Quantity to Realize a Predetermined Level of Profit, Management Science, Vol. 29, pp. 512–513, 1983.
Google Scholar
Kao, E. P.,A Preference Order Dynamic Program for a Stochastic Traveling Salesman Problem, Operations Research, Vol. 26, pp. 1033–1045, 1978.
Google Scholar
Henig, M. I.,Target and Percentile Criteria in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, Working Paper, Department of Business Administration, University of Illinois at Urbana, 1984.
Filar, J. E.,Percentiles and Markov Decision Processes, Operations Research Letters, Vol. 2, pp. 13–15, 1983.
Google Scholar
Heyman, D., andSobel, M. J.,Stochastic Models in Operations Research,Vol. 2, McGraw-Hill, New York, New York, 1984.
Google Scholar
Sobel, M. J.,The Variance of Discounted Markov Decision Processes, Journal of Applied Probability, Vol. 19, pp. 794–802, 1982.
Google Scholar
Chung, M. J., andSobel, M. J.,Discounted MDPs: Distribution Functions and Exponential Utility Maximization, SIAM Journal on Control and Optimization, Vol. 25, pp. 49–62, 1987.
Google Scholar
Ross, S. M.,Applied Probability Models with Optimization Applications, Holden-Day, San Francisco, California, 1970.
Google Scholar
Royden, H.,Real Analysis, Macmillan, New York, New York, 1968.
Google Scholar

Download references

Author information

Authors and Affiliations

ATT-IOD, Morristown, New Jersey
M. Bouakiz (Staff Member)
Departments of Management Science and Mathematical Sciences, Loyola University of Chicago, Chicago, Illinois
Y. Kebir (Senior Lecturer)

Authors

M. Bouakiz
View author publications
You can also search for this author in PubMed Google Scholar
Y. Kebir
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Communicated by M. Pachter

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bouakiz, M., Kebir, Y. Target-level criterion in Markov decision processes. J Optim Theory Appl 86, 1–15 (1995). https://doi.org/10.1007/BF02193458

Download citation

Issue Date: July 1995
DOI: https://doi.org/10.1007/BF02193458

Key Words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Target-level criterion in Markov decision processes

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Introduction to Reinforcement Learning

Non-convex scenario optimization

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key Words

Navigation

Target-level criterion in Markov decision processes

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Introduction to Reinforcement Learning

Non-convex scenario optimization

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Search

Navigation