Convex Analytic Methods in Markov Decision Processes

Borkar, Vivek S.

doi:10.1007/978-1-4615-0805-2_11

Vivek S. Borkar⁴

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 40))

1653 Accesses
37 Citations

Abstract

This article describes the convex analytic approach to classical Markov decision processes wherein they are cast as a static convex programming problem in the space of measures. Applications to multiobjective problems are described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Altman, Constrained Markov decision processes, Chapman and Hall/ CRC, Boca Raton, Florida, 1999.
Google Scholar
E. Altman, “Applications of Markov decision processes in communication networks: a survey”, Chapter 16, this volume.
Google Scholar
E. Altman and A. Shwartz, “Markov decision problems and state-action frequencies”, SIAM Journal of Control and Optimization 29, pp.786–809, 1991.
Article Google Scholar
E. Altman and F. Spieksma, “The linear program approach in Markov decision processes revisited”, ZOR — Methods and Models in Operations Research 42, Issue 2, 1995, pp. 169–188.
Google Scholar
E. J. Anderson and P. Nash, Linear Programming in Infinite Dimensional Spaces, John Wiley, Chichester, 1987.
Google Scholar
K. J. Arrow, E. W. Barankin and D. Blackwell, “Admissible points of convex sets”, in Contributions to the Theory of Games (eds. H.W. Kuhn and A.W. Tucker), Princeton Uni. Press, Princeton, NJ, pp.87–91, 1950.
Google Scholar
S. Bhatnagar and V. S. Borkar, “A convex analytic framework for ergodic control of semi-Markov processes”, Mathematics of Operations Research 20, 1995, pp. 923–936.
Article Google Scholar
A. G. Bhatt and V. S. Borkar, “Occupation measures for controlled Markov processes: characterization and optimality”, The Annals of Probability 24, pp. 1531–1562, 1996.
Article Google Scholar
V. S. Borkar, “A convex analytic approach to Markov decision processes”, Probability Theory and Related Fields 78, pp.583–602, 1988.
Article Google Scholar
V. S. Borkar, “Control of Markov chains with long-run average cost cri- terion: the dynamic programming equations”, SIAM Journal of Control and Optimization 27, pp.642–657, 1989.
Article Google Scholar
V. S. Borkar, “Controlled Markov chains with constraints”, Sadhana: Indian Academy of Sciences Proceedings in Engineering Sciences 15, pp.405–413, 1990.
Google Scholar
V. S. Borkar, Topics in Controlled Markov Chains, Pitman Research Notes in Maths. No.240, Longman Scientific and Technical, Harlow, England, 1991.
Google Scholar
V. S. Borkar, “Controlled diffusions with constraints II”, Journal of Mathematical Analysis and Applications 176 pp.310–321, 1993.
Article Google Scholar
V. S. Borkar, “Ergodic control of Markov chains with constraints — the general case”, SIAM Journal of Control and Optimization 32, pp. 176–186, 1994.
Article Google Scholar
V. S. Borkar, Probability Theory: An Advanced Course, Springer Verlag, New York, 1995.
Google Scholar
V. S. Borkar, “Uniform stability of controlled Markov processes”, in System Theory: Modeling, Analysis and Control (T. E. Djaferis and I. C. Schick, eds.), Kluwer Academic Publishers, Boston, pp. 106–120, 1999.
Google Scholar
V. S. Borkar and M. K. Ghosh, “Controlled diffusions with constraints”, Journal of Mathematical Analysis and Applications 152, pp.88–108, 1990.
Article Google Scholar
G. Choquet, Lectures on Analysis, Vol.11: Representation Theory, W.A. Benjamin, Inc., Reading, Mass., 1969.
Google Scholar
C. Derman, Finite State Markovian Decision Processes, Academic Press, New York, 1970.
Google Scholar
C. Derman and M. Klein, “Some remarks on finite horizon Markovian decision models”, Operations Research 13, pp. 272–278, 1965.
Article Google Scholar
L. Dubins, “On extreme points of convex sets”, Journal of Mathematical Analysis and Applications 5, pp. 237–244, 1962.
Article Google Scholar
G. Fayolle, V. A. Malyshev and M. V. Menshikov, Topics in the Constructive Theory of Countable Markov Chains, Cambridge University Press, Cambridge, UK, 1995.
Book Google Scholar
E. A. Feinberg, “Nonrandomized Markov and semi-Markov strategies in dynamic programming”, Theory of Probability and Applications 27, pp. 116–126, 1982.
Article Google Scholar
E.A. Feinberg, “Controlled Markov processes with arbitrary numerical criteria,” SIAM Theory Prob. Appl. 27 pp. 486–503, 1982.
Article Google Scholar
E. A. Feinberg, “Constrained semi-Markov decision processes with average rewards”, ZOR — Methods and Models in Operations Research 39, pp. 257–288, 1995.
Google Scholar
E. A. Feinberg, “On measurability and representation of strategic measures in Markov decision processes”, in Statistics, Probability and Game Theory: Papers in Honour of David Blackwell (eds. T. S. Ferguson et al), IMS Lecture Notes — Monographs Series 30, Hayward, pp. 29–43, 1996.
Chapter Google Scholar
E. A. Feinberg, “Total reward criteria”, Chapter 5, this volume.
Google Scholar
E. A. Feinberg and A. Shwartz, “Constrained discounted dynamic programming”, Mathematics of Operations Research 21, pp. 922–945, 1996.
Article Google Scholar
E. A. Feinberg and I. M. Sonin, “Notes on equivalent stationary policies in Markov decision processes with total rewards”, Mathematical Methods of Operations Research 44, pp. 205–221, 1996.
Article Google Scholar
M. K. Ghosh, “Markov decision processes with multiple costs”, Operations Research Letters 9, pp. 257–260, 1990.
Article Google Scholar
O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes, Springer Verlag, New York, 1996.
Book Google Scholar
O. Hernández-Lerma and J. B. Lasserre, Further Topics in Discrete-Time Markov Control Processes, Springer Verlag, New York, 1999.
Book Google Scholar
O. Hernández-Lerma and J. B. Lasserre, “The linear programming approach, Chapter 12, this volume.
Google Scholar
A. Hordijk and L. C. M. Kallenberg, “Linear programming and Markov decision chains”, Management Science 25, 1979, pp. 352–362.
Article Google Scholar
A. Hordijk and L. C. M. Kallenberg, “Constrained undiscounted stochastic dynamic programming”, Mathematics of Operations Research 9, 1984, pp. 276–289.
Article Google Scholar
D. Kadelka, “On randomized policies and mixtures of deterministic policies in dynamic programming”, Methods of Operations Research 46, 1983, pp. 67–75.
Google Scholar
L. C. M. Kallenberg, “Finite state and action MDPs”, Chapter 2, this volume.
Google Scholar
D. Krass, Contributions to the theory and applications of Markov decision processes, Ph.D. Thesis, sDepartment of Mathematics, The Johns Hopkins University, Baltimore, Maryland.
Google Scholar
N.V. Krylov, “Once more about the connection between elliptic operators and Itô’s stochastic equations” in Statistics and Control of Stochastic Processes, Steklov Seminar (eds. N.V. Krylov, R.Sh. Liptser, and A.A. Novikov), Optimization Software, New York, pp. 69–101, 1985
Google Scholar
N.V. Krylov, “An approach in the theory of controlled diffusion processes”, SIAM Theory Prob. Appl. 31 pp. 604–626, 1987.
Article Google Scholar
T. G. Kurtz and R. Stockbridge, “Existence of Markov controls and char- acterization of optimal Markov controls”, SIAM Journal of Control and Optimization 36, pp. 609–653, 1998.
Article Google Scholar
D. G. Luenberger, Optimization by Vector Space Methods, John Wiley, New York, 1969.
Google Scholar
A. Manne, “Linear programming and sequential decisions”, Management Science 6, pp. 259–267, 1960.
Article Google Scholar
A. B. Piunovskiy, Optimal Control of Random Sequences in Problems with Constraints, Kluwer Academic Publishers, Dordrecht, 1997.
Book Google Scholar
A. B. Piunovskiy, “Controlled random sequences: methods of convex analysis and problems with functional constraints”, Russian Math. Surveys 6, pp. 129–192, 1998.
Google Scholar
M. Puterman, Markov Decision Processes, John Wiley, New York, 1994.
Book Google Scholar
K. W. Ross, “Randomized and past-dependent policies for Markov decision problems with multiple constraints”, Operations Research 37, pp. 474–477, 1989.
Article Google Scholar
K. W. Ross and R. Varadarajan, “Markov decision processes with sample path constraints: the communicating case”, Operations Research 37, pp. 780–790, 1989.
Article Google Scholar
L. I. Sennott, “Average reward optimization theory for denumerable state spaces”, Chapter 5, this volume.
Google Scholar
N. Shimkin and A. Shwartz, “Guaranteed performance regions in Markovian systems with competing decision makers”, IEEE Transactions on Automatic Control 38, pp. 84–95, 1993.
Article Google Scholar
P. Tuominen and R. L. Tweedie, “Subgeometric rates of convergence of f-ergodic Markov chains”, Advances in Applied Probability 26, pp. 775–798, 1994.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Technology and Computer Science, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai, 400005, India
Vivek S. Borkar

Authors

Vivek S. Borkar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

State University of New York at Stony Brook, USA
Eugene A. Feinberg
Technion—Israel Institute of Technology, Israel
Adam Shwartz

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Borkar, V.S. (2002). Convex Analytic Methods in Markov Decision Processes. In: Feinberg, E.A., Shwartz, A. (eds) Handbook of Markov Decision Processes. International Series in Operations Research & Management Science, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0805-2_11

Download citation

DOI: https://doi.org/10.1007/978-1-4615-0805-2_11
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5248-8
Online ISBN: 978-1-4615-0805-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics