Skip to main content

Convex Analytic Methods in Markov Decision Processes

  • Chapter
Handbook of Markov Decision Processes

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 40))

Abstract

This article describes the convex analytic approach to classical Markov decision processes wherein they are cast as a static convex programming problem in the space of measures. Applications to multiobjective problems are described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Altman, Constrained Markov decision processes, Chapman and Hall/ CRC, Boca Raton, Florida, 1999.

    Google Scholar 

  2. E. Altman, “Applications of Markov decision processes in communication networks: a survey”, Chapter 16, this volume.

    Google Scholar 

  3. E. Altman and A. Shwartz, “Markov decision problems and state-action frequencies”, SIAM Journal of Control and Optimization 29, pp.786–809, 1991.

    Article  Google Scholar 

  4. E. Altman and F. Spieksma, “The linear program approach in Markov decision processes revisited”, ZOR — Methods and Models in Operations Research 42, Issue 2, 1995, pp. 169–188.

    Google Scholar 

  5. E. J. Anderson and P. Nash, Linear Programming in Infinite Dimensional Spaces, John Wiley, Chichester, 1987.

    Google Scholar 

  6. K. J. Arrow, E. W. Barankin and D. Blackwell, “Admissible points of convex sets”, in Contributions to the Theory of Games (eds. H.W. Kuhn and A.W. Tucker), Princeton Uni. Press, Princeton, NJ, pp.87–91, 1950.

    Google Scholar 

  7. S. Bhatnagar and V. S. Borkar, “A convex analytic framework for ergodic control of semi-Markov processes”, Mathematics of Operations Research 20, 1995, pp. 923–936.

    Article  Google Scholar 

  8. A. G. Bhatt and V. S. Borkar, “Occupation measures for controlled Markov processes: characterization and optimality”, The Annals of Probability 24, pp. 1531–1562, 1996.

    Article  Google Scholar 

  9. V. S. Borkar, “A convex analytic approach to Markov decision processes”, Probability Theory and Related Fields 78, pp.583–602, 1988.

    Article  Google Scholar 

  10. V. S. Borkar, “Control of Markov chains with long-run average cost cri- terion: the dynamic programming equations”, SIAM Journal of Control and Optimization 27, pp.642–657, 1989.

    Article  Google Scholar 

  11. V. S. Borkar, “Controlled Markov chains with constraints”, Sadhana: Indian Academy of Sciences Proceedings in Engineering Sciences 15, pp.405–413, 1990.

    Google Scholar 

  12. V. S. Borkar, Topics in Controlled Markov Chains, Pitman Research Notes in Maths. No.240, Longman Scientific and Technical, Harlow, England, 1991.

    Google Scholar 

  13. V. S. Borkar, “Controlled diffusions with constraints II”, Journal of Mathematical Analysis and Applications 176 pp.310–321, 1993.

    Article  Google Scholar 

  14. V. S. Borkar, “Ergodic control of Markov chains with constraints — the general case”, SIAM Journal of Control and Optimization 32, pp. 176–186, 1994.

    Article  Google Scholar 

  15. V. S. Borkar, Probability Theory: An Advanced Course, Springer Verlag, New York, 1995.

    Google Scholar 

  16. V. S. Borkar, “Uniform stability of controlled Markov processes”, in System Theory: Modeling, Analysis and Control (T. E. Djaferis and I. C. Schick, eds.), Kluwer Academic Publishers, Boston, pp. 106–120, 1999.

    Google Scholar 

  17. V. S. Borkar and M. K. Ghosh, “Controlled diffusions with constraints”, Journal of Mathematical Analysis and Applications 152, pp.88–108, 1990.

    Article  Google Scholar 

  18. G. Choquet, Lectures on Analysis, Vol.11: Representation Theory, W.A. Benjamin, Inc., Reading, Mass., 1969.

    Google Scholar 

  19. C. Derman, Finite State Markovian Decision Processes, Academic Press, New York, 1970.

    Google Scholar 

  20. C. Derman and M. Klein, “Some remarks on finite horizon Markovian decision models”, Operations Research 13, pp. 272–278, 1965.

    Article  Google Scholar 

  21. L. Dubins, “On extreme points of convex sets”, Journal of Mathematical Analysis and Applications 5, pp. 237–244, 1962.

    Article  Google Scholar 

  22. G. Fayolle, V. A. Malyshev and M. V. Menshikov, Topics in the Constructive Theory of Countable Markov Chains, Cambridge University Press, Cambridge, UK, 1995.

    Book  Google Scholar 

  23. E. A. Feinberg, “Nonrandomized Markov and semi-Markov strategies in dynamic programming”, Theory of Probability and Applications 27, pp. 116–126, 1982.

    Article  Google Scholar 

  24. E.A. Feinberg, “Controlled Markov processes with arbitrary numerical criteria,” SIAM Theory Prob. Appl. 27 pp. 486–503, 1982.

    Article  Google Scholar 

  25. E. A. Feinberg, “Constrained semi-Markov decision processes with average rewards”, ZOR — Methods and Models in Operations Research 39, pp. 257–288, 1995.

    Google Scholar 

  26. E. A. Feinberg, “On measurability and representation of strategic measures in Markov decision processes”, in Statistics, Probability and Game Theory: Papers in Honour of David Blackwell (eds. T. S. Ferguson et al), IMS Lecture Notes — Monographs Series 30, Hayward, pp. 29–43, 1996.

    Chapter  Google Scholar 

  27. E. A. Feinberg, “Total reward criteria”, Chapter 5, this volume.

    Google Scholar 

  28. E. A. Feinberg and A. Shwartz, “Constrained discounted dynamic programming”, Mathematics of Operations Research 21, pp. 922–945, 1996.

    Article  Google Scholar 

  29. E. A. Feinberg and I. M. Sonin, “Notes on equivalent stationary policies in Markov decision processes with total rewards”, Mathematical Methods of Operations Research 44, pp. 205–221, 1996.

    Article  Google Scholar 

  30. M. K. Ghosh, “Markov decision processes with multiple costs”, Operations Research Letters 9, pp. 257–260, 1990.

    Article  Google Scholar 

  31. O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes, Springer Verlag, New York, 1996.

    Book  Google Scholar 

  32. O. Hernández-Lerma and J. B. Lasserre, Further Topics in Discrete-Time Markov Control Processes, Springer Verlag, New York, 1999.

    Book  Google Scholar 

  33. O. Hernández-Lerma and J. B. Lasserre, “The linear programming approach, Chapter 12, this volume.

    Google Scholar 

  34. A. Hordijk and L. C. M. Kallenberg, “Linear programming and Markov decision chains”, Management Science 25, 1979, pp. 352–362.

    Article  Google Scholar 

  35. A. Hordijk and L. C. M. Kallenberg, “Constrained undiscounted stochastic dynamic programming”, Mathematics of Operations Research 9, 1984, pp. 276–289.

    Article  Google Scholar 

  36. D. Kadelka, “On randomized policies and mixtures of deterministic policies in dynamic programming”, Methods of Operations Research 46, 1983, pp. 67–75.

    Google Scholar 

  37. L. C. M. Kallenberg, “Finite state and action MDPs”, Chapter 2, this volume.

    Google Scholar 

  38. D. Krass, Contributions to the theory and applications of Markov decision processes, Ph.D. Thesis, sDepartment of Mathematics, The Johns Hopkins University, Baltimore, Maryland.

    Google Scholar 

  39. N.V. Krylov, “Once more about the connection between elliptic operators and Itô’s stochastic equations” in Statistics and Control of Stochastic Processes, Steklov Seminar (eds. N.V. Krylov, R.Sh. Liptser, and A.A. Novikov), Optimization Software, New York, pp. 69–101, 1985

    Google Scholar 

  40. N.V. Krylov, “An approach in the theory of controlled diffusion processes”, SIAM Theory Prob. Appl. 31 pp. 604–626, 1987.

    Article  Google Scholar 

  41. T. G. Kurtz and R. Stockbridge, “Existence of Markov controls and char- acterization of optimal Markov controls”, SIAM Journal of Control and Optimization 36, pp. 609–653, 1998.

    Article  Google Scholar 

  42. D. G. Luenberger, Optimization by Vector Space Methods, John Wiley, New York, 1969.

    Google Scholar 

  43. A. Manne, “Linear programming and sequential decisions”, Management Science 6, pp. 259–267, 1960.

    Article  Google Scholar 

  44. A. B. Piunovskiy, Optimal Control of Random Sequences in Problems with Constraints, Kluwer Academic Publishers, Dordrecht, 1997.

    Book  Google Scholar 

  45. A. B. Piunovskiy, “Controlled random sequences: methods of convex analysis and problems with functional constraints”, Russian Math. Surveys 6, pp. 129–192, 1998.

    Google Scholar 

  46. M. Puterman, Markov Decision Processes, John Wiley, New York, 1994.

    Book  Google Scholar 

  47. K. W. Ross, “Randomized and past-dependent policies for Markov decision problems with multiple constraints”, Operations Research 37, pp. 474–477, 1989.

    Article  Google Scholar 

  48. K. W. Ross and R. Varadarajan, “Markov decision processes with sample path constraints: the communicating case”, Operations Research 37, pp. 780–790, 1989.

    Article  Google Scholar 

  49. L. I. Sennott, “Average reward optimization theory for denumerable state spaces”, Chapter 5, this volume.

    Google Scholar 

  50. N. Shimkin and A. Shwartz, “Guaranteed performance regions in Markovian systems with competing decision makers”, IEEE Transactions on Automatic Control 38, pp. 84–95, 1993.

    Article  Google Scholar 

  51. P. Tuominen and R. L. Tweedie, “Subgeometric rates of convergence of f-ergodic Markov chains”, Advances in Applied Probability 26, pp. 775–798, 1994.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Science+Business Media New York

About this chapter

Cite this chapter

Borkar, V.S. (2002). Convex Analytic Methods in Markov Decision Processes. In: Feinberg, E.A., Shwartz, A. (eds) Handbook of Markov Decision Processes. International Series in Operations Research & Management Science, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0805-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-0805-2_11

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5248-8

  • Online ISBN: 978-1-4615-0805-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics