Abstract
This article describes the convex analytic approach to classical Markov decision processes wherein they are cast as a static convex programming problem in the space of measures. Applications to multiobjective problems are described.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Altman, Constrained Markov decision processes, Chapman and Hall/ CRC, Boca Raton, Florida, 1999.
E. Altman, “Applications of Markov decision processes in communication networks: a survey”, Chapter 16, this volume.
E. Altman and A. Shwartz, “Markov decision problems and state-action frequencies”, SIAM Journal of Control and Optimization 29, pp.786–809, 1991.
E. Altman and F. Spieksma, “The linear program approach in Markov decision processes revisited”, ZOR — Methods and Models in Operations Research 42, Issue 2, 1995, pp. 169–188.
E. J. Anderson and P. Nash, Linear Programming in Infinite Dimensional Spaces, John Wiley, Chichester, 1987.
K. J. Arrow, E. W. Barankin and D. Blackwell, “Admissible points of convex sets”, in Contributions to the Theory of Games (eds. H.W. Kuhn and A.W. Tucker), Princeton Uni. Press, Princeton, NJ, pp.87–91, 1950.
S. Bhatnagar and V. S. Borkar, “A convex analytic framework for ergodic control of semi-Markov processes”, Mathematics of Operations Research 20, 1995, pp. 923–936.
A. G. Bhatt and V. S. Borkar, “Occupation measures for controlled Markov processes: characterization and optimality”, The Annals of Probability 24, pp. 1531–1562, 1996.
V. S. Borkar, “A convex analytic approach to Markov decision processes”, Probability Theory and Related Fields 78, pp.583–602, 1988.
V. S. Borkar, “Control of Markov chains with long-run average cost cri- terion: the dynamic programming equations”, SIAM Journal of Control and Optimization 27, pp.642–657, 1989.
V. S. Borkar, “Controlled Markov chains with constraints”, Sadhana: Indian Academy of Sciences Proceedings in Engineering Sciences 15, pp.405–413, 1990.
V. S. Borkar, Topics in Controlled Markov Chains, Pitman Research Notes in Maths. No.240, Longman Scientific and Technical, Harlow, England, 1991.
V. S. Borkar, “Controlled diffusions with constraints II”, Journal of Mathematical Analysis and Applications 176 pp.310–321, 1993.
V. S. Borkar, “Ergodic control of Markov chains with constraints — the general case”, SIAM Journal of Control and Optimization 32, pp. 176–186, 1994.
V. S. Borkar, Probability Theory: An Advanced Course, Springer Verlag, New York, 1995.
V. S. Borkar, “Uniform stability of controlled Markov processes”, in System Theory: Modeling, Analysis and Control (T. E. Djaferis and I. C. Schick, eds.), Kluwer Academic Publishers, Boston, pp. 106–120, 1999.
V. S. Borkar and M. K. Ghosh, “Controlled diffusions with constraints”, Journal of Mathematical Analysis and Applications 152, pp.88–108, 1990.
G. Choquet, Lectures on Analysis, Vol.11: Representation Theory, W.A. Benjamin, Inc., Reading, Mass., 1969.
C. Derman, Finite State Markovian Decision Processes, Academic Press, New York, 1970.
C. Derman and M. Klein, “Some remarks on finite horizon Markovian decision models”, Operations Research 13, pp. 272–278, 1965.
L. Dubins, “On extreme points of convex sets”, Journal of Mathematical Analysis and Applications 5, pp. 237–244, 1962.
G. Fayolle, V. A. Malyshev and M. V. Menshikov, Topics in the Constructive Theory of Countable Markov Chains, Cambridge University Press, Cambridge, UK, 1995.
E. A. Feinberg, “Nonrandomized Markov and semi-Markov strategies in dynamic programming”, Theory of Probability and Applications 27, pp. 116–126, 1982.
E.A. Feinberg, “Controlled Markov processes with arbitrary numerical criteria,” SIAM Theory Prob. Appl. 27 pp. 486–503, 1982.
E. A. Feinberg, “Constrained semi-Markov decision processes with average rewards”, ZOR — Methods and Models in Operations Research 39, pp. 257–288, 1995.
E. A. Feinberg, “On measurability and representation of strategic measures in Markov decision processes”, in Statistics, Probability and Game Theory: Papers in Honour of David Blackwell (eds. T. S. Ferguson et al), IMS Lecture Notes — Monographs Series 30, Hayward, pp. 29–43, 1996.
E. A. Feinberg, “Total reward criteria”, Chapter 5, this volume.
E. A. Feinberg and A. Shwartz, “Constrained discounted dynamic programming”, Mathematics of Operations Research 21, pp. 922–945, 1996.
E. A. Feinberg and I. M. Sonin, “Notes on equivalent stationary policies in Markov decision processes with total rewards”, Mathematical Methods of Operations Research 44, pp. 205–221, 1996.
M. K. Ghosh, “Markov decision processes with multiple costs”, Operations Research Letters 9, pp. 257–260, 1990.
O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes, Springer Verlag, New York, 1996.
O. Hernández-Lerma and J. B. Lasserre, Further Topics in Discrete-Time Markov Control Processes, Springer Verlag, New York, 1999.
O. Hernández-Lerma and J. B. Lasserre, “The linear programming approach, Chapter 12, this volume.
A. Hordijk and L. C. M. Kallenberg, “Linear programming and Markov decision chains”, Management Science 25, 1979, pp. 352–362.
A. Hordijk and L. C. M. Kallenberg, “Constrained undiscounted stochastic dynamic programming”, Mathematics of Operations Research 9, 1984, pp. 276–289.
D. Kadelka, “On randomized policies and mixtures of deterministic policies in dynamic programming”, Methods of Operations Research 46, 1983, pp. 67–75.
L. C. M. Kallenberg, “Finite state and action MDPs”, Chapter 2, this volume.
D. Krass, Contributions to the theory and applications of Markov decision processes, Ph.D. Thesis, sDepartment of Mathematics, The Johns Hopkins University, Baltimore, Maryland.
N.V. Krylov, “Once more about the connection between elliptic operators and Itô’s stochastic equations” in Statistics and Control of Stochastic Processes, Steklov Seminar (eds. N.V. Krylov, R.Sh. Liptser, and A.A. Novikov), Optimization Software, New York, pp. 69–101, 1985
N.V. Krylov, “An approach in the theory of controlled diffusion processes”, SIAM Theory Prob. Appl. 31 pp. 604–626, 1987.
T. G. Kurtz and R. Stockbridge, “Existence of Markov controls and char- acterization of optimal Markov controls”, SIAM Journal of Control and Optimization 36, pp. 609–653, 1998.
D. G. Luenberger, Optimization by Vector Space Methods, John Wiley, New York, 1969.
A. Manne, “Linear programming and sequential decisions”, Management Science 6, pp. 259–267, 1960.
A. B. Piunovskiy, Optimal Control of Random Sequences in Problems with Constraints, Kluwer Academic Publishers, Dordrecht, 1997.
A. B. Piunovskiy, “Controlled random sequences: methods of convex analysis and problems with functional constraints”, Russian Math. Surveys 6, pp. 129–192, 1998.
M. Puterman, Markov Decision Processes, John Wiley, New York, 1994.
K. W. Ross, “Randomized and past-dependent policies for Markov decision problems with multiple constraints”, Operations Research 37, pp. 474–477, 1989.
K. W. Ross and R. Varadarajan, “Markov decision processes with sample path constraints: the communicating case”, Operations Research 37, pp. 780–790, 1989.
L. I. Sennott, “Average reward optimization theory for denumerable state spaces”, Chapter 5, this volume.
N. Shimkin and A. Shwartz, “Guaranteed performance regions in Markovian systems with competing decision makers”, IEEE Transactions on Automatic Control 38, pp. 84–95, 1993.
P. Tuominen and R. L. Tweedie, “Subgeometric rates of convergence of f-ergodic Markov chains”, Advances in Applied Probability 26, pp. 775–798, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media New York
About this chapter
Cite this chapter
Borkar, V.S. (2002). Convex Analytic Methods in Markov Decision Processes. In: Feinberg, E.A., Shwartz, A. (eds) Handbook of Markov Decision Processes. International Series in Operations Research & Management Science, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0805-2_11
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0805-2_11
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5248-8
Online ISBN: 978-1-4615-0805-2
eBook Packages: Springer Book Archive