Skip to main content
Log in

Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models

  • Published:
Acta Applicandae Mathematica Aims and scope Submit manuscript

Abstract

This paper studies the expected total cost (ETC) criterion for discrete-time Markov control processes on Borel spaces, and possibly unbounded cost-per-stage functions. It presents optimality results which include conditions for a control policy to be ETC-optimal and for the ETC-value function to be a solution of the dynamic programming equation. Conditions are also given for the ETC-value function to be the limit of the α-discounted cost value function as α ↑ 1, and for the Markov control process to be `stable" in the sense of Lagrange and almost surely. In addition, transient control models are fully analized. The paper thus provides a fairly complete, up-dated, survey-like presentation of the ETC criterion for Markov control processes on Borel spaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bertsekas, D. P.: Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987.

    Google Scholar 

  2. Billingsley, P.: Convergence of Probability Measures, Wiley, New York, 1968.

    Google Scholar 

  3. Blackwell, D.: Positive dynamic programming, In: Proc. Fifth Berkeley Sympos. Math. Statist. and Probab. (Berkeley, Calif., 1965/66), Vol. I: Statistics, Univ. California Press, Berkeley, CA, 1967, pp. 415–418.

    Google Scholar 

  4. Chan, K. S.: Deterministic stability, stochastic stability, and ergodicity, in: Tong [30], Appendix 1, pp. 448–466.

    Google Scholar 

  5. Derman, C. and Strauch, R. E.: A note on memoryless rules for controlling sequential control processes, Ann. Math. Statist. 37 (1966), 276–278.

    Google Scholar 

  6. Doob, J. L.: Measure Theory, Springer-Verlag, New York, 1994.

    Google Scholar 

  7. Dynkin, E. B. and Yushkevich, A. A.: Controlled Markov Processes, Springer-Verlag, New York, 1979.

    Google Scholar 

  8. Gordienko, E. and Hernández-Lerma, O.: Average cost Markov control processes with weighted norms: Existence of canonical policies, Appl. Math. (Warsaw) 23 (1995), 199–218.

    Google Scholar 

  9. Gordienko, E. and Hernández-Lerma, O.: Average cost Markov control processes: Value iteration, Appl. Math. (Warsaw) 23 (1995), 219–237.

    Google Scholar 

  10. Hernández-Lerma, O.: Lyapunov criteria for stability of differential equations with Markov parameters, Bol. Soc. Mat. Mexicana 24 (1979), 27–48.

    Google Scholar 

  11. Hernández-Lerma, O. and Lasserre, J. B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria, Springer-Verlag, New York, 1996.

    Google Scholar 

  12. Hernández-Lerma, O. and Lasserre, J. B.: Policy iteration for average cost Markov control processes on Borel spaces, Acta Appl. Math. 47 (1997), 125–154.

    Google Scholar 

  13. Hernández-Lerma, O. and Vega-Amaya, O.: Infinite-horizon Markov control processes with undiscounted cost criteria: From average to overtaking optimality, Appl. Math. (Warsaw) 25 (1998), 153–178.

    Google Scholar 

  14. Hernández-Lerma, O., Vega-Amaya, O. and Carrasco, G.: Sample-path optimality and variance-minimization of average cost Markov control processes, SIAM J. Control Optim. (to appear).

  15. Hinderer, K.: Foundations of Non-Stationary Dynamic Programming with Discrete-Time Parameter, Lecture Notes in Oper. Res. Math. Systems 33, Springer-Verlag, Berlin, 1970.

    Google Scholar 

  16. Kallenberg, L. C. M.: Linear Programming and Finite Markovian Control Problems, Mathematical Centre Tracts No. 148, Mathematisch Centrum, Amsterdam, 1983.

    Google Scholar 

  17. Kushner, H. H.: Introduction to Stochastic Control, Holt, Rinehart and Winston, New York, 1971.

    Google Scholar 

  18. Laha, R. G. and Rohatgi, V. K.: Probability Theory, Wiley, New York, 1979.

    Google Scholar 

  19. Meyn, S. P. and Tweedie, R. L.: Markov Chains and Stochastic Stability, Springer-Verlag, London, 1993.

    Google Scholar 

  20. Neveu, J.: Mathematical Foundations of the Calculus of Probability, Holden-Day, San Francisco, 1965.

    Google Scholar 

  21. Parthasarathy, K. R.: Probability Measures on Metric Spaces, Academic Press, New York, 1967.

    Google Scholar 

  22. Pliska, S. R.: On the transient case for Markov decision chains with general state space, in: M. L. Puterman (ed.), Dynamic Programming and its Applications, Academic Press, New York, 1979.

    Google Scholar 

  23. Puterman, M. L.: Markov Decision Processes, Wiley, New York, 1994.

    Google Scholar 

  24. Quelle, G.: Dynamic programming of expectation and variance, J. Math. Anal. Appl. 55 (1976), 239–252.

    Google Scholar 

  25. Ramsey, F. P.: A mathematical theory of savings, Economic J. 38 (1928), 543–559.

    Google Scholar 

  26. Rieder, U.: On optimal policies and martingales in dynamic programming, J. Appl. Probab. 13 (1976), 507–518.

    Google Scholar 

  27. Rieder, U.: On Howard's policy improvement method, Math. Oper. Statist., Ser. Optim. 8 (1977), 227–236.

    Google Scholar 

  28. Schäl, M.: Conditions for optimality and for the limit of n-stage optimal policies to be optimal, Z. Wahrs. verw. Geb. 32 (1975), 179–196.

    Google Scholar 

  29. Strauch, R. E.: Negative dynamic programming, Ann. Mah. Statist. 37 (1966), 871–890.

    Google Scholar 

  30. Tong, H.: Non-linear Time Series, Oxford Univ. Press, Oxford, 1993.

    Google Scholar 

  31. Veinott, A. F.: Discrete dynamic programming with sensitive discount optimality criteria, Ann. Math. Statist. 40 (1969), 1635–1660.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hernández-Lerma, O., Carrasco, G. & Pérez-Hernández, R. Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models. Acta Applicandae Mathematicae 59, 229–269 (1999). https://doi.org/10.1023/A:1006368714127

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1006368714127

Navigation