Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models

Hernández-Lerma, Onésimo; Carrasco, Guadalupe; Pérez-Hernández, Rubén

doi:10.1023/A:1006368714127

Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models

Published: December 1999

Volume 59, pages 229–269, (1999)
Cite this article

Acta Applicandae Mathematica Aims and scope Submit manuscript

Onésimo Hernández-Lerma¹,
Guadalupe Carrasco² &
Rubén Pérez-Hernández³

90 Accesses
9 Citations
Explore all metrics

Abstract

This paper studies the expected total cost (ETC) criterion for discrete-time Markov control processes on Borel spaces, and possibly unbounded cost-per-stage functions. It presents optimality results which include conditions for a control policy to be ETC-optimal and for the ETC-value function to be a solution of the dynamic programming equation. Conditions are also given for the ETC-value function to be the limit of the α-discounted cost value function as α ↑ 1, and for the Markov control process to be `stable" in the sense of Lagrange and almost surely. In addition, transient control models are fully analized. The paper thus provides a fairly complete, up-dated, survey-like presentation of the ETC criterion for Markov control processes on Borel spaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Article 20 June 2016

Local Poisson Equations Associated with Discrete-Time Markov Control Processes

Article 13 February 2017

Abel-type Results for Controlled Piecewise Deterministic Markov Processes

Article 09 April 2015

References

Bertsekas, D. P.: Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987.
Google Scholar
Billingsley, P.: Convergence of Probability Measures, Wiley, New York, 1968.
Google Scholar
Blackwell, D.: Positive dynamic programming, In: Proc. Fifth Berkeley Sympos. Math. Statist. and Probab. (Berkeley, Calif., 1965/66), Vol. I: Statistics, Univ. California Press, Berkeley, CA, 1967, pp. 415–418.
Google Scholar
Chan, K. S.: Deterministic stability, stochastic stability, and ergodicity, in: Tong [30], Appendix 1, pp. 448–466.
Google Scholar
Derman, C. and Strauch, R. E.: A note on memoryless rules for controlling sequential control processes, Ann. Math. Statist. 37 (1966), 276–278.
Google Scholar
Doob, J. L.: Measure Theory, Springer-Verlag, New York, 1994.
Google Scholar
Dynkin, E. B. and Yushkevich, A. A.: Controlled Markov Processes, Springer-Verlag, New York, 1979.
Google Scholar
Gordienko, E. and Hernández-Lerma, O.: Average cost Markov control processes with weighted norms: Existence of canonical policies, Appl. Math. (Warsaw) 23 (1995), 199–218.
Google Scholar
Gordienko, E. and Hernández-Lerma, O.: Average cost Markov control processes: Value iteration, Appl. Math. (Warsaw) 23 (1995), 219–237.
Google Scholar
Hernández-Lerma, O.: Lyapunov criteria for stability of differential equations with Markov parameters, Bol. Soc. Mat. Mexicana 24 (1979), 27–48.
Google Scholar
Hernández-Lerma, O. and Lasserre, J. B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria, Springer-Verlag, New York, 1996.
Google Scholar
Hernández-Lerma, O. and Lasserre, J. B.: Policy iteration for average cost Markov control processes on Borel spaces, Acta Appl. Math. 47 (1997), 125–154.
Google Scholar
Hernández-Lerma, O. and Vega-Amaya, O.: Infinite-horizon Markov control processes with undiscounted cost criteria: From average to overtaking optimality, Appl. Math. (Warsaw) 25 (1998), 153–178.
Google Scholar
Hernández-Lerma, O., Vega-Amaya, O. and Carrasco, G.: Sample-path optimality and variance-minimization of average cost Markov control processes, SIAM J. Control Optim. (to appear).
Hinderer, K.: Foundations of Non-Stationary Dynamic Programming with Discrete-Time Parameter, Lecture Notes in Oper. Res. Math. Systems 33, Springer-Verlag, Berlin, 1970.
Google Scholar
Kallenberg, L. C. M.: Linear Programming and Finite Markovian Control Problems, Mathematical Centre Tracts No. 148, Mathematisch Centrum, Amsterdam, 1983.
Google Scholar
Kushner, H. H.: Introduction to Stochastic Control, Holt, Rinehart and Winston, New York, 1971.
Google Scholar
Laha, R. G. and Rohatgi, V. K.: Probability Theory, Wiley, New York, 1979.
Google Scholar
Meyn, S. P. and Tweedie, R. L.: Markov Chains and Stochastic Stability, Springer-Verlag, London, 1993.
Google Scholar
Neveu, J.: Mathematical Foundations of the Calculus of Probability, Holden-Day, San Francisco, 1965.
Google Scholar
Parthasarathy, K. R.: Probability Measures on Metric Spaces, Academic Press, New York, 1967.
Google Scholar
Pliska, S. R.: On the transient case for Markov decision chains with general state space, in: M. L. Puterman (ed.), Dynamic Programming and its Applications, Academic Press, New York, 1979.
Google Scholar
Puterman, M. L.: Markov Decision Processes, Wiley, New York, 1994.
Google Scholar
Quelle, G.: Dynamic programming of expectation and variance, J. Math. Anal. Appl. 55 (1976), 239–252.
Google Scholar
Ramsey, F. P.: A mathematical theory of savings, Economic J. 38 (1928), 543–559.
Google Scholar
Rieder, U.: On optimal policies and martingales in dynamic programming, J. Appl. Probab. 13 (1976), 507–518.
Google Scholar
Rieder, U.: On Howard's policy improvement method, Math. Oper. Statist., Ser. Optim. 8 (1977), 227–236.
Google Scholar
Schäl, M.: Conditions for optimality and for the limit of n-stage optimal policies to be optimal, Z. Wahrs. verw. Geb. 32 (1975), 179–196.
Google Scholar
Strauch, R. E.: Negative dynamic programming, Ann. Mah. Statist. 37 (1966), 871–890.
Google Scholar
Tong, H.: Non-linear Time Series, Oxford Univ. Press, Oxford, 1993.
Google Scholar
Veinott, A. F.: Discrete dynamic programming with sensitive discount optimality criteria, Ann. Math. Statist. 40 (1969), 1635–1660.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Matemáticas, CINVESTAV-IPN, A. Postal 14-740, México, D.F. 07000, México. e-mail
Onésimo Hernández-Lerma
Departamento de Matemáticas, Facultad de Ciencias, UNAM, Ciudad Universitaria, México, D.F. 04510, México. e-mail
Guadalupe Carrasco
Departamento de Matemáticas, Universidad Autónoma Metropolitana–Iztapalapa, A. Postal 55 534, México, D.F. 09340, México. e-mail
Rubén Pérez-Hernández

Authors

Onésimo Hernández-Lerma
View author publications
You can also search for this author in PubMed Google Scholar
Guadalupe Carrasco
View author publications
You can also search for this author in PubMed Google Scholar
Rubén Pérez-Hernández
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hernández-Lerma, O., Carrasco, G. & Pérez-Hernández, R. Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models. Acta Applicandae Mathematicae 59, 229–269 (1999). https://doi.org/10.1023/A:1006368714127

Download citation

Issue Date: December 1999
DOI: https://doi.org/10.1023/A:1006368714127

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models

Abstract

Access this article

Similar content being viewed by others

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Local Poisson Equations Associated with Discrete-Time Markov Control Processes

Abel-type Results for Controlled Piecewise Deterministic Markov Processes

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models

Abstract

Access this article

Similar content being viewed by others

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Local Poisson Equations Associated with Discrete-Time Markov Control Processes

Abel-type Results for Controlled Piecewise Deterministic Markov Processes

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation