Deterministic Discounted Markov Decision Processes with Fuzzy Rewards/Costs

Hugo Cruz-Suárez; Raúl Montes-de-Oca; R. Israel Ortega-Gutiérrez

doi:10.26599/FIE.2023.9270020

Fuzzy Information and Engineering 2023, 15(3): 274-290 https://doi.org/10.26599/FIE.2023.9270020

Article |

Open Access | Issue | Published: 01 September 2023

Deterministic Discounted Markov Decision Processes with Fuzzy Rewards/Costs

Show Author's Information Hide Author's Information Hugo Cruz-Suárez^¹(

), Raúl Montes-de-Oca^², R. Israel Ortega-Gutiérrez^¹

1Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Puebla 72570, México

2Departamento de Matemáticas, Universidad Autónoma Metropolitana-Iztapalapa, CDMX 09340, México

Keywords:

deterministic Markov decision process, discounted criterion, fuzzy reward, fuzzy cost, trapezoidal fuzzy number

Cite this article:

Cruz-Suárez H, Montes-de-Oca R, Israel Ortega-Gutiérrez R. Deterministic Discounted Markov Decision Processes with Fuzzy Rewards/Costs. Fuzzy Information and Engineering, 2023, 15(3): 274-290. https://doi.org/10.26599/FIE.2023.9270020

Download citation

EndNote(RIS)

BibTeX

659

Views

101

Downloads

Citations

Crossref

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

The article concerns a study of infinite-horizon deterministic Markov decision processes (MDPs) for which the fuzzy environment will be presented through considering these MDPs with both fuzzy rewards and fuzzy costs. Specifically, these rewards and costs will be assumed of a suitable trapezoidal type. For both classes of MDPs, i.e., MDPs with fuzzy rewards and MDPs with fuzzy costs, the fuzzy total discounted function will be taken into account as the objective function, and the corresponding optimal decision problems will be considered with respect to the max order of the fuzzy numbers. For each optimal decision problem, the optimal policy and the optimal value function are related and obtained as a solution of a convenient standard MDP (i.e., a standard MDP is an MDP with a non-fuzzy reward function or a non-fuzzy cost function). Moreover, an economic growth model (EGM), a deterministic version of the linear-quadratic model (LQM), and an optimal consumption model (OCM) in order to clarify the theory presented are given, and it is remarked that these models have uncountable state spaces, and the corresponding non-fuzzy version of both the EGM and the OCM has an unbounded reward function, and the corresponding non-fuzzy version of the LQM has an unbounded cost function.

Full text

Abstract

Full text

Outline

About this article

Deterministic Discounted Markov Decision Processes with Fuzzy Rewards/Costs

Show Author's information Hide Author's Information Hugo Cruz-Suárez^¹(

), Raúl Montes-de-Oca^², R. Israel Ortega-Gutiérrez^¹

1Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Puebla 72570, México

2Departamento de Matemáticas, Universidad Autónoma Metropolitana-Iztapalapa, CDMX 09340, México

Abstract

Keywords: deterministic Markov decision process, discounted criterion, fuzzy reward, fuzzy cost, trapezoidal fuzzy number

References(33)

[1]

R. E. Bellman and L. A. Zadeh, Decision-making in a fuzzy environment, Manage. Sci., vol. 17, no. 4, pp. 141–164, 1970.

DOI Google Scholar

[2]

L. Zadeh, Fuzzy sets and information granularity, Inform. Control, vol. 8, no. 3, pp. 338–353, 1965.

DOI Google Scholar

[3]

J. M. Conrad, Resource Economics. Cambridge, UK: Cambridge University Press, 2010.

[4]

A. Jaśkiewicz and A. S. Nowak, Discounted dynamic programming with unbounded returns: Application to economic models, J. Math. Anal. Appl., vol. 378, no. 2, pp. 450–462, 2011.

DOI Google Scholar

[5]

U. Rieder, K. Hinderer, and M. Stieglitz, Dynamic Optimization: Deterministic and Stochastic Models. New York, NY, USA: Springer, 2016.

DOI

[6]

L. Ljungqvist and T. J. Sargent, Recursive Macroeconomic Theory. Cambridge, MA, USA: The MIT Press, 2004.

[7]

L. J. Mirman, Dynamic models of fishing: a heuristic approach, in Control Theory in Mathematical Economics, R. T. Liu and J. G. Sutinen, eds. New York, NY, USA: Marcel Dekker, 1979, vol. 47, pp. 39–73.

[8]

G. Tragler, S. Mittnik, W. Semmler, and V. M. Veliov, eds. Dynamic Optimization in Environmental Economics, New York, NY, USA: Springer, 2014.

[9]

A. Venditti, J. Stachurski, and M. Yano, Nonlinear Dynamics in Equilibrium Models: Chaos, Cycles and Indeterminacy. New York, NY, USA: Springer, 2012.

DOI

[10]

N. Stokey and R. E. Lucas, Recursive Methods in Economic Dynamics. Cambridge, MA, USA: Harvard University Press, 1989.

DOI

[11]

O. Tahvonen, M. F. Quaas, and R. Voss, Harvesting selectivity and stochastic recruitment in economic models of age-structured fisheries, J. Environ. Econ. Manag., vol. 92, pp. 659–676, 2018.

DOI Google Scholar

[12]

C. Van and R. A. Dana, Dynamic programming in Economics. New York, NY, USA: Springer, 2003.

[13]

M. Wickens, Macroeconomic Theory: A Dynamic General Equilibrium Approach, Second Edition. Princeton, NJ, USA: Princeton University Press, 2011.

DOI

[14]

O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes. New York, NY, USA: Springer, 1999.

DOI

[15]

R. K. Sundaram, A First Course in Optimization Theory. Cambridge, UK: Cambridge University Press, 1996.

DOI

[16]

H. Cruz-Suárez and R. Montes de Oca, Discounted markov control processes induced by deterministic systems, Kybernetika, vol. 42, no. 6, pp. 647–664, 2006.

Google Scholar

[17]

A. de la Fuente, Mathematical Methods and Models for Economists. Cambridge, UK: Cambridge University Press, 2000.

DOI

[18]

S. Abbasbandy and T. Hajjari, A new approach for ranking of trapezoidal fuzzy numbers, Comput. Math. Appl., vol. 57, no. 3, pp. 413–419, 2009.

DOI Google Scholar

[19]

S. H. Chen, Operations of fuzzy numbers with step form membership function using function principle, Inf. Sci., vol. 108, nos. 1&4, pp. 149–155, 1998.

DOI

[20]

S. Rezvani and M. Molani, Representation of trapezoidal fuzzy numbers with shape function, Ann. Fuzzy Math. Inform., vol. 8, no. 1, pp. 89–112, 2014.

Google Scholar

[21]

A. I. Ban, Triangular and parametric approximations of fuzzy numbers—Inadvertences and corrections, Fuzzy Sets Syst., vol. 160, no. 21, pp. 3048–3058, 2009.

DOI Google Scholar

[22]

W. Pedrycz, Why triangular membership functions? Fuzzy Sets Syst., vol. 64, no. 1, pp. 21–30, 1994.

DOI

[23]

W. Zeng and H. Li, Weighted triangular approximation of fuzzy numbers, Int. J. Approx. Reason., vol. 46, no. 1, pp. 137–150, 2007.

DOI Google Scholar

[24]

N. Furukawa, Paramentric orders on fuzzy numbers and their roles in fuzzy optimization problems, Optimization, vol. 40, pp. 171–192, 1997.

DOI Google Scholar

[25]

H. Cruz-Suárez, K. Carrero-Vera, and R. Montes-de Oca, Markov decision processes on finite spaces with fuzzy total rewards, Kybernetika, vol. 52, no. 2, pp. 180–199, 2022.

DOI Google Scholar

[26]

M. Hosaka, M. Kurano, and Y. Huang, Controlled markov set-chains with discounting, J. Appl. Prob., vol. 35, pp. 293–302, 1998.

DOI Google Scholar

[27]

M. Kurano, M. Yasuda, J. I. Nakagami, and Y. Yoshida, Markov-type fuzzy decision processes with a discounted reward on a closed interval, Eur. J. Oper. Res., vol. 92, no. 3, pp. 649–662, 1996.

DOI Google Scholar

[28]

J. Nakagami, M. Kurano, M. Yasuda, and Y. Yoshida, Markov decision processes with fuzzy rewards, in Proc. Int. Conf. on Nonlinear Analysis, Hirosaki, Japan, 2002.

[29]

A. Semmouri, M. Jourhmane, and Z. Belhallaj, Discounted Markov decision processes with fuzzy costs, Ann. Oper. Res., vol. 295, no. 2, pp. 769–786, 2020.

DOI Google Scholar

[30]

P. Diamond and P. Kloeden, Metric Spaces of Fuzzy Sets: Theory and Applications. Singapore: World Scientific, 1994.

DOI

[31]

M. L. Puri and D. A. Ralescu, Fuzzy random variable, J. Math. Anal. Appl., vol. 114, no. 2, pp. 402–422, 1986.

DOI Google Scholar

[32]

R. Montes de Oca and H. Cruz-Suárez, Optimal policies in the class of infinitely differentiable functions for discounted linear-quadratic models, Int. J. Pure Appl. Math., vol. 58, no. 1, pp. 77–85, 2010.

Google Scholar

[33]

H. Cruz-Suárez, R. Montes-de-Oca, and R. I. Ortega-Gutiérrez, An extended version of average Markov decision processes on discrete spaces under fuzzy environment, Kybernetika, pp. 160–178, 2023.

DOI Google Scholar

About this article

Publication history

Rights and permissions

Publication history

Received: 05 April 2023

Revised: 27 May 2023

Accepted: 24 June 2023

Published: 01 September 2023

Issue date: September 2023

Copyright

Rights and permissions

This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).