659
Views
101
Downloads
1
Crossref
1
WoS
1
Scopus
N/A
CSCD
The article concerns a study of infinite-horizon deterministic Markov decision processes (MDPs) for which the fuzzy environment will be presented through considering these MDPs with both fuzzy rewards and fuzzy costs. Specifically, these rewards and costs will be assumed of a suitable trapezoidal type. For both classes of MDPs, i.e., MDPs with fuzzy rewards and MDPs with fuzzy costs, the fuzzy total discounted function will be taken into account as the objective function, and the corresponding optimal decision problems will be considered with respect to the max order of the fuzzy numbers. For each optimal decision problem, the optimal policy and the optimal value function are related and obtained as a solution of a convenient standard MDP (i.e., a standard MDP is an MDP with a non-fuzzy reward function or a non-fuzzy cost function). Moreover, an economic growth model (EGM), a deterministic version of the linear-quadratic model (LQM), and an optimal consumption model (OCM) in order to clarify the theory presented are given, and it is remarked that these models have uncountable state spaces, and the corresponding non-fuzzy version of both the EGM and the OCM has an unbounded reward function, and the corresponding non-fuzzy version of the LQM has an unbounded cost function.
The article concerns a study of infinite-horizon deterministic Markov decision processes (MDPs) for which the fuzzy environment will be presented through considering these MDPs with both fuzzy rewards and fuzzy costs. Specifically, these rewards and costs will be assumed of a suitable trapezoidal type. For both classes of MDPs, i.e., MDPs with fuzzy rewards and MDPs with fuzzy costs, the fuzzy total discounted function will be taken into account as the objective function, and the corresponding optimal decision problems will be considered with respect to the max order of the fuzzy numbers. For each optimal decision problem, the optimal policy and the optimal value function are related and obtained as a solution of a convenient standard MDP (i.e., a standard MDP is an MDP with a non-fuzzy reward function or a non-fuzzy cost function). Moreover, an economic growth model (EGM), a deterministic version of the linear-quadratic model (LQM), and an optimal consumption model (OCM) in order to clarify the theory presented are given, and it is remarked that these models have uncountable state spaces, and the corresponding non-fuzzy version of both the EGM and the OCM has an unbounded reward function, and the corresponding non-fuzzy version of the LQM has an unbounded cost function.
R. E. Bellman and L. A. Zadeh, Decision-making in a fuzzy environment, Manage. Sci., vol. 17, no. 4, pp. 141–164, 1970.
L. Zadeh, Fuzzy sets and information granularity, Inform. Control, vol. 8, no. 3, pp. 338–353, 1965.
A. Jaśkiewicz and A. S. Nowak, Discounted dynamic programming with unbounded returns: Application to economic models, J. Math. Anal. Appl., vol. 378, no. 2, pp. 450–462, 2011.
O. Tahvonen, M. F. Quaas, and R. Voss, Harvesting selectivity and stochastic recruitment in economic models of age-structured fisheries, J. Environ. Econ. Manag., vol. 92, pp. 659–676, 2018.
H. Cruz-Suárez and R. Montes de Oca, Discounted markov control processes induced by deterministic systems, Kybernetika, vol. 42, no. 6, pp. 647–664, 2006.
S. Abbasbandy and T. Hajjari, A new approach for ranking of trapezoidal fuzzy numbers, Comput. Math. Appl., vol. 57, no. 3, pp. 413–419, 2009.
S. Rezvani and M. Molani, Representation of trapezoidal fuzzy numbers with shape function, Ann. Fuzzy Math. Inform., vol. 8, no. 1, pp. 89–112, 2014.
A. I. Ban, Triangular and parametric approximations of fuzzy numbers—Inadvertences and corrections, Fuzzy Sets Syst., vol. 160, no. 21, pp. 3048–3058, 2009.
W. Zeng and H. Li, Weighted triangular approximation of fuzzy numbers, Int. J. Approx. Reason., vol. 46, no. 1, pp. 137–150, 2007.
N. Furukawa, Paramentric orders on fuzzy numbers and their roles in fuzzy optimization problems, Optimization, vol. 40, pp. 171–192, 1997.
H. Cruz-Suárez, K. Carrero-Vera, and R. Montes-de Oca, Markov decision processes on finite spaces with fuzzy total rewards, Kybernetika, vol. 52, no. 2, pp. 180–199, 2022.
M. Hosaka, M. Kurano, and Y. Huang, Controlled markov set-chains with discounting, J. Appl. Prob., vol. 35, pp. 293–302, 1998.
M. Kurano, M. Yasuda, J. I. Nakagami, and Y. Yoshida, Markov-type fuzzy decision processes with a discounted reward on a closed interval, Eur. J. Oper. Res., vol. 92, no. 3, pp. 649–662, 1996.
A. Semmouri, M. Jourhmane, and Z. Belhallaj, Discounted Markov decision processes with fuzzy costs, Ann. Oper. Res., vol. 295, no. 2, pp. 769–786, 2020.
M. L. Puri and D. A. Ralescu, Fuzzy random variable, J. Math. Anal. Appl., vol. 114, no. 2, pp. 402–422, 1986.
R. Montes de Oca and H. Cruz-Suárez, Optimal policies in the class of infinitely differentiable functions for discounted linear-quadratic models, Int. J. Pure Appl. Math., vol. 58, no. 1, pp. 77–85, 2010.
H. Cruz-Suárez, R. Montes-de-Oca, and R. I. Ortega-Gutiérrez, An extended version of average Markov decision processes on discrete spaces under fuzzy environment, Kybernetika, pp. 160–178, 2023.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).