We explore the use of caching both at the network edge and within User Equipment (UE) to alleviate traffic load of wireless networks. We develop a joint cache placement and delivery policy that maximizes the Quality of Service (QoS) while simultaneously minimizing backhaul load and UE power consumption, in the presence of an unknown time-variant file popularity. With file requests in a time slot being affected by download success in the previous slot, the caching system becomes a non-stationary Partial Observable Markov Decision Process (POMDP). We solve the problem in a deep reinforcement learning framework based on the Advantageous Actor-Critic (A2C) algorithm, comparing Feed Forward Neural Networks (FFNN) with a Long Short-Term Memory (LSTM) approach specifically designed to exploit the correlation of file popularity distribution across time slots. Simulation results show that using LSTM-based A2C outperforms FFNN-based A2C in terms of sample efficiency and optimality, demonstrating superior performance for the non-stationary POMDP problem. For caching at the UEs, we provide a distributed algorithm that reaches the objectives dictated by the agent controlling the network, with minimum energy consumption at the UEs, and minimum communication overhead.
P. Hassanzadeh, A. M. Tulino, J. Llorca, and E. Erkip, Rate-memory trade-off for caching and delivery of correlated sources, IEEE Trans. Inf. Theory, vol. 66, no. 4, pp. 2219–2251, 2020.
H. Wu, Y. Fan, Y. Wang, H. Ma, and L. Xing, A comprehensive review on edge caching from the perspective of total process: Placement, policy and delivery, Sensors, vol. 21, no. 15, p. 5033, 2021.
Y. Chen, M. Ding, J. Li, Z. Lin, G. Mao, and L. Hanzo, Probabilistic small-cell caching: Performance analysis and optimization, IEEE Trans. Veh. Technol., vol. 66, no. 5, pp. 4341–4354, 2017.
X. Xu and M. Tao, Modeling, analysis, and optimization of caching in multi-antenna small-cell networks, IEEE Trans. Wirel. Commun., vol. 18, no. 11, pp. 5454–5469, 2019.
M. Choi, A. F. Molisch, D. J. Han, D. Kim, J. Kim, and J. Moon, Probabilistic caching and dynamic delivery policies for categorized contents and consecutive user demands, IEEE Trans. Wirel. Commun., vol. 20, no. 4, pp. 2685–2699, 2021.
J. Wen, K. Huang, S. Yang, and V. O. K. Li, Cache-enabled heterogeneous cellular networks: Optimal tier-level content placement, IEEE Trans. Wirel. Commun., vol. 16, no. 9, pp. 5939–5952, 2017.
K. Li, C. Yang, Z. Chen, and M. Tao, Optimization and analysis of probabilistic caching in N-tier heterogeneous networks, IEEE Trans. Wirel. Commun., vol. 17, no. 2, pp. 1283–1297, 2018.
J. Wu, C. Yang, and B. Chen, Proactive caching and bandwidth allocation in heterogenous networks by learning from historical numbers of requests, IEEE Trans. Commun., vol. 68, no. 7, pp. 4394–4410, 2020.
Y. Cui and D. Jiang, Analysis and optimization of caching and multicasting in large-scale cache-enabled heterogeneous wireless networks, IEEE Trans. Wirel. Commun., vol. 16, no. 1, pp. 250–264, 2017.
C. Ye, Y. Cui, Y. Yang, and R. Wang, Optimal caching designs for perfect, imperfect, and unknown file popularity distributions in large-scale multi-tier wireless networks, IEEE Trans. Commun., vol. 67, no. 9, pp. 6612–6625, 2019.
M. Bayat, R. K. Mungara, and G. Caire, Achieving spatial scalability for coded caching via coded multipoint multicasting, IEEE Trans. Wirel. Commun., vol. 18, no. 1, pp. 227–240, 2019.
X. Peng, Y. Shi, J. Zhang, and K. B. Letaief, Layered Group sparse beamforming for cache-enabled green wireless networks, IEEE Trans. Commun., vol. 65, no. 12, pp. 5589–5603, 2017.
W. Sun, Y. Li, C. Hu, and M. Peng, Joint optimization of cache placement and bandwidth allocation in heterogeneous networks, IEEE Access, vol. 6, pp. 37250–37260, 2018.
F. Zhou, L. Fan, N. Wang, G. Luo, J. Tang, and W. Chen, A cache-aided communication scheme for downlink coordinated multipoint transmission, IEEE Access, vol. 6, pp. 1416–1427, 2018.
M. Amidzadeh, H. Al-Tous, G. Caire, and O. Tirkkonen, Caching in cellular networks based on multipoint multicast transmissions, IEEE Trans. Wirel. Commun., vol. 22, no. 4, pp. 2393–2408, 2023.
R. Li, C. Wang, Z. Zhao, R. Guo, and H. Zhang, The LSTM-based advantage actor-critic learning for resource management in network slicing with user mobility, IEEE Commun. Lett., vol. 24, no. 9, pp. 2005–2009, 2020.
Z. Zhang and M. Tao, Deep learning for wireless coded caching with unknown and time-variant content popularity, IEEE Trans. Wirel. Commun., vol. 20, no. 2, pp. 1152–1163, 2021.
M. Kumar, R. Rout, and D. Somayajulu, Cooperative cache update using multi-agent recurrent deep reinforcement learning for mobile edge networks, Comput. Netw., vol. 209, p. 108876, 2022.
E. Paluzo-Hidalgo, R. Gonzalez-Diaz, and M. A. Gutiérrez-Naranjo, Two-hidden-layer feed-forward networks are universal approximators: A constructive approach, Neural Netw., vol. 131, pp. 29–36, 2020.
S. Nath and J. Wu, Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems, Intelligent and Converged Networks, vol. 1, no. 2, pp. 181–198, 2020.
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
I. Grondman, L. Busoniu, G. A. D. Lopes, and R. Babuska, A survey of actor-critic reinforcement learning: Standard and natural policy gradients, IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 6, pp. 1291–1307, 2012.