AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (8.2 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered Storage

Department of Computing, University of Derby, Derby, DE22 1GB, UK
Department of Informatics, University of Leicester, Leicester, LE1 7RH, UK
Department of Computer Science, COMSATS University Islamabad, Islamabad 45550, Pakistan
Edinburgh Napier University, Edinburgh, EH11 4BN, UK
Show Author Information

Abstract

Data temperature is a response to the ever-growing amount of data. These data have to be stored, but they have been observed that only a small portion of the data are accessed more frequently at any one time. This leads to the concept of hot and cold data. Cold data can be migrated away from high-performance nodes to free up performance for higher priority data. Existing studies classify hot and cold data primarily on the basis of data age and usage frequency. We present this as a limitation in the current implementation of data temperature. This is due to the fact that age automatically assumes that all new data have priority and that usage is purely reactive. We propose new variables and conditions that influence smarter decision-making on what are hot or cold data and allow greater user control over data location and their movement. We identify new metadata variables and user-defined variables to extend the current data temperature value. We further establish rules and conditions for limiting unnecessary movement of the data, which helps to prevent wasted input output (I/O) costs. We also propose a hybrid algorithm that combines existing variables and new variables and conditions into a single data temperature. The proposed system provides higher accuracy, increases performance, and gives greater user control for optimal positioning of data within multi-tiered storage solutions.

References

【1】
【1】
 
 
Big Data Mining and Analytics
Pages 371-398

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Davies-Tagg D, Anjum A, Zahir A, et al. Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered Storage. Big Data Mining and Analytics, 2024, 7(2): 371-398. https://doi.org/10.26599/BDMA.2023.9020039

881

Views

85

Downloads

2

Crossref

2

Web of Science

2

Scopus

0

CSCD

Received: 24 April 2023
Revised: 29 July 2023
Accepted: 07 December 2023
Published: 22 April 2024
© The author(s) 2023.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).