Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered Storage

Dominic Davies-Tagg; Ashiq Anjum; Ali Zahir; Lu Liu; Muhammad Usman Yaseen; Nick Antonopoulos

doi:10.26599/BDMA.2023.9020039

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (8.2 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered Storage

Dominic Davies-Tagg^¹, Ashiq Anjum^², Ali Zahir^²(

), Lu Liu^², Muhammad Usman Yaseen^³, Nick Antonopoulos^⁴

1Department of Computing, University of Derby, Derby, DE22 1GB, UK

2Department of Informatics, University of Leicester, Leicester, LE1 7RH, UK

3Department of Computer Science, COMSATS University Islamabad, Islamabad 45550, Pakistan

4Edinburgh Napier University, Edinburgh, EH11 4BN, UK

Show Author Information

Abstract

Data temperature is a response to the ever-growing amount of data. These data have to be stored, but they have been observed that only a small portion of the data are accessed more frequently at any one time. This leads to the concept of hot and cold data. Cold data can be migrated away from high-performance nodes to free up performance for higher priority data. Existing studies classify hot and cold data primarily on the basis of data age and usage frequency. We present this as a limitation in the current implementation of data temperature. This is due to the fact that age automatically assumes that all new data have priority and that usage is purely reactive. We propose new variables and conditions that influence smarter decision-making on what are hot or cold data and allow greater user control over data location and their movement. We identify new metadata variables and user-defined variables to extend the current data temperature value. We further establish rules and conditions for limiting unnecessary movement of the data, which helps to prevent wasted input output (I/O) costs. We also propose a hybrid algorithm that combines existing variables and new variables and conditions into a single data temperature. The proposed system provides higher accuracy, increases performance, and gives greater user control for optimal positioning of data within multi-tiered storage solutions.

Keywords

data temperature hot and cold data multi-tiered storage metadata variable multi-temperature system

References

【1】

Crossref Google Scholar

Big Data Mining and Analytics

Volume 7 Issue 2,
June 2024

Pages 371-398

DOI: 10.26599/BDMA.2023.9020039

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Davies-Tagg D, Anjum A, Zahir A, et al. Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered Storage. Big Data Mining and Analytics, 2024, 7(2): 371-398. https://doi.org/10.26599/BDMA.2023.9020039

1013

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 24 April 2023

Revised: 29 July 2023

Accepted: 07 December 2023

Published: 22 April 2024

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).