Optimal Caching for Low Latency in Distributed Coded Storage Systems

12/05/2020
by   Kaiyang Liu, et al.
0

Erasure codes have been widely considered a promising solution to enhance data reliability at low storage costs. However, in modern geo-distributed storage systems, erasure codes may incur high data access latency as they require data retrieval from multiple remote storage nodes. This hinders the extensive application of erasure codes to data-intensive applications. This paper proposes novel caching schemes to achieve low latency in distributed coded storage systems. Experiments based on Amazon Simple Storage Service confirm the positive correlation between the latency and the physical distance of data retrieval. The average data access latency is used the performance metric to quantify the benefits of caching. Assuming that the future data popularity and network latency information is available, an offline caching scheme is proposed to find the optimal caching solution. Guided by the optimal scheme, an online caching scheme is proposed according to the measured data popularity and network latency information in real time. Experiment results demonstrate that the online scheme can approximate the optimal scheme well with dramatically reduced computation complexity.

READ FULL TEXT

page 1

page 9

research
05/21/2020

Modeling and Optimization of Latency in Erasure-coded Storage Systems

As consumers are increasingly engaged in social networking and E-commerc...
research
05/29/2021

SMURF: Efficient and Scalable Metadata Access for Distributed Applications

In parallel with big data processing and analysis dominating the usage o...
research
02/03/2021

Optimizing QoS for Erasure-Coded Wireless Data Centers

Cloud computing facilitates the access of applications and data from any...
research
01/04/2021

Caching in Heterogeneous Satellite Networks with Fountain Codes

In this paper we investigate the performance of caching schemes based on...
research
07/15/2021

Exploring Object Stores for High-Energy Physics Data Storage

Over the last two decades, ROOT TTree has been used for storing over one...
research
04/18/2023

RPDP: An Efficient Data Placement based on Residual Performance for P2P Storage Systems

Storage systems using Peer-to-Peer (P2P) architecture are an alternative...
research
05/17/2022

A Novel K-Repetition Design for SCMA

This work presents a novel K-Repetition based HARQ scheme for LDPC coded...

Please sign up or login with your details

Forgot password? Click here to reset