Simulating Data Access Profiles of Computational Jobs in Data Grids

02/26/2019
by   Volodimir Begy, et al.
0

The data access patterns of applications running in computing grids are changing due to the recent proliferation of high speed local and wide area networks. The data-intensive jobs are no longer strictly required to run at the computing sites, where the respective input data are located. Instead, jobs may access the data employing arbitrary combinations of data-placement, stage-in and remote data access. These data access profiles exhibit partially non-overlapping throughput bottlenecks. This fact can be exploited in order to minimize the time jobs spend waiting for input data. In this work we present a novel grid computing simulator, which puts a heavy emphasis on the various data access profiles. The fundamental assumptions underlying our simulator are justified by empirical experiments performed in the Worldwide LHC Computing Grid (WLCG) at CERN. We demonstrate how to calibrate the simulator parameters in accordance with the true system using posterior inference with likelihood-free Markov Chain Monte Carlo. Thereafter, we validate the simulator's output with respect to an authentic production workload from WLCG, demonstrating its remarkable accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2018

Using ATLAS@Home to exploit extra CPU from busy grid sites

Grid computing typically provides most of the data processing resources ...
research
08/07/2017

Delayed acceptance ABC-SMC

Approximate Bayesian computation (ABC) is now an established technique f...
research
04/20/2017

Intrusion Prevention and Detection in Grid Computing - The ALICE Case

Grids allow users flexible on-demand usage of computing resources throug...
research
06/11/2015

Optimization Monte Carlo: Efficient and Embarrassingly Parallel Likelihood-Free Inference

We describe an embarrassingly parallel, anytime Monte Carlo method for l...
research
05/24/2023

Deep Learning-enabled MCMC for Probabilistic State Estimation in District Heating Grids

Flexible district heating grids form an important part of future, low-ca...
research
01/12/2018

Arhuaco: Deep Learning and Isolation Based Security for Distributed High-Throughput Computing

Grid computing systems require innovative methods and tools to identify ...
research
08/13/2022

Regression test of various versions of STRmix

STRmix has been in operational use since 2012 for the interpretation of ...

Please sign up or login with your details

Forgot password? Click here to reset