Scalable Infrastructure for Workload Characterization of Cluster Traces

by   Thomas van Loo, et al.

In the recent past, characterizing workloads has been attempted to gain a foothold in the emerging serverless cloud market, especially in the large production cloud clusters of Google, AWS, and so forth. While analyzing and characterizing real workloads from a large production cloud cluster benefits cloud providers, researchers, and daily users, analyzing the workload traces of these clusters has been an arduous task due to the heterogeneous nature of data. This article proposes a scalable infrastructure based on Google's dataproc for analyzing the workload traces of cloud environments. We evaluated the functioning of the proposed infrastructure using the workload traces of Google cloud cluster-usage-traces-v3. We perform the workload characterization on this dataset, focusing on the heterogeneity of the workload, the variations in job durations, aspects of resources consumption, and the overall availability of resources provided by the cluster. The findings reported in the paper will be beneficial for cloud infrastructure providers and users while managing the cloud computing resources, especially serverless platforms.


page 1

page 2

page 3

page 4


A Deep Dive into the Google Cluster Workload Traces: Analyzing the Application Failure Characteristics and User Behaviors

Large-scale cloud data centers have gained popularity due to their high ...

On the Potential of Execution Traces for Batch Processing Workload Optimization in Public Clouds

With the growing amount of data, data processing workloads and the manag...

Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider

Function as a Service (FaaS) has been gaining popularity as a way to dep...

Cloud Workload Prediction based on Workflow Execution Time Discrepancies

Infrastructure as a service clouds hide the complexity of maintaining th...

Miners in the Cloud: Measuring and Analyzing Cryptocurrency Mining in Public Clouds

Cryptocurrencies, arguably the most prominent application of blockchains...

Mystique: Accurate and Scalable Production AI Benchmarks Generation

Building and maintaining large AI fleets to efficiently support the fast...

Implementing a scalable and elastic computing environment based on Cloud Containers

In this article we look at the potential of cloud containers and we prov...

Please sign up or login with your details

Forgot password? Click here to reset