SCHeMa: Scheduling Scientific Containers on a Cluster of Heterogeneous Machines

03/24/2021
by   Thanasis Vergoulis, et al.
0

In the era of data-driven science, conducting computational experiments that involve analysing large datasets using heterogeneous computational clusters, is part of the everyday routine for many scientists. Moreover, to ensure the credibility of their results, it is very important for these analyses to be easily reproducible by other researchers. Although various technologies, that could facilitate the work of scientists in this direction, have been introduced in the recent years, there is still a lack of open source platforms that combine them to this end. In this work, we describe and demonstrate SCHeMa, an open-source platform that facilitates the execution and reproducibility of computational analysis on heterogeneous clusters, leveraging containerization, experiment packaging, workflow management, and machine learning technologies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/13/2018

Ten Simple Rules for Reproducible Research in Jupyter Notebooks

Reproducibility of computational studies is a hallmark of scientific met...
research
05/12/2020

Toward Enabling Reproducibility for Data-Intensive Research using the Whole Tale Platform

Whole Tale http://wholetale.org is a web-based, open-source platform for...
research
07/12/2023

CLAIMED – the open source framework for building coarse-grained operators for accelerated discovery in science

In modern data-driven science, reproducibility and reusability are key c...
research
03/15/2018

Sharing and Preserving Computational Analyses for Posterity with encapsulator

Open data and open source software have been proposed as the primary sol...
research
05/30/2019

ImJoy: an open-source computational platform for the deep learning era

Deep learning methods have shown extraordinary potential for analyzing v...
research
05/23/2022

Lotaru: Locally Estimating Runtimes of Scientific Workflow Tasks in Heterogeneous Clusters

Many scientific workflow scheduling algorithms need to be informed about...
research
10/14/2020

Valentine: Evaluating Matching Techniques for Dataset Discovery

Data scientists today search large data lakes to discover and integrate ...

Please sign up or login with your details

Forgot password? Click here to reset