Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications

10/23/2017
by   Vivek Balasubramanian, et al.
0

Many scientific problems require multiple distinct computational tasks to be executed in order to achieve a desired solution. We introduce the Ensemble Toolkit (EnTK) to address the challenges of scale, diversity and reliability they pose. We describe the design and implementation of EnTK, characterize its performance and integrate it with two distinct exemplar use cases: seismic inversion and adaptive analog ensembles. We perform nine experiments, characterizing EnTK overheads, strong and weak scalability, and the performance of two use case implementations, at scale and on production infrastructures. We show how EnTK meets the following general requirements: (i) implementing dedicated abstractions to support the description and execution of ensemble applications; (ii) support for execution on heterogeneous computing infrastructures; (iii) efficient scalability up to O(10^4) tasks; and (iv) fault tolerance. We discuss novel computational capabilities that EnTK enables and the scientific advantages arising thereof. We propose EnTK as an important and unique addition to the suite of tools in support of production scientific computing.

READ FULL TEXT
research
04/12/2018

Implementing Adaptive Ensemble Biomolecular Applications at Scale

Many scientific problems require multiple distinct computational tasks t...
research
04/12/2018

Adaptive Ensemble Biomolecular Simulations at Scale

Recent advances in both theory and methods have created opportunities to...
research
09/05/2022

ScalSALE: Scalable SALE Benchmark Framework for Supercomputers

Supercomputers worldwide provide the necessary infrastructure for ground...
research
04/15/2022

Saga: A Platform for Continuous Construction and Serving of Knowledge At Scale

We introduce Saga, a next-generation knowledge construction and serving ...
research
04/10/2021

Achieving 100X faster simulations of complex biological phenomena by coupling ML to HPC ensembles

The use of ML methods to dynamically steer ensemble-based simulations pr...
research
06/27/2023

Challenges and Opportunities for RISC-V Architectures towards Genomics-based Workloads

The use of large-scale supercomputing architectures is a hard requiremen...
research
03/28/2023

ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

As we enter the exascale computing era, efficiently utilizing power and ...

Please sign up or login with your details

Forgot password? Click here to reset