Prototyping a ROOT-based distributed analysis workflow for HL-LHC: the CMS use case

07/24/2023
by   Tommaso Tedeschi, et al.
0

The challenges expected for the next era of the Large Hadron Collider (LHC), both in terms of storage and computing resources, provide LHC experiments with a strong motivation for evaluating ways of rethinking their computing models at many levels. Great efforts have been put into optimizing the computing resource utilization for the data analysis, which leads both to lower hardware requirements and faster turnaround for physics analyses. In this scenario, the Compact Muon Solenoid (CMS) collaboration is involved in several activities aimed at benchmarking different solutions for running High Energy Physics (HEP) analysis workflows. A promising solution is evolving software towards more user-friendly approaches featuring a declarative programming model and interactive workflows. The computing infrastructure should keep up with this trend by offering on the one side modern interfaces, and on the other side hiding the complexity of the underlying environment, while efficiently leveraging the already deployed grid infrastructure and scaling toward opportunistic resources like public cloud or HPC centers. This article presents the first example of using the ROOT RDataFrame technology to exploit such next-generation approaches for a production-grade CMS physics analysis. A new analysis facility is created to offer users a modern interactive web interface based on JupyterLab that can leverage HTCondor-based grid resources on different geographical sites. The physics analysis is converted from a legacy iterative approach to the modern declarative approach offered by RDataFrame and distributed over multiple computing nodes. The new scenario offers not only an overall improved programming experience, but also an order of magnitude speedup increase with respect to the previous approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2022

A distributed computing infrastructure for LOFAR Italian community

The LOw-Frequency ARray is a low-frequency radio interferometer composed...
research
10/10/2020

AstroDS – A Distributed Storage for Astrophysics of Cosmic Rays. Current Status

Currently, the processing of scientific data in astroparticle physics is...
research
01/22/2019

Using Big Data Technologies for HEP Analysis

The HEP community is approaching an era were the excellent performances ...
research
07/15/2021

Exploring Object Stores for High-Energy Physics Data Storage

Over the last two decades, ROOT TTree has been used for storing over one...
research
06/02/2022

A Serverless Engine for High Energy Physics Distributed Analysis

The Large Hadron Collider (LHC) at CERN has generated in the last decade...
research
06/23/2021

Design and engineering of a simplified workflow execution for the MG5aMC event generator on GPUs and vector CPUs

Physics event generators are essential components of the data analysis s...
research
03/09/2023

Dedicated Analysis Facility for HEP Experiments

High-energy physics (HEP) provides ever-growing amount of data. To analy...

Please sign up or login with your details

Forgot password? Click here to reset