hep_tables: Heterogeneous Array Programming for HEP

03/22/2021
by   Gordon Watts, et al.
0

Array operations are one of the most concise ways of expressing common filtering and simple aggregation operations that is the hallmark of the first step of a particle physics analysis: selection, filtering, basic vector operations, and filling histograms. The High Luminosity run of the Large Hadron Collider (HL-LHC), scheduled to start in 2026, will require physicists to regularly skim datasets that are over a PB in size, and repeatedly run over datasets that are 100's of TB's - too big to fit in memory. Declarative programming techniques are a way of separating the intent of the physicist from the mechanics of finding the data, processing the data, and using distributed computing to process it efficiently that is required to extract the plot or data desired in a timely fashion. This paper describes a prototype library that provides a framework for different sub-systems to cooperate in producing this data, using an array-programming declarative interface. This prototype has a ServiceX data-delivery sub-system and an awkward array sub-system cooperating to generate requested data. The ServiceX system runs against ATLAS xAOD data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2018

HDArray: Parallel Array Interface for Distributed Heterogeneous Devices

Heterogeneous clusters with nodes containing one or more accelerators, s...
research
08/28/2020

Coffea – Columnar Object Framework For Effective Analysis

The coffea framework provides a new approach to High-Energy Physics anal...
research
08/31/2022

pPython for Parallel Python Programming

pPython seeks to provide a parallel capability that provides good speed-...
research
08/23/2017

A Grammar for Reproducible and Painless Extract-Transform-Load Operations on Medium Data

Many interesting data sets available on the Internet are of a medium siz...
research
10/24/2016

A 481pJ/decision 3.4M decision/s Multifunctional Deep In-memory Inference Processor using Standard 6T SRAM Array

This paper describes a multi-functional deep in-memory processor for inf...
research
06/28/2022

NumS: Scalable Array Programming for the Cloud

Scientists increasingly rely on Python tools to perform scalable distrib...
research
12/02/2021

Memory-efficient array redistribution through portable collective communication

Modern large-scale deep learning workloads highlight the need for parall...

Please sign up or login with your details

Forgot password? Click here to reset