Computing and Compressing Electron Repulsion Integrals on FPGAs

by   Xin Wu, et al.

The computation of electron repulsion integrals (ERIs) over Gaussian-type orbitals (GTOs) is a challenging problem in quantum-mechanics-based atomistic simulations. In practical simulations, several trillions of ERIs may have to be computed for every time step. In this work, we investigate FPGAs as accelerators for the ERI computation. We use template parameters, here within the Intel oneAPI tool flow, to create customized designs for 256 different ERI quartet classes, based on their orbitals. To maximize data reuse, all intermediates are buffered in FPGA on-chip memory with customized layout. The pre-calculation of intermediates also helps to overcome data dependencies caused by multi-dimensional recurrence relations. The involved loop structures are partially or even fully unrolled for high throughput of FPGA kernels. Furthermore, a lossy compression algorithm utilizing arbitrary bitwidth integers is integrated in the FPGA kernels. To our best knowledge, this is the first work on ERI computation on FPGAs that supports more than just the single most basic quartet class. Also, the integration of ERI computation and compression it a novelty that is not even covered by CPU or GPU libraries so far. Our evaluation shows that using 16-bit integer for the ERI compression, the fastest FPGA kernels exceed the performance of 10 GERIS (10 × 10^9 ERIs per second) on one Intel Stratix 10 GX 2800 FPGA, with maximum absolute errors around 10^-7 - 10^-5 Hartree. The measured throughput can be accurately explained by a performance model. The FPGA kernels deployed on 2 FPGAs outperform similar computations using the widely used libint reference on a two-socket server with 40 Xeon Gold 6148 CPU cores of the same process technology by factors up to 6.0x and on a new two-socket server with 128 EPYC 7713 CPU cores by up to 1.9x.


page 1

page 8


A Compilation Flow for the Generation of CNN Inference Accelerators on FPGAs

We present a compilation flow for the generation of CNN inference accele...

Synergistic CPU-FPGA Acceleration of Sparse Linear Algebra

This paper describes REAP, a software-hardware approach that enables hig...

A Survey of FPGA-Based Robotic Computing

Recent researches on robotics have shown significant improvement, spanni...

Experience with PCIe streaming on FPGA for high throughput ML inferencing

Achieving maximum possible rate of inferencing with minimum hardware res...

Theoretical Model of Computation and Algorithms for FPGA-based Hardware Accelerators

While FPGAs have been used extensively as hardware accelerators in indus...

AutoDSE: Enabling Software Programmers Design Efficient FPGA Accelerators

Adopting FPGA as an accelerator in datacenters is becoming mainstream fo...

ANDROMEDA: An FPGA Based RISC-V MPSoC Exploration Framework

With the growing demands of consumer electronic products, the computatio...

Please sign up or login with your details

Forgot password? Click here to reset