ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing

11/10/2020
by   Cheng Tan, et al.
0

The next generation HPC and data centers are likely to be reconfigurable and data-centric due to the trend of hardware specialization and the emergence of data-driven applications. In this paper, we propose ARENA – an asynchronous reconfigurable accelerator ring architecture as a potential scenario on how the future HPC and data centers will be like. Despite using the coarse-grained reconfigurable arrays (CGRAs) as the substrate platform, our key contribution is not only the CGRA-cluster design itself, but also the ensemble of a new architecture and programming model that enables asynchronous tasking across a cluster of reconfigurable nodes, so as to bring specialized computation to the data rather than the reverse. We presume distributed data storage without asserting any prior knowledge on the data distribution. Hardware specialization occurs at runtime when a task finds the majority of data it requires are available at the present node. In other words, we dynamically generate specialized CGRA accelerators where the data reside. The asynchronous tasking for bringing computation to data is achieved by circulating the task token, which describes the data-flow graphs to be executed for a task, among the CGRA cluster connected by a fast ring network. Evaluations on a set of HPC and data-driven applications across different domains show that ARENA can provide better parallel scalability with reduced data movement (53.9 contemporary compute-centric parallel models, ARENA can bring on average 4.37x speedup. The synthesized CGRAs and their task-dispatchers only occupy 2.93mm^2 chip area under 45nm process technology and can run at 800MHz with on average 759.8mW power consumption. ARENA also supports the concurrent execution of multi-applications, offering ideal architectural support for future high-performance parallel computing and data analytics systems.

READ FULL TEXT

page 6

page 10

research
04/09/2020

A Survey on Coarse-Grained Reconfigurable Architectures from a Performance Perspective

With the end of both Dennard's scaling and Moore's law, computer users a...
research
08/25/2020

Optically Connected Memory for Disaggregated Data Centers

Recent advances in integrated photonics enable the implementation of rec...
research
06/19/2023

A multithread AES accelerator for Cyber-Physical Systems

Computing elements of CPSs must be flexible to ensure interoperability; ...
research
05/31/2019

Efficient Multiway Hash Join on Reconfigurable Hardware

We propose the algorithms for performing multiway joins using a new type...
research
10/01/2020

Weighing up the new kid on the block: Impressions of using Vitis for HPC software development

The use of reconfigurable computing, and FPGAs in particular, has strong...
research
10/11/2022

Medha: Microcoded Hardware Accelerator for computing on Encrypted Data

Homomorphic encryption (HE) enables computation on encrypted data, and h...
research
11/30/2022

Canal: A Flexible Interconnect Generator for Coarse-Grained Reconfigurable Arrays

The architecture of a coarse-grained reconfigurable array (CGRA) interco...

Please sign up or login with your details

Forgot password? Click here to reset