ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code

06/22/2023
by   Kazuaki Matsumura, et al.
0

Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they require repetitive implementation to perform similar analyses due to the lack of cooperation. To address this issue, modern optimization techniques, such as equality saturation, allow for exhaustive term rewriting at various levels of inputs, thereby simplifying compiler design. In this paper, we propose equality saturation to optimize sequential codes utilized in directive-based programming for GPUs. Our approach simultaneously realizes less computation, less memory access, and high memory throughput. Our fully-automated framework constructs single-assignment forms from inputs to be entirely rewritten while keeping dependencies and extracts optimal cases. Through practical benchmarks, we demonstrate a significant performance improvement on several compilers. Furthermore, we highlight the advantages of computational reordering and emphasize the significance of memory-access order for modern GPUs.

READ FULL TEXT

page 3

page 7

page 8

page 9

research
07/18/2021

Effective GPU Sharing Under Compiler Guidance

Modern computing platforms tend to deploy multiple GPUs (2, 4, or more) ...
research
04/17/2020

GEVO: GPU Code Optimization using EvolutionaryComputation

GPUs are a key enabler of the revolution in machine learning and high pe...
research
04/17/2020

GEVO: GPU Code Optimization using Evolutionary Computation

GPUs are a key enabler of the revolution in machine learning and high pe...
research
04/28/2022

Black-Scholes Option Pricing on Intel CPUs and GPUs: Implementation on SYCL and Optimization Techniques

The Black-Scholes option pricing problem is one of the widely used finan...
research
04/24/2019

Exploring Memory Persistency Models for GPUs

Given its high integration density, high speed, byte addressability, and...
research
10/29/2022

Enabling Data Movement and Computation Pipelining in Deep Learning Compiler

Pipelining between data loading and computation is a critical tensor pro...
research
12/01/2022

BaCO: A Fast and Portable Bayesian Compiler Optimization Framework

We introduce the Bayesian Compiler Optimization framework (BaCO), a gene...

Please sign up or login with your details

Forgot password? Click here to reset