POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning

by   Shalini Jain, et al.

The ever increasing memory requirements of several applications has led to increased demands which might not be met by embedded devices. Constraining the usage of memory in such cases is of paramount importance. It is important that such code size improvements should not have a negative impact on the runtime. Improving the execution time while optimizing for code size is a non-trivial but a significant task. The ordering of standard optimization sequences in modern compilers is fixed, and are heuristically created by the compiler domain experts based on their expertise. However, this ordering is sub-optimal, and does not generalize well across all the cases. We present a reinforcement learning based solution to the phase ordering problem, where the ordering improves both the execution time and code size. We propose two different approaches to model the sequences: one by manual ordering, and other based on a graph called Oz Dependence Graph (ODG). Our approach uses minimal data as training set, and is integrated with LLVM. We show results on x86 and AArch64 architectures on the benchmarks from SPEC-CPU 2006, SPEC-CPU 2017 and MiBench. We observe that the proposed model based on ODG outperforms the current Oz sequence both in terms of size and execution time by 6.19 2017 benchmarks, on an average.


Less is More: Exploiting the Standard Compiler Optimization Levels for Better Performance and Energy Consumption

This paper presents the interesting observation that by performing fewer...

Tuning symplectic integrators is easy and worthwhile

Many applications in computational physics that use numerical integrator...

Greedy Clustering-Based Algorithm for Improving Multi-point Robotic Manipulation Sequencing

The problem of optimizing a sequence of tasks for a robot, also known as...

Optimizing Binary Code Produced by Valgrind (Project Report on Virtual Execution Environments Course - AVExe)

Valgrind is a widely used framework for dynamic binary instrumentation a...

Improving tasks throughput on accelerators using OpenCL command concurrency

A heterogeneous architecture composed by a host and an accelerator must ...

Compiling a Calculus for Relaxed Memory: Practical constraint-based low-level concurrency

Crary and Sullivan's Relaxed Memory Calculus (RMC) proposed a new declar...

Please sign up or login with your details

Forgot password? Click here to reset