Improved Parallel Cache-Oblivious Algorithms for Dynamic Programming and Linear Algebra

09/25/2018
by   Yan Gu, et al.
0

For many cache-oblivious algorithms for dynamic programming and linear algebra, we observe that the key factor that affects the cache complexity is the number of input entries involved in each basic computation cell. In this paper, we propose a level of abstraction to capture this property, and refer to it as the k-d grid computation structure. We then show the computational lower bounds for this grid structure, and propose efficient and highly-parallel algorithms to compute such grid structure that optimize the number of arithmetic operations, parallel depth, and the cache complexity in both the classic setting when reads and writes have the same cost, and the asymmetric variant that considers writes to be more expensive than reads. Using the abstraction with the proposed algorithms as the implementation, we propose cache-oblivious algorithms for many fundamental problems with improved cache complexities in both the classic and asymmetric settings. The cache bounds are optimal in most applications we consider. Meanwhile, we also reduce the parallel depths of many problems. We believe that the novelty of our framework is of interests and leads to many new questions for future work.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset