DePa: Simple, Provably Efficient, and Practical Order Maintenance for Task Parallelism
A number of problems in parallel computing require reasoning about the dependency structure in parallel programs. For example, dynamic race detection relies on efficient "on-the-fly" determination of dependencies between sequential and parallel tasks (e.g. to quickly determine whether or not two memory accesses occur in parallel). Several solutions to this "parallel order maintenance" problem has been proposed, but they all have several drawbacks, including lack of provable bounds, high asymptotic or practical overheads, and poor support for parallel execution. In this paper, we present a solution to the parallel order maintenance problem that is provably efficient, fully parallel, and practical. Our algorithm – called DePa – represents a computation as a graph and encodes vertices in the graph with two components: a dag-depth and a fork-path. In this encoding, each query requires O(f/ω) work, where f is the minimum dynamic nesting depth of the two vertices compared, and ω is the word-size. In the common case (where f is small, e.g., less than 100), each query requires only a single memory lookup and a small constant number of bitwise instructions. Furthermore, graph maintenance at forks and joins requires only constant work, resulting in no asymptotic impact on overall work and span. DePa is therefore work-efficient and fully parallel.
READ FULL TEXT