Fusion of parallel array operations

Mads Ruben Burgdorff Kristensen, Simon Andreas Frimann Lund, Troels Blum, James Emil Avery

11 Citations (Scopus)

Abstract

We address the problem of fusing array operations based on criteria such as shape compatibility, data reuse, and minimizing for data reuse, the fusion problem has been formulated as a static weighted graph partitioning problem (known as the Weighted Loop Fusion problem). We show that this scheme cannot accurately track data reuse between multiple independent loops, since it overestimates total data reuse of certain cases. Our formulation in terms of partitions allows use of realistic cost functions that can track resource usage accurately. We give correctness proofs, and prove that WSP can maximize data reuse in programs exactly, in contrast to prior work. For the exact optimal solution, which is NP-hard to find, we present a branch-and-bound algorithm together with a polynomial-time preconditioner that reduces the problem size significantly in practice. We further present a polynomial-time greedy approximation that is fast enough to use for JIT-compilation and gives near-optimal results in practice. All algorithms have been implemented in the automatic parallelization platform Bohrium, run on a set of benchmarks, and compared to existing methods from the literature.

Original languageEnglish
Title of host publication PACT' 16 : Proceedings of the 2016 International Conference on Parallel Architectures and Compilation
Number of pages15
PublisherAssociation for Computing Machinery
Publication date2016
Pages71-85
ISBN (Print)978-1-4503-4121-9
DOIs
Publication statusPublished - 2016
EventThe 25th International Conference on Parallel Architectures and Compilation Techniques - Haifa, Israel
Duration: 11 Sept 201615 Sept 2016
Conference number: 25

Conference

ConferenceThe 25th International Conference on Parallel Architectures and Compilation Techniques
Number25
Country/TerritoryIsrael
CityHaifa
Period11/09/201615/09/2016

Fingerprint

Dive into the research topics of 'Fusion of parallel array operations'. Together they form a unique fingerprint.

Cite this