Transparent GPU Execution of NumPy Applications

Troels Blum, Mads Ruben Burgdorff Kristensen, Brian Vinter

7 Citations (Scopus)

Abstract

In this work, we present a back-end for the Python library NumPy that utilizes the GPU seamlessly. We use dynamic code generation to generate kernels, and data is moved transparently to and from the GPU. For the integration into NumPy, we use the Bohrium runtime system. Bohrium hooks into NumPy through the implicit data parallelization of array operations, this approach requires no annotations or other code modifications. The key motivation for our GPU computation back-end is to transform high-level Python/NumPy applications to the lowlevel GPU executable kernels, with the goal of obtaining highperformance, high-productivity and high-portability, HP3. We provide a performance study of the GPU back-end that includes four well-known benchmark applications, Black-Scholes, Successive Over-relaxation, Shallow Water, and N-body, implemented in pure Python/NumPy. We demonstrate an impressive 834 times speed up for the Black-Scholes application, and an average speedup of 124 times across the four benchmarks.

Original languageEnglish
Title of host publicationParallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), 2014 IEEE 28th International
Publication date27 Nov 2014
Publication statusPublished - 27 Nov 2014

Fingerprint

Dive into the research topics of 'Transparent GPU Execution of NumPy Applications'. Together they form a unique fingerprint.

Cite this