Strategies for regular segmented reductions on GPU

Rasmus Wriedt Larsen, Troels Henriksen

3 Citations (Scopus)

Abstract

We present and evaluate an implementation technique for regular segmented reductions on GPUs. Existing techniques tend to be either consistent in performance but relatively inefficient in absolute terms, or optimised for specific workloads and thereby exhibiting bad performance for certain input. We propose three different strategies for segmented reduction of regular arrays, each optimised for a particular workload. We demonstrate an implementation in the Futhark compiler that is able to employ all three strategies and automatically select the appropriate one at runtime. While our evaluation is in the context of the Futhark compiler, the implementation technique is applicable to any library or language that has a need for segmented reductions. We evaluate the technique on four microbenchmarks, two of which we also compare to implementations in the CUB library for GPU programming, as well as on two application benchmarks from the Rodinia suite. On the latter, we obtain speedups ranging from 1.3× to 1.7× over a previous implementation based on scans.

Original languageEnglish
Title of host publicationProceedings of the 6th ACM SIGPLAN International Workshop on Functional High-Performance Computing
Number of pages11
PublisherAssociation for Computing Machinery
Publication date2017
Pages42-52
ISBN (Electronic)978-1-4503-5181-2
DOIs
Publication statusPublished - 2017
Event6th ACM SIGPLAN International Workshop on Functional High-Performance Computing - Oxford, United Kingdom
Duration: 7 Sept 20177 Sept 2017
Conference number: 6

Workshop

Workshop6th ACM SIGPLAN International Workshop on Functional High-Performance Computing
Number6
Country/TerritoryUnited Kingdom
CityOxford
Period07/09/201707/09/2017

Keywords

  • Functional programming
  • GPGPU
  • Parallelism

Fingerprint

Dive into the research topics of 'Strategies for regular segmented reductions on GPU'. Together they form a unique fingerprint.

Cite this