Financial software on GPUs: between Haskell and Fortran

Cosmin Eugen Oancea, Christian Andreetta, Jost Berthold, Alain Frisch, Fritz Henglein

10 Citationer (Scopus)

Abstract

This paper presents a real-world pricing kernel for financial derivatives and evaluates the language and compiler tool chain that would allow expressive, hardware-neutral algorithm implementation and efficient execution on graphics-processing units (GPU). The language issues refer to preserving algorithmic invariants, e.g., inherent parallelism made explicit by map-reduce-scan functional combinators. Efficient execution is achieved by manually; applying a series of generally-applicable compiler transformations that allows the generated-OpenCL code to yield speedups as high as 70x and 540x on a commodity mobile and desktop GPU, respectively. Apart from the concrete speed-ups attained, our contributions are twofold: First, from a language perspective;, we illustrate that even state-of-the-art auto-parallelization techniques are incapable of discovering all the requisite data parallelism when rendering the functional code in Fortran-style imperative array processing form. Second, from a performance perspective;, we study which compiler transformations are necessary to map the high-level functional code to hand-optimized OpenCL code for GPU execution. We discover a rich optimization space with nontrivial trade-offs and cost models. Memory reuse in map-reduce patterns, strength reduction, branch divergence optimization, and memory access coalescing, exhibit significant impact individually. When combined, they enable essentially full utilization of all GPU cores. Functional programming has played a crucial double role in our case study: Capturing the naturally data-parallel structure of the pricing algorithm in a transparent, reusable and entirely hardware-independent fashion; and supporting the correctness of the subsequent compiler transformations to a hardware-oriented target language by a rich class of universally valid equational properties. Given the observed difficulty of automatically parallelizing imperative sequential code and the inherent labor of porting hardware-oriented and -optimized programs, our case study suggests that functional programming technology can facilitate high-level; expression of leading-edge performant portable; high-performance systems for massively parallel hardware architectures.

OriginalsprogEngelsk
TitelFHPC’12 : Proceedings of the 1st ACM SIGPLAN Workshop on Functional High Performance Computing
Antal sider12
ForlagAssociation for Computing Machinery
Publikationsdato2012
Sider61-72
ISBN (Trykt)978-1-4503-1577-7
DOI
StatusUdgivet - 2012
Begivenhed1st ACM SIGPLAN Workshop on Functional High-Performance Computing - København, Danmark
Varighed: 15 sep. 201215 sep. 2012
Konferencens nummer: 1

Konference

Konference1st ACM SIGPLAN Workshop on Functional High-Performance Computing
Nummer1
Land/OmrådeDanmark
ByKøbenhavn
Periode15/09/201215/09/2012

Fingeraftryk

Dyk ned i forskningsemnerne om 'Financial software on GPUs: between Haskell and Fortran'. Sammen danner de et unikt fingeraftryk.

Citationsformater