Towards a streaming model for nested data parallelism

Frederik Meisner Madsen; Andrzej Filinski

doi:10.1145/2502323.2502330

Towards a streaming model for nested data parallelism

Frederik Meisner Madsen, Andrzej Filinski

7 Citationer (Scopus)

Abstract

The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening execution strategy, comes at the price of potentially prohibitive space usage in the common case of computations with an excess of available parallelism, such as dense-matrix multiplication.

We present a simple nested data-parallel functional language and associated cost semantics that retains NESL's intuitive work--depth model for time complexity, but also allows highly parallel computations to be expressed in a space-efficient way, in the sense that memory usage on a single (or a few) processors is of the same order as for a sequential formulation of the algorithm, and in general scales smoothly with the actually realized degree of parallelism, not the potential parallelism.

The refined semantics is based on distinguishing formally between fully materialized (i.e., explicitly allocated in memory all at once) "vectors" and potentially ephemeral "sequences" of values, with the latter being bulk-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level.

The language definition and implementation are still very much work in progress, but we do present some preliminary examples and timings, suggesting that the streaming model has practical potential.

Originalsprog	Engelsk
Titel	FHPC '13 : proceedings of the 2nd ACM SIGPLAN Workshop on Functional High-Performance Computing
Antal sider	12
Forlag	Association for Computing Machinery
Publikationsdato	2013
Sider	13-24
ISBN (Elektronisk)	978-1-4503-2381-9
DOI	https://doi.org/10.1145/2502323.2502330
Status	Udgivet - 2013
Begivenhed	2nd ACM SIGPLAN Workshop on Functional High-Performance Computing - Boston, USA Varighed: 23 sep. 2013 → 23 sep. 2013 Konferencens nummer: 2

Konference

Konference	2nd ACM SIGPLAN Workshop on Functional High-Performance Computing
Nummer	2
Land/Område	USA
By	Boston
Periode	23/09/2013 → 23/09/2013

Adgang til dokumentet

10.1145/2502323.2502330

Citationsformater

@inproceedings{2a97ddcec55543f8b39fd9bbe7a9e829,

title = "Towards a streaming model for nested data parallelism",

abstract = "The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening execution strategy, comes at the price of potentially prohibitive space usage in the common case of computations with an excess of available parallelism, such as dense-matrix multiplication.We present a simple nested data-parallel functional language and associated cost semantics that retains NESL's intuitive work--depth model for time complexity, but also allows highly parallel computations to be expressed in a space-efficient way, in the sense that memory usage on a single (or a few) processors is of the same order as for a sequential formulation of the algorithm, and in general scales smoothly with the actually realized degree of parallelism, not the potential parallelism.The refined semantics is based on distinguishing formally between fully materialized (i.e., explicitly allocated in memory all at once) {"}vectors{"} and potentially ephemeral {"}sequences{"} of values, with the latter being bulk-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level.The language definition and implementation are still very much work in progress, but we do present some preliminary examples and timings, suggesting that the streaming model has practical potential.",

author = "Madsen, {Frederik Meisner} and Andrzej Filinski",

year = "2013",

doi = "10.1145/2502323.2502330",

language = "English",

pages = "13--24",

booktitle = "FHPC '13",

publisher = "Association for Computing Machinery",

note = "2nd ACM SIGPLAN Workshop on Functional High-Performance Computing, FHPC '13 ; Conference date: 23-09-2013 Through 23-09-2013",

}

TY - GEN

T1 - Towards a streaming model for nested data parallelism

AU - Madsen, Frederik Meisner

AU - Filinski, Andrzej

N1 - Conference code: 2

PY - 2013

Y1 - 2013

N2 - The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening execution strategy, comes at the price of potentially prohibitive space usage in the common case of computations with an excess of available parallelism, such as dense-matrix multiplication.We present a simple nested data-parallel functional language and associated cost semantics that retains NESL's intuitive work--depth model for time complexity, but also allows highly parallel computations to be expressed in a space-efficient way, in the sense that memory usage on a single (or a few) processors is of the same order as for a sequential formulation of the algorithm, and in general scales smoothly with the actually realized degree of parallelism, not the potential parallelism.The refined semantics is based on distinguishing formally between fully materialized (i.e., explicitly allocated in memory all at once) "vectors" and potentially ephemeral "sequences" of values, with the latter being bulk-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level.The language definition and implementation are still very much work in progress, but we do present some preliminary examples and timings, suggesting that the streaming model has practical potential.

AB - The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening execution strategy, comes at the price of potentially prohibitive space usage in the common case of computations with an excess of available parallelism, such as dense-matrix multiplication.We present a simple nested data-parallel functional language and associated cost semantics that retains NESL's intuitive work--depth model for time complexity, but also allows highly parallel computations to be expressed in a space-efficient way, in the sense that memory usage on a single (or a few) processors is of the same order as for a sequential formulation of the algorithm, and in general scales smoothly with the actually realized degree of parallelism, not the potential parallelism.The refined semantics is based on distinguishing formally between fully materialized (i.e., explicitly allocated in memory all at once) "vectors" and potentially ephemeral "sequences" of values, with the latter being bulk-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level.The language definition and implementation are still very much work in progress, but we do present some preliminary examples and timings, suggesting that the streaming model has practical potential.

U2 - 10.1145/2502323.2502330

DO - 10.1145/2502323.2502330

M3 - Article in proceedings

SP - 13

EP - 24

BT - FHPC '13

PB - Association for Computing Machinery

T2 - 2nd ACM SIGPLAN Workshop on Functional High-Performance Computing

Y2 - 23 September 2013 through 23 September 2013

ER -