Parallel and Scalable Sparse Basic Linear Algebra Subprograms

Weifeng Liu

Parallel and Scalable Sparse Basic Linear Algebra Subprograms

Weifeng Liu

Abstract

Sparse basic linear algebra subprograms (BLAS) are fundamental building blocks for
numerous scientific computations and graph applications. Compared with Dense
BLAS, parallelization of Sparse BLAS routines entails extra challenges due to the irregularity
of sparse data structures. This thesis proposes new fundamental algorithms
and data structures that accelerate Sparse BLAS routines on modern massively parallel
processors: (1) a new heap data structure named ad-heap, for faster heap operations
on heterogeneous processors, (2) a new sparse matrix representation named CSR5, for
faster sparse matrix-vector multiplication (SpMV) on homogeneous processors such
as CPUs, GPUs and Xeon Phi, (3) a new CSR-based SpMV algorithm for a variety of
tightly coupled CPU-GPU heterogeneous processors, and (4) a new framework and
associated algorithms for sparse matrix-matrix multiplication (SpGEMM) on GPUs
and heterogeneous processors.
The thesis compares the proposed methods with state-of-the-art approaches on
six homogeneous and five heterogeneous processors from Intel, AMD and nVidia.
Using in total 38 sparse matrices as a benchmark suite, the experimental results show
that the proposed methods obtain significant performance improvement over the best
existing algorithms.

Originalsprog	Engelsk

Forlag	The Niels Bohr Institute, Faculty of Science, University of Copenhagen
Antal sider	181
Status	Udgivet - 2015

Adgang til dokumentet

PhDThesis_LiuForlagets udgivne version, 7,67 MB

http://rex.kb.dk/KGL:KGL:KGL01009197651

Andre filer og links

Sign in to request a library copy

Citationsformater

@phdthesis{d2f99abe8ffe40c78da0682c84242b68,

title = "Parallel and Scalable Sparse Basic Linear Algebra Subprograms",

abstract = "Sparse basic linear algebra subprograms (BLAS) are fundamental building blocks fornumerous scientific computations and graph applications. Compared with DenseBLAS, parallelization of Sparse BLAS routines entails extra challenges due to the irregularityof sparse data structures. This thesis proposes new fundamental algorithmsand data structures that accelerate Sparse BLAS routines on modern massively parallelprocessors: (1) a new heap data structure named ad-heap, for faster heap operationson heterogeneous processors, (2) a new sparse matrix representation named CSR5, forfaster sparse matrix-vector multiplication (SpMV) on homogeneous processors suchas CPUs, GPUs and Xeon Phi, (3) a new CSR-based SpMV algorithm for a variety oftightly coupled CPU-GPU heterogeneous processors, and (4) a new framework andassociated algorithms for sparse matrix-matrix multiplication (SpGEMM) on GPUsand heterogeneous processors.The thesis compares the proposed methods with state-of-the-art approaches onsix homogeneous and five heterogeneous processors from Intel, AMD and nVidia.Using in total 38 sparse matrices as a benchmark suite, the experimental results showthat the proposed methods obtain significant performance improvement over the bestexisting algorithms.",

author = "Weifeng Liu",

year = "2015",

language = "English",

publisher = "The Niels Bohr Institute, Faculty of Science, University of Copenhagen",

}

TY - BOOK

T1 - Parallel and Scalable Sparse Basic Linear Algebra Subprograms

AU - Liu, Weifeng

PY - 2015

Y1 - 2015

N2 - Sparse basic linear algebra subprograms (BLAS) are fundamental building blocks fornumerous scientific computations and graph applications. Compared with DenseBLAS, parallelization of Sparse BLAS routines entails extra challenges due to the irregularityof sparse data structures. This thesis proposes new fundamental algorithmsand data structures that accelerate Sparse BLAS routines on modern massively parallelprocessors: (1) a new heap data structure named ad-heap, for faster heap operationson heterogeneous processors, (2) a new sparse matrix representation named CSR5, forfaster sparse matrix-vector multiplication (SpMV) on homogeneous processors suchas CPUs, GPUs and Xeon Phi, (3) a new CSR-based SpMV algorithm for a variety oftightly coupled CPU-GPU heterogeneous processors, and (4) a new framework andassociated algorithms for sparse matrix-matrix multiplication (SpGEMM) on GPUsand heterogeneous processors.The thesis compares the proposed methods with state-of-the-art approaches onsix homogeneous and five heterogeneous processors from Intel, AMD and nVidia.Using in total 38 sparse matrices as a benchmark suite, the experimental results showthat the proposed methods obtain significant performance improvement over the bestexisting algorithms.

AB - Sparse basic linear algebra subprograms (BLAS) are fundamental building blocks fornumerous scientific computations and graph applications. Compared with DenseBLAS, parallelization of Sparse BLAS routines entails extra challenges due to the irregularityof sparse data structures. This thesis proposes new fundamental algorithmsand data structures that accelerate Sparse BLAS routines on modern massively parallelprocessors: (1) a new heap data structure named ad-heap, for faster heap operationson heterogeneous processors, (2) a new sparse matrix representation named CSR5, forfaster sparse matrix-vector multiplication (SpMV) on homogeneous processors suchas CPUs, GPUs and Xeon Phi, (3) a new CSR-based SpMV algorithm for a variety oftightly coupled CPU-GPU heterogeneous processors, and (4) a new framework andassociated algorithms for sparse matrix-matrix multiplication (SpGEMM) on GPUsand heterogeneous processors.The thesis compares the proposed methods with state-of-the-art approaches onsix homogeneous and five heterogeneous processors from Intel, AMD and nVidia.Using in total 38 sparse matrices as a benchmark suite, the experimental results showthat the proposed methods obtain significant performance improvement over the bestexisting algorithms.

UR - http://rex.kb.dk/KGL:KGL:KGL01009197651

M3 - Ph.D. thesis

BT - Parallel and Scalable Sparse Basic Linear Algebra Subprograms

PB - The Niels Bohr Institute, Faculty of Science, University of Copenhagen

ER -