Parallel transposition of sparse data structures

Hao Wang; Weifeng Liu; Kaixi Hou; Wu-chun Feng

doi:10.1145/2925426.2926291

Parallel transposition of sparse data structures

Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng

eScience

26 Citations (Scopus)

Abstract

Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel transposition for sparse matrices and graphs, have not received the attention they deserve. In this paper, we first identify that the transposition operation can be a bottleneck of some fundamental sparse matrix and graph algorithms. Then, we revisit the performance and scalability of parallel transposition approaches on x86-based multi-core and many-core processors. Based on the insights obtained, we propose two new parallel transposition algorithms: ScanTrans and MergeTrans. The experimental results show that our ScanTrans method achieves an average of 2.8-fold (up to 6.2-fold) speedup over the parallel transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor.

Original language	English
Title of host publication	Proceedings of the 2016 International Conference on Supercomputing
Number of pages	13
Place of Publication	Istabbul, Turkey
Publisher	Association for Computing Machinery
Publication date	2016
Article number	33
ISBN (Electronic)	978-1-4503-4361-9
DOIs	https://doi.org/10.1145/2925426.2926291
Publication status	Published - 2016
Event	30th International Conference on Supercomputing - Istanbul, Turkey Duration: 1 Jun 2016 → 3 Jun 2016 Conference number: 30

Conference

Conference	30th International Conference on Supercomputing
Number	30
Country/Territory	Turkey
City	Istanbul
Period	01/06/2016 → 03/06/2016

Keywords

AVX
CSR
Graph algorithms
Intel Xeon Phi
Sparse matrix
SpGEMM
SpMV
Transposition

Access to Document

10.1145/2925426.2926291

http://dl.acm.org/ft_gateway.cfm?id=2926291&type=pdf

Cite this

@inproceedings{4a46bf478ae44f2d9383991991cd77de,

title = "Parallel transposition of sparse data structures",

abstract = "Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel transposition for sparse matrices and graphs, have not received the attention they deserve. In this paper, we first identify that the transposition operation can be a bottleneck of some fundamental sparse matrix and graph algorithms. Then, we revisit the performance and scalability of parallel transposition approaches on x86-based multi-core and many-core processors. Based on the insights obtained, we propose two new parallel transposition algorithms: ScanTrans and MergeTrans. The experimental results show that our ScanTrans method achieves an average of 2.8-fold (up to 6.2-fold) speedup over the parallel transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor.",

keywords = "AVX, CSR, Graph algorithms, Intel Xeon Phi, Sparse matrix, SpGEMM, SpMV, Transposition",

author = "Hao Wang and Weifeng Liu and Kaixi Hou and Wu-chun Feng",

year = "2016",

doi = "10.1145/2925426.2926291",

language = "English",

booktitle = "Proceedings of the 2016 International Conference on Supercomputing",

publisher = "Association for Computing Machinery",

note = "30th International Conference on Supercomputing ; Conference date: 01-06-2016 Through 03-06-2016",

}

TY - GEN

T1 - Parallel transposition of sparse data structures

AU - Wang, Hao

AU - Liu, Weifeng

AU - Hou, Kaixi

AU - Feng, Wu-chun

N1 - Conference code: 30

PY - 2016

Y1 - 2016

N2 - Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel transposition for sparse matrices and graphs, have not received the attention they deserve. In this paper, we first identify that the transposition operation can be a bottleneck of some fundamental sparse matrix and graph algorithms. Then, we revisit the performance and scalability of parallel transposition approaches on x86-based multi-core and many-core processors. Based on the insights obtained, we propose two new parallel transposition algorithms: ScanTrans and MergeTrans. The experimental results show that our ScanTrans method achieves an average of 2.8-fold (up to 6.2-fold) speedup over the parallel transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor.

AB - Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel transposition for sparse matrices and graphs, have not received the attention they deserve. In this paper, we first identify that the transposition operation can be a bottleneck of some fundamental sparse matrix and graph algorithms. Then, we revisit the performance and scalability of parallel transposition approaches on x86-based multi-core and many-core processors. Based on the insights obtained, we propose two new parallel transposition algorithms: ScanTrans and MergeTrans. The experimental results show that our ScanTrans method achieves an average of 2.8-fold (up to 6.2-fold) speedup over the parallel transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor.

KW - AVX

KW - CSR

KW - Graph algorithms

KW - Intel Xeon Phi

KW - Sparse matrix

KW - SpGEMM

KW - SpMV

KW - Transposition

UR - http://www.scopus.com/inward/record.url?scp=84978501295&partnerID=8YFLogxK

U2 - 10.1145/2925426.2926291

DO - 10.1145/2925426.2926291

M3 - Article in proceedings

AN - SCOPUS:84978501295

BT - Proceedings of the 2016 International Conference on Supercomputing

PB - Association for Computing Machinery

CY - Istabbul, Turkey

T2 - 30th International Conference on Supercomputing

Y2 - 1 June 2016 through 3 June 2016

ER -

Parallel transposition of sparse data structures

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this