Abstract
Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel transposition for sparse matrices and graphs, have not received the attention they deserve. In this paper, we first identify that the transposition operation can be a bottleneck of some fundamental sparse matrix and graph algorithms. Then, we revisit the performance and scalability of parallel transposition approaches on x86-based multi-core and many-core processors. Based on the insights obtained, we propose two new parallel transposition algorithms: ScanTrans and MergeTrans. The experimental results show that our ScanTrans method achieves an average of 2.8-fold (up to 6.2-fold) speedup over the parallel transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2016 International Conference on Supercomputing |
Number of pages | 13 |
Place of Publication | Istabbul, Turkey |
Publisher | Association for Computing Machinery |
Publication date | 2016 |
Article number | 33 |
ISBN (Electronic) | 978-1-4503-4361-9 |
DOIs | |
Publication status | Published - 2016 |
Event | 30th International Conference on Supercomputing - Istanbul, Turkey Duration: 1 Jun 2016 → 3 Jun 2016 Conference number: 30 |
Conference
Conference | 30th International Conference on Supercomputing |
---|---|
Number | 30 |
Country/Territory | Turkey |
City | Istanbul |
Period | 01/06/2016 → 03/06/2016 |
Keywords
- AVX
- CSR
- Graph algorithms
- Intel Xeon Phi
- Sparse matrix
- SpGEMM
- SpMV
- Transposition