Abstract
High-bandwidth On-Package Memory (OPM) innovates the conventional memory hierarchy by augmenting a new on-package layer between classic on-chip cache and off-chip DRAM. Due to its relative location and capacity, OPM is often used as a new type of LLC. Despite the adaptation in modern processors, the performance and power impact of OPM on HPC applications, especially scientific kernels, is still unknown. In this paper, we fill this gap by conducting a comprehensive evaluation for a wide spectrum of scientific kernels with a large amount of representative inputs, including dense, sparse and medium, on two Intel OPMs: eDRAM on multicore Broadwell and MCDRAM on manycore Knights Landing. Guided by our general optimization models, we demonstrate OPM's effectiveness for easing programmers' tuning efforts to reach ideal throughput for both compute-bound and memory-bound applications.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC |
Number of pages | 14 |
Volume | 17 |
Place of Publication | Denver, USA |
Publication date | 12 Nov 2017 |
ISBN (Print) | 978-145035114-0/17/11 |
DOIs | |
Publication status | Published - 12 Nov 2017 |