Abstract
Structural alignments of Ribonucleic acid (RNA) sequences solved by the Sankoff algorithm are computationally expensive and often require constraints to be used in practice. Modern Graphics Processing Units (GPUs) contain more than 1000 cores, which compute in parallel to speed up applications. Here, we present a GPU-based solution to the RNA structural alignment problem that makes use of precalculated base pair probabilities on the individual sequences. We designed and developed an unconstrained version of the Sankoff algorithm, obtaining the optimal result and calculating the entire four-dimension dynamic programming matrix (4D DP). Our approach uses a two-level wavefront strategy to exploit parallelism. The 4D DP matrix is divided in one external matrix (EM) and several internal matrices (IM). We applied wavefront strategies on the EM and IMs in a two-level hierarchical way. At the first level, the wavefront is applied to the EM, calculating the cells that belong to the same diagonal in parallel. In the second level, since each cell in the EM is itself an IM matrix, the cells that belong to the same IM diagonal are calculated in parallel. The results obtained with real RNA sequences show that our GPU version is capable of outperforming a multicore CPU version of the unconstrained version of the Sankoff algorithm. Compared with the CPU-based version running on 32 cores, our approach is able to achieve a speedup of 7.81x on the NVidia Tesla P100. In this case, the execution time was reduced from 6 hours and 18 minutes (32 cores) to 48 minutes and 20 seconds (GPU).
Original language | English |
---|---|
Article number | e5468 |
Journal | Concurrency Computation |
ISSN | 1532-0626 |
DOIs | |
Publication status | Published - 25 May 2020 |
Keywords
- base-pairing probabilities
- GPUs
- high-performance computing
- RNA
- Sankoff algorithm