Abstract
Motivation: Regulatory, non-coding RNAs often function by forming a duplex with other RNAs. It is therefore of interest to predict putative RNA-RNA duplexes in silico on a genome-wide scale. Current computational methods for predicting these interactions range from fast complementary-based searches to those that take intramolecular binding into account. Together these methods constitute a trade-off between speed and accuracy, while leaving room for improvement within the context of genome-wide screens. A fast pre-filtering of putative duplexes would therefore be desirable.Results: We present RIsearch, an implementation of a simplified Turner energy model for fast computation of hybridization, which significantly reduces runtime while maintaining accuracy. Its time complexity for sequences of lengths m and n is with a much smaller pre-factor than other tools. We show that this energy model is an accurate approximation of the full energy model for near-complementary RNA-RNA duplexes. RIsearch uses a Smith-Waterman-like algorithm using a dinucleotide scoring matrix which approximates the Turner nearest-neighbor energies. We show in benchmarks that we achieve a speed improvement of at least 2.4× compared with RNAplex, the currently fastest method for searching near-complementary regions. RIsearch shows a prediction accuracy similar to RNAplex on two datasets of known bacterial short RNA (sRNA)-messenger RNA (mRNA) and eukaryotic microRNA (miRNA)-mRNA interactions. Using RIsearch as a pre-filter in genome-wide screens reduces the number of binding site candidates reported by miRNA target prediction programs, such as TargetScanS and miRanda, by up to 70. Likewise, substantial filtering was performed on bacterial RNA-RNA interaction data.
Originalsprog | Engelsk |
---|---|
Tidsskrift | Bioinformatics |
Vol/bind | 28 |
Udgave nummer | 21 |
Sider (fra-til) | 2738-2746 |
Antal sider | 9 |
ISSN | 1367-4803 |
DOI | |
Status | Udgivet - nov. 2012 |