TY - JOUR
T1 - CAGEfightR
T2 - analysis of 5'-end data using R/Bioconductor
AU - Thodberg, Malte
AU - Thieffry, Axel
AU - Vitting-Seerup, Kristoffer
AU - Andersson, Robin
AU - Sandelin, Albin
PY - 2019/10/4
Y1 - 2019/10/4
N2 - Background: 5′-end sequencing assays, and Cap Analysis of Gene Expression (CAGE) in particular, have been instrumental in studying transcriptional regulation. 5′-end methods provide genome-wide maps of transcription start sites (TSSs) with base pair resolution. Because active enhancers often feature bidirectional TSSs, such data can also be used to predict enhancer candidates. The current availability of mature and comprehensive computational tools for the analysis of 5′-end data is limited, preventing efficient analysis of new and existing 5′-end data. Results: We present CAGEfightR, a framework for analysis of CAGE and other 5′-end data implemented as an R/Bioconductor-package. CAGEfightR can import data from BigWig files and allows for fast and memory efficient prediction and analysis of TSSs and enhancers. Downstream analyses include quantification, normalization, annotation with transcript and gene models, TSS shape statistics, linking TSSs to enhancers via co-expression, identification of enhancer clusters, and genome-browser style visualization. While built to analyze CAGE data, we demonstrate the utility of CAGEfightR in analyzing nascent RNA 5′-data (PRO-Cap). CAGEfightR is implemented using standard Bioconductor classes, making it easy to learn, use and combine with other Bioconductor packages, for example popular differential expression tools such as limma, DESeq2 and edgeR. Conclusions: CAGEfightR provides a single, scalable and easy-to-use framework for comprehensive downstream analysis of 5′-end data. CAGEfightR is designed to be interoperable with other Bioconductor packages, thereby unlocking hundreds of mature transcriptomic analysis tools for 5′-end data. CAGEfightR is freely available via Bioconductor: bioconductor.org/packages/CAGEfightR.
AB - Background: 5′-end sequencing assays, and Cap Analysis of Gene Expression (CAGE) in particular, have been instrumental in studying transcriptional regulation. 5′-end methods provide genome-wide maps of transcription start sites (TSSs) with base pair resolution. Because active enhancers often feature bidirectional TSSs, such data can also be used to predict enhancer candidates. The current availability of mature and comprehensive computational tools for the analysis of 5′-end data is limited, preventing efficient analysis of new and existing 5′-end data. Results: We present CAGEfightR, a framework for analysis of CAGE and other 5′-end data implemented as an R/Bioconductor-package. CAGEfightR can import data from BigWig files and allows for fast and memory efficient prediction and analysis of TSSs and enhancers. Downstream analyses include quantification, normalization, annotation with transcript and gene models, TSS shape statistics, linking TSSs to enhancers via co-expression, identification of enhancer clusters, and genome-browser style visualization. While built to analyze CAGE data, we demonstrate the utility of CAGEfightR in analyzing nascent RNA 5′-data (PRO-Cap). CAGEfightR is implemented using standard Bioconductor classes, making it easy to learn, use and combine with other Bioconductor packages, for example popular differential expression tools such as limma, DESeq2 and edgeR. Conclusions: CAGEfightR provides a single, scalable and easy-to-use framework for comprehensive downstream analysis of 5′-end data. CAGEfightR is designed to be interoperable with other Bioconductor packages, thereby unlocking hundreds of mature transcriptomic analysis tools for 5′-end data. CAGEfightR is freely available via Bioconductor: bioconductor.org/packages/CAGEfightR.
KW - Transcription start site
KW - Promoter
KW - Enhancer
KW - Enhancer RNA
KW - CAGE
KW - 5 '-end methods
KW - R-package
KW - Bioconductor
U2 - 10.1186/s12859-019-3029-5
DO - 10.1186/s12859-019-3029-5
M3 - Journal article
C2 - 31585526
SN - 1471-2105
VL - 20
JO - BMC Bioinformatics
JF - BMC Bioinformatics
M1 - 487
ER -