Abstract
The human body consists of over 200 different types of cells and each carries completely identical DNA information. Some pieces of this information is needed by all the cells whereas some other information is only needed in a specific type of cells or under certain circumstances. Consequently, all cells are capable of regulating their gene expression, so that each cell can only express a particular set of genes yielding limited numbers of proteins with specialized functions. Therefore a rigid control of differential gene expression is necessary for cellular diversity. On the other hand, aberrant gene regulation will disrupt the cell’s fundamental processes, which in turn can cause disease. Hence, understanding gene regulation is essential for deciphering the code of life. Along with the development of high throughput sequencing (HTS) technology and the subsequent large-scale data analysis, genome-wide assays have increased our understanding of gene regulation significantly. This thesis describes the integration and analysis of HTS data across different important aspects of gene regulation.
Gene expression can be regulated at different stages when the genetic information is passed from gene to protein: through epigenetic modifications, transcription regulators or post-transcriptional controls. The following papers concern several layers of gene regulation with questions answered by different HTS approaches.
Genome-wide screening of epigenetic changes by ChIP-seq allowed us to study both spatial and temporal alterations of histone modifications (Papers I and II). Coupling the data with machine learning approaches, we established a prediction framework to assess the most informative histone marks as well as their most influential nucleosome positions in predicting the promoter usages. (Papers I). Focusing on the same promoter across the cell cycle, we observed that histone modification undergoes very distinct temporal lterations compared to their regulatory functions spatially at different promoters (Papers II).
By aggregating different HTS methods including CAGE, 3’end-seq, GRO-seq, RNAPII ChIP-seq and small RNA-seq, we delineated the landscape of the promoters with bidirectional transcriptions that yield steady-state RNA in only one directions (Paper III). A subsequent motif analysis enabled us to uncover specific DNA signals – early polyA sites – that make RNA on the reverse strand sensitive to degradation.
Cross species comparison of transcription factor binding sites (TFBSs) using ChIP-seq (Paper IV) suggested the majority of the TFBSs were species-specific yet with exceptions. We found the retention of TFBSs between human and mouse was significantly increased when they were close to
the genes they regulated or if the sites had global elevated usage rates by multiple TFs. Using RNA-seq, 5’end-seq in combination with depletion of 5’exonuclease as well as nonsensemediated decay (NMD) factors, we systematically analyzed NMD substrates as well as their degradation intermediates in human cells (Paper V). Gene enrichment analysis on the detected NMD substrates revealed an unappreciated NMD-based regulatory mechanism of the genes hosting multiple intronic snoRNAs, which can facilitate differential expression of individual snoRNAs from a single host gene locus.
Finally, supported by RNA-seq and small RNA-seq, we assessed both gene and miRNA expression signatures in human patients with Crohn’s Disease (CD) (Paper VI). We found miRNAs had a better diagnostic power than genes in CD and detected several novel miRNA-gene interactions in CD.
Gene expression can be regulated at different stages when the genetic information is passed from gene to protein: through epigenetic modifications, transcription regulators or post-transcriptional controls. The following papers concern several layers of gene regulation with questions answered by different HTS approaches.
Genome-wide screening of epigenetic changes by ChIP-seq allowed us to study both spatial and temporal alterations of histone modifications (Papers I and II). Coupling the data with machine learning approaches, we established a prediction framework to assess the most informative histone marks as well as their most influential nucleosome positions in predicting the promoter usages. (Papers I). Focusing on the same promoter across the cell cycle, we observed that histone modification undergoes very distinct temporal lterations compared to their regulatory functions spatially at different promoters (Papers II).
By aggregating different HTS methods including CAGE, 3’end-seq, GRO-seq, RNAPII ChIP-seq and small RNA-seq, we delineated the landscape of the promoters with bidirectional transcriptions that yield steady-state RNA in only one directions (Paper III). A subsequent motif analysis enabled us to uncover specific DNA signals – early polyA sites – that make RNA on the reverse strand sensitive to degradation.
Cross species comparison of transcription factor binding sites (TFBSs) using ChIP-seq (Paper IV) suggested the majority of the TFBSs were species-specific yet with exceptions. We found the retention of TFBSs between human and mouse was significantly increased when they were close to
the genes they regulated or if the sites had global elevated usage rates by multiple TFs. Using RNA-seq, 5’end-seq in combination with depletion of 5’exonuclease as well as nonsensemediated decay (NMD) factors, we systematically analyzed NMD substrates as well as their degradation intermediates in human cells (Paper V). Gene enrichment analysis on the detected NMD substrates revealed an unappreciated NMD-based regulatory mechanism of the genes hosting multiple intronic snoRNAs, which can facilitate differential expression of individual snoRNAs from a single host gene locus.
Finally, supported by RNA-seq and small RNA-seq, we assessed both gene and miRNA expression signatures in human patients with Crohn’s Disease (CD) (Paper VI). We found miRNAs had a better diagnostic power than genes in CD and detected several novel miRNA-gene interactions in CD.
Originalsprog | Engelsk |
---|
Forlag | Department of Biology, Faculty of Science, University of Copenhagen |
---|---|
Antal sider | 240 |
Status | Udgivet - 2014 |