Abstract
MOTIVATION: High-density oligonucleotide tiling array technology holds the promise of a better description of the complexity and the dynamics of transcriptional landscapes. In organisms such as bacteria and yeasts, transcription can be measured on a genome-wide scale with a resolution >25 bp. The statistical models currently used to handle these data remain however very simple, the most popular being the piecewise constant Gaussian model with a fixed number of breakpoints.
RESULTS: This article describes a new methodology based on a hidden Markov model that embeds the segmentation of a continuous-valued signal in a probabilistic setting. For a computationally affordable cost, this framework (i) alleviates the difficulty of choosing a fixed number of breakpoints, and (ii) permits retrieving more information than a unique segmentation by giving access to the whole probability distribution of the transcription profile. Importantly, the model is also enriched and accounts for subtle effects such as signal 'drift' and covariates. Relevance of this framework is demonstrated on a Bacillus subtilis dataset.
AVAILABILITY: A software is distributed under the GPL.
Original language | English |
---|---|
Journal | Bioinformatics (Online) |
Volume | 25 |
Issue number | 18 |
Pages (from-to) | 2341-7 |
Number of pages | 7 |
ISSN | 1367-4811 |
DOIs | |
Publication status | Published - 15 Sept 2009 |
Externally published | Yes |
Keywords
- Bacillus subtilis/genetics
- Computational Biology/methods
- Gene Expression Profiling/methods
- Genome
- Oligonucleotide Array Sequence Analysis/methods
- Sequence Analysis, DNA/methods
- Transcription, Genetic