The block hidden Markov model for biological sequence analysis

Kyoung Jae Won, Adam Prügel-Bennett, Anders Krogh*

*Corresponding author for this work
4 Citations (Scopus)

Abstract

The Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure. An automatic means of optimising the structure of HMMs would be highly desirable. To maintain biologically interpretable blocks inside the HMM, we used a Genetic Algorithm (GA) that has HMM blocks in its coding representation. We developed special genetics operations that maintain the useful HMM blocks. To prevent over-fitting a separate data set is used for comparing the performance of the HMMs to that used for the Baum-Welch training. The performance of this algorithm is applied to finding HMM structures for the promoter and coding region of C. jejuni. The GA-HMM was capable of finding a superior HMM to a hand-coded HMM designed for the same task which has been published in the literature.

Original languageEnglish
Book seriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3213
Pages (from-to)64-70
Number of pages7
ISSN0302-9743
Publication statusPublished - 1 Dec 2004

Fingerprint

Dive into the research topics of 'The block hidden Markov model for biological sequence analysis'. Together they form a unique fingerprint.

Cite this