The block hidden Markov model for biological sequence analysis

Kyoung Jae Won; Adam Prügel-Bennett; Anders Krogh

The block hidden Markov model for biological sequence analysis

Kyoung Jae Won, Adam Prügel-Bennett, Anders Krogh^*

^*Corresponding author af dette arbejde

4 Citationer (Scopus)

Abstract

The Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure. An automatic means of optimising the structure of HMMs would be highly desirable. To maintain biologically interpretable blocks inside the HMM, we used a Genetic Algorithm (GA) that has HMM blocks in its coding representation. We developed special genetics operations that maintain the useful HMM blocks. To prevent over-fitting a separate data set is used for comparing the performance of the HMMs to that used for the Baum-Welch training. The performance of this algorithm is applied to finding HMM structures for the promoter and coding region of C. jejuni. The GA-HMM was capable of finding a superior HMM to a hand-coded HMM designed for the same task which has been published in the literature.

Originalsprog	Engelsk
Bogserie	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Vol/bind	3213
Sider (fra-til)	64-70
Antal sider	7
ISSN	0302-9743
Status	Udgivet - 1 dec. 2004

Andre filer og links

Link to publication in Scopus

Citationsformater

The block hidden Markov model for biological sequence analysis. / Won, Kyoung Jae; Prügel-Bennett, Adam; Krogh, Anders.
I: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Bind 3213, 01.12.2004, s. 64-70.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

@article{cd75a457149b426cbcb4207da3b0385a,

title = "The block hidden Markov model for biological sequence analysis",

abstract = "The Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure. An automatic means of optimising the structure of HMMs would be highly desirable. To maintain biologically interpretable blocks inside the HMM, we used a Genetic Algorithm (GA) that has HMM blocks in its coding representation. We developed special genetics operations that maintain the useful HMM blocks. To prevent over-fitting a separate data set is used for comparing the performance of the HMMs to that used for the Baum-Welch training. The performance of this algorithm is applied to finding HMM structures for the promoter and coding region of C. jejuni. The GA-HMM was capable of finding a superior HMM to a hand-coded HMM designed for the same task which has been published in the literature.",

author = "Won, {Kyoung Jae} and Adam Pr{\"u}gel-Bennett and Anders Krogh",

year = "2004",

month = dec,

day = "1",

language = "English",

volume = "3213",

pages = "64--70",

journal = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

issn = "0302-9743",

publisher = "Springer",

}

TY - JOUR

T1 - The block hidden Markov model for biological sequence analysis

AU - Won, Kyoung Jae

AU - Prügel-Bennett, Adam

AU - Krogh, Anders

PY - 2004/12/1

Y1 - 2004/12/1

N2 - The Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure. An automatic means of optimising the structure of HMMs would be highly desirable. To maintain biologically interpretable blocks inside the HMM, we used a Genetic Algorithm (GA) that has HMM blocks in its coding representation. We developed special genetics operations that maintain the useful HMM blocks. To prevent over-fitting a separate data set is used for comparing the performance of the HMMs to that used for the Baum-Welch training. The performance of this algorithm is applied to finding HMM structures for the promoter and coding region of C. jejuni. The GA-HMM was capable of finding a superior HMM to a hand-coded HMM designed for the same task which has been published in the literature.

AB - The Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure. An automatic means of optimising the structure of HMMs would be highly desirable. To maintain biologically interpretable blocks inside the HMM, we used a Genetic Algorithm (GA) that has HMM blocks in its coding representation. We developed special genetics operations that maintain the useful HMM blocks. To prevent over-fitting a separate data set is used for comparing the performance of the HMMs to that used for the Baum-Welch training. The performance of this algorithm is applied to finding HMM structures for the promoter and coding region of C. jejuni. The GA-HMM was capable of finding a superior HMM to a hand-coded HMM designed for the same task which has been published in the literature.

UR - http://www.scopus.com/inward/record.url?scp=31744440287&partnerID=8YFLogxK

M3 - Journal article

AN - SCOPUS:31744440287

SN - 0302-9743

VL - 3213

SP - 64

EP - 70

JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -

The block hidden Markov model for biological sequence analysis

Abstract

Andre filer og links

Fingeraftryk

Citationsformater