| Literature DB >> 9228612 |
J Henderson1, S Salzberg, K H Fasman.
Abstract
This study describes a new Hidden Markov Model (HMM) system for segmenting uncharacterized genomic DNA sequences into exons, introns, and intergenic regions. Separate HMM modules were designed and trained for specific regions of DNA: exons, introns, intergenic regions, and splice sites. The models were then tied together to form a biologically feasible topology. The integrated HMM was trained further on a set of eukaryotic DNA sequences and tested by using it to segment a separate set of sequences. The resulting HMM system which is called VEIL (Viterbi Exon-Intron Locator), obtains an overall accuracy on test data of 92% of total bases correctly labelled, with a correlation coefficient of 0.73. Using the more stringent test of exact exon prediction, VEIL correctly located both ends of 53% of the coding exons, and 49% of the exons it predicts are exactly correct. These results compare favorably to the best previous results for gene structure prediction and demonstrate the benefits of using HMMs for this problem.Mesh:
Substances:
Year: 1997 PMID: 9228612 DOI: 10.1089/cmb.1997.4.127
Source DB: PubMed Journal: J Comput Biol ISSN: 1066-5277 Impact factor: 1.479