| Literature DB >> 10786296 |
C Iseli1, C V Jongeneel, P Bucher.
Abstract
One of the problems associated with the large-scale analysis of unannotated, low quality EST sequences is the detection of coding regions and the correction of frameshift errors that they often contain. We introduce a new type of hidden Markov model that explicitly deals with the possibility of errors in the sequence to analyze, and incorporates a method for correcting these errors. This model was implemented in an efficient and robust program, ESTScan. We show that ESTScan can detect and extract coding regions from low-quality sequences with high selectivity and sensitivity, and is able to accurately correct frameshift errors. In the framework of genome sequencing projects, ESTScan could become a very useful tool for gene discovery, for quality control, and for the assembly of contigs representing the coding regions of genes.Mesh:
Substances:
Year: 1999 PMID: 10786296
Source DB: PubMed Journal: Proc Int Conf Intell Syst Mol Biol ISSN: 1553-0833