| Literature DB >> 15901250 |
Zhuo Wang1, Yazhu Chen, Yixue Li.
Abstract
With the development of genome sequencing for many organisms, more and more raw sequences need to be annotated. Gene prediction by computational methods for finding the location of protein coding regions is one of the essential issues in bioinformatics. Two classes of methods are generally adopted: similarity based searches and ab initio prediction. Here, we review the development of gene prediction methods, summarize the measures for evaluating predictor quality, highlight open problems in this area, and discuss future research directions.Entities:
Mesh:
Year: 2004 PMID: 15901250 PMCID: PMC5187414 DOI: 10.1016/s1672-0229(04)02028-5
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Ab initio Gene Prediction Programs (Possibly with Homology Integration)
| Program | Organism | Algorithm | Website | Homology |
|---|---|---|---|---|
| GeneID | Vertebrates, plants | DP | ||
| FGENESH | Human, mouse, Drosophila, rice | HMM | ||
| GeneParser | Vertebrates | NN | EST | |
| Genie | Drosophila, human, other | GHMM | protein | |
| GenLang | Vertebrates, Drosophila, dicots | Grammar rule | ||
| GENSCAN | Vertebrates, Arabidopsis, maize | GHMM | ||
| GlimmerM | Small eukaryotes, Arabidopsis, rice | IMM | ||
| GRAIL | Human, mouse, Arabidopsis, Drosophila | NN, DP | EST, cDNA | |
| HMMgene | Vertebrates, | CHMM | ||
| AUGUSTUS | Human, Arabidopsis | IMM,WWAM | ||
| MZEF | Human, mouse, Arabidopsis, Fission yeast | Quadratic discriminant analysis |
DP, dynamic programming; NN, neural network; MM, Markov model; HMM, Hidden Markov model; CHMM, class HMM; GHMM, generalized HMM; IMM, interpolated MM.
Fig. 1State transition of HMM modeling eukaryotic genes.
Fig. 2Evaluation of gene prediction accuracy at the exon level.
Accuracy Comparisons of Gene Prediction Programs
| Program | Sn | Sp | MR | WR |
|---|---|---|---|---|
| GENSCAN | 0.78 | 0.81 | 0.09 | 0.05 |
| FGENEH | 0.61 | 0.64 | 0.15 | 0.12 |
| GeneID | 0.44 | 0.46 | 0.28 | 0.24 |
| Genie | 0.55 | 0.48 | 0.17 | 0.33 |
| GenLang | 0.51 | 0.52 | 0.21 | 0.22 |
| GeneParser2 | 0.35 | 0.40 | 0.34 | 0.17 |
| GRAIL2 | 0.36 | 0.43 | 0.25 | 0.11 |
| SORFIND | 0.42 | 0.47 | 0.24 | 0.14 |
| Xpound | 0.15 | 0.18 | 0.33 | 0.13 |