| Literature DB >> 17170005 |
Keith Knapp1, Yi-Ping Phoebe Chen.
Abstract
We present an independent evaluation of six recent hidden Markov model (HMM) genefinders. Each was tested on the new dataset (FSH298), the results of which showed no dramatic improvement over the genefinders tested five years ago. In addition, we introduce a comprehensive taxonomy of predicted exons and classify each resulting exon accordingly. These results are useful in measuring (with finer granularity) the effects of changes in a genefinder. We present an analysis of these results and identify four patterns of inaccuracy common in all HMM-based results.Entities:
Mesh:
Year: 2006 PMID: 17170005 PMCID: PMC1802560 DOI: 10.1093/nar/gkl1026
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Performance of six Genefinders on the FSH298 dataset
| Nucleotide | Exon | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No genes | SN | SP | AC | CC | CR | PC | OL | ME | WE | SNE | SPE | AVG | |
| Twinscan | 7 | 0.90 | 0.95 | 0.89 | 0.88 | 0.50 | 0.34 | 0.07 | 0.12 | 0.07 | 0.59 | 0.51 | 0.55 |
| GenomeScan | 43 | 0.88 | 0.83 | 0.72 | 0.81 | 0.63 | 0.07 | 0.01 | 0.26 | 0.14 | 0.76 | 0.74 | 0.75 |
| GlimmerHMM | 9 | 0.89 | 0.79 | 0.80 | 0.80 | 0.61 | 0.13 | 0.03 | 0.14 | 0.21 | 0.69 | 0.63 | 0.66 |
| Augustus | 0 | 0.81 | 0.78 | 0.78 | 0.76 | 0.63 | 0.12 | 0.01 | 0.15 | 0.17 | 0.64 | 0.63 | 0.64 |
| GeneZilla | 0 | 0.70 | 0.67 | 0.67 | 0.65 | 0.40 | 0.16 | 0.05 | 0.17 | 0.31 | 0.47 | 0.40 | 0.44 |
| SNAP ( | 9 | 0.72 | 0.71 | 0.69 | 0.66 | 0.35 | 0.20 | 0.08 | 0.31 | 0.34 | 0.40 | 0.36 | 0.38 |
| SNAP ( | 7 | 0.47 | 0.22 | 0.22 | 0.19 | 0.04 | 0.10 | 0.09 | 0.52 | 0.76 | 0.11 | 0.04 | 0.08 |
The metrics provided are for the whole genome, the nucleotide and the exon level. At the whole genome (No genes) is the number of sequences where no gene was predicted. At the nucleotide level sensitivity (SN), specificity (SP), approximate correlation (AC) and the correlation coefficient (CC) are displayed. On the exon level correct exons (CR), partially correct (PC), overlapping exons (OL), missed exons (ME), wrong exons (WE), exon sensitivity (SNE) exon specificity (SPE), and the mean average (AVG) of SNE and SPE. All genefinders successfully completed each of the 298 sequences except Twinscan and GenomeScan which completed 295 and 294, respectively. SNAP was trained on two organisms, A.thaliana (A.thal) and H.sapiens (H.sap).
Figure 1Class 1 exons. Match exactly at both boundaries.
Figure 4Wrong exons. Classes 10 and 11 reverse a boundary. Classes 12 and 13 neither match a boundary nor overlap an annotated exon.
Figure 2Partially correct. Classes 3–6 match only one boundary.
Figure 3Overlapping exons. No boundaries match but the exons do overlap true annotated exons for classes 2, 7, 8 and 9.
Distribution of predicted exons by feature
| Initial | Internal | Term | Single | |
|---|---|---|---|---|
| Actual exon distribution | 0.10 | 0.78 | 0.10 | 0.01 |
| Mean for all genefinders | 0.12 | 0.72 | 0.12 | 0.02 |
| Augustus | 0.11 | 0.73 | 0.13 | 0.03 |
| Genezilla | 0.11 | 0.77 | 0.11 | 0.01 |
| GenomeScan | 0.11 | 0.76 | 0.11 | 0.02 |
| GlimmerHMM | 0.12 | 0.73 | 0.13 | 0.02 |
| Twinscan | 0.08 | 0.70 | 0.05 | 0.01 |
| SNAP ( | 0.20 | 0.60 | 0.17 | 0.03 |
For each program the percentage of exons predicted as a particular feature is displayed. Term is an abbreviation for Terminal.
PET distribution
| Class | Augustus (%) | GeneZilla (%) | GenomeScan (%) | GlimmerHMM (%) | Twinscan (%) | SNAP (%) | Avg (%) |
|---|---|---|---|---|---|---|---|
| 1 | 72.19 | 49.00 | 73.28 | 65.33 | 82.42 | 36.08 | 63.05 |
| 13 | 11.54 | 31.67 | 14.41 | 18.09 | 4.73 | 32.75 | 18.86 |
| 12 | 5.24 | 5.15 | 4.31 | 5.74 | 1.37 | 9.63 | 5.24 |
| 5 | 2.49 | 5.15 | 1.73 | 2.09 | 2.61 | 4.58 | 3.11 |
| 4 | 4.27 | 2.00 | 2.04 | 2.70 | 0.83 | 5.79 | 2.94 |
| 3 | 3.63 | 2.67 | 2.29 | 3.25 | 2.20 | 3.26 | 2.89 |
| 6 | 1.06 | 2.26 | 1.36 | 1.26 | 4.48 | 3.50 | 2.32 |
| 8 | 0.21 | 0.68 | 0.19 | 0.38 | 0.25 | 2.12 | 0.64 |
| 2 | 0.38 | 0.29 | 0.19 | 0.51 | 0.17 | 1.31 | 0.47 |
| 9 | 0.38 | 0.64 | 0.42 | 0.44 | 0.41 | 0.44 | 0.46 |
| 7 | 0.13 | 0.48 | 0.00 | 0.21 | 0.54 | 0.54 | 0.32 |
| 10 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 11 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
For each of the classes the mean average as a percentage of overall predicted exons was calculated. 5% of all exons predicted are class 12, while ∼19% are class 13.
Class seven exon distribution by genefinder and feature
| Genefinder | Initial | Internal | Term | Single |
|---|---|---|---|---|
| Augustus | 0 | 2 | 1 | 0 |
| Genezilla | 3 | 9 | 1 | 2 |
| GenomeScan | 0 | 0 | 0 | 0 |
| GlimmerHMM | 2 | 4 | 0 | 0 |
| Twinscan | 0 | 8 | 1 | 4 |
| SNAP ( | 6 | 9 | 1 | 0 |
GenomeScan was the only genefinder to predict zero exons for a class (no including classes 10 and 11 as no genefinder predicted these). Term is an abbreviation for Terminal.
Initial and terminal exon comparison
| 3 Initial | 5 Initial | 4 Terminal | 6 Terminal | 4 Single | 6 Single | |
|---|---|---|---|---|---|---|
| Augustus | 12 | 22 | 26 | 1 | 6 | 0 |
| GeneZilla | 17 | 52 | 22 | 8 | 1 | 0 |
| GenomeScan | 2 | 15 | 5 | 3 | 1 | 0 |
| GlimmerHMM | 16 | 36 | 23 | 6 | 4 | 0 |
| Twinscan | 8 | 29 | 3 | 3 | 2 | 0 |
| SNAP ( | 35 | 74 | 92 | 3 | 8 | 0 |
| SNAP ( | 41 | 99 | 74 | 9 | 28 | 0 |