| Literature DB >> 14565849 |
Evan Keibler1, Michael R Brent.
Abstract
SUMMARY: Eval is a flexible tool for analyzing the performance of gene annotation systems. It provides summaries and graphical distributions for many descriptive statistics about any set of annotations, regardless of their source. It also compares sets of predictions to standard annotations and to one another. Input is in the standard Gene Transfer Format (GTF). Eval can be run interactively or via the command line, in which case output options include easily parsable tab-delimited files. AVAILABILITY: To obtain the module package with documentation, go to http://genes.cse.wustl.edu/ and follow links for Resources, then Software. Please contact brent@cse.wustl.eduEntities:
Mesh:
Year: 2003 PMID: 14565849 PMCID: PMC270064 DOI: 10.1186/1471-2105-4-50
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
A sampling of the less common statistics calculated by Eval when comparing the output of TWINSCAN and GENSCAN on the "semi-artificial" gene set used in [1] to the gold standard annotation. Standard statistics such as gene and exon sensitivity and specificity are also calculated but are not shown.
| Feature | Statistic | TWINSCAN | GENSCAN |
| Transcripts | Exons Per Transcript | 6.46 | 5.93 |
| CDS Overlap Specificity | 96.55% | 70.59% | |
| CDS Overlap Sensitivity | 87.64% | 97.19% | |
| All Introns Matched Specificity | 26.90% | 8.60% | |
| All Introns Matched Sensitivity | 21.91% | 10.67% | |
| Start and Stop Codon Specificity | 44.14% | 17.65% | |
| Start and Stop Codon Sensitivity | 35.96% | 21.91% | |
| Initial Exons | Overlap Specificity | 70.16% | 35.47% |
| Overlap Sensitivity | 77.54% | 73.91% | |
| Terminal Exons | 5' Splice Specificity | 74.36% | 36.22% |
| 5' Splice Sensitivity | 74.64% | 71.01% | |
| Introns | 80% Overlap Specificity | 73.11% | 48.07% |
| 80% Overlap Sensitivity | 80.19% | 72.58% | |
| Nucleotides | Correct Specificity | 84.61% | 64.76% |
| Correct Sensitivity | 84.26% | 88.87% | |
| Splice Acceptors | Correct Specificity | 77.23% | 52.69% |
| Correct Sensitivity | 84.90% | 81.30% | |
| Splice Donors | Correct Specificity | 76.18% | 53.02% |
| Correct Sensitivity | 84.63% | 80.19% | |
| Start Codons | Correct Specificity | 61.97% | 34.90% |
| Correct Sensitivity | 49.44% | 37.64% | |
| Stop Codons | Correct Specificity | 82.22% | 47.95% |
| Correct Sensitivity | 62.36% | 58.99% |
Figure 1Panel A. Distributions of exons-per-gene for TWINSCAN [4] and GENSCAN [5] gene predictions and RefSeq mRNA sequences aligned to the genome. The plot reveals that, although TWINSCAN predicts too few genes in the 5–20 exon range, it predicts the right proportion of genes with more than 25 exons. Panel B. Fraction of RefSeq genes that TWINSCAN and GENSCAN predict exactly right, as a function of the genomic length of the RefSeq, excluding UTRs. Both figures were made in Excel by importing Eval output as tab-separated files. Data in both panes was generated using the NCBI34 version of the human genome and TWINSCAN 1.2.
The results of building a Venn diagram based on exact exon matches among the aligned RefSeqs, TWINSCAN 1.2 predictions, and GENSCAN predictions, on the NCBI34 build of the human genome. All exons are first combined into clusters that have the same begin and end points. These clusters are then partitioned into the subset of exons annotated only by RefSeq (R), the subset annotated only by TWINSCAN (T), the subset annotated only by GENSCAN (G), the subset annotated by RefSeq and TWINSCAN but not GENSCAN (RT), etc. For each of these subsets, the table shows the number of clusters in the subset. It also shows the percentage all exons from each of the input sets that is included in that subset. The last column shows the fraction of all clusters included in that subset.
| Subset in partition | Cluster Count | % of RefSeq exons | % of Twinscan exons | % of Genscan exons | % of all clusters |
| R | 29,680 | 20.29% | 0.00% | 0.00% | 7.21% |
| T | 44,672 | 0.00% | 22.04% | 0.00% | 10.84% |
| G | 166,765 | 0.00% | 0.00% | 51.72% | 40.48% |
| RT | 15,141 | 10.55% | 7.47% | 0.00% | 3.68% |
| RG | 12,812 | 9.29% | 0.00% | 3.97% | 3.11% |
| TG | 57,795 | 0.00% | 28.52% | 17.92% | 14.03% |
| RTG | 85,069 | 59.88% | 41.97% | 26.38% | 20.65% |