| Literature DB >> 26283419 |
Manal Kalkatawi1, Intikhab Alam2, Vladimir B Bajic3.
Abstract
BACKGROUND: Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs).Entities:
Mesh:
Year: 2015 PMID: 26283419 PMCID: PMC4539851 DOI: 10.1186/s12864-015-1826-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1BEACON workflow. This diagram illustrates the flow of the processes in BEACON tool
Statistics for different annotations for H. utahensis genome along with the extended annotations. For orphan and functional genes we show the actual number of genes and the percentage relative to the total number of annotated genes
| Annotation features | NCBI | AAMG | RAST | Extended annotations | |||||
|---|---|---|---|---|---|---|---|---|---|
| Original | Complemented by annotation of function from AAMG and RAST | Original | Complemented by annotation of function from NCBI and RAST | Original | Complemented by annotation of function from NCBI and AAMG | EA | Unique | EUA | |
| CDS | 2998 | 2998 | 3040 | 3040 | 3041 | 3041 | 2980 | 698 | 3678 |
| rRNA | 4 | 4 | 3 | 3 | 3 | 3 | 4 | 0 | 4 |
| tRNA | 45 | 45 | 45 | 45 | 45 | 45 | 45 | 0 | 45 |
| ncRNA | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| frameshift/Pseudo | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 3048 | 3048 | 3088 | 3088 | 3089 | 3089 | 3029 | 699 | 3728 |
| Orphan genes | 1014 (33.27 %) | 777 (25.49 %) | 885 (28.66 %) | 837 (27.10 %) | 1203 (38.94 %) | 819 (26.51 %) | 672 (22.19 %) | 399 (57.08 %) | 1071 (28.73 %) |
| Functional genes | 2034 (66.73 %) | 2271 (74.51 %) | 2203 (71.34 %) | 2251 (72.90 %) | 1886 (61.06 %) | 2270 (73.49 %) | 2357 (77.81 %) | 300 (42.92 %) | 2657 (71.27 %) |
Fig. 2AMs features comparison stats. Histogram that shows the distribution of the statistical and comparison data of H. utahensis genome
Fig. 3Venn diagram showing unique/common genes from different AMs. Venn diagram that illustrates intersections between different annotations of H. utahensis genome
AAMG and RAST annotations compared to the NCBI annotation that is taken as the reference. False Negatives (FN) are genes that exist in the NCBI annotation but are not predicted by an AM. False Positives (FP) are genes predicted by an AM but not present in the NCBI annotation
| Gene calls | Genes annotated by RAST | % of NCBI genes | Genes annotated by AAMG | % of NCBI genes |
|---|---|---|---|---|
| Detected identical | 2421 (CDS = 2376 rRNA = 0 tRA | 79.43 % | 2780 (CDS = 2732 rRNA = 3 tRA | 91.21 % |
| Detected similar | 126 (CDS = 123 rRNA = 3 tRA | 4.13 % | 71 (CDS = 71 rRNA = 0 tRA | 2.33 % |
| FN - Short overlap | 74 | 2.43 % | 32 | 1.05 % |
| FN - No overlap | 427 | 14.01 % | 165 | 5.41 % |
| FP - Short overlap | 13 | - | 0 | - |
| FP-No overlap | 529 | - | 237 | - |
| Similarity score | 83.00 % | 92.93 % |