| Literature DB >> 24275496 |
Takatomo Fujisawa1, Shinobu Okamoto, Toshiaki Katayama, Mitsuteru Nakao, Hidehisa Yoshimura, Hiromi Kajiya-Kanegae, Sumiko Yamamoto, Chiyoko Yano, Yuka Yanaka, Hiroko Maita, Takakazu Kaneko, Satoshi Tabata, Yasukazu Nakamura.
Abstract
To understand newly sequenced genomes of closely related species, comprehensively curated reference genome databases are becoming increasingly important. We have extended CyanoBase (http://genome.microbedb.jp/cyanobase), a genome database for cyanobacteria, and newly developed RhizoBase (http://genome.microbedb.jp/rhizobase), a genome database for rhizobia, nitrogen-fixing bacteria associated with leguminous plants. Both databases focus on the representation and reusability of reference genome annotations, which are continuously updated by manual curation. Domain experts have extracted names, products and functions of each gene reported in the literature. To ensure effectiveness of this procedure, we developed the TogoAnnotation system offering a web-based user interface and a uniform storage of annotations for the curators of the CyanoBase and RhizoBase databases. The number of references investigated for CyanoBase increased from 2260 in our previous report to 5285, and for RhizoBase, we perused 1216 references. The results of these intensive annotations are displayed on the GeneView pages of each database. Advanced users can also retrieve this information through the representational state transfer-based web application programming interface in an automated manner.Entities:
Mesh:
Year: 2013 PMID: 24275496 PMCID: PMC3965071 DOI: 10.1093/nar/gkt1145
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.An example GeneView page for the sll1867 gene of Synechocystis sp. PCC 6803. Manually curated gene symbol(s) and gene product(s) are shown in the ‘Gene symbol Extracted from literature’ and ‘Gene symbol Extracted from literature’ fields in the ‘Summary’ section.
Number of curated publications and annotated genes for each organism of CyanoBase and RhizoBase
| Database | Organism | References | Annotations | Annotated genes | Total genes |
|---|---|---|---|---|---|
| CyanoBase | 2346 | 80 204 | 3064 | 3725 | |
| CyanoBase | 959 | 29 154 | 2754 | 6223 | |
| CyanoBase | 815 | 17 060 | 794 | 2715 | |
| CyanoBase | 270 | 6768 | 2528 | 2528 | |
| CyanoBase | 264 | 3999 | 265 | 3235 | |
| CyanoBase | 151 | 3349 | 768 | 6794 | |
| CyanoBase | 143 | 5532 | 751 | 2310 | |
| CyanoBase | 119 | 1731 | 258 | 5724 | |
| CyanoBase | 64 | 2155 | 390 | 1756 | |
| CyanoBase | 52 | 5600 | 4483 | 4484 | |
| CyanoBase | 44 | 919 | 248 | 2326 | |
| CyanoBase | 37 | 539 | 135 | 1928 | |
| CyanoBase | 9 | 787 | 260 | 6676 | |
| CyanoBase | 5 | 22 | 14 | 4498 | |
| CyanoBase | 5 | 38 | 22 | 2579 | |
| CyanoBase | 2 | 5 | 2 | 2580 | |
| RhizoBase | 550 | 26 636 | 8366 | 8374 | |
| RhizoBase | 240 | 9801 | 1990 | 6287 | |
| RhizoBase | 115 | 2373 | 865 | 7343 | |
| RhizoBase | 107 | 5224 | 989 | 990 | |
| RhizoBase | 83 | 3426 | 781 | 7342 | |
| RhizoBase | 8 | 46 | 17 | 6437 |
Summary of data types and the number of items accessible from the SPARQL endpoint
| Data type | Number | RDF | Reference |
|---|---|---|---|
| CyanoBase | |||
| Genome project | 39 | ○ | |
| Gene | 138 896 | ○ | |
| Publication | 5285 | ||
| Operona | 86 | ○ | |
| Protein complexa | 68 | ○ | |
| Protein–protein interaction | 3054 | ○ | ( |
| RhizoBase | |||
| Genome project | 20 | ○ | |
| Gene | 116 140 | ○ | |
| Publication | 1216 | ||
| Protein–protein interaction | 2987 | ○ | ( |