| Literature DB >> 21715386 |
Chen Xie1, Xizeng Mao, Jiaju Huang, Yang Ding, Jianmin Wu, Shan Dong, Lei Kong, Ge Gao, Chuan-Yun Li, Liping Wei.
Abstract
High-throughput experimental technologies often identify dozens to hundreds of genes related to, or changed in, a biological or pathological process. From these genes one wants to identify biological pathways that may be involved and diseases that may be implicated. Here, we report a web server, KOBAS 2.0, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations. It allows for both ID mapping and cross-species sequence similarity mapping. It then performs statistical tests to identify statistically significantly enriched pathways and diseases. KOBAS 2.0 incorporates knowledge across 1327 species from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther) and 5 human disease databases (OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog). KOBAS 2.0 can be accessed at http://kobas.cbi.pku.edu.cn.Entities:
Mesh:
Year: 2011 PMID: 21715386 PMCID: PMC3125809 DOI: 10.1093/nar/gkr483
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.KOBAS 2.0 workflow. The types of input can be ID, FASTA sequence, or tabular BLAST output. KOBAS 2.0 has two programs ‘annotate’ and ‘identify’. The first program annotates input genes with pathways and diseases by ID mapping or sequence similarity mapping. The second program identifies statistically significantly enriched pathways and diseases.
Pathway and disease databases supported by KOBAS 2.0
| Database name | Data content | File format | Number of species | Number of pathways or diseases in human | Number of genes mapped to KEGG GENES/all genes in human | URL |
|---|---|---|---|---|---|---|
| KEGG PATHWAY | Pathway | Text | 1327 | 220 | 5595/5595 | |
| PID Curated | Pathway | XML | 1 | 192 | 2782/3315 | |
| PID BioCarta | Pathway | XML | 1 | 254 | 1907/2391 | |
| PID Reactome | Pathway | XML | 1 | 996 | 3783/4405 | |
| BioCyc | Pathway | Text and Table | 6 | 277 | 1087/1120 | |
| Reactome | Pathway | Table | 22 | 68 | 4366/4534 | |
| Panther | Pathway | Table | 43 | 154 | 2170/2207 | |
| OMIM | Disease | Table | 1 | 4990 | 3792/3792 | |
| KEGG DISEASE | Disease | Text | 1 | 323 | 798/798 | |
| FunDO | Disease | Table | 1 | 561 | 3888/4029 | |
| GAD | Disease | Table | 1 | 3770 | 3164/3238 | |
| NHGRI | Disease | Table | 1 | 369 | 1975/2191 |
aThe numbers in this table are summarized from KOBAS 2.0 backend database updated in November 23rd, 2010. And all the analyses using KOBAS 2.0 in this article are based on this data version.
Figure 2.Screenshot of the output of ‘annotate’. 371 upregulated probe sets in CA are assigned to KEGG human genes by sequence similarity mapping. Users can view the result in table format (by default) or raw format (which can be downloaded to local disks). Users can also directly use the result as the input of ‘identify’ to do further analysis.
Figure 3.Screenshot of the output of ‘identify’. Statistically significantly enriched pathways and diseases of 371 upregulated probe sets in CA identified are sorted by increasing corrected P-value. Only those with corrected P ≤ 0.05 are shown. Similar to the output of ‘annotate’, users can view the result in table format (by default) or raw format (which can be downloaded to local disks).