Literature DB >> 21342572

GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products.

Yu Rang Park1, Jihun Kim, Hye Won Lee, Young Jo Yoon, Ju Han Kim.   

Abstract

BACKGROUND: The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-based annotations; semantically redundant, biological-domain inconsistent and taxonomy inconsistent annotations.
METHODS: To determine the semantic inconsistencies in GO annotation, we used the hierarchical structure of GO graph and tree structure of NCBI taxonomy. Twenty seven biological databases were collected for finding semantic inconsistent annotation.
RESULTS: The distributions and possible causes of the semantic inconsistencies were investigated using twenty seven biological databases with GO-based annotations. We found that some evidence codes of annotation were associated with the inconsistencies. The numbers of gene products and species in a database that are related to the complexity of database management are also in correlation with the inconsistencies. Consequently, numerous annotation errors arise and are propagated throughout biological databases and GO-based high-level analyses. GOChase-II is developed to detect and correct both syntactic and semantic errors in GO-based annotations.
CONCLUSIONS: We identified some inconsistencies in GO-based annotation and provided software, GOChase-II, for correcting these semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21342572      PMCID: PMC3044297          DOI: 10.1186/1471-2105-12-S1-S40

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

The Gene Ontology (GO) project started to provide semantic standards for the annotation of molecular attributes of genes and gene products [1]. The Gene Ontology is a controlled vocabulary for describing genes and gene products in terms of their associated biological processes, cellular components and molecular functions. The structural foundation of GO is formally a Directed Acyclic Graph (DAG) wherein the terms are equivalent to the nodes and the relationships to the edges of the graph [2]. GO has grown enormously. The number of organism groups participating in the GO Consortium has grown every quarter year from the initial three to roughly two dozen [3]. A lot of biological databases use GO to annotate the molecular attributes of genes and gene products [4,5]. GO-based analysis of microarray and mass spectrometry data have been successfully realized [3]. Recently, new generation of tools based-on GO have been developed, aiming to enhance biological knowledge such as protein structure classifying [6], gene-phenotype association predicting [7] and gene network building [8]. More details are available at GO website (http://www.geneontology.org/GO.tools.shtml). Unified Medical Language System (UMLS) metathesaurus has been integrated with GO to expand UMLS into the biological domain [9]. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. Masseroli correctly pointed out the structural and semantic problems of GO such as metonymy, species-specific terms and multiple paths [10]. Dolan et al. evaluated the reliability of GO-based annotations [11]. Poor inter-annotator reliability of GO-based annotations for human-mouse orthologous gene pairs was reported between two gene-annotation groups, MGI and GOA. Park et al. identified syntactic errors caused by the two GO-update operations, ‘new obsoletions’ and ‘new term merges’, used in the course of GO version change [12]. They introduced GOChase to detect and correct the syntactic errors and error propagations in GO-based annotations (http://www.snubi.org/software/GOChase/). In the present study, we further identified semantic error types in GO-based annotations; redundant, biological-domain-inconsistent and taxonomy inconsistent annotation. The first type is “redundant annotation.” When a gene is annotated to a GO term, for instance, according to the current GO annotation paradigm, it is considered to be implicitly annotated to all parents of the term. Assigning both parent and child terms to the same gene is regarded as “redundant annotation.” In some cases, if parent and child term was annotated in specific gene product using different evidence code, these annotations hard to say completely redundant. For example, an experiment may provide enough evidence to annotate to a parent, but not to any specific child, whereas a more specific annotation may be predicted by sequence comparison or other computation. In such cases both annotation would be retained, the parent because of its experiment support and the child for specificity. So we analyze the redundant annotation to distinguish the evidence code used in parent and child term. The second type is “biological domain-inconsistent annotation.” A GO term should avoid using species-specific definitions and rather include any term that can be applied to more than one taxonomy classes of organisms (The Gene Ontology Consortium, 2000). Some GO terms have species-specific characteristics such as nucleus (GO:0005634), specific for eukaryotes and unidirectional conjugation (GO:0009291), specific for prokaryotic specific terms. As GO-based annotation expands to various species, however, species-specific terms become increasingly problematic. For example, a gene product having UNIPROT ID O24899 from Helicobacter pylori, a kind of bacteria, is wrongly annotated to nucleus, a eukaryote-only GO term. The third type is “taxonomy inconsistent annotation”. Recently, the GO Consortium provided terms with taxonomy restrictions, containing species-specific terms with the NCBI taxonomy group for which they are or are not appropriate (http://www.geneontology.org/GO.sensu.shtml). Forty four taxonomic groups used taxonomy restricted terms in the January 2010 GO version. Taxonomy inconsistent annotation occurs when a taxonomy restricted term is annotated to a gene that does not belong to the corresponding taxonomy group. GO consortium checks the inconsistent annotation using taxonomy restricted terms and provide reports of inconsistent annotation. But many annotations have been produced without consideration of taxonomy restricted terms. For example, we found that a eukaryote restricted GO term, Golgi apparatus (GO:0005794), was (wrongly) annotated to 27 gene products of Escherichia coli, a kind of bacteria. In the present study, we analyzed the distributions of the semantic inconsistencies in GO-based annotations using 27 major biological databases. To understand the factors influencing such inconsistent annotations, we perform correlation analysis between the inconsistent annotations and the possible attributes for the inconsistent annotations including the usage of evidence codes (http://www.geneontology.org/GO.evidence. shtml), the number of gene products, the number of species and the number of GO terms. We developed a set of web-based utilities, GOChase-II, to correct the semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I [12].

Material and methods

Databases

We obtained GO DB downloads from the GO database site (http://www.geneontology.org/GO.downloads.database.shtml). We collected GO-based annotations for genes and gene products from 27 major biological databases including NCBI’s Gene and Ensembl. The GO DB schema used for data integration was obtained at http://www.geneontology.org/images/diag-godb-er.jpg. To extract GO-update history, we downloaded GO monthly reports from January 2000 to December 2007 from the GO FTP site (ftp://ftp.geneontolgy.org). Since January 2008, GO consortium, however, have not provided monthly reports, thus we use OBO-Edit tool to generate GO change reports over the past month [15]. OBO-Edit generated reports provide four additional types of change; change comment, change synonym, change category, and change external reference. It also provide six types of changes which defined by monthly report; new term, new obsoletion, term name change, new definition, new term merge and term movement. We parsed these 11 types of change for resolving GO-update history. The NCBI taxonomy database (ftp://ftp.ncbi.nih.gov/pub/taxonomy/) was downloaded to find and correct biological domain -inconsistent annotations. The NCBI taxonomy database indexes over 320,000 named organisms that are represented in the databases with at least one nucleotide or protein sequence [16].

Semantic inconsistencies

The hierarchical relationships extracted from the GO DAGs were used to determine redundant annotations. For each gene product, parent-child relationship between any pair of GO terms annotated to the gene product in the 27 biological databases was tested to determine redundant annotations (Table 1). We analyze the redundant annotation to distinguish between one specific gene product annotated using parent-child terms that use the same evidence code and those use different evidence codes. In some cases (details in introduction section), the redundant annotations of parent-child terms use the different evidence code are supporting data.
Table 1

Redundant annotations in biological databases

DatabasesDB version mm/dd/yyGO versionmm/dd/yyNo. of gene products annotated with GO termsNo. of GO annotations applied to gene productsNo. of GO terms used in gene product annotations
Redundant anntationRedundant annotation (same evidence code)Total gene productsRedundant anntationRedundant annotation (same evidence code)Total GO annotationsRedundant anntationRedundant annotation (same evidence code)Total GO terms

Ensembla09/01/0901/01/10307,467299,911673,180783,687707,3354,395,1252,9782,55112,309
Geneb12/15/0901/01/1088,79773,001235,852223,772143,5371,234,2203,3692,63215,363

AspGDc12/21/0901/01/105232253,42564025915,3402391073,259
CGD11/24/0901/01/104742294,04077230920,0092541043,332
dictyBase12/27/0901/01/102,5901,6197,4894,6512,37731,0643682412,403
EcoCyc12/14/0901/01/101731551,8692732194,9921321111,388
FB10/30/0901/01/103,3551,82312,5097,3012,74068,3161,0776564,924
GeneDB_Pfalciparum10/27/0501/01/1021162,20621164,6321715663
GeneDB_Spombe09/28/0901/01/102,7971,3305,2134,0091,66234,1144952973,394
GeneDB_Tbrucei07/18/0701/01/102341912,97725120210,4146152935
GR_protein08/26/0901/01/1042636941,32155244549,7219077646
JCVI_CMR07/22/0901/01/1044641221,27145541754,39890832,350
MGI12/17/0901/01/1014,92713,46618,16750,97033,966151,6521,5641,2147,327
NCBI03/03/0801/01/1032418711,27445731927,6476663492
PDB12/17/0901/01/1010,23410,23421,84918,26318,26383,5882832831,884
PseudoCAP06/28/0601/01/105842441,5197202757,2845430859
RefSeq12/14/0901/01/101,9451,94512,1662,7482,74836,2011251251,440
RGD10/02/0901/01/109,9328,00817,35229,96115,120180,6061,8931,3419,094
SGD12/25/0901/01/105,2734,4826,35323,81511,57576,1881,1187664,222
SGN10/23/0901/01/101291551291,253108653
TAIR12/23/0901/01/108,1026,87151,71310,6158,656149,4666464734,103
TIGR_CMR11/14/0701/01/1075773140,653782756101,96595922,441
UniProt12/17/0901/01/10206411,290381679,381119173
UniProtKB/Swiss-Prot12/17/0901/01/10384,061380,296419,2411,303,9091,279,9243,416,1941,7091,51411,507
UniProtKB/TrEMBL12/17/0901/01/103,615,6143,615,4695,981,4519,116,5139,115,70828,760,3561,4201,4029,262
WB11/26/0901/01/105,2525,04115,6679,9048,92691,6114973812,738
ZFIN12/23/0901/01/107,0476,85615,07417,68316,454101,1526035093,019

a http://www.ensembl.org/index.html

b http://www.ncbi.nlm.nih.gov/gene

c http://www.geneontology.org/GO.downloads.annotations.shtml

Redundant annotations in biological databases a http://www.ensembl.org/index.html b http://www.ncbi.nlm.nih.gov/gene c http://www.geneontology.org/GO.downloads.annotations.shtml To find biological domain inconsistency in GO annotation, we reviewed and manually extracted 410 ‘eukaryote-only’ and 73 ‘prokaryote-only’ GO terms including such terms as RNA import into nucleus and ketodeoxyoctanoate biosynthesis (see additional file 1 and 2). All gene products in the 27 databases were divided into non-prokaryotic and non-eukaryotic classes according to the species definition in NCBI taxonomy. Biological-domain-inconsistent annotation was determined by testing the consistency between the corresponding species of a gene product and the ‘prokaryote-only’ or ‘eukaryote-only’ classification of the annotation term. There were 44 taxonomy groups having taxonomy restricted terms in the January 2010 GO version. The taxonomy inconsistent annotation was determined by inconsistency between species-specific GO terms and the species of origin of the annotated gene products.

Attributes for inconsistent annotation

In search for the possible attributes for the inconsistent annotations, we evaluated five possible attributes by correlation analysis; the use of different evidence codes, the number of gene products, the number of species, the number of GO terms, and the average number of GO annotations. Every GO annotation is supposed to indicate the type of evidence. There are 18 evidence codes currently available. When no evidence code was assigned for an annotation, we marked it as ‘Not Available (NA)’.

Results

To analyze the distributions of the semantic inconsistencies in GO-based annotations we calculated the distribution of redundant annotations in the 27 biological databases (Table 1). All databases have redundant annotation. The fraction of redundant annotations in databases is distributed from 0.9% to 91% for gene products (31% in average), from 2% to 26% for GO terms (13% in average), and from 0.4% to 38% for GO annotations (12% in average). UniProtKB/Swiss-Prot shows the highest redundancy for gene product (91%) and GO annotation (38%). The database showing the highest redundancy in GO terms is Ensembl (24%). GeneDB_Pfalciparum shows the lowest numbers among the databases; 0.9% for gene products, 2.5% for GO terms and 0.4% for GO annotations. In all databases, the fractions of redundant annotation based on the same evidence code are larger than different evidence code. The distributions of biological-domain-inconsistent annotations are calculated using prokaryote-only and eukaryote-only GO terms we defined. Biological domain inconsistent annotation was found in thirteen databases of non-prokaryotic gene product and eight databases of non-eukaryotic gene product (Table 2). Most of databases have less than 100 inconsistent annotations, except four databases (Ensembl, NCBI Gene, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL). In both biological domains, UniProtKB/TrEMBL is shown to have the highest portion of inconsistent annotation by all three measures. Taxonomy inconsistent annotations were found in 27 out of the 44 taxonomy groups having at least one taxonomy restricted GO term (Table 3 in additional file 3, see method). Table 3 in Additional file 3 shows the numbers of taxonomy inconsistent annotations (as numerators) and the numbers of taxonomy restricted GO terms used (as denominators) in the 27 databases. A blank cell means no annotation with taxonomy restricted GO term. Taxonomy inconsistent annotations are not evenly distributed across databases or taxonomy groups (Table 3 in additional file 3). The NCBI Gene, Ensemble, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL, for example, has inconsistent annotation in most of taxonomy groups as while, no taxonomy inconsistent annotation was found in the five databases: CGD (0/2174), GeneDB_Tbrucei (0/422), NCBI (0/1291), PseudoCAP(0/20), and UniProt (0/17). Interestingly, all annotations for Passeriformes (11/11) are taxonomy inconsistent. Cellular organisms show the lowest taxonomy-inconsistent annotations rate (7/42344) among the 27 taxonomy groups.
Table 2

Biological-domain-inconsistent annotations in biological databases

Biological DomainDatabasesDB version mm/dd/yyGO version mm/dd/yyNo. of gene products annotated with GO termsNo. of GO annotations applied to gene productsNo. of GO terms used in gene product annotations
Biological-domain inconsistentTotal gene productBiological-domain inconsistentTotal GO annotationBiological-domain inconsistentTotal GO terms

Non-prokaryotic gene productEnsembla09/01/0901/01/107111,891,5867604,395,1251312,309
Geneb12/15/0901/01/101,5172,391,4431,6471,133,0603414,762

AspGDc12/21/0901/01/1023,425215,34013,259
dictyBase12/27/0901/01/1017,489131,06412,403
FB10/30/0901/01/10112,509168,31614,924
GeneDB_Tbrucei07/18/0701/01/1022,977210,4141935
MGI12/17/0901/01/10218,1672151,65227,327
PDB12/17/0901/01/10269,1702631,68631,024
RGD10/02/0901/01/10218,3634180,60639,094
TAIR12/23/0901/01/10351,7133149,46614,103
UniProtKB/Swiss-Prot12/17/0901/01/102,680122,2613,8031,035,2091710,589
UniProtKB/TrEMBL12/17/0901/01/1020,5732,333,59223,87612,234,060248,586
WB11/26/0901/01/101215,6671391,61152,738

Non-eukaryotic gene productGenebb12/15/0901/01/1053,0883,595,04176,597101,1603192,497

EcoCycc12/14/0901/01/1021,86924,99211,388
JCVI_CMR07/22/0901/01/101621,2711654,39832,350
PDB12/17/0901/01/108316,5808566,027121,689
TIGR_CMR11/14/0701/01/107040,65370101,96542,441
UniProt12/17/0901/01/1048248677,870344
UniProtKB/Swiss-Prot12/17/0901/01/104,454324,5235,2972,581,774303,306
UniProtKB/TrEMBL12/17/0901/01/1077,0474,459,83483,96520,009,318494,048

a http://www.ensembl.org/index.html

b http://www.ncbi.nlm.nih.gov/gene

c http://www.geneontology.org/GO.current.annotations.shtml

Biological-domain-inconsistent annotations in biological databases a http://www.ensembl.org/index.html b http://www.ncbi.nlm.nih.gov/gene c http://www.geneontology.org/GO.current.annotations.shtml To investigate which factors are related to each inconsistent annotation we analyzed correlation between three types of inconsistent annotation and 23 possible attributes of inconsistent annotation (Table 3). As shown in table 3, Inferred from Electronic Annotation (IEA) shows the highest correlation with redundant (r=0.99) and taxonomy inconsistent annotation (r=0.99). Biological domain inconsistent annotation shows high correlation with number of gene product (0.97). We found that the numbers of species and average number of GO annotation show high correlation while the number of GO term shows low correlation with all types of inconsistent annotation (Table 3).
Table 3

Gene Ontology distribution incorrectly annotated across evidence codes and the related factors

Evidence codeNo. of inaccurate annotation (correlation coefficient value)Total No. of GO annotation

Redundant annotationBiological domain inconsistent annotationTaxonomy inconsistent annotationTotal inaccurate annotation
NR0 (*)0 (*)0 (*)0 (*)6
ISM30 (-0.07)2 (0.26)0 (*)32 (-0.06)279
ISA287 (-0.05)1,385 (0.40)0 (*)1,672 (-0.04)11,756
IGC322 (-0.05)11 (0.43)0 (*)333 (-0.04)888
IC1,193 (-0.03)1,265 (0.40)0 (*)2,458 (-0.02)12,490
IEP2,344 (-0.03)2,467 (0.46)0 (*)4,811 (-0.02)27,889
EXP3,273 (0.08)1,221 (0.16)0 (*)4,494 (0.08)20,781
IGI3,628 (-0.02)4,171 (0.41)0 (*)7,799 (-0.02)34,985
RCA6,469 (-0.07)7,710 (0.18)0 (*)14,179 (-0.06)85,014
NAS7,921 (0.01)4,285 (0.39)0 (*)12,206 (0.02)58,687
IPI10,555 (0.11)1,163 (0.31)0 (*)11,718 (0.11)72,597
ISO14,119 (-0.05)15,956 (0.39)16 (-0.06)30,091 (-0.04)115,268
TAS15,944 (0.01)8,331 (0.39)5 (-0.01)24,280 (0.01)113,414
ND18,987 (-0.07)1 (0.51)0 (*)18,988 (-0.05)366,152
ISS24,314 (0.01)12,828 (0.53)49 (0.01)37,191 (0.03)377,770
IMP25,994 (-0.03)29,932 (0.38)160 (-0.06)56,086 (-0.03)242,825
IDA44,327 (0.01)29,863 (0.38)17 (-0.02)74,207 (0.01)311,481
IEA11,433,355 (0.99)219,560 (0.75)56,180 (0.99)11,709,095 (0.99)42,984,075
NA (Not Avaliable)803 (-0.02)3,654 (0.61)12 (-0.04)4,469 (-0.01)18,102

No of gene product(0.71)(0.97)(0.69)(0.72)
No. of species(0.99)(0.78)(0.99)(0.99)
No. of GO term(0.35)(0.57)(0.34)(0.36)
Average No. of annotations(0.99)(0.76)(0.99)(0.99)
Gene Ontology distribution incorrectly annotated across evidence codes and the related factors

GOChase-II implementation

GOChase-I [12] is a set of web-based utilities to detect and correct syntactic errors from GO-based annotations caused by GO versioning and tracing problems. On the contrary, GOChase-II (http://www.snubi.org/software/GOChase2/) attempts to correct semantic errors in GO-based annotations. It provides four web-based interfaces. (1) GOChase-History resolves the whole evolution history of a GO ID. As an example, the GO term, sorocarp development (GO:0030587), has repeatedly swung back and forth among the fifteen GO terms (reproduction, cell communication, development, response to external stimulus, physiological process, biological_process, response to biotic stimulus, morphogenesis, multicellular organismal development, anatomical structure morphogenesis, anatomical structure development, asexual reproduction, fruiting body development in response to starvation, fruiting body development, response to starvation) by the 31 GO operations in fifteen updates between March 2002 and November 2008. (2) GOChase-Species resolves the distribution of the usage of a GO term across different species and displays the distribution onto the taxonomy tree. The Species function is a powerful tool to analyze the species specificity of a GO term. Some terms are limited to specific species whereas others are used for a wide variety of species. For example, negative regulation of vulval development (GO:0040027) is annotated 395 times but exclusively to Caenorhabditis elegans (i.e. 100%). It is suggested that cyanelle may be a species-specific term. We identified 3548 GO terms annotated only to a single species in January 2010 GO version (see additional file 4). On the other hand, oxidoreductase activity (GO:0016491) is annotated 800,048 times to 108,929 different species (i.e. 7.3 times per a species in average). Species function can also be used to find the wrong use of species-specific terms. (3) GOChase-Correct highlights a 'merged-term' and redirects it to the correct 'target term' into which the terms have been merged. For an obsolete term, GOChase provides the alternative terms. GOChase-Correct correct redundant and biological-domain-inconsistent annotations. (4) When one inputs a GO ID, GOChase will resolve all gene products annotated with the GO ID across all the databases in Table 1. GOChaser provides GO enrichment analysis for input gene-expression clusters. Although most GO enrichment analysis tools have the similar functionality [14], GOChaser has a unique functionality of correcting both the syntactic and semantic errors to improve the analysis results. GOChaser provides two statistical models, the hypergeomeric test and the Fisher’s exact test, with multiple hypotheses testing correction (Bonferroni correction).

Conclusion and discussion

We identified and corrected three types of semantic inconsistencies in GO-based annotations for gene products from 27 major biological databases. GO becomes a widely accepted ontology in biomedical field. The under-managed errors and inconsistencies may reflect its short history, its ever growing complexity, and the vast amount of the biological domain knowledge. Recently GO Consortium starts working on refining GO contents and structure [17]. The present study demonstrates that the GO community may be empowered by bioinformatics tools ensuring error-proof mechanisms concerning the GO hierarchical relationships, species-specific definitions and GO term usage guidelines. To sum up our result in this research, there is no database free from the semantic inconsistent annotation. Among the three types of semantic inconsistent annotation, redundant annotation is most common error. About 12% of the whole annotations are redundant. Only a few biological-domain inconsistent annotations are found in the 18 biological databases because of the small number of ‘eukaryote-only’ (410) and ‘prokaryote-only’ (71) GO term. The high correlation between IEA and inconsistent annotations (Table 3) suggests that IEA has lower reliability than others. Electronically generated associations without human judgment are labelled as IEA. GO Consortium proposes a hierarchy of reliability among evidence codes (http://www.geneontology.org/GO.evidence.shtml). In general, TAS and IDA show higher reliability. TAS and IDA have low correlation with all three types of inconsistent annotation. And most of evidence codes, which curated by human, have low correlation with all types of inconsistent annotation. This result implies that the hierarchy of reliability among evidence codes are preserved in inaccurate annotation. The numbers of gene products and species of a database show high correlations with all types of inconsistent annotations except taxonomy-inconsistent annotation. It suggests that the complexity of database maintenance may affect the occurrence of inconsistent annotations. Therefore, it is more strongly required for such databases to implement a sound mechanism such as GOChase-II in order to avoid semantic inconsistencies caused by multiple user-groups.

Authors' contributions

YRP conceived the study, wrote the manuscript and implemented the web-based program. JK wrote the manuscript and validated the inconsistent annotation. HWL validated the taxonomy-specific GO terms and helped to draft the manuscript. YJY calculated history data of GO term. JHK coordinated and supervised the study. All authors read and approved the final manuscript.

Additional file 1

Additional file 1 Click here for file

Additional file 2

Additional file 2 Click here for file

Additional file 3

Additional file 3 Click here for file

Additional file 4

Additional file 4 Click here for file
  16 in total

1.  Creating the gene ontology resource: design and implementation.

Authors: 
Journal:  Genome Res       Date:  2001-08       Impact factor: 9.043

2.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  A procedure for assessing GO annotation consistency.

Authors:  Mary E Dolan; Li Ni; Evelyn Camon; Judith A Blake
Journal:  Bioinformatics       Date:  2005-06       Impact factor: 6.937

4.  Ontological analysis of gene expression data: current tools, limitations, and open problems.

Authors:  Purvesh Khatri; Sorin Drăghici
Journal:  Bioinformatics       Date:  2005-06-30       Impact factor: 6.937

5.  OBO-Edit--an ontology editor for biologists.

Authors:  John Day-Richter; Midori A Harris; Melissa Haendel; Suzanna Lewis
Journal:  Bioinformatics       Date:  2007-06-01       Impact factor: 6.937

6.  BisoGenet: a new tool for gene network building, visualization and analysis.

Authors:  Alexander Martin; Maria E Ochagavia; Laya C Rabasa; Jamilet Miranda; Jorge Fernandez-de-Cossio; Ricardo Bringas
Journal:  BMC Bioinformatics       Date:  2010-02-17       Impact factor: 3.169

7.  Gene Ontology: looking backwards and forwards.

Authors:  Suzanna E Lewis
Journal:  Genome Biol       Date:  2004-12-15       Impact factor: 13.583

8.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.

Authors:  Evelyn Camon; Michele Magrane; Daniel Barrell; Vivian Lee; Emily Dimmer; John Maslen; David Binns; Nicola Harte; Rodrigo Lopez; Rolf Apweiler
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

9.  Tetrahymena Genome Database (TGD): a new genomic resource for Tetrahymena thermophila research.

Authors:  Nicholas A Stover; Cynthia J Krieger; Gail Binkley; Qing Dong; Dianna G Fisk; Robert Nash; Anand Sethuraman; Shuai Weng; J Michael Cherry
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

10.  Database resources of the National Center for Biotechnology Information.

Authors:  Eric W Sayers; Tanya Barrett; Dennis A Benson; Evan Bolton; Stephen H Bryant; Kathi Canese; Vyacheslav Chetvernin; Deanna M Church; Michael Dicuccio; Scott Federhen; Michael Feolo; Lewis Y Geer; Wolfgang Helmberg; Yuri Kapustin; David Landsman; David J Lipman; Zhiyong Lu; Thomas L Madden; Tom Madej; Donna R Maglott; Aron Marchler-Bauer; Vadim Miller; Ilene Mizrachi; James Ostell; Anna Panchenko; Kim D Pruitt; Gregory D Schuler; Edwin Sequeira; Stephen T Sherry; Martin Shumway; Karl Sirotkin; Douglas Slotta; Alexandre Souvorov; Grigory Starchenko; Tatiana A Tatusova; Lukas Wagner; Yanli Wang; W John Wilbur; Eugene Yaschenko; Jian Ye
Journal:  Nucleic Acids Res       Date:  2009-11-12       Impact factor: 16.971

View more
  5 in total

Review 1.  Management of Dynamic Biomedical Terminologies: Current Status and Future Challenges.

Authors:  M Da Silveira; J C Dos Reis; C Pruski
Journal:  Yearb Med Inform       Date:  2015-08-13

2.  The use of EST expression matrixes for the quality control of gene expression data.

Authors:  Andrew T Milnthorpe; Mikhail Soloviev
Journal:  PLoS One       Date:  2012-03-08       Impact factor: 3.240

3.  Evolutionary rate heterogeneity of core and attachment proteins in yeast protein complexes.

Authors:  Sandip Chakraborty; Tapash Chandra Ghosh
Journal:  Genome Biol Evol       Date:  2013       Impact factor: 3.416

4.  Measuring the evolution of ontology complexity: the gene ontology case study.

Authors:  Olivier Dameron; Charles Bettembourg; Nolwenn Le Meur
Journal:  PLoS One       Date:  2013-10-11       Impact factor: 3.240

5.  Optimization of gene set annotations via entropy minimization over variable clusters (EMVC).

Authors:  H Robert Frost; Jason H Moore
Journal:  Bioinformatics       Date:  2014-02-25       Impact factor: 6.937

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.