| Literature DB >> 25957950 |
Frederic B Bastian1, Marcus C Chibucos2, Pascale Gaudet2, Michelle Giglio2, Gemma L Holliday2, Hong Huang2, Suzanna E Lewis2, Anne Niknejad1, Sandra Orchard2, Sylvain Poux2, Nives Skunca1, Marc Robinson-Rechavi1.
Abstract
Biocuration has become a cornerstone for analyses in biology, and to meet needs, the amount of annotations has considerably grown in recent years. However, the reliability of these annotations varies; it has thus become necessary to be able to assess the confidence in annotations. Although several resources already provide confidence information about the annotations that they produce, a standard way of providing such information has yet to be defined. This lack of standardization undermines the propagation of knowledge across resources, as well as the credibility of results from high-throughput analyses. Seeded at a workshop during the Biocuration 2012 conference, a working group has been created to address this problem. We present here the elements that were identified as essential for assessing confidence in annotations, as well as a draft ontology--the Confidence Information Ontology--to illustrate how the problems identified could be addressed. We hope that this effort will provide a home for discussing this major issue among the biocuration community. Tracker URL: https://github.com/BgeeDB/confidence-information-ontology Ontology URL: https://raw.githubusercontent.com/BgeeDB/confidence-information-ontology/master/src/ontology/cio-simple.oboEntities:
Mesh:
Year: 2015 PMID: 25957950 PMCID: PMC4425939 DOI: 10.1093/database/bav043
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Partial overview of the CIO. The first branching of the CIO distinguishes annotations supported by a single evidence, or by multiple evidence lines. In the latter case, further subclasses refine the overall confidence in the annotation, yielded from all evidence lines available considered together.
Figure 2.Overview of the confidence statement from single evidence branch. The CIO defines three basic confidence statements, corresponding to a simple rating system, that can be modularly used for single evidence annotation, plus a rejected term, used to tag retracted results.
Figure 3.Example of conflicting versus congruent terms. This figure presents the branch ‘confidence statement from multiple evidence lines of same type’; the rationale would be the same if applied to evidence lines of multiple types. The term confidence statement from multiple evidence lines of same type has two subclasses: ‘confidence statement from conflicting evidence lines of same types’ and ‘confidence statement from congruent evidence lines of same type’. The ‘congruent evidence lines’ term has three subclasses, to define the overall level of confidence obtained from the set of supporting evidence lines. Similarly, the ‘weakly conflicting evidence lines’ term has three subclasses, defining the overall level of confidence obtained from the set of available evidence lines. The ‘strongly conflicting evidence lines’ term does not have such subclasses, as in that case, the evidence lines do not allow to reach a consensual conclusion.
Example homology annotation from Bgee
| Entity name | Qualifier | Taxon name | Line type | Evidence term name | Confidence term name | Reference ID |
|---|---|---|---|---|---|---|
| Autopod | — | RAW | Traceable author statement | Medium confidence from single evidence | PMID:23598338 | |
| Autopod | NOT | RAW | Traceable author statement | Medium confidence from single evidence | PMID:23598338 | |
| Autopod | — | SUMMARY | Confidence statement from strongly conflicting evidence lines of same type | — |
This table shows columns 4, 5, 7, 8, 10, 12 and 13 of the Bgee homology annotation file. The first two rows represent conflicting annotations from single evidence, about the homology of the autopod among Vertebrata. The third is an auto-generated row, summarizing the status of this homology hypothesis, from all evidence lines available.
List of most informative terms from the CIO
| Interpretation | Main branch | Term label | |
|---|---|---|---|
| Assertion should be trusted | Single evidence | High confidence from single evidence | |
| Multiple evidence lines, same type | Confidence statement from congruent evidence lines of same type, overall confidence high | ||
| Confidence statement from congruent evidence lines of same type, overall confidence medium | |||
| Multiple evidence lines, multiple types | Confidence statement from congruent evidence lines of multiple types, overall confidence high | ||
| Confidence statement from congruent evidence lines of multiple types, overall confidence medium | |||
| Assertion needs additional support | Single evidence | Low confidence from single evidence | |
| Multiple evidence lines, same type | Confidence statement from strongly conflicting evidence lines of same type | ||
| Confidence statement from weakly conflicting evidence lines of same type, overall confidence low | |||
| Multiple evidence lines, multiple types | Confidence statement from strongly conflicting evidence lines of multiple types | ||
| Confidence statement from weakly conflicting evidence lines of multiple types, overall confidence low | |||