| Literature DB >> 32065617 |
Rashmie Abeysinghe1,2, Eugene W Hinderer3, Hunter N B Moseley3,4,5,6, Licong Cui1.
Abstract
MOTIVATION: The Gene Ontology (GO) is the unifying biological vocabulary for codifying, managing and sharing biological knowledge. Quality issues in GO, if not addressed, can cause misleading results or missed biological discoveries. Manual identification of potential quality issues in GO is a challenging and arduous task, given its growing size. We introduce an automated auditing approach for suggesting potentially missing is-a relations, which may further reveal erroneous is-a relations.Entities:
Mesh:
Year: 2020 PMID: 32065617 PMCID: PMC7214018 DOI: 10.1093/bioinformatics/btaa106
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Sequence representations for concept negative regulation of cellular protein catabolic process (GO:1903363)
| Sequence representation— | Tag annotation— |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Sequence representations for concept innate immune response activating cell surface receptor signaling pathway (GO:0002220)
| Sequence representation— |
|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sequence representations for concept negative regulation of cellular protein catabolic process (GO:1903363) after antonym tagging
| Sequence representation— | Tag annotation– |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Fig. 1.An example of two GO concepts satisfying the monotonicity rule and revealing a missing is-a relation: GO:0071450 is-a GO:0071241 (see the bolded, dashed arrow)
Fig. 2.An example of two GO concepts satisfying the monotonicity rule and revealing an erroneous is-a relation: nucleotide catabolic process (GO:0009166) is-a biosynthetic process (GO:0009058) (see the bolded arrow with a cross)
Fig. 3.An example of four GO concepts satisfying the intersection rule and revealing a missing is-a relation: negative regulation of ornithine catabolic process (GO:1903267) is a subtype of negative regulation of cellular amine catabolic process (GO:0033242) (see the bolded, dashed arrow)
Fig. 4.An example of four GO concepts satisfying the intersection rule and revealing an erroneous existing relation: positive regulation of B cell deletion (GO:0002869) is-a regulation of acute inflammatory response (GO:0002673) (see the bolded arrow with a cross)
Number of potentially missing is-a relations suggested by each conditional rule
| Conditional rule | No. of potentially missing |
|---|---|
| Monotonicity rule | 819 |
| Intersection rule | 691 |
| Sub-concept rule | 669 |
The numbers of potentially missing is-a relations, valid missing is-a relations, valid erroneous is-a relations, valid problematic is-relations respectively in the evaluation sample for each condition rule
| Conditional rule | No. of potentially | No. of valid | No. of valid | Total no. of valid | Precision (%) |
|---|---|---|---|---|---|
| missing | missing | erroneous | problematic | ||
| Monotonicity rule | 99 | 54 | 6 | 60 | 60.61 |
| Intersection rule | 81 | 44 | 5 | 49 | 60.49 |
| Sub-concept rule | 63 | 29 | N/A | 29 | 46.03 |
Examples of valid problematic (missing or erroneous) is-a relations verified by domain experts
| Conditional rule | Problematic | Type |
|---|---|---|
| Monotonicity rule |
| Missing |
|
| ||
| Monotonicity rule |
| Missing |
|
| ||
| Monotonicity rule |
| Missing |
|
| ||
| Monotonicity rule |
| Missing |
|
| ||
| Monotonicity rule |
| Erroneous |
|
| ||
| Intersection rule |
| Missing |
|
| ||
| Intersection rule |
| Missing |
|
| ||
| Intersection rule |
| Erroneous |
|
| ||
| Sub-concept rule |
| Missing |
|
| ||
| Sub-concept rule |
| Missing |
|
|