| Literature DB >> 33931103 |
Casper van Mourik1,2, Rezvan Ehsani3,4,5, Finn Drabløs6.
Abstract
OBJECTIVE: Properties of gene products can be described or annotated with Gene Ontology (GO) terms. But for many genes we have limited information about their products, for example with respect to function. This is particularly true for long non-coding RNAs (lncRNAs), where the function in most cases is unknown. However, it has been shown that annotation as described by GO terms to some extent can be predicted by enrichment analysis on properties of co-expressed genes.Entities:
Keywords: Annotation; Gene ontology; Long non-coding RNAs; Prediction
Mesh:
Year: 2021 PMID: 33931103 PMCID: PMC8086094 DOI: 10.1186/s13104-021-05580-1
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Fig. 1The GAPGOM pipeline for annotation prediction. The flowchart shows the main steps of GAPGOM. For a query gene and a certain property type, a gene set is defined based on similarity over a property library. The gene set is then populated with data from an annotation library and enriched terms of the gene set are identified. The predicted annotation may optionally be compared to actual annotation if the query gene is part of a benchmark dataset (i.e., with known annotation)
Main alternative approaches for annotation prediction with GAPGOM tools
| Approach | Tool | Property | Annotation | Benchmarka |
|---|---|---|---|---|
| 1 | lncRNA2GOA | Expression | GO | TopoICSim |
| 2 | lncRNA2GOA | Expression | Class | 2 × 2 |
| 3 | TopoICSim | GO | Class | 2 × 2 |
a2 × 2 represents benchmarking using a confusion matrix over true and false positive and negative predictions