| Literature DB >> 27175225 |
Jingshan Huang1, Fernando Gutierrez2, Harrison J Strachan1, Dejing Dou2, Weili Huang3, Barry Smith4, Judith A Blake5, Karen Eilbeck6, Darren A Natale7, Yu Lin8, Bin Wu9, Nisansa de Silva2, Xiaowei Wang10, Zixing Liu11, Glen M Borchert12, Ming Tan11, Alan Ruttenberg13.
Abstract
As a special class of non-coding RNAs (ncRNAs), microRNAs (miRNAs) perform important roles in numerous biological and pathological processes. The realization of miRNA functions depends largely on how miRNAs regulate specific target genes. It is therefore critical to identify, analyze, and cross-reference miRNA-target interactions to better explore and delineate miRNA functions. Semantic technologies can help in this regard. We previously developed a miRNA domain-specific application ontology, Ontology for MIcroRNA Target (OMIT), whose goal was to serve as a foundation for semantic annotation, data integration, and semantic search in the miRNA field. In this paper we describe our continuing effort to develop the OMIT, and demonstrate its use within a semantic search system, OmniSearch, designed to facilitate knowledge capture of miRNA-target interaction data. Important changes in the current version OMIT are summarized as: (1) following a modularized ontology design (with 2559 terms imported from the NCRO ontology); (2) encoding all 1884 human miRNAs (vs. 300 in previous versions); and (3) setting up a GitHub project site along with an issue tracker for more effective community collaboration on the ontology development. The OMIT ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/omit.owl. The OmniSearch system is also free and open to all users, accessible at: http://omnisearch.soc.southalabama.edu/index.php/Software.Entities:
Keywords: Biomedical ontology; Data annotation; Data integration; Non-coding RNA; Ontology development; SPARQL query; Semantic search; Target gene; microRNA
Mesh:
Substances:
Year: 2016 PMID: 27175225 PMCID: PMC4863347 DOI: 10.1186/s13326-016-0064-2
Source DB: PubMed Journal: J Biomed Semantics
A subset of imported terms and relations
| Imported term or relation | Source ontology | Original ID | |
|---|---|---|---|
| RO:part of | Relation Ontology | BFO_0000050 | |
| RO:participates in | Relation Ontology | RO_0000056 | |
| RO:has participant | Relation Ontology | RO_0000057 | |
| BFO:entity | Basic Formal Ontology | BFO_0000001 | |
| BFO:continuant | Basic Formal Ontology | BFO_0000002 | |
| BFO:independent continuant | Basic Formal Ontology | BFO_0000004 | |
| BFO:occurrent | Basic Formal Ontology | BFO_0000003 | |
| BFO:material entity | Basic Formal Ontology | BFO_0000040 | |
| CHEBI:molecular entity | Chemical Entities of Biological Interest Ontology | CHEBI_23367 | |
| CHEBI:ribonucleic acid | Chemical Entities of Biological Interest Ontology | CHEBI_23367 | |
| CHEBI:ribosomal RNA | Chemical Entities of Biological Interest Ontology | CHEBI_18111 | |
| CHEBI:small nuclear RNA | Chemical Entities of Biological Interest Ontology | CHEBI_74035 | |
| CHEBI:transfer RNA | Chemical Entities of Biological Interest Ontology | CHEBI_17843 | |
| NCRO:human_miRNA | Non-coding RNA Ontology | NCRO_0000810 | |
| NCRO:hsa-miR-125b-1-3p | Non-coding RNA Ontology | NCRO_0003283 | |
| NCRO:hsa-miR-125b-2-3p | Non-coding RNA Ontology | NCRO_0003284 | |
| NCRO:hsa-miR-125b-5p | Non-coding RNA Ontology | NCRO_0003282 | |
| NCRO:miRNA_target_gene | Non-coding RNA Ontology | NCRO_0000025 | |
| NCRO:miRNA_and_target_gene_binding | Non-coding RNA Ontology | NCRO_0000003 | |
| NCRO:protein_miRNA_promoter_binding | Non-coding RNA Ontology | NCRO_0000011 | |
| IAO:information content entity | Information Artifact Ontology | IAO_0000030 | |
| IAO:measurement datum | Information Artifact Ontology | IAO_0000109 |
Fig. 1The design of core terms and relations in the OMIT ontology (both terms and relations are represented in the format of PREFIX:label)
A subset of new OMIT terms
| OMIT new term | Direct parent term | Human-readable explanation |
|---|---|---|
| computationally_asserted_evidence | IAO:information content entity | Evidence obtained from some |
| computational methods. | ||
| information_from_miRNA_ | OMIT:computationally_asserted_evidence | Records obtained from various |
| target_prediction_database | miRNA target prediction databases. | |
| prediction_from_miRDB | OMIT:information_from_miRNA_ | Records specifically obtained |
| target_prediction_database | from the miRDB database. | |
| prediction_from_TargetScan | OMIT:information_from_miRNA_ | Records specifically obtained |
| target_prediction_database | from the TargetScan database. | |
| prediction_from_miRanda | OMIT:information_from_miRNA_ | Records specifically obtained |
| target_prediction_database | from the miRanda database. | |
| target_score_in_miRDB | IAO:measurement datum | The score of some specific |
| miRNA-target binding prediction | ||
| from the miRDB database. | ||
| gene_context_score_in_TargetScan | IAO:measurement datum | The context score of some specific |
| miRNA-target binding prediction | ||
| from the TargetScan database. | ||
| mirSVR_score_in_miRanda | IAO:measurement datum | The mirSVR score of some specific |
| miRNA-target binding prediction | ||
| from the miRanda database. | ||
| information_from_NCBI_gene | IAO:information content entity | Records obtained from NCBI Gene |
| according to gene IDs or gene symbols. | ||
| information_from_NCBI_nucleotide | IAO:information content entity | Records obtained from NCBI Nucleotide |
| according to GenBank Accession numbers. | ||
| information_from_PubMed | IAO:information content entity | Records obtained from the PubMed |
| database according to PMIDs. |
A subset of new OMIT relations
| New relation | Domain | Range | Human-readable explanation |
|---|---|---|---|
| OMIT:miRNA_target_ | NCRO:miRNA_and_ | OMIT:computationally_ | Specific miRNA-target binding |
| assumption_ | target_gene_binding | asserted_evidence | prediction is based on some |
| based_on | computationally asserted evidence. | ||
| OMIT:is_quality_ | IAO:measurement datum | OMIT:computationally_ | A piece of measurement datum |
| measurement_of | asserted_evidence | (e.g., the target score in miRDB) | |
| is a quality measurement of | |||
| computationally asserted evidence. | |||
| OMIT:is_gene_ | NCRO:miRNA_target_gene | OMIT:target_protein | A miRNA target gene |
| template_of_protein | serves as a template | ||
| of relevant protein. | |||
| RO:has participant | OMIT:prediction_from_miRDB | SO:miRNA | Each miRNA-target binding |
| prediction record has one | |||
| miRNA as a participant. | |||
| RO:has participant | OMIT:prediction_from_miRDB | NCRO:miRNA_target_gene | Each miRNA-target binding |
| prediction record has one | |||
| target as a participant. | |||
| RO:part of | OMIT:target_score_in_miRDB | OMIT:prediction_from_miRDB | Each miRNA-target binding |
| prediction record from | |||
| miRDB contains one score. | |||
| Each record from NCBI | |||
| RO:part of | OMIT:PubMed_summary_ | OMIT:information_from_NCBI_gene | Gene contains one or |
| in_NCBI_gene | more PubMed summaries. |
Fig. 2Semantic annotation and data integration flowchart in the OmniSearch system
Fig. 3Semantic search architecture in the OmniSearch system
Fig. 4GUI design in the OmniSearch system
Fig. 5Search results for the question of “What is the role of hsa-miR-125b-5p in cancer drug resistance?”
The system time and saved time for end users
| Query | First search | Second search | System time | User time | Percentage of saved | Percentage of saved | Percentage of saved |
|---|---|---|---|---|---|---|---|
| criterion | criterion | (seconds) | (seconds) | time for end users | time on DAVID analysis | time on result comparison | |
| 1 | hsa-miR-1231 | cell movement | 2.51 | 10 | 62 % | 55 % | 61 % |
| 2 | hsa-miR-1288-5p | cell proliferation | 2.89 | 9 | 61 % | 51 % | 62 % |
| 3 | hsa-miR-143-3p | mitosis | 5.54 | 10 | 61 % | 52 % | 60 % |
| 4 | hsa-miR-192-5p | leukemic infiltration | 2.24 | 8 | 53 % | 53 % | 59 % |
| 5 | hsa-miR-216a-5p | drug resistance, | 4.09 | 11 | 65 % | 55 % | 62 % |
| multiple | |||||||
| 6 | hsa-miR-29c-3p | recurrence | 8.99 | 11 | 68 % | 53 % | 63 % |
| 7 | hsa-miR-3155a | dna cleavage | 1.21 | 6 | 53 % | 47 % | 55 % |
| 8 | hsa-miR-320b | drug resistance | 17.59 | 18 | 73 % | 51 % | 66 % |
| 9 | hsa-miR-3622a-5p | entosis | 0.30 | 6 | 51 % | 43 % | 57 % |
| 10 | hsa-miR-371b-5p | mitochondrial | 3.89 | 12 | 66 % | 59 % | 64 % |
| dynamics | |||||||
| 11 | hsa-miR-3934-5p | dna methylation | 0.93 | 8 | 61 % | 45 % | 59 % |
| 12 | hsa-miR-4263 | mutagenesis | 1.65 | 6 | 52 % | 46 % | 56 % |
| 13 | hsa-miR-4431 | mitochondrial | 0.17 | 6 | 53 % | 47 % | 55 % |
| degradation | |||||||
| 14 | hsa-miR-4505 | cell transformation, | 4.25 | 10 | 63 % | 55 % | 61 % |
| neoplastic | |||||||
| 15 | hsa-miR-4648 | cell polarity | 0.71 | 6 | 52 % | 45 % | 57 % |
| 16 | hsa-miR-4700-3p | neoplasm regression, | 1.56 | 7 | 53 % | 51 % | 59 % |
| spontaneous | |||||||
| 17 | hsa-miR-4756-5p | endocytosis | 3.76 | 10 | 67 % | 53 % | 62 % |
| 18 | hsa-miR-4802-3p | drug resistance, | 1.67 | 7 | 55 % | 47 % | 59 % |
| microbial | |||||||
| 19 | hsa-miR-501-3p | insulin resistance | 1.78 | 8 | 57 % | 43 % | 61 % |
| 20 | hsa-miR-520a-3p | ubiquitination | 13.31 | 17 | 75 % | 55 % | 65 % |
| Average | ————— | ————— | 3.95 | 9.30 | 60.05 % | 50.30 % | 60.15 % |
Reduced number of publications after applying the MeSH-term filter “drug resistance”
| Target gene | Original number | Number of papers | Percentage |
|---|---|---|---|
| symbol | of papers | after MeSH filtering | reduced |
| ABCC5 | 50 | 16 | 68 % |
| DPH2 | 13 | 2 | 85 % |
| FOXQ1 | 31 | 3 | 90 % |
| CIAPIN1 | 43 | 4 | 91 % |
| SLC38A9 | 12 | 1 | 92 % |
| MCL1 | 452 | 31 | 93 % |
| MKNK2 | 30 | 2 | 93 % |
| BAG4 | 32 | 2 | 94 % |
| ARID3B | 18 | 1 | 94 % |
| HSPB2 | 79 | 4 | 95 % |
| THEMIS2 | 20 | 1 | 95 % |
| BAK1 | 266 | 11 | 96 % |
| SULT4A1 | 27 | 1 | 96 % |
| FUT4 | 57 | 2 | 96 % |
| GPC6 | 29 | 1 | 97 % |
| DDX54 | 29 | 1 | 97 % |
| MBD1 | 58 | 2 | 97 % |
| PRDM1 | 118 | 4 | 97 % |
| DTNB | 30 | 1 | 97 % |
| LIN28A | 91 | 3 | 97 % |
| SIRT7 | 33 | 1 | 97 % |
| ZBTB7A | 67 | 2 | 97 % |
| NCOR2 | 240 | 7 | 97 % |
| TTPA | 35 | 1 | 97 % |
| MAP3K10 | 35 | 1 | 97 % |
| SGPL1 | 36 | 1 | 97 % |
| MYO18A | 36 | 1 | 97 % |
| EIF4EBP1 | 217 | 6 | 97 % |
| LIMK1 | 109 | 3 | 97 % |
| TP53INP1 | 37 | 1 | 97 % |
| CYTH1 | 39 | 1 | 97 % |
| SLC7A1 | 41 | 1 | 98 % |
An example set of publications correctly/incorrectly filtered by “drug resistance”
| Gene symbol | Total number of publications without applying the “drug resistance” filter | Total number of publications that contain the MeSH term “drug resistance” | Total number of incorrectly filtered publications |
|---|---|---|---|
| IRF4 | 130 | 3 | 0 |
| ARID3B | 18 | 1 | 0 |
| SGPL1 | 36 | 1 | 0 |
| ESRRA | 131 | 3 | 0 |
| PAFAH1B1 | 129 | 1 | 0 |
| ETS1 | 287 | 5 | 0 |
| TTPA | 35 | 1 | 0 |
| DVL3 | 60 | 1 | 1 |
| THEMIS2 | 20 | 1 | 0 |
| VTCN1 | 66 | 1 | 0 |
| WDR5 | 128 | 1 | 0 |
| ETV6 | 198 | 4 | 0 |
| TAZ | 74 | 1 | 0 |
| IL6R | 300 | 1 | 0 |
| DPH2 | 13 | 2 | 0 |
| BTG2 | 84 | 1 | 0 |
| CYP24A1 | 146 | 2 | 0 |
| LIN28A | 91 | 3 | 0 |
| TRPS1 | 69 | 1 | 0 |
| CSNK2A1 | 619 | 5 | 3 |
| TP53INP1 | 37 | 1 | 0 |
| GPC6 | 29 | 1 | 0 |
| DICER1 | 291 | 3 | 0 |