| Literature DB >> 28729549 |
Adil Salhi1, Sónia Negrão2, Magbubah Essack1, Mitchell J L Morton2, Salim Bougouffa1, Rozaimi Razali1, Aleksandar Radovanovic1, Benoit Marchand3, Maxat Kulmanov1, Robert Hoehndorf1,4, Mark Tester2, Vladimir B Bajic5,6.
Abstract
Tomato is the most economically important horticultural crop used as a model to study plant biology and particularly fruit development. Knowledge obtained from tomato research initiated improvements in tomato and, being transferrable to other such economically important crops, has led to a surge of tomato-related research and published literature. We developed DES-TOMATO knowledgebase (KB) for exploration of information related to tomato. Information exploration is enabled through terms from 26 dictionaries and combination of these terms. To illustrate the utility of DES-TOMATO, we provide several examples how one can efficiently use this KB to retrieve known or potentially novel information. DES-TOMATO is free for academic and nonprofit users and can be accessed at http://cbrc.kaust.edu.sa/des_tomato/, using any of the mainstream web browsers, including Firefox, Safari and Chrome.Entities:
Mesh:
Year: 2017 PMID: 28729549 PMCID: PMC5519719 DOI: 10.1038/s41598-017-05448-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Workflow used within DES to create a KB such as DES-TOMATO.
List of dictionaries used in DES-TOMATO.
| Dictionary | Enriched Unique Terms in the KB | Source |
|---|---|---|
|
| ||
| Chemical Entities of Biological Interest (ChEBI) | 4561 | pre-existing in DES |
| Metabolites (MetaboLights) | 1556 | pre-existing in DES |
| Enzymes (IntEnz) | 1182 | pre-existing in DES |
| Toxins (T3DB) | 886 | pre-existing in DES |
| Antibiotics | 244 | pre-existing in DES |
| Industrially Important Enzymes (EC) | 215 | pre-existing in DES |
|
| ||
| Pathways (KEGG, Reactome, UniPathway, PANTHER) | 576 | pre-existing in DES |
| Biological Process (GO) | 1288 | pre-existing in DES |
| Molecular Function (GO) | 474 | pre-existing in DES |
| Cellular Component (GO) | 466 | pre-existing in DES |
|
| ||
| Green Plants Genes (EntrezGene) | 16579 | newly compiled |
| Solanaceae Genes (EntrezGene) | 2994 | newly compiled |
| Bacteria Genes (EntrezGene) | 2879 | pre-existing in DES |
| Fungi Genes (EntrezGene) | 2758 | pre-existing in DES |
| Viruses Genes (EntrezGene) | 971 | pre-existing in DES |
| Archaea Genes (EntrezGene) | 536 | pre-existing in DES |
|
| ||
| Green Plants (NCBI Taxonomy) | 5733 | newly compiled |
| Fungi (NCBI Taxonomy) | 2426 | pre-existing in DES |
| Bacteria (NCBI Taxonomy) | 1498 | pre-existing in DES |
| Viruses (NCBI Taxonomy) | 1109 | pre-existing in DES |
| Solanaceae (NCBI Taxonomy) | 297 | newly compiled |
| Source Microbes for Antibiotics | 113 | pre-existing in DES |
| Archaea (NCBI Taxonomy) | 40 | pre-existing in DES |
| Tomato Species (NCBI Taxonomy) | 15 | newly compiled |
|
| ||
| Plant-related Vocabulary | 2688 | newly compiled |
| Stress-related Vocabulary | 759 | newly compiled |
References for the data sources indicated in Table 1 are as follows: ChEBI (Hastings et al.[47]), MetaboLights[113], IntEnz[114], T3DB[115], Industrially Important Enzymes EC[116, 117], GO[118], KEGG[119], Reactome[120], PANTHER[121], UniPathways[122], EntrezGene[48], NCBI Taxonomy[123], KOBAS[52].
Plant-related ontologies used to compile the “Plant-related Vocabulary”.
| Terms or Phrases | Definition |
|---|---|
| Enriched Terms | Biological terms or keywords (e.g. lycopene, peroxidase activity, |
| Enriched Term Pairs | Connection/association (possibly biological) between two terms that is inferred based on the co-occurrence of these terms (e.g. signaling and salicylic acid; lycopene and carotenoids; |
| Hypothesis | New connection of terms; a starting point for possible further investigation (e.g. AGO5 and ‘DNA methylation’; SNI1 and ‘jasmonic acid’) |
| KOBAS Pathways | Enriched pathways that were identified by the set of genes and/or proteins extracted from tomato-based literature |
| Dictionary | A set of terms, which are categorized into themes (e.g. Pathways, Metabolites, or Genes) |
|
|
|
| Network Viewer | A tool for the visualization of term associations as a graph of interlinked nodes |
| Term Co-occurrences | A list of all the enriched terms from all dictionaries that is potentially associated with the term in question. |
| Term Link Sources | A graph/pie chart that visualize the distribution of data sources (dictionaries) from which associations to the term in question are drawn |
Note that sometimes ontologies reuse and integrate entities from other ontologies/sources when appropriate, such as is the case for FLOPO and PTO ontologies.
Examples of gene-gene associations identified in KB with semantic similarity equal to 1.0.
| Gene Symbol/Description | Gene Symbol/Description | Common Annotations |
|---|---|---|
|
|
| “protein kinase activity”;“molecular_function”;“GO:0004672” “protein binding”;“molecular_function”;“GO:0005515” “ATP binding”;“molecular_function”;“GO:0005524” “protein phosphorylation”;“biological_process”;“GO:0006468” |
|
|
| genes are involved in photoreceptor activity (GO:0009881) |
|
|
| “peroxidase activity”;“molecular_function”;“GO:0004601” “peroxidase activity”;“molecular_function”;“GO:0004601” “response to oxidative stress”;“biological_process”;“GO:0006979” “heme binding”;“molecular_function”;“GO:0020037” “oxidation-reduction process”;“biological_process”;“GO:0055114” |
The change of the number of gene pairs according to the change of required semantic similarity level.
| Semantic Similarity | Number of Gene Pairs | Percentage (out of 2,227) |
|---|---|---|
| >=0.4 | 1098 | 49% |
| >=0.45 | 991 | 45% |
| >=0.5 | 943 | 42% |
| >=0.55 | 913 | 41% |
| >=0.6 | 875 | 39% |
| >=0.65 | 832 | 37% |
| >=0.7 | 794 | 36% |
| >=0.75 | 760 | 34% |
| >=0.8 | 697 | 31% |
| >=0.85 | 674 | 30% |
| >=0.9 | 613 | 28% |
| >=0.95 | 579 | 26% |
| =1 | 575 | 26% |
Some examples of gene-gene associations that have functional association but do not have semantic similarity.
| Gene Symbol/Description | Gene Symbol/Description | Common Annotations | Reference |
|---|---|---|---|
|
|
| Volatile Organic Compounds (albuterol and 1,3- propanediole) were shown to promote lateral root formation that correlates with an increase in levels of EXP2 and IAA3 in the roots of tomato plants |
|
|
|
| Filament-like plant proteins (FPP) belongs to a family of long coiled-coil proteins that interacts with the nuclear envelope-associated protein, MAF1 |
|
|
|
| Both DAD1 and pirin are mediators of programmed cell death in plants. However, DAD1 was shown to interact with BCL2 family members, while pirin plays more of a downstream role as it forms a NF-kB, BCL3, Pirin complex that is capable of modulating NF-kB-driven gene expression through interaction with an NF-kB DNA-binding site. |
|
Glossary.
| Ontology | Description |
|---|---|
| PO |
|
| FLOPO |
|
| PTO/TO |
|
| PECO/EO |
|
| SPTO |
|
Figure 2Step-by-step illustration of how DES-TOMATO can be used to identify components of genetic resistance for P. syringae (marked in yellow). The pink octagons represent the “Solanaceae Genes” dictionary; the dark green triangles represent the “Bacteria (NCBI Taxonomy)” dictionary; and the pale green trapezoids represent “Plant-related Vocabulary” dictionary. The edge color is distributed across a color spectrum from hot/red (high frequency co-occurrence/strong association) to cold/blue (small number of co-occurrences, weaker association). The numbers on the edges provide the number of publications that link the associated nodes.
Figure 3Step-by-step illustration of how DES-TOMATO can be used to find relevant candidate genes involved in salinity tolerance by focusing on Na+ homeostasis and plasma membrane H+-ATPases (in yellow). In the network, the pink octagons represent the “Solanaceae Genes” dictionary. The edge color is distributed across a color spectrum from hot/red (high frequency co-occurrence/strong association) to cold/blue (small number of co-occurrences, weaker association). The numbers on the edges provide the number of publications that link the associated nodes.
Figure 4A simple demonstration of how a use “Explore hypotheses”. Boxed in yellow are the criteria used to direct or test the hypotheses generated.
Figure 5A simple demonstration of how S. lycorpersicum enriched pathways can be explored using “KOBAS Pathways”. Boxed in yellow are the criteria adapted for this exploration process.