| Literature DB >> 35436299 |
Ryan A Miller1, Martina Kutmon1,2, Anwesha Bohler1, Andra Waagmeester1,3, Chris T Evelo1,2, Egon L Willighagen1.
Abstract
To grasp the complexity of biological processes, the biological knowledge is often translated into schematic diagrams of, for example, signalling and metabolic pathways. These pathway diagrams describe relevant connections between biological entities and incorporate domain knowledge in a visual format making it easier for humans to interpret. Still, these diagrams can be represented in machine readable formats, as done in the KEGG, Reactome, and WikiPathways databases. However, while humans are good at interpreting the message of the creators of diagrams, algorithms struggle when the diversity in drawing approaches increases. WikiPathways supports multiple drawing styles which need harmonizing to offer semantically enriched access. Particularly challenging, here, are the interactions between the biological entities that underlie the biological causality. These interactions provide information about the biological process (metabolic conversion, inhibition, etc.), the direction, and the participating entities. Availability of the interactions in a semantic and harmonized format is essential for searching the full network of biological interactions. We here study how the graphically-modelled biological knowledge in diagrams can be semantified and harmonized, and exemplify how the resulting data is used to programmatically answer biological questions. We find that we can translate graphically modelled knowledge to a sufficient degree into a semantic model and discuss some of the current limitations. We then use this to show that reproducible notebooks can be used to explore up- and downstream targets of MECP2 and to analyse the sphingolipid metabolism. Our results demonstrate that most of the graphical biological knowledge from WikiPathways is modelled into the semantic layer with the semantic information intact and connectivity information preserved. Being able to evaluate how biological elements affect each other is useful and allows, for example, the identification of up or downstream targets that will have a similar effect when modified.Entities:
Mesh:
Year: 2022 PMID: 35436299 PMCID: PMC9015122 DOI: 10.1371/journal.pone.0263057
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Abbreviations for semantic web technologies used to harmonize the biological interaction information from WikiPathways.
| Abbreviation | Full Name/Meaning |
|---|---|
| GPML | Graphical Pathway Markup Language |
| GPMLRDF | RDF for Graphical Pathway Markup Language |
| MIM | Molecular Interaction Map |
| RDF | Resource Description Framework |
| SBGN | Systems Biology Graphical Notation |
| ShEx | Shape Expressions |
| SPARQL | SPARQL Protocol and RDF Query Language |
| WikiPathways RDF | The combination of GPMLRDF and WPRDF |
| WPRDF | RDF for WikiPathways |
Fig 1Differences in drawing of MIM vs SBGN inhibition interaction.
A shows a MIM—inhibition interaction. B shows a SBGN—inhibition interaction.
Datanode type counts, as defined by the WikiPathways ontology.
The datanode counts for each type of node found in the WikiPathways RDF.
| Datanode Type | Count (WPRDF) | Count (GPMLRDF but not WPRDF) |
|---|---|---|
| Datanode | 28402 | —– |
| GeneProduct | 21270 | 1084 |
| Protein | 8255 | 141 |
| Metabolite | 4038 | 219 |
| RNA | 1204 | 66 |
| Complex | 980 | 16 |
| Unknown | —– | 218 |
| Pathway | —– | 250 |
Interaction type counts, as defined in the WikiPathways ontology.
The sum of DirectedInteration and NonDirected equals the Interaction Total. Of the directed interactions, subsets are typed as Conversion, Inhibition, etc. The NonSpecified interactions is a subset of NonDirected interactions. More than 12 thousand interactions are only found in the GPMLRDF.
| Interaction Type | Count (WPRDF) | Count (GPMLRDF but not WPRDF) |
|---|---|---|
| Interaction | 15525 | —– |
| DirectedInteraction | 11819 | —– |
| Conversion | 1447 | —– |
| Inhibition | 1091 | —– |
| Catalysis | 1231 | —– |
| ComplexBinding | 940 | —– |
| Binding | 1513 | —– |
| Stimulation | 842 | —– |
| TranscriptionTranslation | 256 | —– |
| NonDirected | 3706 | —– |
| NonSpecified | 2766 | —– |
| Unknown | —– | 12287 |
Fig 2Interaction types that are not found in Table 6.
A shows a complex binding of SULT1A1, SULT1E1 and SULT2A1 that catalyzes cis-4-hydroxytamoxafin to trans-4-sulfoxytamoxifen with PAPS to PAP formation found in Tamoxifen Metabolism (wikipathways:WP691). B shows transcription translation interaction for BST2 to BST2 in Host-pathogen interaction of human corona viruses—MAPK signaling pathway (wikipathways:WP4877).
Interaction Identifier ID counts by data source.
The identifier types for the interactions with a source from the WikiPathways RDF.
| Database Source | Interactions |
|---|---|
| Rhea | 313 |
| Uniprot-TrEMBL | 213 |
| KEGG Pathway | 28 |
| pato | 8 |
| kegg.compound | 8 |
| ChEBI | 6 |
| KEGG Reaction | 3 |
| Reactome | 3 |
| WikiPathways | 2 |
| XMetDB | 2 |
| SPIKE | 2 |
| BIND | 1 |
Participants for interactions.
Twenty example interaction syntaxes shown in table below. First twenty interactions from the WikiPathways RDF along with their interaction type and the participants for each interaction.
| Interaction | Interaction Type | Interaction Participants |
|---|---|---|
| WP3668_r97639/ComplexBinding/b916e | Binding | Complex, GeneProduct |
| WP2879_r94789/ComplexBinding/c939e | Binding | Complex, GeneProduct, Metabolite |
| WP4262_r97132/ComplexBinding/dae4b | Binding | Complex, GeneProduct, Metabolite |
| WP585_r94686/WP/Interaction/ida141949 | Catalysis | GeneProduct, Protein |
| WP2533_r95594/WP/Interaction/adbe3 | Catalysis | Conversion, DirectedInteraction, Interaction, Protein |
| WP1601_r95004/WP/Interaction/ida833b0dc | Catalysis | Conversion, DirectedInteraction, GeneProduct, Interaction |
| WP1423_r94289/WP/Interaction/idde73da53 | Catalysis | DirectedInteraction, GeneProduct, Interaction |
| WP3865_r88186/ComplexBinding/d5e4f | ComplexBinding | Complex, GeneProduct |
| WP2446_r87639/ComplexBinding/e75ff | ComplexBinding | Complex, GeneProduct, Protein, Rna |
| WP2795_r97631/ComplexBinding/b5fa4 | ComplexBinding | Complex, GeneProduct, Protein |
| WP3580_r96434/WP/Interaction/id6d378f23 | Conversion | Metabolite |
| WP134_r94935/WP/Interaction/a5dec | Conversion | Metabolite |
| WP3627_r90137/WP/Interaction/id14d637fe | Conversion | Metabolite |
| WP2436_r97673/WP/Interaction/b1b2f | Conversion | Metabolite |
| WP4149_r94399/WP/Interaction/id30000f59 | Inhibition | GeneProduct, Protein |
| WP2261_r89520/WP/Interaction/id65877034 | Inhibition | GeneProduct, Protein |
| WP306_r97459/WP/Interaction/e8847 | Inhibition | GeneProduct, Protein |
| WP2526_r96312/WP/Interaction/ddfe1 | Stimulation | Protein |
| WP1984_r95143/WP/Interaction/id8ba5f251 | Stimulation | GeneProduct, Metabolite |
| WP1984_r95143/WP/Interaction/iddde89331 | Stimulation | GeneProduct, Protein |
Top 20 most occurring directional interactions by participants combination.
The most abundant interaction is a directed interaction between two metabolites. Interaction participants, the count of how many there are in the WikiPathways RDF and the type of interactions are shown.
| Interaction Participants | Count | Type |
|---|---|---|
| Metabolite, Metabolite | 2675 | DirectedInteraction |
| GeneProduct, GeneProduct | 1423 | DirectedInteraction |
| GeneProduct, Protein, GeneProduct, Protein | 1334 | DirectedInteraction |
| Metabolite, Metabolite | 1125 | Conversion |
| Metabolite | 474 | DirectedInteraction |
| GeneProduct, Protein, GeneProduct | 445 | DirectedInteraction |
| GeneProduct, GeneProduct, Protein | 438 | DirectedInteraction |
| GeneProduct, Protein | 420 | DirectedInteraction |
| GeneProduct | 315 | DirectedInteraction |
| DirectedInteraction, Interaction, GeneProduct | 315 | DirectedInteraction |
| GeneProduct, Protein, Protein | 292 | DirectedInteraction |
| Metabolite, GeneProduct | 291 | DirectedInteraction |
| DirectedInteraction, Interaction, GeneProduct | 274 | Catalysis |
| Protein, Protein | 273 | Stimulation |
| GeneProduct, GeneProduct | 270 | Inhibition |
| Protein, Protein | 262 | DirectedInteraction |
| DirectedInteraction, Interaction, Conversion, Protein | 227 | DirectedInteraction |
| DirectedInteraction, Interaction, Conversion, Protein | 226 | Catalysis |
| GeneProduct, Metabolite | 180 | DirectedInteraction |
| GeneProduct, DirectedInteraction, Interaction | 151 | DirectedInteraction |
Curation query showing interaction, GPML graph ref from the WikiPathways RDF, and label for node at end of interaction.
| GPML Interaction | GPML Graph Ref | Participant Label |
|---|---|---|
| WP107_r105846/Interaction/d2818 | e82 | EIF4E |
| WP107_r105846/Interaction/cc170 | ceb | ITGB4BP |
| WP107_r105846/Interaction/f3bb6 | fc8 | EIF5A |
| WP1403_r106688/Interaction/ide379f87c | b9666 | GLUT4 |
| WP1403_r106688/Interaction/b1235 | f344c | Calcium |
| WP1403_r106688/Interaction/c4810 | c9726 | FA Synthase |
| WP1403_r106688/Interaction/f8d22 | d9cf5 | cAMP |
| WP1403_r106688/Interaction/d8a35 | a84ee | Leptin |
| WP1403_r106688/Interaction/b166c | ad4a4 | Malonyl-CoA |
| WP1403_r106688/Interaction/e0f9b | d4875 | Fatty Acid Oxidation |
| WP1403_r106688/Interaction/af18d | dcd84 | MEF2B |
| WP1403_r106688/Interaction/e4288 | b35fe | Torc2 |
| WP1403_r106688/Interaction/c0527 | aeb8f | HMG CoA Reductase |
| WP1403_r106688/Interaction/cff59 | d8c91 | HuR |
| WP1403_r106688/Interaction/ae70c | b3840 | Metformin |
| WP1403_r106688/Interaction/d14e4 | b2489 | Glucose |
| WP1403_r106688/Interaction/bedc0 | af2e8 | Raptor |
| WP1403_r106688/Interaction/c7163 | f156e | PI3K (III) |
| WP1403_r106688/Interaction/a04e2 | df1d0 | HNF4A |
| WP1403_r106688/Interaction/d7df8 | f3d7e | 4E-BP1 |
Fig 3Example ShEx shape for the WikiPathways harmonized Conversion interaction element (RDF shown in the top half), that requires two or more participant IRIs and exactly one source IRI and one target IRI.
MECP2 upstream and downstream targets.
In the table a source node is shown with its label, as well as the target and its label. The pathway in which the interaction is found and the interaction id are also provided.
| Source | Source Label | Target | Target Label | Pathway | Interaction |
|---|---|---|---|---|---|
| ensembl:ENSG00000169057 | MECP2 | chebi:CHEBI:29987 | glutamate | wikipathways:WP3584_r96364 | Interaction/id4f207df3 |
| ensembl:ENSG00000169057 | MECP2 | chebi:CHEBI:29987 | Glutamate | wikipathways:WP3584_r96364 | Interaction/id4f207df3 |
| ensembl:ENSG00000169057 | MECP2 | ensembl:ENSG00000118260 | CREB1 | wikipathways:WP3584_r96364 | Interaction/ida4a8b443 |
| ensembl:ENSG00000169057 | MECP2 | ensembl:ENSG00000118260 | CREB | wikipathways:WP3584_r96364 | Interaction/ida4a8b443 |
| ensembl:ENSG00000169057 | MECP2 | ensembl:ENSG00000176697 | BDNF | wikipathways:WP3584_r96364 | Interaction/id4a259c62 |
| ensembl:ENSG00000169057 | MECP2 | ensembl:ENSG00000155511 | GRIA1 | wikipathways:WP3584_r96364 | Interaction/id3bcd32 |
| ensembl:ENSG00000169057 | MECP2 | ensembl:ENSG00000155511 | AMPA | wikipathways:WP3584_r96364 | Interaction/id3bcd32 |
| ensembl:ENSG00000169813 | HNRNPF | ensembl:ENSG00000169057 | MECP2 | wikipathways:WP3584_r96364 | Interaction/id1c3def3d |
| ensembl:ENSG00000196132 | MYT1 | ensembl:ENSG00000169057 | MECP2 | wikipathways:WP3584_r96364 | Interaction/id8e7af5c |
| ensembl:ENSG00000169045 | HNRNPH1 | ensembl:ENSG00000169057 | MECP2 | wikipathways:WP3584_r96364 | Interaction/ida6a9fa9d |
Fig 4Example of direct interactions of gene products that both influence MECP2 and are influenced by MECP2 from Rett syndrome causing genes (wikipathways:WP4312).
In this example, MECP2 is being influenced by HDAC1 and CDKL5. MECP2 then in turns influences SHANK3 and inhibits the activity of FOXG1.
Fig 5Representation of conversion of different sphingholipids to their products and the relevant enzyme catalyzing the reaction from the Ganglio Sphingolipid Metabolism pathways (wikipathways:WP1423).
In this case, GD3 is converted to GD2 by the enzyme B4GALNT1. GD2 is then in turn converted to GD1b and catalyzed by B3GALT4.
Sphingolipid conversion interactions.
In the table the enzyme for the conversion is given along with the metabolite source and its label along with the metabolite target along with its label and completed with the interaction id for the conversion.
| Enzyme | Metabolite Source | Source Label | Metabolite Target | Metabolite Target Label | Interaction |
|---|---|---|---|---|---|
| ensembl:ENSG00000115525 | hmdb:HMDB0006750 | Lactosylceramide | hmdb:HMDB0004844 | GM3 | Interaction/idb121743e |
| ensembl:ENSG00000115525 | hmdb:HMDB0006750 | LacCer | hmdb:HMDB0004844 | GM3 | Interaction/idb121743e |
| ensembl:ENSG00000169359 | hmdb:HMDB0004913 | GD3 | pubchem.compound:73427362 | O-Acetylated GD3 | Interaction/id5f3f21f |
| ensembl:ENSG00000235863 | hmdb:HMDB0004925 | GD2 | hmdb:HMDB0004926 | GD1b | Interaction/idde73da53 |
| ensembl:ENSG00000101638 | hmdb:HMDB0004927 | GT1b | hmdb:HMDB0004928 | GQ1bA | Interaction/idc09b2721 |