| Literature DB >> 33378765 |
Vasundra Touré1, Åsmund Flobak2, Anna Niarakis3, Steven Vercruysse4, Martin Kuiper5.
Abstract
Causal molecular interactions represent key building blocks used in computational modeling, where they facilitate the assembly of regulatory networks. Logical regulatory networks can be used to predict biological and cellular behaviors by system perturbations and in silico simulations. Today, broad sets of causal interactions are available in a variety of biological knowledge resources. However, different visions, based on distinct biological interests, have led to the development of multiple ways to describe and annotate causal molecular interactions. It can therefore be challenging to efficiently explore various resources of causal interaction and maintain an overview of recorded contextual information that ensures valid use of the data. This review lists the different types of public resources with causal interactions, the different views on biological processes that they represent, the various data formats they use for data representation and storage, and the data exchange and conversion procedures that are available to extract and download these interactions. This may further raise awareness among the targeted audience, i.e. logical modelers and other scientists interested in molecular causal interactions, but also database managers and curators, about the abundance and variety of causal molecular interaction data, and the variety of tools and approaches to convert them into one interoperable resource.Entities:
Keywords: biological pathway; causal interactions; computational biology; data representation; databases; interoperability; logical modeling
Year: 2021 PMID: 33378765 PMCID: PMC8294520 DOI: 10.1093/bib/bbaa390
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1
Representation example of explicit and implicit causal interactions in AF and PD. In AF (left panel), causal interactions are evident from the network’s structure: the source entity has an effect on the target entity. In the example, the effect is ‘negative’ [represented with a directed and inhibitory edge (ending in a pipe symbol)]. In PD (right panel), the implicit causality shows a metabolic reaction (state change of an entity) where a reactant entity is consumed to produce a product entity. This reversible reaction is catalyzed by catalyst 1 and catalyst 2. The product acts as a catalyst in a second reaction. Therefore, catalyst 1 has a positive effect on the product since it enables the product to perform its biological function as catalyst. Alternatively, catalyst 2 has a negative effect on the product since it prevents the product to perform that function. The causality can be inferred as follows: both catalyst 1 and catalyst 2 affect the activity of the product entity. The product is in an active state (i.e. the state in which it catalyzes another reaction), and therefore, catalyst 1 activates this particular activity of the product, while catalyst 2 inhibits it. In the case of the AF, the logic equation describes that the target is present or active in the absence of the inhibitor (specified with the operator ‘NOT’). In the case of PD description, the logic equation describes that the product is present or active in the presence of catalyst 1 and in the absence of catalyst 2 (specified with the operators ‘AND’ and ‘NOT’).
Summary of the listed explicit and implicit data resources with causal information. The number of causal interactions, models or maps is provided as recorded at the time of writing this manuscript. The `Data provenance' column indicates whether a reference to a manuscript or information source leading to the description of a causal interaction is provided. The `Exports' column indicates the formats in which causal interactions are accessible or in which formats the pathways can be extracted. The `Latest content update' column indicates, to the best of our knowledge, the last year the content of the resource has been modified. The `+' symbol indicates that the database does support a specific characteristic.
| Resource | Data | Data provenance | API | Exports | Latest content update | Species | |
|---|---|---|---|---|---|---|---|
| Causal interactions or AF pathways | CBN [ | 9712 causal interactions | + | + | BEL, SIF | 2020 |
|
| GO-CAM [ | 2956 models | + | + | OWL, GAPB, SIF, JNL | 2020 | >10 species | |
| KEGG (AF) [ | 538 pathways, together with PD | + | KGML | 2020 | >20 species | ||
| SignaLink [ | 89,000 causal interactions | + | SIF | 2011 |
| ||
| SIGNOR [ | 24,657 causal interactions | + | + | PSI-MITAB2.8 | 2020 |
| |
| SPIKE [ | 9503 causal interactions | + | SBML, SIF, BioPAX, XML | 2012 |
| ||
| Wikipathways [ | 2891 pathways | + | BioPAX, GPML | 2020 | >20 species | ||
| Logical models | Biomodels [ | 18 models | + | + | SBML | 2020 |
|
| Cell Collective [ | 78 models | + | SBML qual | 2020 |
| ||
| GINsim [ | 59 models | + | SBML qual, ZGINML | 2020 | >10 species | ||
| PyBoolNet [ | 24 models | BNET | 2020 |
| |||
| PD pathways | ACSN [ | 13 maps | + | SBGN, XML, CellDesigner | 2018 |
| |
| Disease Maps [ | 6 maps | + | SBML, SBGN, BioPAX | 2020 |
| ||
| KEGG (PD) [ | 538 pathways, together with AF | + | KGML | 2020 | >20 species | ||
| PANTHER [ | 177 pathways | + | SBML, BioPAX | 2020 | 142 species | ||
| Reactome [ | 2441 human pathways | + | + | PSI-MITAB2.7, SBML, SBGN, BioPAX | 2020 |
|
Figure 2
Different representations of causal interactions. (1A) Protein-based causal interaction where both entities are gene products: Protein A (AKT1) represses the activity of Protein B (CDK1NB) by decreasing the quantity of protein by repression (meaning that this is most likely an indirect regulation with an intermediate entity, i.e. the gene that enables the production of Protein B); (1B) Gene-based causal interaction where Protein A negatively regulates the transcription of Gene B; (2) Activity-based causality where an Activity A positively regulates an Activity B. Usually, the entities that perform these activities are known and annotated; (3) Process-based causality where a biologicxal Process A (DNA damage) positively regulates a biological Process B (apoptosis).
Comparison of annotations in different formats for causal molecular interactions. Data and metadata types, described in the MI2CAST guidelines, that can currently be annotated and stored in each format are listed with the ontologies and controlled vocabularies used. If there is no specification about ontologies and controlled vocabularies to use, the `+' sign is given, meaning that the format stores explicitly this type of data or metadata. Table inspired from [3].
| SIF | SBML qual | PSI-MITAB2.8 | GO-CAM | BEL | |
|---|---|---|---|---|---|
| Source entity | + | + | UniProtKB, RefSeq, ChEBI, EMBL/DDBJ/GenBank, Entrez Gene, Ensembl, EnsemblGenome | UniProtKB, Model Organism Database (MOD) gene identifier, Protein Ontology | HGNC, ChEBI, RefSeq, Entrez Gene, etc. |
| Interaction effect | + | + | PSI-MI | RO | BEL term: increases, decreases, etc. |
| Target entity | + | + | UniProtKB, RefSeq, ChEBI, EMBL/DDBJ/GenBank, EntrezGene, Ensembl, EnsemblGenome | UniProtKB, Model Organism Database (MOD) gene identifier, Protein Ontology | HGNC, ChEBI, RefSeq, etc. |
| Reference | PMID | PMID or MOD reference | PMID | ||
| Evidence | PSI-MI | ECO | ECO | ||
| Experimental setup | ECO | ||||
| Biological mechanism | PSI-MI | combination of biological activity and relationship | GO:BP | ||
| Biological activity | GO:MF | GO:MF | BEL term: act((prefix:id), ma(prefix:id)), etc. | ||
| Biological type | PSI-MI | BEL term: g(), r(), etc.See documentation. | |||
| Biological modification | PSI-MOD | Protein Ontology | BEL term: p(prefix:identifier, pmod(prefix:identifier)), etc. | ||
| Taxon entity | NCBI Taxonomy | NCBI Taxonomy | NCBI Taxonomy | ||
| Taxon interaction | NCBI Taxonomy | NCBI Taxonomy | |||
| Tissue type | Uberon (animals), Plant Ontology, Fungal Anatomy Ontology, MOD-specific ontologies | BRENDA Tissue Ontology | |||
| Cell type/Cell line | Cell Ontology, MOD-specific ontologies | BRENDA Tissue Ontology, Cell Ontology, Cell Type Ontology | |||
| Cellular compartment | + | GO:CC | GO:CC |
Figure 3
Graphical summary of the manuscript. The figure describes connections between the different types of data resources (causal interaction, AF pathway, PD pathway, logical model and prior knowledge) together with the name of resources that provide this type of information and the possible data formats that can be generated. The directed arrows show possibilities to transform one specific type of data to another type. On each arrow, the tools (or resources) that enable (or provide) the conversion from one resource to another are indicated. For instance, from a PD pathway, it is possible to obtain a logical model using the tool CaSQ. The dotted lines highlight the integrated resources in which several databases have been incorporated.