| Literature DB >> 35173764 |
Konstantinos Mechteridis1, Michael Lauber1, Jan Baumbach2,3, Markus List1.
Abstract
De novo pathway enrichment is a systems biology approach in which OMICS data are projected onto a molecular interaction network to identify subnetworks representing condition-specific functional modules and molecular pathways. Compared to classical pathway enrichment analysis methods, de novo pathway enrichment is not limited to predefined lists of pathways from (curated) databases and thus particularly suited for discovering novel disease mechanisms. While several tools have been proposed for pathway enrichment, the integration of de novo pathway enrichment in end-to-end OMICS analysis workflows in the R programming language is currently limited to a single tool. To close this gap, we have implemented an R package KeyPathwayMineR (KPM-R). The package extends the features and usability of existing versions of KeyPathwayMiner by leveraging the power, flexibility and versatility of R and by providing various novel functionalities for performing data preparation, visualization, and comparison. In addition, thanks to its interoperability with a plethora of existing R packages in e.g., Bioconductor, CRAN, and GitHub, KPM-R allows carrying out the initial preparation of the datasets and to meaningfully interpret the extracted subnetworks. To demonstrate the package's potential, KPM-R was applied to bulk RNA-Seq data of nasopharyngeal swabs from SARS-CoV-2 infected individuals, and on single cell RNA-Seq data of aging mice tissue from the Tabula Muris Senis atlas.Entities:
Keywords: R package; data integration; network analysis; pathway enrichment; systems biology
Year: 2022 PMID: 35173764 PMCID: PMC8842393 DOI: 10.3389/fgene.2021.812853
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
KPM run options and their description.
| Parameter | Description |
|---|---|
| execution | Defines the execution type of KPM-R, which can be run either “Local” |
| Default value: “Local”. | |
| strategy | Can be either “INES” or “GLONE”. If the GloNE strategy is selected, the user does not need to set the |
| Default value: “GLONE”. | |
| algorithm | The algorithm that should be used to extract the pathways. It can be set to “Greedy”, “ACO” or “Optimal”. |
| Default value: “Greedy”. | |
| use_range_k | Boolean parameter that describes whether parameter |
| Default value: FALSE. | |
| k_min, k_max, k_step | Numeric parameters that control the number of node exceptions allowed in a solution. If the use_range_k parameter is set to false, only k_min must be defined. Otherwise, a range must be defined with k_min and k_max defining the lower and upper boundary respectively and k_step describing the incrementation from one iteration to the next. For example, setting k_min = 4, k_max = 8 and k_step = 2 would mean that KPM will be executed with |
| Default values: k_min = 1, k_max = 3, k_step = 1. | |
| use_range_l | Boolean that describes whether parameter |
| Default value: FALSE. | |
| l_min, l_max, l_step | Numeric parameters that control the number of case exceptions within a node. Similar to the |
| Default values: l_min = 0, l_max = 0, l_step = 1. | |
| link_type | When using multiple datasets, the user must specify a logical formula to combine these. The link_type parameter’s accepted values are “OR”, “AND”, or a custom formula. |
| Default value: “OR”. | |
| graph_id | ID of the network on the web server, which should be used in a remote run |
| negative_nodes | Character vector contains biological entities that should be considered as inactive |
| positive_nodes | Character vector contains biological entities that should be considered as active |
FIGURE 1Typical workflow of KPM-R.
FIGURE 2Shiny app for visualizing, browsing, and saving pathways from a result object. The displayed subnetworks were extracted from SARS-CoV-2 gene expression data. (A,F) The user can switch between pathway and union network view. (B) Panel to select which parameter configuration and subnetwork to visualize. (C) Export buttons that allow the extraction of the current pathway as edges or nodes. (D) The user can select a gene from the network for closer inspection. (E) For every network, statistics are displayed, which provide information on the number of nodes, edges, and the average number of active cases per node. (G) In the union network view, the selection panel allows selecting pathways to examine from which subnetwork the nodes originate. When using INES to run KPM, the exception nodes are marked as red squares, as shown in the pathway view.
FIGURE 3Pathway comparison plots can be utilized to find the optimal pathway in the extracted solution. The shown pathways were extracted from the analyzed GEO SARS-CoV-2 dataset and used to limit the further exploration to configurations with at least 100 on average differentially expressed genes per case.
FIGURE 4SARS-CoV-2 network from the configuration L = 220 and K = 20. Many genes of the network are known to be important players in the human immune response. Exception nodes are visualized as orange squares and significant nodes as blue circles.
FIGURE 5Enrichment Analysis of extracted SARS-CoV-2 network. The network was tested for enrichment within the KEGG, Reactome and Wikipathways data bases.
FIGURE 6Extracted networks from single cell data of murina limb muscle. (A) Network with the configuration L = 0 and K = 0 based on mesenchymal stem cells from the limb muscle. The genes consist mainly of extracellular matrix proteins and are differentially expressed in all old mice. (B) Network of mesenchymal satellite stem cells with K = 0 and L = 4. With the exception Tpt1 and Gnbl2l1, all nodes are ribosomal proteins.
FIGURE 7Network from single cell data of satellite stem cells from the murine limb muscle with the configuration K = 6 and L = 4. Exception nodes are visualized as orange squares and significant nodes as blue circles.