| Literature DB >> 34408770 |
Christina Y Yu1, Antonina Mitrofanova1,2.
Abstract
Biomarker discovery is at the heart of personalized treatment planning and cancer precision therapeutics, encompassing disease classification and prognosis, prediction of treatment response, and therapeutic targeting. However, many biomarkers represent passenger rather than driver alterations, limiting their utilization as functional units for therapeutic targeting. We suggest that identification of driver biomarkers through mechanism-centric approaches, which take into account upstream and downstream regulatory mechanisms, is fundamental to the discovery of functionally meaningful markers. Here, we examine computational approaches that identify mechanism-centric biomarkers elucidated from gene co-expression networks, regulatory networks (e.g., transcriptional regulation), protein-protein interaction (PPI) networks, and molecular pathways. We discuss their objectives, advantages over gene-centric approaches, and known limitations. Future directions highlight the importance of input and model interpretability, method and data integration, and the role of recently introduced technological advantages, such as single-cell sequencing, which are central for effective biomarker discovery and time-cautious precision therapeutics.Entities:
Keywords: biomarkers; mechanism-centric approaches; precision medicine; predictive models; treatment response
Year: 2021 PMID: 34408770 PMCID: PMC8365516 DOI: 10.3389/fgene.2021.687813
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Mechanism-centric approaches in biomarker discovery and precision therapeutics. A variety of data, including single- and multi-omic sources, knowledge bases, and phenotype/clinical information, can be used as inputs to mechanism-centric approaches to identify functional biomarkers of disease and therapeutic response. We describe mechanism-centric methods that are based on co-expression networks, regulatory networks, PPI networks, and molecular pathways.
FIGURE 2Co-expression network methods: WGNCA and lmQCM. (A) Pairwise gene correlations are calculated from gene expression (microarray or RNA-seq) data. (B) The co-expression matrix is transformed into a topological overlap matrix and subjected to hierarchical clustering for module identification. A cluster dendrogram is shown, with different gene modules identified by the color bar on the bottom. (C) The co-expression matrix is used to construct a network, with genes as nodes and the correlation co-efficient between any two genes as the edge weight. Module identification is achieved through a greedy search for highly correlated subnetworks.
FIGURE 3Interrogation of transcriptional regulatory networks: Master Regulator Inference Algorithm (MARINa) and Virtual Inference of Protein-activity by Enriched Regulon analysis (VIPER). (A) A differential signature is defined between two phenotypes of interest (left) as input to MARINa; or on a single-sample level (right) as input to VIPER. (B) The transcriptional regulon is identified from Algorithm for the Reconstruction of Gene Regulatory Networks (ARACNe) tissue-specific transcriptional regulatory network, which includes a transcriptional regulator (TR) and its activated and repressed targets. (C) The activated and repressed targets of the regulon are mapped onto the corresponding signature and used to determine the TR’s transcriptional activity.
FIGURE 4RegNetDriver. (A) DNase-seq of DNase I hypermutation sites from a specific tissue type, information to identify TFs from binding motifs, and information of known regulatory gene pairs as used as input to reconstruct (B) a tissue-specific regulatory network. TF hubs are determined from nodes with the top 25% out-degree centrality. (C) Significantly perturbed TF hubs are identified using SNV, SV, and DNA methylation data.
FIGURE 5Illustration of the PPI network-based approach by Chuang et al. Gene expression microarray data with phenotype information is overlaid onto a PPI network that is constructed from existing knowledge. Subnetwork activities are calculated per sample based on z-transformed gene expression values, with subnetworks defined by the PPI network. Discriminative potential for each subnetwork is determined by mutual information (or alternatively, t-score or Wilcoxon score) that measures the association between sample activities and phenotypes. Subnetworks with discriminative potential between phenotypes are identified by a greedy search for locally maximal discriminative potential scores. Discriminative subnetworks are further assessed in significance testing to identify statistically significant discriminative subnetworks.
FIGURE 6Pathway-based modeling: pathCHEMO and pathER. (A) Therapeutic response distribution is defined based on time to therapeutic failure. Tails of this distribution are utilized in pathCHEMO and a full spectrum of therapeutic responses is utilized in pathER. (B) Molecular pathways are utilized as a knowledge base in pathway-based modeling. Genes in such pathways can be affected on multiple levels, such as differential expression (i.e., orange square) and DNA methylation (i.e., green satellite). (C) Molecular pathways are assessed for their integrated enrichment and association with therapeutic response.
Summary of mechanism-centric methods discussed in this review.
| Method | Data modality | Utilize knowledge base? |
|
| ||
|
| ||
| Centered Concordance Index (CCI) ( |
| |
| Single-omic | No | |
|
| ||
| Eigengenes ( |
| |
| Single-omic | No | |
|
| ||
| Hubs ( | ||
| Single-omic | No | |
|
| ||
|
| ||
|
| ||
| MARINa ( | ||
| Single-omic | No | |
|
| ||
| VIPER ( | ||
| Single-omic | No | |
|
| ||
| RegNetDriver ( | ||
| Multi-omic | Yes | |
|
| ||
|
| ||
|
| ||
| | ||
| Multi-omic | Yes | |
|
| ||
|
| ||
|
| ||
| pathCHEMO ( | ||
| Multi-omic | Yes | |
|
| ||
| pathER ( | ||
| Single-omic | Yes | |