Literature DB >> 23482392

Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their topologies within pathways.

Chunquan Li¹, Junwei Han, Qianlan Yao, Chendan Zou, Yanjun Xu, Chunlong Zhang, Desi Shang, Lingyun Zhou, Chaoxia Zou, Zeguo Sun, Jing Li, Yunpeng Zhang, Haixiu Yang, Xu Gao, Xia Li.

Abstract

Various 'omics' technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases. Identifying metabolic pathways has become an invaluable aid to understanding the genes and metabolites associated with studying conditions. However, the classical methods used to identify pathways fail to accurately consider joint power of interesting gene/metabolite and the key regions impacted by them within metabolic pathways. In this study, we propose a powerful analytical method referred to as Subpathway-GM for the identification of metabolic subpathways. This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway. We analyzed two colorectal cancer and one metastatic prostate cancer data sets and demonstrated that Subpathway-GM was able to identify disease-relevant subpathways whose corresponding entire pathways might be ignored using classical entire pathway identification methods. Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：
Histamine

Year: 2013 PMID： 23482392 PMCID： PMC3643575 DOI： 10.1093/nar/gkt161

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Various ‘omics’ technologies, such as microarrays, RNA-seq and gas chromatography mass spectrometry (GC–MS), can be used to identify potentially interesting (e.g. differential) genes and metabolites, including those associated with specific diseases. One of the challenges is to use such information to provide a better understanding of the underlying biological phenomena. Metabolic pathway analysis has become an invaluable aid to understanding the molecules generated by these ‘omics’ technologies. Information on the metabolic pathways investigated is available via pathway databases, such as KEGG, which manually curates electronic high-quality pathway structure information on the enzymes and metabolites involved in the metabolic processes (1). One of the most widely applied pathway analysis methods is the overrepresentation approach (ORA), which compares the number of interesting genes (metabolites) that hit a given pathway with the number of genes (metabolites) expected to hit the given pathway by chance. If the observed number is significantly different from that expected by chance, the pathway is reported as significant. A statistical model, such as the hypergeometric test, can be used to calculate the enrichment significance (P-values). Many methods, including ORA and gene set enrichment analysis (GSEA), have been developed to identify pathways [mainly reviewed in (2–7)]. However, these existing methods share obvious limitations in terms of their abilities to identify pathways, especially metabolic pathways (2,8,9). Because metabolic pathways involve both genes and metabolites, the dysfunction of which play important roles in diseases (10–12), integrative pathway analysis of both genes and metabolites will thus help to interpret the underlying biological phenomena. However, most existing methods ignore this integrative approach and were designed and developed mainly for the analysis of interesting genes (2–4), leading to a relative lack of information on metabolites in the pathways. Pathway identification methods based on metabolite lists, such as MBRole (5), MetPA (6) and MSEA (7), have recently been developed. However, these methods use only metabolite information, and thus also ignore the power of joint gene/metabolite analysis. Kamburov et al. recently developed a new metabolic pathway analysis method, IMPaLA, for the identification of biochemical pathways related to tumor-cell chemosensitivity. This method integrates enrichment significance (P-values) of pathways calculated through ORA based on interesting genes and interesting metabolites to improve pathway identification (10). Kamburov et al. (10) assumed independence of pathways associations from data sets of genes and metabolites, and the joint statistical significance was thus calculated by multiplying the P-values for genes and metabolites, respectively (P-valuegene × P-valuemetabolite). Although IMPaLA can effectively improve pathway identification, simply changing the statistical model might not fully reflect the superiority of a joint approach to analyzing genes and metabolites of interest within metabolic pathways. From a biological perspective, dysfunctional genes are closely related to dysfunctional metabolites in pathways, and biological features, such as their topology, might thus be more useful in helping to identify pathways than simply changing the statistical analysis (see examples in the ‘Results’ section). Improvements in pathway identification methods may need to be made at a fundamental biological level, rather than a statistical level (2). Moreover, fewer differential metabolites are detected by metabolomic studies compared with differentially expressed genes. However, metabolites may be located at important positions in pathways, such as arachidonic acid in the arachidonic acid metabolism pathway. The importance of metabolites in pathway structure is ignored by most of the existing pathway identification methods. More importantly, metabolic pathways stored in pathway databases are often too large to allow the accurate interpretation of the relevant biological phenomena. Key subpathway regions representative of the entire corresponding pathway may be more useful in terms of interpreting the relevant biological phenomena. Moreover, several studies have shown that abnormalities in subpathway regions of metabolic pathways contribute to the etiology of diseases (9,13,14), although most of the existing methods only identify entire pathways, rather than the key subpathway regions. We previously attempted to identify key metabolic subpathways using only gene information (13). However, the joint use of genes and metabolites has still been largely ignored, and the positional importance of genes and metabolites should be considered to locate key regions of pathways related to the underlying biological phenomena, such as specific diseases. In this study, we developed a powerful analytical method called Subpathway-GM for the identification of biologically meaningful metabolic subpathways. Subpathway-GM integrates ‘interesting genes’ and ‘interesting metabolites’ related to the study condition (e.g. disease) into the corresponding enzyme and metabolite nodes (referred to as signature nodes) within the metabolic pathway. We then analyzed lenient distance similarities of signature nodes within the pathway structure to locate key metabolic cascade subpathway regions (see ‘Materials and Methods’ section). Finally, a hypergeometric test was used to evaluate the enrichment significance of these subpathway regions. We applied Subpathway-GM to two colorectal cancer and one metastatic prostate cancer data sets and demonstrated that this method was able to identify disease-related subpathway regions successfully and reliably.

MATERIALS AND METHODS

Data sets

We analyzed two colorectal cancer data sets and one metastatic prostate cancer data set. We obtained differential genes from gene expression data and differential metabolites from metabolomic experimental studies.

Colorectal cancer data set 1

The gene expression profile data included 32 pairs of colorectal adenoma and adjacent normal mucosa tissues, originally analyzed by Sabates-Bellver et al. (15). The data are publicly available at the GEO database (GEO accession number = GSE8671). We used both the significance analysis of microarray (SAM) method (16) and the fold-change (FC) method to identify differentially expressed genes. A gene was considered to be differentially expressed when it was significant in the SAM method at a significance level of 0.001 (False discovery rate FDR < 0.001) and the log2|fold-change| value of the gene was also >1 (i.e. FC > 2 or FC < 0.5). A total of 2053 differentially expressed genes were identified by the aforementioned strategies. Differential metabolites were directly obtained from the results of several experimental studies, including Chan et al. (17), Qiu et al. (18,19) and Denkert et al. (20). The metabolites were extracted from these articles and converted to KEGG compound IDs. Finally, 90 unique differential metabolites associated with colorectal cancer were obtained.

Colorectal cancer data set 2

The gene expression profile data, including 70 colorectal cancer samples and 12 normal samples, was initially analyzed by Hong et al. (21) (GEO accession number = GSE9348). We identified 1452 differentially expressed genes using the same strategy as for colorectal cancer data set 1 (FDR < 0.001 and FC > 2 or FC < 0.5). The differential metabolites used were the same as in colorectal cancer data set 1.

Metastatic prostate cancer data set

Gene expression profile data, including six benign prostate samples, seven clinically localized prostate cancer samples and six metastatic prostate cancer samples, were initially analyzed by Varambally et al. (22) (GEO accession number = GSE3325). The localized and metastatic prostate cancer samples were used to identify differentially expressed genes associated with metastatic prostate cancer, using the SAM method (FDR < 0.01) and the FC method (FC > 2 or FC < 0.5) simultaneously. A total of 1773 differential genes were identified. Multiple metabolomic profiles (liquid chromatography/GC–MS), including 16 benign adjacent prostate samples, 12 clinically localized prostate cancer samples and 14 metastatic prostate cancer samples, were obtained from studies of Sreekumar et al. (12). The localized and metastatic samples were used to identify differentially expressed genes. We used the method (Wilcoxon rank-sum test) described by Sreekumar et al. to identify differential metabolites, and then converted these metabolite names to KEGG compound IDs. Finally, 53 metabolites associated with metastatic prostate cancer were identified (P < 0.01; Wilcoxon rank-sum test).

Methods

Subpathway-GM has been implemented as a freely available web-based and R-based tool (http://bioinfo.hrbmu.edu.cn/SubpathwayGM). Figure 1 depicts the schematic overview of Subpathway-GM. The step-by-step method is provided in Supplementary Text. The users inputs interesting the genes and metabolites of interest, and metabolic subpathways can then be identified mainly through: (i) mapping genes and metabolites of interest to graphs of pathways after graph-based reconstruction of metabolic pathways; (ii) locating subpathways within pathways according to signature nodes; (iii) evaluating the statistical significance of subpathways. The method is described in more detail later in the text.

Figure 1.

Schematic overview of Subpathway-GM.

Map genes and metabolites of interest to graphs of pathways

We converted all 150 metabolic pathways in KEGG (17 December 2010) to the directed graphs based on biochemical reaction information in KGML files (for an XML representation of KEGG pathway information, http://www.kegg.jp/kegg/xml/). For each pathway, we extracted all metabolites and enzymes within the pathway as nodes in the corresponding graph. If a metabolite participated in a reaction as a substrate or product, a directed edge was used to connect the corresponding metabolite nodes to the enzyme (i.e. reaction) nodes. Substrates were directed towards the enzyme nodes, whereas enzyme nodes were directed towards their products. Reversible reactions had twice as many edges as irreversible reactions. This strategy for converting pathway graphs had the advantage of developing graph algorithms. More importantly, positional information for metabolites and genes encoding enzymes could be extracted efficiently and used via a graph model that keeps pathway structure. Interesting genes and metabolites can be then mapped to the corresponding nodes within the pathway graphs. Notably, interesting metabolites can be mapped directly to metabolite (substrate and product) nodes. Interesting genes can be assigned to Enzyme Commission (EC) (or KEGG Ontology (KO)) numbers and matched to enzyme nodes. KO numbers are recommended by KEGG for mapping genes to pathways; therefore, we used KO numbers as the default numbers in Subpathway-GM. Finally, the mapped nodes within each pathway graph were defined as signature nodes.

Locate subpathways within pathways according to signature nodes

Signature nodes within pathways represent information on genes and metabolites of interest. These nodes can effectively help to locate subpathways associated with the genes and metabolites of interest, through further considering their topologies within pathways. Furthermore, distances between some nodes in a subpathway are usually similar. We, therefore, used ‘lenient’ distance similarity of signature nodes to locate subpathways. Specifically, for each pathway that contained signature nodes, we computed the shortest path between any two signature nodes in the given pathway graph. If the shortest path between two signature nodes was shorter than n + 1, then the two signature nodes and other non-signature nodes at their shortest path were added to the same node set. The parameter n indicates the maximum permitted non-signature node number at the shortest path between signature nodes. We then extracted the corresponding subgraph in the pathway graph according to each node set, and finally defined these subgraphs with node number ≥s as the subpathway regions because subgraphs with small scales can not usually form biologically relevant subpathway. Figure 1 shows the subpathways within entire pathways identified by Subpathway-GM with n = 2 and s = 5. Our subpathway strategy was based on lenient distance similarity. When a node within a pathway appeared within the permitted shortest path between any two signature nodes, the node was merged into the corresponding subpathway. If a node within the entire pathway had high topological centrality (e.g. high degree and/or betweenness), then that node would be more likely to occur within the permitted shortest path, and thus appear in the final subpathway. In metabolic pathways, nodes in the key regions usually have high topological centrality (e.g. the region near arachidonic acid in arachidonic acid metabolism). These key nodes, therefore, tend to appear in subpathways even though they are not signature nodes. Subpathway-GM will thus tend to identify key regions in entire pathways that might be representative of the corresponding entire pathways. Flexibility can be introduced to this subpathway strategy by varying the parameter n. A smaller value of n means that only those nodes meeting stricter distance similarities will be added to the corresponding subpathway, and the identified subpathways thus become smaller compared with larger values of n. The number of non-signature nodes within subpathways will also tend to reduce because of the smaller number of permitted non-signature nodes, and the ratio of signature nodes in the located subpathway regions will thus increase as n decreases. This can help users to identify subpathways closely associated with genes and metabolites of interest. In contrast, increasing n will usually increase the ratio of non-signature nodes and the size of the subpathways. Key nodes within entire pathways will also tend to be added to the subpathway because of the high topological centrality of key nodes. Subpathway-GM will thus tend to identify the key regions in entire pathways that might be representative of the corresponding entire pathways.

Evaluate the statistical significance of subpathways

We used the hypergeometric test to calculate the statistical significance of each subpathway. To achieve this, the following values needed to be calculated: (i) the number of interesting genes and metabolites submitted for analysis; (ii) the number of background genes and metabolites; (iii) the number of background genes and metabolites annotated to each subpathway; and (iv) the number of interesting genes and metabolites annotated to each subpathway. The default background is usually considered to be the whole genome and metabolome of the given organism (2). In this study, we, therefore, used all human genes in KEGG as the background genes, and all metabolites in the Human Metabolome Database (HMDB) (23) and KEGG Human Pathway (1) as background metabolites because human metabolites are not included in KEGG. The following equation represents an example calculation of the statistical significance of a subpathway. If the whole genome (metabolome) has a total of genes ( metabolites), of which () are involved in the subpathway under investigation, and the set of genes (metabolites) submitted for analysis has a total of genes ( metabolites), of which () are involved in the same subpathway, then the P-value can be calculated to evaluate the enrichment significance of the subpathway as follows:

RESULTS

We analyzed two colorectal cancer and one metastatic prostate cancer data sets using Subpathway-GM with the parameters n = 5 and s = 5. This study focused on identifying disease-related subpathways. We, therefore, set the parameter s = 5 because this type of subpathways (s ≥ 5) has been reportedly associated with disease in some studies and is considered by many groups to represent a pathway. To set an appropriate n value for the identification of disease-related subpathways, we examined the distances among known disease nodes within pathways based on disease genes and metabolites in the Genetic Association Database (GAD) (24) and HMDB (23) (Figure 2A). The average shortest distance among these nodes was 8.02, which was significantly smaller than that between all nodes within metabolic pathways (P < 2.2E-16; Wilcoxon rank-sum test). We further computed the shortest distance between each disease node and its nearest disease node and found that the distance was <5 for 85% disease nodes (Figure 2B). Some studies have suggested that genes associated with the same disease show close tendencies in biological pathways, and that their biological functions tend to be similar (25–27). A value of n = 5 thus seems to represent the closeness of genes and metabolites in diseases.

Figure 2.

Identification of metabolic subpathways associated with colorectal cancer. (A) Distances among known disease nodes within metabolic pathways. (B) Empirical cumulative distribution functions of shortest path lengths between each disease node and its nearest disease node within pathways. (C) Plots of pathway significance (–log10 P-value) in Subpathway-GM, Pathway-G, Pathway-M and IMPaLA. Subpathway-GM identified 26 significant metabolic subpathways, corresponding to 25 entire pathways. Plus sign indicates that the pathway was identified by the corresponding method at the 1% significance level. Bold labels represent the additional pathways identified by Subpathway-GM. (D) Interaction network of the subpathway identified by Subpathway-GM. Two subpathways are connected by an edge if they share a non-empty intersection of metabolites or genes. Edge width between subpathways is proportional to the number of genes and metabolites shared by the two connected subpathways. Node size is proportional to the degree of the node. Node color reflects statistical significance of pathway (P-value). Subpathways well supported by existing literature are shown with a black border node. To compare Subpathway-GM with other methods at the system level, we used three specific ORAs to identify pathways for each data set: Pathway-G, Pathway-M and IMPaLA. These ORA methods are commonly used for pathway identification and have been included in many pathway analysis tools (2,6,7,10). Pathway-G uses only genes of interest to identify entire pathways via hypergeometric tests (2). Similar to Pathway-G, Pathway-M uses metabolites of interest (5–7), whereas IMPaLA integrates genes and metabolites of interest to identify entire pathways via hypergeometric tests but does not consider pathway structure and subpathway identification strategy (10).

Colorectal cancer 1

Our first example was chosen to illustrate the effectiveness of Subpathway-GM for identifying subpathways associated with colorectal cancer. Subpathway-GM located 56 potential subpathways from all metabolic pathways using 2053 differential genes and 90 differential metabolites. With a strict cut-off value of P < 0.01, Subpathway-GM finally identified 26 significant metabolic subpathways (Figure 2C). When many subpathways are considered, a high–false-positive discovery rate is likely to result; therefore, we calculated FDR corrected P-values for these subpathways using the Benjamini–Hochberg FDR method (Supplementary Table S1). The results showed that 26 subpathways with P < 0.01 remained significant at the strict cut-off of FDR < 0.02, suggesting a low–false discovery rate. We, therefore, continued to use the non-corrected P-values in the following analyses. The 26 significant metabolic subpathways identified corresponded to 25 entire pathways (Figure 2C). Of these, up to 20 (80%) were well reported to be associated with cancer (indicated by black border in Figure 2D). Detailed evidences were provided in Supplementary Table S1. Several other pathways also interact with the reported cancer pathways, suggesting a possible association (Figure 2D and Supplementary Text). Many pathways identified by Subpathway-GM were undetected by Pathway-G, -M or IMPaLA. Pathway-G found only three significant pathways at the 1% significance level (see Supplementary Data set S1), all of which were included in the 25 significant pathways found by Subpathway-GM (Figure 2C). However, Subpathway-GM identified an additional 22 (88%) not identified by Pathway-G (see Supplementary Data set S1). The powers of Pathway-M and IMPaLA also seemed to be limited, and 68 and 40% of the 25 significant pathways identified by Subpathway-GM were not considered as significant by Pathway-M and IMPaLA, respectively (see Supplementary Data set S1). Surprisingly, up to 10 pathways identified by Subpathway-GM were simultaneously ignored by Pathway-G, -M and IMPaLA (indicated by black label in Figure 2C). We focused on seven of these 10 additional pathways that contained both differential genes and metabolites (Table 1). These pathways had high P-values in Pathway-G, -M and IMPaLA, and we further tested their ranks in these pathways. The result showed that almost all pathways were ranked >25 in Pathway-G, -M and IMPaLA (Table 1), indicating that they might be ignored by Pathway-G, -M and IMPaLA from a ‘rank’ point of view.

Table 1.

Seven additional subpathways identified by Subpathway-GM using colorectal cancer data set 1

Subpathway ID	Pathway name	S-P(R)	S-FDR	I-P(R)	G-P(R)	M-P(R)	Representative	Reference
path:00380_3	Tryptophan metabolism	0.00037(7)	0.0041	0.037(42)	0.095(24)	0.38(48)	YES	(11,28,29,30,31)
path:00010_1	Glycolysis/gluconeogenesis	0.0017(14)	0.0070	0.020(35)	0.39(42)	0.052(31)		(17,18,32–34)
path:00562_1	Inositol phosphate metabolism	0.0021(15)	0.0081	0.10(56)	0.16(30)	0.66(55)		(32,33,35–37)
path:00340_1	Histidine metabolism	0.0039(18)	0.011	0.017(33)	0.049(14)	0.34(46)	YES	(38,39)
path:00590_1	Arachidonic acid metabolism	0.0040(19)	0.011	0.034(39)	0.038(12)	0.88(59)	YES	(40–42)
path:00500_1	Starch and sucrose metabolism	0.0048(22)	0.011	0.049(46)	0.12(26)	0.40(49)	YES
path:00270_2	Cysteine and methionine metabolism	0.0051(24)	0.011	0.012(29)	0.67(59)	0.018(21)	YES	(43,44)

S-P(R): P-values (P) and ranks (R) of pathways in Subpathway-GM; I-P(R), G-P(R), M-P(R): P-values (P) and ranks (R) for IMPaLA, Pathway-G and Pathway-M respectively; S-FDR: FDR corrected P-values of pathways in Subpathway-GM.

Seven additional subpathways identified by Subpathway-GM using colorectal cancer data set 1 S-P(R): P-values (P) and ranks (R) of pathways in Subpathway-GM; I-P(R), G-P(R), M-P(R): P-values (P) and ranks (R) for IMPaLA, Pathway-G and Pathway-M respectively; S-FDR: FDR corrected P-values of pathways in Subpathway-GM. The most significant of the seven additional subpathways (path:00380_3) belonged to tryptophan metabolism pathway (Figure 3). Subpathway-GM yielded a P-value of 0.00037 (FDR corrected to 0.0029), but the tryptophan metabolism pathway was not considered as significant in Pathway-G, -M (P > 0.05) or IMPaLA (P > 0.01). Abnormality of the key tryptophan–serotonin regions (top region in Figure 3) has been reported to cause tumor cell proliferation in colon and prostate cancers [reviewed in (28)]. Moreover, the key region where tryptophan is converted by indoleamine-2,3-dioxygenase to kynurenime (red arrow region in Figure 3) was closely related to immune activation, cell proliferation and impaired quality of life in colorectal cancer [reviewed in (29)]. Recent studies also showed that the tryptophan-2,3-dioxygenase (TDO)–kynurenime region effectively suppressed antitumor immune responses and promoted tumor cell survival and motility (11). The differential metabolites and enzymes in tryptophan metabolism pathway encoded by differential expressed genes were located in the local cascade region and formed signature nodes in Subpathway-GM. Notably, differential tryptophan was at the center of the pathway. Subpathway-GM used distance similarity information between these signature nodes to identify the pathway, as well as the key subpathway region, effectively.

Figure 3.

Tryptophan metabolism pathway where the differential genes and metabolites of colorectal cancer were annotated. Nodes near asterisk symbol belong to the key subpathway region (path:00380_3) identified by Subpathway-GM. Enzymes (rectangular nodes) mapped by differential genes are shown with red node labels and borders. Metabolites (circle nodes) mapped by differential metabolites were showed with red node borders. The second subpathway (path:00010_1) belonged to the glycolysis/gluconeogenesis pathway, which is highly associated with cancer cell metabolism. During the early part of the 20th century, Warburg found that cancer cells consume glucose and acidify their environment with lactate (Warburg effect) (32,33). The differential genes and metabolites on the colorectal cancer data set were mostly located in the pyruvate and lactate metabolism region of the pathway, which was successfully identified by Subpathway-GM (Supplementary Figure S1). Many studies also indicated that this region was closely related to energy demand of colorectal cancer tissues (17,18,34). The third subpathway (path:00562_1) belonged to the inositol phosphate metabolism pathway. Subpathway-GM analysis yielded a P-value of 0.0021 (FDR corrected to 0.0081) for this subpathway, but the corresponding pathway was ignored by Pathway-G, -M and IMPaLA even at the 10% significance level (P = 0.16, 0.66 and 0.1, respectively). The subpathway path:00562_1 (Supplementary Figure S2), especially its starting region, which contains phosphoinositide 3-kinase and the tumor suppressor Phosphatase and tensin homolog (PTEN), was reported to be highly associated with tumor growth and survival (35,36). The end product of the subpathway, inositol, was differential in colorectal cancer and located in the center of the pathway. Inositol has been highly associated with the initiation of cancer and the control of cancer metastases (32,33,37). The fourth subpathway (path:00340_1) belonged to histidine metabolism, whereby histidine is converted by histidine decarboxylase to histamine, which is highly associated with cancer [reviewed in (38)]. Cianchi et al. (39) demonstrated that dual inhibition of the histidine decarboxylase and cyclooxygenase (COX) subpathways in arachidonic acid metabolism might be as a possible therapeutic tool for the treatment of colorectal cancer. The fifth subpathway (path:00590_1) belonged to arachidonic acid metabolism with Subpathway-GM P-value of 0.004 (FDR corrected to 0.011). Pathway-M, however, only gave a P-value of 0.88 because only one metabolite (arachidonic acid) was annotated to the pathway. The arachidonic acid pathway was composed of three key metabolic regions: COX, Lysyl oxidase (LOX) and Human Cytochrome P450 (CYP) (Figure 4), each of which has been implicated in several types of cancers, including colorectal and prostate cancers [reviewed in (40,41); see Supplementary Text]. Arachidonic acid is located in the central position of three key metabolic regions of the pathway. Subpathway-GM thus used the close positional relationship between differential arachidonic acid and nearby differential genes to identify one irregular subpathway (path:00590_1) that contained the certain part of the three regions (the * region in Figure 4), especially, the Arachidonate lipoxygenases (ALOX5) region.

Figure 4.

Arachidonic acid metabolism pathway where the differential genes and metabolites of colorectal cancer were annotated. Nodes near asterisk symbol belong to the key subpathway (path:00590_1) region identified by Subpathway-GM. The region contained the certain parts of three subsystems: COX, LOX and CYP450. Most of the differential genes involved in LOX5 belonged to the LOX subsystem. Subpathway-GM tends to locate key regions representative of the corresponding entire pathways. We further tested whether the subpathways identified as significant by Subpathway-GM tended to be representative of the corresponding entire pathways in the colorectal cancer data set. We obtained all the nodes within the 26 significant subpathways and calculated their degrees and betweenness within the corresponding entire pathways. Degree and betweenness are the two most popular topological centrality indexes. The degree and betweenness of nodes within subpathways were significantly higher than that for all nodes within entire pathways (Table 2). For example, the average degree of nodes within the significant subpathways was 4.00, which was significantly higher than the average degree of all nodes, which was 2.14 (P = 4.05E-94; Wilcoxon rank-sum test). We also found that both genes and metabolites within significant subpathways tended to exhibit high topological centrality (Table 2). These results suggest that nodes within significant subpathways may be important in pathways in terms of both local (degree) and global (betweenness) topologies, and they indicate that the subpathways identified by Subpathway-GM tend to be representative of the entire pathways.

Table 2.

Degree and betweenness of nodes within significant subpathways and corresponding entire pathways

Centrality	Molecules	Subpathway	Entire pathway	P-values
Degree	Genes and metabolites	4.00	2.14	4.05E-94
	Genes	3.95		5.96E-115
	Metabolites	4.18		5.05E-61
Betweenness	Genes and metabolites	617.59	212.18	6.36E-113
	Genes	614.29		5.34E-82
	Metabolites	658.47		1.02E-46

Degree and betweenness of nodes within significant subpathways and corresponding entire pathways We also examined these significant subpathways identified by Subpathway-GM from a biological point of view. We defined a subpathway as representative when it covered most core parts of the corresponding entire pathway, based on our biological knowledge. Nineteen (∼73.07%) of the 26 subpathways showed high tendencies to be representative of the corresponding entire pathways (Supplementary Table S1). For example, the aforementioned subpathways were representative of the corresponding tryptophan, arachidonic acid and histidine metabolism pathways, respectively (Table 1). Tryptophan was at the center of the tryptophan pathway, which contains three main regions: tryptophan–kynurenime, tryptophan–serotonin and tryptophan–tryptamine. Subpathway-GM effectively identified the core parts of these regions as subpathway path:00380_3 (Figure 3). The arachidonic acid pathway was composed of three key metabolic regions: COX, LOX and CYP, with arachidonic acid at the center of three regions. Subpathway-GM identified subpathway path:00590_1, including arachidonic acid, COX, LOX and CYP. Taken together, Subpathway-GM can thus not only identify cancer-related subpathways but also tends to locate key regions representative of entire pathways. Seven subpathways were not defined as representative of their corresponding entire pathways, but they were still reported to be highly associated with cancer (Supplementary Table S1). These included glycolysis/gluconeogenesis and inositol phosphate metabolism, which displayed obvious local subpathway features (e.g. Supplementary Figures S1 and S2). For example, subpathway path:00562_1 belonged to the inositol phosphate metabolism pathway. Inositol is at the center of this pathway, which included many paths to consume and release the metabolite. Because the subpathway only covered one of these paths, and does not cover most regions of the pathway, we did not consider this subpathway to be representative of the corresponding entire pathway. The subpathway path:00562_1 (Supplementary Figure S2), however, especially the starting region of the subpathway, was reported to be highly associated with tumor growth and survival (35,36). The subpathway path:00010_1 comprised the pyruvate and lactate metabolism region, which is located in the bottom region of the glycolysis/gluconeogenesis pathway (Supplementary Figure S1). Pyruvate is at the center of the pathway and showed a high degree and betweenness in the pathway. Lactate has been widely reported to be associated with cancer (Warburg effect) (32,33). The differential genes and metabolites involved in the aforementioned pathways were located in the local cascade region of pathways. Pathway-G, -M and IMPaLA tended to miss this kind of pathways because of the low ratios of the differential genes and metabolites involved, resulting in low enrichment significance. However, Subpathway-GM can effectively detect the biologically meaningful pathways by accurately identifying key subpathway regions through the joint power of differential genes and metabolites, and their topologies within pathways.

Colorectal cancer 2

Our second data set was chosen to illustrate the reliability of the metabolic pathways identified by Subpathway-GM analysis in colorectal cancer data set 1 by analysis of gene expression in a second colorectal cancer data set 2. In this data set, we identified 1773 differentially expressed genes, including 792 (38.59%) of the genes identified in data set 1. Subpathway-GM analysis of the data set 2 detected 64 potential subpathways from all metabolic pathways, 36 of which, corresponding to 33 entire pathways, were finally identified as significant at the 1% significance level. Twenty (80%) of 25 significant pathways found in the data set 1 were also in data set 2, although only 38.59% differential genes in data set 1 were also in data set 2. The overlap between pathways found in the data sets 1 and 2 was also highly statistically significant (P < 1.00E-11; hypergeometric test). Most of the cancer-related pathways in data set 1, such as tryptophan metabolism, glycolysis/gluconeogenesis and arachidonic acid metabolism, were also identified in data set 2, indicating that Subpathway-GM can reliably identify disease-related metabolic pathways. Furthermore, we compared the results of the two data sets 1 and 2 at the subpathway level. We used the Jaccard index to compute similarities between molecules (genes and metabolites) in the corresponding subpathways for 20 common pathways found in the data sets 1 and 2. The similarity values of 17 pathways were >50%, and the average value for all 20 pathways was 66%. These results suggest that the results of Subpathway-GM analysis were reliable at the subpathway level. Moreover, eight pathways in data set 2, including tryptophan metabolism, glycolysis/gluconeogenesis and arachidonic acid metabolism, were identified by Subpathway-GM, but not by Pathway-G, -M or IMPaLA.

Metastatic prostate cancer

To further demonstrate the use of Subpathway-GM, we applied this analysis to a metastatic prostate cancer data set (12,22). Because the metabolic pathways associated with metastatic prostate cancer have not been widely reported, we expected to identify novel pathways in this data set. Subpathway-GM identified 16 significant subpathways at the 1% significance level, nine (56.25%) of which were involved in amino acid metabolism. Some studies suggested that amino acid metabolism might be highly associated with metastatic cancer (12,43), for example, ‘glycine, serine and threonine metabolism’ was reported to be highly associated with metastatic prostate cancer (12). Subpathway-GM yielded a P-value of 0.000027 (FDR corrected to 0.00042) and identified the metastasis-related key subpathway path:00260_1 (Supplementary Figure S3) (12). Of 16 pathways identified by Subpathway-GM, up to nine (56.25%) were ignored by Pathway-G, -M and IMPaLA simultaneously (Supplementary Table S2). The most significant additional subpathway (path:00380_1) contained the key tryptophan–serotonin regions in the tryptophan metabolism pathway (Supplementary Figure S4). Dizeyi et al. (30,31) demonstrated that serotonin can activate mitogen-activated protein kinase and PI3K/Akt signaling pathways, which play an important role in prostate cancer progression, especially in androgen-independent state disease. In the third additional pathway ‘cysteine and methionine metabolism’, Subpathway-GM identified the metastasis-related key subpathway region (path:00270_3) involving methionine metabolites (Supplementary Figure S5). In the pathway, methionine metabolites, including cystathionine and cysteine, can significantly increase the ability to predict aggressive prostate cancer (43). Furthermore, methionine metabolism involves mechanisms for sarcosine formation (43), and Sreekumar et al. (12) identified sarcosine as a potential key metabolic intermediary of prostate cancer cell invasion and aggressivity. In addition, the arachidonic acid metabolism pathway, especially COX regions in the pathway, has been highly associated with prostate cancer progression [reviewed in (45)]. To the best of our knowledge, some of the additional pathways identified by Subpathway-GM, such as histidine metabolism, have not been reported in association with metastatic prostate cancer. As shown in Figure 5A, the histamine region in the pathway was accurately identified by Subpathway-GM and the subpathway yielded a P-value of 0.0016 (FDR corrected to 0.0094). Histamine was located in the central region in the subpathway, suggesting a potential high association with metastatic prostate cancer. We further explore this using a transwell chamber assay to detect the effect of histamine on cell migratory ability in vitro (see Supplementary Text). Briefly, the prostate cancer cell line DH145 was treated with different final concentrations of histamine (1–6 μmol/l) for 24 h. The result showed that low concentration histamine could promote prostate cancer cell migratory ability and had a dose-dependent effect (Figure 5B and D). In contrast, high concentration histamine inhibited the cell migration. To exclude the effect of histamine on cell viability, viability was determined by 3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay after treatment of the cells with histamine (see Supplementary Text). The result showed that histamine had no effect on cell viability (Supplementary Figure S6). Cells were treated with 3 μmol/l of histamine for different period (0–24 h) to detect any time-dependent effects. The results confirmed that histamine promoted prostate cancer cell’s ability of migration in a time-dependent manner (Figure 5C and 5E). In addition, several genes previously shown to play important roles in multiple cancers also emerged in the identified subpathway, including Histidine decarboxylase (HDC), Histamine N-methyltransferase (HNMT), Monoamine oxidase (MAO) and Aldehyde dehydrogenase [NAD(P)+] (ALDH) (Figure 5A). Overall, these results suggest that dysfunction of the histamine region may be highly associated with metastatic prostate cancer.

Figure 5.

Analysis of the histamine region in histidine metabolism. (A) The histamine region (path:00340_1) is located in the center of the histidine metabolism pathway. Zoomed region displays the subpathway in detail. (B and D) Dose-dependent effect of histamine on migration detected using transwell chamber assay. Cell migration ability increased as histamine concentration increased. Prostate cancer cells showed the greatest migration at 3 μmol/l. (C and E) Cells treated with 3 μmol/l histamine were incubated for different periods (0–24 h). Histamine promoted prostate cancer cell migration in a time-dependent manner. Each experiment aforementioned was performed in triplicate.

DISCUSSION

Metabolic pathways involve genes (enzymes) and metabolites, which both play important roles in the functions of metabolic pathways related to studying conditions such as diseases (10–12). The integrative analysis of genes and metabolites at the pathway structure level will, therefore, help to locate and evaluate key metabolic subpathways. Based on this idea, we integrated genes and metabolites relevant to a given disease into pathways, and then identified key metabolic subpathways via cascades among signature nodes within the pathway structure. The resulting Subpathway-GM method was applied to differential genes and metabolites in colorectal cancer. The results showed that most of the pathways identified by Subpathway-GM were highly associated with the initiation and progression of colorectal cancer. Interestingly, many pathways corresponding to significant subpathways identified by Subpathway-GM were obviously ignored by classical pathway identification methods, such as Pathway-G, -M and IMPaLA. We focused on seven pathways that were identified by Subpathway-GM but not by any of the other three methods. Surprisingly, six pathways were highly associated with colorectal cancer, suggesting that Subpathway-GM was able to recall more colorectal cancer-associated pathways. Moreover, the results obtained for one colorectal cancer data set at the entire pathway and subpathway levels were reliably reproduced in a second colorectal cancer data set. Application of Subpathway-GM to a metastatic prostate cancer data set not only successfully identified already-reported metastasis-related metabolic subpathway regions but also predicted multiple novel potential metastasis-related subpathways, such as histamine metabolism. These results thus demonstrate the power of a joint gene/metabolite subpathway strategy based on their topology in terms of recalling and predicting more biologically meaningful pathways. Compared with other entire pathway identification methods, Subpathway-GM was able to locate key subpathway regions accurately via positional information of differential genes and metabolites within pathways. This strategy can not only locate key subpathway regions associated with diseases but also tends to identify the key regions representative of entire pathways. The key subpathway regions identified by Subpathway-GM contained fewer genes and metabolites than entire pathways, allowing researchers to use alternative low-throughput technologies to confirm the local subpathway regions related to specific diseases (46). This study focused on finding disease-related subpathways by setting the parameter n according to the distance between known disease genes and metabolites. The parameter s was also set as 5 to increase the sensitivity of identifying potential disease-related subpathways. However, users would be able to vary these parameters according to their needs. For example, increasing s could be used to focus more on the representative subpathways by filtering out small subpathways because smaller subpathway are less likely to represent the corresponding entire pathway. In addition, n can indirectly influence the size of the located subpathway. Increasing n increases the size of the subpathway because lenient distance similarity tends to merge more nodes into the same subpathway. Moreover, the key nodes within entire pathways also tend more to be added to the located subpathway because of high topological centrality of key nodes. Subpathway-GM would thus be expected to identify the key representative regions in entire pathways. Our previous ‘k-cliques’ method, which was similar to Pathway-G but uses pathway structure, was able to identify metabolic subpathways based on interesting genes (13). However, the method only used enzyme relations (without considering metabolites) to mine regular ‘circle’ shape subpathways. On the one hand, this method did not support joint input from interesting genes and metabolites like most of pathway identification methods, and thus ignored joint power of genes and metabolites. On the other hand, many disease-related subpathways displayed relatively irregular shapes in pathways, especially when considering interesting metabolite information. These irregular regions can usually be identified effectively by the Subpathway-GM, but not by the ‘k-cliques’ method, because of the different subpathway mining strategies of the two methods. We further compared Subpathway-GM with the ‘k-cliques’ method using data sets 1, 2 and 3, and we found that Subpathway-GM was more effective in identifying subpathways as a result of the integrative use of genes and metabolites, the strategy for locating subpathway regions and control of subpathway overlap (see Supplementary Text). Subpathway-GM thus detected 16, 13 and 10 significant pathways in data sets 1, 2 and 3, respectively, that were ignored by the ‘k-cliques’ method. The limitations of current metabolomic technology mean that fewer differentially metabolites are detected compared with differentially expressed genes, which may result in pathway analysis strategies tending to ignore metabolite information. However, metabolites may be located in important positions in pathways, and Subpathway-GM takes into account the importance of metabolites in locating and evaluating subpathways. Notably, if differential metabolites appear in an important position in the pathways, these metabolites and their nearby genes can play crucial roles in identifying key subpathway regions through the formation of signature nodes. For example, the differential arachidonic acid in the arachidonic acid metabolism pathway effectively helps to locate the key subpathway region composed of LOX, COX and CYP according to our subpathway strategy. Tryptophan in tryptophan metabolism provides another example. These metabolites were differential in colorectal cancer and were thus defined as signature nodes to locate key subpathways in Subpathway-GM. In analysis of data from ‘omics’ technologies such as microarrays, Pathway-G, -M and IMPaLA belong to the ‘cut-off–based’ methods, which are useful for identifying pathways with strong changes reflecting major changes in genes or/and metabolites (e.g. differential genes and metabolites) (2,8). The ‘cut-off–free’ methods, such as GSEA and its versions (e.g. MSEA), are also useful because they can detect molecules with weaker changes, and thus identify biologically meaningful but seemingly weak pathways (2,8). Subpathway-GM is a ‘cut-off–based’ method but can also recall biologically meaningful, seemingly weak pathways that contain key subpathway regions with strong changed nodes. The key principle of Subpathway-GM is that seemingly weak pathways from the entire pathway view may still show strong changes in key subpathway regions. Through effectively setting the parameter n, Subpathway-GM can thus endure some genes and metabolites with seemingly weak changes. ‘Cut-off–free’ methods are still not applied for the integrative pathway identification of genes and metabolites. A fact is that acquiring integrative rank list of genes and metabolites is usually difficult based on different technologies, such as microarray, GC–MS, liquid chromatography–MS and nuclear magnetic resonance. Subpathway-GM thus demonstrates a significant advantage in terms of the integrative pathway identification of genes and metabolites. Instead of needing to know rank lists or differential levels, users only need to provide a list of the molecules of interest. Subpathway-GM thus has considerable potential to complement ORA and GSEA in pathway identification, as a result of its utilization of both genes and metabolites and its subpathway strategy based on their topologies.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online: Supplementary Tables 1 and 2, Supplementary Figures 1–6, Supplementary Text, Supplementary Data set 1 and Supplementary References [47-61].

FUNDING

Funds for Creative Research Groups of the National Natural Science Foundation of China [81121003]; National Natural Science Foundation of China [61170154, 61073136, 81270511, 31200996 and 81000933]; Specialized Research Fund for the Doctoral Program of Higher Education of China [20102307120027 and 20102307110022]. Funding for open access charge: The National Natural Science Foundation of China [81121003]. Conflict of interest statement. None declared.

58 in total

Review 1. Gene-set analysis and reduction.

Authors: Irina Dinu; John D Potter; Thomas Mueller; Qi Liu; Adeniyi J Adewale; Gian S Jhangri; Gunilla Einecke; Konrad S Famulski; Philip Halloran; Yutaka Yasui
Journal: Brief Bioinform Date: 2008-10-04 Impact factor: 11.622

Review 2. Inhibition of arachidonic acid metabolism and its implication on cell proliferation and tumour-angiogenesis.

Authors: C A C Hyde; S Missailidis
Journal: Int Immunopharmacol Date: 2009-02-23 Impact factor: 4.932

3. Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression.

Authors: Arun Sreekumar; Laila M Poisson; Thekkelnaycke M Rajendiran; Amjad P Khan; Qi Cao; Jindan Yu; Bharathi Laxman; Rohit Mehra; Robert J Lonigro; Yong Li; Mukesh K Nyati; Aarif Ahsan; Shanker Kalyana-Sundaram; Bo Han; Xuhong Cao; Jaeman Byun; Gilbert S Omenn; Debashis Ghosh; Subramaniam Pennathur; Danny C Alexander; Alvin Berger; Jeffrey R Shuster; John T Wei; Sooryanarayana Varambally; Christopher Beecher; Arul M Chinnaiyan
Journal: Nature Date: 2009-02-12 Impact factor: 49.962

4. Invasion associated up-regulation of nuclear factor kappaB target genes in colorectal cancer.

Authors: David Horst; Jan Budczies; Thomas Brabletz; Thomas Kirchner; Falk Hlubek
Journal: Cancer Date: 2009-11-01 Impact factor: 6.860

5. Induction of tumor cell apoptosis by taurine Schiff base copper complex is associated with the inhibition of proteasomal activity.

Authors: Xia Zhang; Caifeng Bi; Yuhua Fan; Qiuzhi Cui; Di Chen; Yan Xiao; Q Ping Dou
Journal: Int J Mol Med Date: 2008-11 Impact factor: 4.101

6. Serum metabolite profiling of human colorectal cancer using GC-TOFMS and UPLC-QTOFMS.

Authors: Yunping Qiu; Guoxiang Cai; Mingming Su; Tianlu Chen; Xiaojiao Zheng; Ye Xu; Yan Ni; Aihua Zhao; Lisa X Xu; Sanjun Cai; Wei Jia
Journal: J Proteome Res Date: 2009-10 Impact factor: 4.466

7. Metabolic profiling of human colorectal cancer using high-resolution magic angle spinning nuclear magnetic resonance (HR-MAS NMR) spectroscopy and gas chromatography mass spectrometry (GC/MS).

Authors: Eric Chun Yong Chan; Poh Koon Koh; Mainak Mal; Peh Yean Cheah; Kong Weng Eu; Alexandra Backshall; Rachel Cavill; Jeremy K Nicholson; Hector C Keun
Journal: J Proteome Res Date: 2009-01 Impact factor: 4.466

8. HMDB: a knowledgebase for the human metabolome.

Authors: David S Wishart; Craig Knox; An Chi Guo; Roman Eisner; Nelson Young; Bijaya Gautam; David D Hau; Nick Psychogios; Edison Dong; Souhaila Bouatra; Rupasri Mandal; Igor Sinelnikov; Jianguo Xia; Leslie Jia; Joseph A Cruz; Emilia Lim; Constance A Sobsey; Savita Shrivastava; Paul Huang; Philip Liu; Lydia Fang; Jun Peng; Ryan Fradette; Dean Cheng; Dan Tzur; Melisa Clements; Avalyn Lewis; Andrea De Souza; Azaret Zuniga; Margot Dawe; Yeping Xiong; Derrick Clive; Russ Greiner; Alsu Nazyrova; Rustem Shaykhutdinov; Liang Li; Hans J Vogel; Ian Forsythe
Journal: Nucleic Acids Res Date: 2008-10-25 Impact factor: 16.971

9. Metabolite profiling of human colon carcinoma--deregulation of TCA cycle and amino acid turnover.

Authors: Carsten Denkert; Jan Budczies; Wilko Weichert; Gert Wohlgemuth; Martin Scholz; Tobias Kind; Silvia Niesporek; Aurelia Noske; Anna Buckendahl; Manfred Dietel; Oliver Fiehn
Journal: Mol Cancer Date: 2008-09-18 Impact factor: 27.401

10. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.

Authors: Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal: Nucleic Acids Res Date: 2008-11-25 Impact factor: 16.971

44 in total

1. Coordinate Regulation of Metabolite Glycosylation and Stress Hormone Biosynthesis by TT8 in Arabidopsis.

Authors: Amit Rai; Shivshankar Umashankar; Megha Rai; Lim Boon Kiat; Johanan Aow Shao Bing; Sanjay Swarup
Journal: Plant Physiol Date: 2016-07-18 Impact factor: 8.340

2. A novel dysregulated pathway-identification analysis based on global influence of within-pathway effects and crosstalk between pathways.

Authors: Junwei Han; Chunquan Li; Haixiu Yang; Yanjun Xu; Chunlong Zhang; Jiquan Ma; Xinrui Shi; Wei Liu; Desi Shang; Qianlan Yao; Yunpeng Zhang; Fei Su; Li Feng; Xia Li
Journal: J R Soc Interface Date: 2015-01-06 Impact factor: 4.118

Review 3. Investigating the aetiology of adverse events following HPV vaccination with systems vaccinology.

Authors: Joan Campbell-Tofte; Aristidis Vrahatis; Knud Josefsen; Jesper Mehlsen; Kaj Winther
Journal: Cell Mol Life Sci Date: 2018-10-16 Impact factor: 9.261

4. Comprehensive analysis of microglia gene and subpathway signatures for glioma prognosis and drug screening: linking microglia to glioma.

Authors: Chunlong Zhang; Jiaxin Zhao; Wanqi Mi; Yuxi Zhang; Xiaoling Zhong; Guiyuan Tan; Feng Li; Xia Li; Yanjun Xu; Yunpeng Zhang
Journal: J Transl Med Date: 2022-06-21 Impact factor: 8.440

5. Exploring autophagy-related prognostic genes of Alzheimer's disease based on pathway crosstalk analysis.

Authors: Fang Qian; Wei Kong; Shuaiqun Wang
Journal: Bosn J Basic Med Sci Date: 2022-09-16 Impact factor: 3.759

Review 6. Metabolomic Biomarkers of Prostate Cancer: Prediction, Diagnosis, Progression, Prognosis, and Recurrence.

Authors: Rachel S Kelly; Matthew G Vander Heiden; Edward Giovannucci; Lorelei A Mucci
Journal: Cancer Epidemiol Biomarkers Prev Date: 2016-04-06 Impact factor: 4.254

7. MPINet: metabolite pathway identification via coupling of global metabolite network structure and metabolomic profile.

Authors: Feng Li; Yanjun Xu; Desi Shang; Haixiu Yang; Wei Liu; Junwei Han; Zeguo Sun; Qianlan Yao; Chunlong Zhang; Jiquan Ma; Fei Su; Li Feng; Xinrui Shi; Yunpeng Zhang; Jing Li; Qi Gu; Xia Li; Chunquan Li
Journal: Biomed Res Int Date: 2014-06-25 Impact factor: 3.411

8. Integrating multi-omics for uncovering the architecture of cross-talking pathways in breast cancer.

Authors: Li Wang; Yun Xiao; Yanyan Ping; Jing Li; Hongying Zhao; Feng Li; Jing Hu; Hongyi Zhang; Yulan Deng; Jiawei Tian; Xia Li
Journal: PLoS One Date: 2014-08-19 Impact factor: 3.240

9. Identification of miRNA-mediated core gene module for glioma patient prediction by integrating high-throughput miRNA, mRNA expression and pathway structure.

Authors: Chunlong Zhang; Chunquan Li; Jing Li; Junwei Han; Desi Shang; Yunpeng Zhang; Wei Zhang; Qianlan Yao; Lei Han; Yanjun Xu; Wei Yan; Zhaoshi Bao; Gan You; Tao Jiang; Chunsheng Kang; Xia Li
Journal: PLoS One Date: 2014-05-08 Impact factor: 3.240

10. From genotype to functional phenotype: unraveling the metabolomic features of colorectal cancer.

Authors: Oliver F Bathe; Farshad Farshidfar
Journal: Genes (Basel) Date: 2014-07-22 Impact factor: 4.096