Literature DB >> 19390668

Perturbation of interaction networks for application to cancer therapy.

Adrian P Quayle1, Asim S Siddiqui, Steven J M Jones.   

Abstract

We present a computational approach for studying the effect of potential drug combinations on the protein networks associated with tumor cells. The majority of therapeutics are designed to target single proteins, yet most diseased states are characterized by a combination of many interacting genes and proteins. Using the topology of protein-protein interaction networks, our methods can explicitly model the possible synergistic effect of targeting multiple proteins using drug combinations in different cancer types. The methodology can be conceptually split into two distinct stages. Firstly, we integrate protein interaction and gene expression data to develop network representations of different tissue types and cancer types. Secondly, we model network perturbations to search for target combinations which cause significant damage to a relevant cancer network but only minimal damage to an equivalent normal network. We have developed sets of predicted target and drug combinations for multiple cancer types, which are validated using known cancer and drug associations, and are currently in experimental testing for prostate cancer. Our methods also revealed significant bias in curated interaction data sources towards targets with associations compared with high-throughput data sources from model organisms. The approach developed can potentially be applied to many other diseased cell types.

Entities:  

Keywords:  Networks; cancer; drug combinations; network perturbation; protein-protein interactions

Year:  2007        PMID: 19390668      PMCID: PMC2666951     

Source DB:  PubMed          Journal:  Cancer Inform        ISSN: 1176-9351


Introduction

Aims and motivation

Research into the genetic basis of many different diseases has developed our knowledge of the links between particular genes, proteins and diseases, such as the wellknown P53 gene and cancer. Most diseased states are polygenic, however, and cannot be explained or characterized by a single gene, but rather by a combination of interacting genes and their products (Vogelstein et al. 2000). Therapeutic drugs traditionally target single highly connected proteins in the networks associated with diseased cells, in order to elicit a response. Some degree of efficacy is possible using such an approach, but a knowledge and characterization of the gene and protein interaction networks associated with a diseased cell state is important for the development of improved therapeutics. For example, highly connected proteins are more likely to have a critical role in the protein interaction networks of normal cells, and hence side effects and chemotoxicity will often result with such an approach (Keyomarsi and Pardee, 2003). In addition to disease therapy, a network level approach is also required to understand many other phenotypic processes, such as development (Gilbert, 2000). We present an approach which characterizes cancer types on a network level, by developing a model of the interaction networks present in tumor cells. The primary motivation for using such an aproach is to search for potential combinations of drugs which give improved efficacy for cancer treatments compared to existing therapies typically involving a single drug. The potential improved efficacy from using a multi-drug approach to cancer therapy has already been recognized (Mitchell, 2003), but research has not yet attempted to discover novel drug combinations from a knowledge of the underlying associated networks. For example, if a given drug is known to inhibit tumor growth to some extent, we can computationally search for another drug to be used in combination with the known drug to give improved efficacy. Part of the premise of such an approach is that network topology is a useful predictor for cancer and diseased targets in general. Network topology has been shown to influence many biological properties including gene essentiality (Jeong et al. 2001; Yu et al. 2004), expression (Lukashin et al. 2003; Herrgard et al. 2003) and function (Dunn et al. 2005; Guimerà and Amaral, 2005) amongst others, and therefore some influence or correlation with target suitability seems likely. Yet, to our knowledge this correlation has not yet been studied in any detail. We develop models of the networks associated with tumor cells and equivalent normal cells in H. sapiens by mapping gene expression comparison data onto networks of protein-protein interactions. The interaction data is collated both from established sources of curated H. sapiens interaction data and from a set of ortholog interactions in model organisms. We search for potential drug combinations using a computational simulation of the effect of removing target protein combinations from the network associated with a given tumor cell (cancer network). A number of alternative measures or ranking methods are introduced to model the criticality of these protein combinations, which are used to predict potential drug combinations by relating the relevant proteins to known drugs. A network level model is a relatively new approach to cancer research but—together with the integration of gene expression data—has already been noted as a useful avenue for drug discovery for cancer therapies (Huang, 1999). Due to current limitations in data coverage and accuracy, in particular for protein interaction data, such a network-level model cannot be expected to provide a complete solution, but may provide significant predictive power. Interaction network models can be developed on a number of levels, using a global network model such as in the current work, possibly incorporating different network types, or alternatively using more detailed models of sub-networks or individual pathways. One of the ultimate goals of biological network models is to produce an accurate model of the complete network of interactions in a cell or tissue type, often referred to as the interactome (Walhout et al. 2002). In this approach, signalling networks, metabolic networks, and protein networks are all explicitly included in a single model, which is currently only feasible for relatively simple organisms such as E. coli (Juty et al. 2001). Models for more complex organisms focus on particular network types, such as transcriptional networks (Bolouri and Davidson, 2002), or metabolic networks (Kell, 2004; Stelling et al. 2002). Topological network models have shown that many cellular networks exhibit a power-law degree disribution (Jeong et al. 2000), and that their structure has a modular nature (Ravasz et al. 2002; Rives and Galitski, 2003).

Network perturbation

We use network perturbation to search for novel target combinations, and consider the perturbation of one network relative to another, which we call “preferential perturbation”. A single network perturbation approach attempts to maximize the perturbation to a cancer network, whereas an approach involving preferential perturbation in this case attempts to maximize the perturbation to a cancer network, while minimizing the resultant perturbation to a related normal network. The susceptibility of networks to attack and failure has been studied for many network models and real world networks (Holme et al. 2002). The robustness of real-world networks is critical in a wide-range of contexts, including power-grids and general transport networks, communication networks and the Internet (Holme et al. 2002) (Cohen et al. 2000). Most studies have investigated the robustness of single networks or systems of networks, and have not considered the robustness of one network relative to another. The preferential perturbation of a pair of networks with non-zero similarity is a very different problem to single network perturbation, and we study the principles and topological dependency of preferential network perturbation in a separate publication (Quayle et al. 2006). We show that the extent of preferential perturbation of a pair of networks depends on a number of topological parameters, including the network similarity, size and average degree amongst others. The work in (Quayle et al. 2006) uses random (ER model) networks (Erdös and Rényi, 1960) and so called “scale-free” (BA model) networks (Barabási and Albert, 1999) to develop results and principles for general networks.

Model and Methods

We describe the methods and approach used in detail in Sections 2.1 to 2.3, in three conceptually distinct and sequential phases. The first phase is the collation of interaction data (Section 2.1), which generates underlying network models. Gene expression data is used to generate networks associated with specific cell types (Section 2.2) and the final phase involves the simulation of network perturbations to study these networks (Section 2.3). We also describe the disease and drug association data used (Section 2.4) and details of statistical techniques used for data analysis (Section 2.5).

Protein-protein interaction data

We combine a number of interaction data sources to improve coverage over that which could be acheived using only one such data source. We collated and populated curated H. sapiens interaction data from BIND (Bader et al. 2003), HPRD (Peri et al. 2003) and DIP (Xenarios et al. 2002) to give a total of 9,020 interactions between 4,524 HUGO genes (Wain et al. 2004). Since the interaction data currently available for H. sapiens is relatively limited in terms of coverage, we also added a set of predicted interactions in H. sapiens which was derived from interactions in model organisms by Lehner and Fraser (Lehner and Fraser, 2004). By mapping proteins referenced in interaction data from the model organisms C. elegans, D. melanogaster and S. cerevisiae to orthologous proteins in H. sapiens using the Inparanoid algorithm (O’Brien et al. 2005), a set of predicted interactions in H. sapiens was obtained. We extracted the core dataset of interactions which have the highest associated confidence score according to the Inparanoid mapping algorithm, to give 6,958 unique interactions. The majority of results presented here are based only on the higher confidence curated data sources, but the set of predicted interactions gives improved coverage.

Generation of networks associated with cell types

The combination of data sources described in Section 2.1 generates a network which is a subset of the true network of all possible interactions due to data limitations, and is also subject to the accuracy of interaction data. The particular proteins and interactions present in a cell varies significantly between cell types, depending on the gene and protein expression. A knowledge of the gene expression profile for a given cell type can be used to generate a model of the associated interaction network. We used expression comparisons of SAGE libraries (Blackshaw et al. 2003) from a particular type of tumor cell and equivalent normal cell to determine those genes which are up-regulated or down-regulated, according to a significance level of 5% and at least a five-fold change in the expression level (after conversion to a parts per million measure). A tag to gene mapping stage is also required, for which we used the mappings available from SAGE Genie (Boon et al. 2002). Approximately 80% of the tags were successfully mapped to genes, which is a reasonable fraction for tag mapping. A more detailed discussion of alternative approaches and complications involved in tag to gene mapping is described in Pleasance and Jones (Pleasance and Jones, 2005). We map genes that are up-regulated in the cancer library to the cancer network, and down-regulated genes to the normal network, and the non-differentially regulated genes are mapped to both networks. This second phase gives two sub-networks from an underlying network to represent the networks associated with particular types of tumor and normal cells. In this phase of the method, approximately 50% of the genes map to the underlying interaction network (see Section 3.2 for more detailed discussion on data coverage). Networks for seven different cancer types are generated from expression comparisons of short SAGE libraries available from the Cancer Genome Anatomy Project (CGAP) (Lash et al. 2000), as listed in Table 1.
Table 1.

CGAP SAGE library IDs used in expression comparisons for each cancer type.

Cancer TypeCancerNormal
StomachGSM2385GSM784
ColonGSM755GSM728
PancreasGSM743GSM716
ProstateGSM740GSM739
BreastGSM672GSM14756
LungGSM14806GSM14805
BrainGSM14762GSM763
In the current work we use a topological characterization of the networks generated from the techniques described in Sections 2.1 and 2.2, as the first phase towards the development of a more detailed understanding and model of these networks. We study the change in network topology when a combination of proteins (network vertices, which we use to refer to proteins later in the paper) is removed, which we call a network perturbation. We search for optimum target combinations which, when removed, maximize the fragmentation of a given cancer network, while minimizing the fragmentation of the associated normal network. It is hoped that this may lead to novel drug combinations, which can target particular proteins, and selectively perturb the cancer network. We measure network perturbation by the size of the giant component, which is the largest component in a network. This is a standard approach used in network robustness studies (Holme et al. 2002; Albert et al. 2000). We define the perturbation score for a single network as, where G1 and G1 are the giant component sizes of network 1 before and after perturbation respectively (Quayle et al. 2006). Equation 1 includes normalization based on the initial network, such that the perturbation score ranges between zero and one. We define a preferential perturbation as the difference in perturbation scores for two networks such that, where P12 reflects a preferential perturbation between two networks. An exhaustive search for the optimum vertex combination which maximizes the perturbation score is not possible due to the size of the associated search space, and a number of alternative optimization approaches were considered. It was found that a technique using successive vertex removal gives the highest score on average (Quayle et al. 2006). That is, the highest ranking vertex is initially removed, parameters are recalculated based on the new networks, the new highest ranking vertex is removed, and this process is repeated iteratively until the required number of vertices have been removed to make up a vertex combination. Alternative vertex ranking methods were studied, based on parameters such as vertex degree, diffierence in degree, betweenness, diffierence in betweenness, and the resulting perturbation score (see (Quayle et al. 2006) for further explanation of these methods and a detailed study of their success). It was found that no single method is a universally optimal method, but rather the best method depends on the regime of interest (number or fraction of vertices to remove) and also details of network topology. Therefore we present and compare results from all methods considered for the perturbation of cancer and normal networks.

Target association data

Given a novel target combination we wish to determine whether those targets are druggable or disease associated, and in particular whether they are cancer associated. We used the Therapeutic Target Database (TTD) as a source of target-drug associations (Chen et al. 2002), which includes information on protein and nucleic acid targets, the diseased target, corresponding drugs or ligands and pathway information. After mapping to HUGO IDs we obtained 659 targets, 231 of which have drug associations, and 641 with disease associations at the time of population. Many of the listed disease associations are cancer associations, and many targets are associated with multiple cancers. An additional set of 299 cancer associations was obtained from a literature census of mutated genes implicated in human cancer (Futreal et al. 2004), giving 275 associations after mapping to HUGO IDs.

Correlation tests

We studied the correlation of vertex properties between networks which provides a number of insights, but is principally useful for quantifying the influence of the underlying network on the resulting cancer and normal networks. The types of analysis are based either on initial network topology or perturbation results, for which we use standard linear regression (Section 2.5.1) or Spearman’s rank correlation coefficient (Section 2.5.2) respectively.

Initial topology

Correlations between vertex topological properties have been studied for model networks (Holme et al. 2002; Quayle et al. 2006), and a similar analysis can be made for the cancer and normal networks, and the underlying network. For example, the correlation between vertex degree and betweenness for a given network is visualized by a scatter plot, where the x and y values of each point are the degree and betweenness of a vertex. A least-squares linear regression coefficient is used to quantify the strength of the correlation. We calculate the linear correlation coefficients between different cancer networks and between the underlying network and cancer networks.

Perturbation results

The output from a perturbation simulation is an ordered vertex list, that is, a set of vertex ranks. The correlation between the rank positions of vertices in two different perturbations can be calculated using Spearman’s rank correlation coefficient which is suitable specifically for ranked data. The output from two perturbations typically contains different vertex sets and a different number of vertices, so the coefficient is calculated based only on vertices in both output lists. Spearman’s rank correlation coefficient, r is given by the expression, where n is the number of paired ranks, and d is the diffierence between the two ranks in a pair (in this case for a given vertex) (Zar, 1999). Note that the above expression is strictly only valid for data with no tied ranks, which is true for the perturbation results since only one vertex is removed at a time.

Results and Discussion

The underlying network model constructed from the combination of data sources was analyzed by calculating a range of network properties. Table 2 shows relevant statistics for the data sources and data source combinations, where we use the network terminology vertices and edges to refer to proteins and protein-protein interactions respectively.
Table 2.

Data statistics: Topological properties of data sources and data combinations where C1 = BIND + HPRD + DIP and C2 = BIND + HPRD + DIP + Predicted.

BINDHPRDDIPPredictedCombination C1Combination C2
Number of vertices, N25013349597269645245989
Number of edges, n357455217306958902015776
Average degree, k2.8583.2972.4465.1623.9885.268
Average shortest path, l5.3005.9456.7376.4855.0435.091
Clustering coefficient, C0.0730.0580.0940.1390.0780.104
Assortativity coefficient, r−0.1690.0410.0310.446–0.111–0.052
A number of network parameters are shown in Table 2 which we explain in more detail below, since these parameters are discussed in later sections of this paper. The degree of a vertex is the number of edges connected to that vertex, and the average degree is the vertex degree averaged across all vertices in a network. Similarly, there is a shortest path length between every pair of vertices in a network, and the path length is averaged over all vertex pairs to give the average shortest path. The betweenness of a vertex (Joy et al. 2005), which is discussed later, measures the number of shortest paths passing through a given vertex. The clustering coefficient measures the fraction of transitive triples or triangles between nearest neighbours, and the clustering coefficient of vertex i, C, is given by, where k is the degree or number of nearest neighbours of vertex i, and E is the number of edges connecting between these nearest neighbours (Watts and Strogatz, 1998; Wasserman and Faust, 1994). The clustering coefficient of a network is calculated by averaging C over all vertices of the network. Finally the assortativity coefficient measures the tendency of vertices to connect to other vertices which have a similar degree (see (Newman, 2002) for an exact definition). In other words, if the “hubs” in a network tend to connect to other hubs, then the network shows assortative mixing. The assortativity coefficient is defined between minus one for a perfectly disassortative network and plus one for a perfectly assortative network, where a value of zero means there is no assortative mixing. It has been shown that social networks are generally assortative, whereas biological and technological networks are generally disassortative (Newman, 2002). We group the data sources into two data source combinations, where the first is a combination of primary H. sapiens interaction data, and the second combination contains all data sources including predicted interactions. DIP has the least interactions and the smallest average degree, and correspondingly the largest average shortest path. The data source of predicted interactions is the most connected, and has a surprisingly high assortativity coefficient of r = 0.446, since biological networks are generally disassortative. This likely reflects sample bias in this dataset of core interactions, since each data source is only a subset of the complete set of interactions, and is subject to bias in terms of known interactions. This bias and the fact that data derived from orthologous protein-protein interactions is expected to be less accurate than primary H. sapiens interactions are the motivation for using two alternative data source combinations.

Topology and similarity of cancer and normal network pairs

We use the definition of network similarity, S, in terms of network edges given by, where n is the number of edges in common between the two networks, and n is the total number of edges in the combined network (see (Quayle et al. 2006) for further explanation). At the two extremes, if the networks are identical then no preferential perturbation is possible, but if the networks have no similarity then a complete perturbation of one network relative to the other is possible if sufficient vertices are removed, and the problem is equivalent to single network perturbation. Between these limits, some maximum possible preferential perturbation score is associated with two given networks. A detailed derivation of the similarity of both independent and “correlated” network pairs has been determined (Quayle et al. 2006). Correlated networks are derived from a single underlying network, and the cancer and normal network pairs are correlated network pairs, since they are derived from the underlying interaction network. The similarity of such correlated networks is independent of network topology if the vertices which make up the correlated networks are sampled randomly from the underlying network. The similarity of randomly sampled correlated networks varies directly with the vertex set similarity, V, which is given by, where N is the number of vertices common to the two networks, and N is the total number of distint vertices in the networks. For random vertex sampling, we showed by analytical derivation that the expected network similarity, E(S), is equal to the square of the vertex set similarity. Fourteen cancer and normal network pairs were generated from the seven cancer types listed in Section 2.2 and the two data source combinations described in Section 3.1. After tag mapping, the results from an expression comparison contain around 7,000 genes, which is typically reduced to around 2,000 after singletons—a SAGE tag with only one occurrence in the corresponding library—are removed. There is significant variation in this stage, however, and some cancer types such as prostate have far fewer singletons, leaving around 4,000 genes after the removal of singletons. Approximately 50% of these genes map to the underlying network, giving network sizes of the order of 1,000 genes. The similarity of the generated network pairs was calculated and plotted against the vertex set similarity for each network pair, as shown in Figure 1. The variation of the expected network similarity for a general network pair with random vertex sampling is also shown.
Figure 1.

Network similarity and vertex set similarity values for cancer and normal network pairs (data combinations C1 and C2), compared directly to the expected network similarity for a general network topology with random vertex sampling.

Figure 1 shows that the network similarity and vertex set similarity values for the cancer and normal networks are generally close to the predicted variation for random vertex sampling. Network pairs for different cancer types have network similarity values within the approximate range 0.55 ≤ S ≤ 0.70, with the exception of prostate cancer, where S ∼ 0.90. Therefore the maximum preferential perturbation score is likely to be lower for the prostate cancer networks, depending also on other topological parameters such as network-size, average degree and clustering coefficient, which all influence network robustness to some extent (Quayle et al. 2006). Table 3 show the values of such parameters for selected networks, which is a sample from a larger table in the Supplementary Materials section.
Table 3.

Topological parameters of selected networks, where the network name reflects the data combination (C1 or C2), cancer type, and whether the network is a cancer or normal network.

NetworkNumber of vertices, NNumber of edges, nGiant component size, GAssortativity coefficient, r
C1 stomach cancer887586414−0.1499
C1 stomach normal848501366−0.1843
C1 colon cancer992709502−0.1325
C1 colon normal1069843569−0.1197
C1 pancreas cancer8985853910.0584
C1 pancreas normal8594522860.0106

Preferential perturbation of cancer and normal network pairs

Vertex ranking methods

We study the preferential perturbation of cancer and normal networks using the vertex ranking methods described in Section 2.3. We are primarily interested in removing combinations of only a few vertices (proteins) to give novel target combinations, which corresponds to a small fraction of the total number of vertices in a given network. Successive vertex removal is applied until a network is fully fragmented and all vertices are removed, which provides useful additional network characterization. We refer to the variation of the perturbation score with the number of vertices removed as a “perturbation profile”. Figure 2 shows preferential perturbation profiles averaged across the fourteen cancer and normal network pairs based on the number of vertices removed. Notably a preferential perturbation profile generally increases up to a maximum, and if more vertices are removed beyond this point the score decreases, as the normal network becomes more fragmented.
Figure 2.

Preferential perturbation averaged over fourteen cancer and normal network pairs, for alternative vertex ranking methods and for initial vertex removals (small N) (a), and large N up to maximum (b).

As predicted from results developed in (Quayle et al. 2006), Figure 2 shows that no single vertex ranking method is universally optimal, but rather the best method depends on the regime of interest and the definition or metric of method success. Alternative metrics of success are discussed further below, but often the extent of perturbation or perturbation score may be the most suitable metric. According to this metric, for small N the perturbation scores method is the most effective, and betweenness-based methods are generally more effective than degree-based methods. For ER and BA model networks degree-based methods are in fact more effective than betweenness-based methods in this regime. The cancer and normal networks are significantly more clustered than equivalent ER and BA model networks of the same size and average degree, as shown in Table 4, which is partly responsible for the observed difference in effectiveness of these methods. The observed clustering fits results from previous studies which have shown that many biological networks have an inherent modularity and clustering (Ravasz et al. 2002; Girvan and Newman, 2002).
Table 4.

Clustering coefficients of example cancer and normal networks, compared to equivalent ER and BA model networks with the same N and k.

Cancer networkER modelBA modelObserved
C2 stomach cancer0.00190.00190.0869
C2 stomach normal0.00170.00180.0789
C2 colon cancer0.00170.00200.0858
C2 colon normal0.00160.00260.0924
C2 pancreas cancer0.00170.00190.0901
C2 pancreas normal0.00200.00210.1017
C2 prostate cancer0.00140.00720.0953
C2 prostate normal0.00140.00770.0946
C2 breast cancer0.00150.00470.0935
C2 breast normal0.00160.00540.0885
C2 lung cancer0.00160.00270.0905
C2 lung normal0.00180.00560.0921
C2 brain cancer0.00140.00560.0940
C2 brain normal0.00130.00400.0861
The success of vertex ranking methods relates closely to whether or not they provide an accurate measure of the “centrality” of vertices in a network. As discussed, degree and betweenness are alternative measures of vertex centrality, and for most network topologies betweenness-based methods are more effective (Holme et al. 2002; Quayle et al. 2006). This is not surprising, since betweenness is a global measure (requires knowledge of network structure) whereas degree is only a local measure (detailed network structure is not needed to measure the degree of a given vertex). The perturbation scores method is a relatively poor measure of vertex centrality, since this method effectively forces at least some perturbation for each vertex removal, and hence tends to target more peripheral vertices. Therefore, although this method on average gives the greatest preferential perturbation score for small N, this may not be the most effective predictor of useful targets. Alternative metrics for analyzing method effectiveness are the score at the maximum in a profile, and the average perturbation gradient, which is given by, where max represents the values of these parameters at the maximum in a given profile (see (Quayle et al. 2006) for the motivation behind these parameters). When searching for novel target combinations, we focus on combinations of up to 5 targets, in other words the regime for small N, below the maximum. The current static topological model gives a conservative estimate of the fragmentation of a real network, since it cannot take into account dynamical effects such as possible cascades and subtle dependencies between interactions, which may be observed in the real biological network. Therefore, alternative metrics based on the maximum in a profile may be more realistic, or at least equally effective predictors of the effect of perturbations on a network, rather than simply the score at a given N value. Tables 5 and 6 show values for these metrics averaged across the fourteen network pairs for each vertex ranking method. Vertex betweenness is significantly the most effective method according to both metrics, which cannot be seen in the averaged profiles shown in Figure 2 since maxima occur at different N values in different networks. These results show a similar order of method effectiveness as predicted (Quayle et al. 2006), and suggest that the betweenness method may provide the most useful target predictions.
Table 5.

Average maximum preferential perturbation score for alternative vertex ranking methods and the standard deviation.

Ranking methodScoreσ
Betweenness0.55890.0952
Diffierence in betweenness0.39100.1619
Degree0.31330.1079
Diffierence in degree0.24120.0756
Perturbation scores0.36910.1464
Table 6.

Average perturbation gradient for alternative vertex ranking methods and the standard deviation.

Ranking methodGradientσ
Betweenness9.95062.5663
Diffierence in betweenness3.94632.4213
Degree3.21551.4297
Diffierence in degree0.78790.6310
Perturbation scores2.76342.2960
Target results for prostate cancer networks for combinations of up to 5 targets are shown in Table 7, for which we are testing the predictions in our own prostate cancer research group. Although betweenness is the most effective method according to the two metrics as shown in Tables 5 and 6, the first few vertex removals using this method actually give a negative perturbation score for this particular network pair (i.e. the normal network is more fragmented than the cancer network). On the other hand, diffierence in betweenness generates a comparable score to that generated using the perturbation scores method.
Table 7.

Successive targets obtained using alternative ranking methods and the associated perturbation score for the preferential perturbation of prostate networks of data combination type C1.

Verticesbetweennessdiff in betweennessdegreedifference in degreeperturbation scores

removedtargetscoretargetscoretargetscoretargetscoretargetscore
1PXN−0.0026SERPINA30.0019PXN−0.0027SERPINA30.0019APP0.0117
2VCL−0.0011APP0.0136VCL−0.0011SLC9A3R10.0038CDC420.0187
3SMAD2−0.0006SLC9A3R10.0155SMAD2−0.0006HIST2H2BE0.0067MAPK30.0216
4TLN1−0.0020DSP0.0165TLN1−0.0020DSP0.0077CDKN1A0.0246
5EGFR−0.0026HIST2H2BE0.0193SMAD3−0.0036CBX40.0086PBX20.0275
The results shown in Figure 2 and Tables 5 to 7 give strong evidence that degree-based methods are less useful than betweenness-based methods. Furthermore, betweenness and difference in betweenness are more effective measures of vertex centrality than degree and difference in degree respectively. What is less clear from the above analysis is exactly which metric of method success is most appropriate, given the current model. Therefore we are initially investigating highly ranked targets from the betweenness, diffierence in betweenness and perturbation scores in experimental testing.

Correlation tests

Targets with a high betweenness in a given cancer network also tend to have a high betweenness in the underlying network from which the cancer network was derived. This has the result that highly ranked targets according to the betweenness method for a given cancer type are often highly ranked targets in another cancer type, due to the correlation of both cancer networks with the underlying network. The difference in betweenness ranking method selects targets which are less strongly correlated with the topology of the underlying network, and thus the highly ranked targets tend to be more specific to a given cancer type. It is possible that both approaches may predict useful novel targets, since some known cancer targets are highly specific to a given cancer type (Fukazawa et al. 2004), while others are known to provide a therapeutic response in many cancer types (Shelton et al. 2005). To analyze the strength of these correlations between the underlying network and the cancer and normal networks, and between different cancer types, we used correlation tests as described in detail in Section 2.5.1. Table 8 shows the vertex betweenness correlation between the underlying networks and cancer networks. Typical correlation coefficients are within the approximate range 0.8 < r < 0.9, indicating a significant correlation in this case. Correlations between vertex degree are slightly stronger between a cancer network and the underlying network from which it was derived, but weaker between a cancer network and a different underlying network (see Supplementary Materials). This is an interesting result which shows that betweenness is less sensitive to the addition or removal of links (between different underlying networks) than degree, and that vertex betweenness captures the overall or core topology of a network more effectively than vertex degree. This inference is consistent with the fact that betweenness provides a better measure of vertex centrality than degree. Since the current underlying networks are only subnetworks of the true or complete interaction network, this result indicates that such a subnetwork can provide a useful representation of the “core” topology of the true interaction network.
Table 8.

Linear regression correlation coefficients for initial vertex betweenness between the underlying networks (C1 and C2) and cancer networks.

C1C2
C1 stomach0.82310.8080
C1 colon0.83970.8484
C1 pancreas0.68510.6043
C1 prostate0.88790.8274
C1 breast0.86480.8467
C1 lung0.79630.7515
C1 brain0.89660.8545

C2 stomach0.76250.8539
C2 colon0.78710.8441
C2 pancreas0.63490.7356
C2 prostate0.82110.8871
C2 breast0.80470.8701
C2 lung0.76260.8246
C2 brain0.82450.8854
It is expected that the observed correlations with initial network topology extend to the vertices (targets) selected in perturbations in different cancer types, since perturbation ranking methods are based on betweenness and degree. We investigated the vertex correlations between perturbation results for different cancer types using Spearman’s rank correlation coefficient, as described in detail in Section 2.5.2. Correlation coefficients were calculated between each cancer network or cancer and normal network pair, for each vertex ranking method. The Supplementary Materials contains detailed results for the betweenness and difference in betweenness ranking methods, and Table 9 shows the average rank correlation coefficient for each vertex ranking method.
Table 9.

Average Spearman’s rank correlation coefficient between perturbation results for each vertex ranking method, averaged across all cancer and normal network pairs.

Ranking method<rs>
betweenness0.6019
diffierence in betweenness0.1784
degree0.6261
diffierence in degree0.3189
perturbation scores0.1809
The average correlation coefficients for the betweenness and degree ranking methods are significantly higher than those from other methods, since these methods do not take into account the relevant normal network topology. Therefore a target with a high rank in the perturbation results using the betweenness method in a given cancer type, is likely to also have a high rank in the results for another cancer type, if it is present in the network (recall the correlations are only based on common targets). There is less correlation in the perturbation results using other ranking methods, and the diffierence in betweenness ranking method gives the least correlation, reflecting the sensitivity of this method to topological differences. Correlations between perturbation results were also compared between the underlying network and the cancer networks, giving similar results for different ranking methods.

Target associations

The outcome of the current work is a set of ranked targets for a given cancer type, which we are testing for their potential as targets for cancer therapy. An approach for assessing the likelihood that these targets are useful, potentially novel cancer targets is to validate the target ranks against known cancer targets. If our methods preferentially select for known cancer targets, this indicates that highly ranked novel targets have a greater likelihood of being useful cancer targets. We applied a global statistical analysis on the ranks of targets with known cancer associations, disease associations and drug associations, using the association data described in Section 2.4. The null hypothesis is that targets with a specified association type are selected randomly from a given network, and the alternative hypothesis is that such targets have a greater chance of selection by our methods, or in other words are more highly ranked than at random. The analysis generates a p-value using hypergeometric probability distributions for each cancer and normal network pair for a given association type and ranking method. These p-values are then combined using the Z-transform test (Whitlock, 2005) to give an overall p-value for each ranking method, for a given association type. The combined p-value therefore measures the ability of a given ranking method to preferentially select for cancer targets, for example. The resulting p-values are given in Table 10, which shows that according to the null hypothesis the betweenness, degree and perturbation scoring methods all strongly select for cancer and disease associated targets. This seems to validate the effectiveness of these methods, but the null hypothesis is naive, since it assumes independence between tests and does not account for possible biases or network correlations as calculated in Section 3.3.2. As discussed, we expect some biases in the underlying interaction data for known cancer targets, since these targets are known to be therapeutically interesting. These biases will tend to increase the degree and betweenness of targets with associations in the underlying networks and cancer and normal networks. We therefore calculate the average degree and betweenness of targets with associations in these networks, and compare them with the average degree and betweenness of all targets in the networks. Tables 11 and 12 show ratio values for the average betweenness for different association types.
Table 10.

Combined p-values for alternative ranking methods and target associations. Methods: be-betweenness, db-difference in betweenness, de-degree, dd-diffierence in degree, ps-perturbation scores.

MethodCancerDiseaseDrug
be2.47 × 10−9< 1 × 10−150.0002
db0.00100.00080.7168
de1.92 × 10−61.73 × 10−140.0080
dd0.72600.51510.7800
ps1.45 × 10−51.38 × 10−130.0006
Table 11.

Ratio of average betweenness for targets with associations over the average betweenness of all targets in the underlying data source combinations.

Data source combinationCancer associationsDisease associationsDrug associations
C13.47342.44972.2374
C23.43562.32182.1695
Table 12.

Ratio of average betweenness for targets with associations over the average betweenness of all targets in different cancer networks for data combination C1.

Cancer typeCancer associationsDisease associationsDrug associations
stomach3.70413.15123.5090
colon1.51891.70631.4188
pancreas2.17912.84053.0614
prostate2.65492.16533.5936
breast2.30211.75871.8214
lung1.99762.54781.7136
brain3.57843.11941.9577
The ratio values are consistently greater than 1, which shows that targets with associations have significantly greater betweenness than an average target, both in the cancer networks and the underlying networks. The average value of the equivalent ratios for the average degree are typically around 1.5 (see Supplementary Materials), which is less than the ratios for average betweenness. The betweenness distributions of these networks have a larger spread or standard deviation than that of equivalent degree distributions, which in part explains this difference. Since targets with associations tend to have a higher than average betweenness and degree in the underlying curated networks, much of the significance observed in Table 10 is due to the influence of the underlying network, rather than the cancer specific networks. This result is biologically reasonable, and is potentially very useful for the discovery of cancer targets with therapeutic applications in many cancer types. For example, the well-characterized EGFR target is highly ranked from our methods in many cancer types, and is known to be implicated in many cancer types including prostate cancer (Shelton et al. 2005), where it is the fifth highest ranked target using the betweenness ranking method. Many other key cancer targets are highly-ranked using our methods in different cancer types, such as SMAD3, MAPK3, RAF1 and TP53, amongst others. In order to quantify the bias in these curated datasets to some extent, we ran a similar analysis using networks generated only from high-throughput data. All methods of network generation and analysis are as described previously except the underlying network is now the complete set of interactions predicted by Lehner & Fraser (Lehner and Fraser, 2004). This high throughput data is biased, but the types of bias are different to those expected in curated data, and therefore such an analysis is useful (Mrowka et al. 2001; Serebriiskii et al. 2000). Tables 13 and 14 show equivalent results for networks generated from high-throughput data to the results in Tables 11 and 12 for curated interaction data (see Supplementary Materials for degree ratios).
Table 13.

Ratio of average betweenness for targets with associations over the average betweenness of all targets in the high-throughput dataset.

Cancer associationsDisease associationsDrug associations
1.72381.15381.8700
Table 14.

Ratio of average betweenness for targets with associations over the average betweenness of all targets in different cancer networks generated from the high-throughput dataset.

Cancer typeCancer associationsDisease associationsDrug associations
stomach1.08411.29932.5160
colon1.05731.53342.2316
pancreas1.26210.87912.3329
prostate0.95201.10331.8761
breast1.25421.20981.9720
lung1.31490.99441.9793
brain1.33371.21942.4215
The betweenness ratios are much lower for networks generated from high-throughput data than from curated data, which indicates that much of the significance is likely to be due to biases in the data, and gives a clear demonstration of bias in curated interaction data sources. Interestingly, the betweenness and degree ratios are still significant for targets with drug associations, which shows how drugs have typically been developed for relatively “central” targets with high degree and betweenness. We also ran perturbations of networks generated from high-throughput data, and applied the same analysis as described for curated data sources to generate Table 10. Equivalent results from high-throughput data sources are shown in Table 15. In this case, none of the association types are significantly selected, with the exception of drug associations, which are selected by the betweenness and degree methods. These results are consistent with the differences between network properties of targets with associations for curated and high-throughput data as shown in Tables 11 to 14.
Table 15.

Combined p-values for alternative ranking methods and target associations from perturbations of networks generated from high-throughput data. Methods: be-betweenness, db-difference in betweenness, de-degree, dd-diffierence in degree, ps-perturbation scores.

MethodCancerDiseaseDrug
be0.87780.15212:12 × 10−5
db0.93530.79460.9955
dd0.38050.53170.9453
de0.67020.06944.82 × 10−6
ps0.98670.94730.9937
The differences in results for curated and high-throughput interaction data sources highlight some of the current problems with interaction data, where curated sources contain significant biases towards well studied targets of interest, and high-throughput sources contain a high percentage of false positives. As more high-quality interaction data becomes available, a network-level approach will become increasingly important for target and drug discovery. Given the current data limitations, however, our predictions for novel target combinations have returned many targets known to be important in the relevant cancer types, and we are pursuing these predictions for their applicability in cancer therapy. We have also run perturbations where only targets with known drug associations are selected so that the results can be tested using readily available drug combinations. Many of the highest ranked targets with drug associations are also known cancer targets.

Summary

In summary, we have developed and studied a novel method for predicting target and drug combinations based on network topology in multiple cancer types. By simulating network fragmentation from targeting multiple proteins in a given cancer type, such a network-level approach facilitates a search for novel target combinations. Our methods significantly select for cancer associated targets using curated interaction data sources, and return many targets of interest in cancer therapy. When using predicted H. sapiens interaction data generated from high-throughput model organism data sets, the methods do not significantly select for known cancer associations. The difference in results between networks generated from curated data sources and high-throughput data sources reveals significant bias in curated data towards targets of interest, with known associations. We have predicted sets of target and drug combinations in seven different cancer types, which are currently in experimental testing for prostate cancer using both drug studies and siRNA techniques. Target lists for stomach cancer and networks of data combination C1. Target lists for colon cancer and networks of data combination C1. Target lists for pancreas cancer and networks of data combination C1. Target lists for prostate cancer and networks of data combination C1. Target lists for breast cancer and networks of data combination C1. Target lists for lung cancer and networks of data combination C1. Target lists for brain cancer and networks of data combination C1. Network properties. Network similarity for every pair of cancer networks. Vertex set similarity for every pair of cancer networks. Linear regression correlation coefficients for initial vertex degree between underlying networks and corresponding cancer networks. Comparison of perturbation results (target rankings) using Spearman’s rank correlation coefficient based on the betweenness method. Only the common targets are compared, and the ranks are relative to this target set. Comparison of perturbation results (target rankings) using Spearman’s rank correlation coefficient based on the difference in betweenness method. Only the common targets are compared, and the ranks are relative to this target set. Ratio of average degree for targets with associations over the average degree of all targets in the underlying data source combinations. Ratio of average degree for targets with associations over the average degree of all targets in different cancer networks generated from the curated only data source combination. Ratio of average degree for targets with associations over the average degree of all targets in the high-throughput dataset. Ratio of average degree for targets with associations over the average degree of all targets in different cancer networks generated from the high throughput dataset.
Table 1.

Target lists for stomach cancer and networks of data combination C1.

Vertices removedbetweennessdifference in betweennessdegreedifference in degreeperturbation scores
1PXNG22P1PXNPXNGAPD
2VCLVCLVCLVCLCTBP1
3EGFRCOL1A1EGFRCOL1A1CTSB
4SMAD3EGFRRNF11APEX1COL1A1
5COL1A1PXNSMAD3COL1A2MIF
6PIN1SMAD3RAF1KRT18RAF1
7CTNNB1UBE2IJUNG22P1EGFR
8UBE2ISFRS1COL1A1HSPH1C1QBP
9MAPK3HSPA1ACDC42CTSBG22P1
10HSPA1AAPEX1HDAC2HSPD1APEX1
Table 2.

Target lists for colon cancer and networks of data combination C1.

Vertices removedbetweennessdifference in betweennessdegreedifference in degreeperturbation scores
1VCLSMAD2PXNSMAD2SMAD2
2PXNGNB2L1VCLGNB2L1BCAP31
3TLN1MYCTLN1FUSVCL
4UBE2IFUSSMAD2MYCPXN
5PCNAHLA-CPLK1HLA-CCALR
6PIN1RPS20FNBP3VCLMAPK1
7SMAD2MAPK1CDC42SRRM1FUS
8GNB2L1IDEPCNASACM1LAPP
9SMARCA4SNRPCAPPIGF2PCNA
10PLK1RAC1SP1NPM1YWHAZ
Table 3.

Target lists for pancreas cancer and networks of data combination C1.

Vertices removedbetweennessdifference in betweennessdegreedifference in degreeperturbation scores
1JUNERBB2FNBP3RELAERBB2
2HDAC3JUNRELASMAD3LRP1
3SMAD3SMAD3ITGB1COL1A1NCL
4ERBB2RELASMAD3THBS1RELA
5UBE2INCLAPPERBB2MMP2
6CSNK2A1THBS1JUNFOSSMAD3
7GNB2L1APPHDAC3EP300FNBP3
8ZNF265UBE2ICOL1A1COL1A2GSN
9APPYY1YY1LRP1ACTG1
10YY1HDAC3ERBB2NCLCTSB
Table 4.

Target lists for prostate cancer and networks of data combination C1.

Vertices removedbetweennessdifference in betweennessdegreedifference in degreeperturbation scores
1PXNSERPINA3PXNSERPINA3APP
2VCLAPPVCLSLC9A3R1CDC42
3SMAD2SLC9A3R1SMAD2HIST2H2BEMAPK3
4TLN1DSPTLN1DSPCDKN1A
5EGFRHIST2H2BESMAD3CBX4PBX2
6SMAD3YWHAZRNF11KRT18HSPH1
7CREBBPIQGAP2EGFRCOPS3COL4A1
8SRCMAPK3CREBBPCCNB1HIST2H2BE
9HDAC3COPS3RAF1TOMM20VDAC1
10RAF1DCNEP300RPS3KA1SHC1
Table 5.

Target lists for breast cancer and networks of data combination C1.

Vertices removedbetweennessdifference in betweennessdegreedifference in degreeperturbation scores
1VCLSMAD2PXNSMAD2SMAD2
2PXNCOL1A1VCLCOL1A1EP300
3TLN1ERBB2TLN1COL1A2MPHOSPH6
4SMAD3COL1A2SMAD2LMO4NFKB1
5SMAD2CSKSMAD3ERBB2PTMA
6ERBB2RNF11FNBP3A2MNFKBIB
7PAK1CCT3CREBBPCSKPBX2
8FNBP3LGALS1ITGB1EXOSC4PXN
9XPO1YY1RNF11PBX2UBE2I
10PTMARELBRAF1SLC9A3R1VCL
Table 6.

Target lists for lung cancer and networks of data combination C1.

Vertices removedbetweennessdifference in betweennessdegreedifference in degreeperturbation scores
1VCLYWHAHVCLPCNABECN1
2SMAD3PCNASMAD3LRP1NFKB2
3JUNLRP1PCNAA2MCTNNB1
4CSNK2A2DDX3XRNF11HLA-DRAYWHAH
5FNBP3CSNK2A2FNBP3CUL1HLA-C
6RNF11AKT1NFKBIAHLA-CLRP1
7HSF1CD63JUNCOL1A2CD63
8HDAC3B2MAPPYWHAHAUP1
9AKT1HSPA5MPHOSPH6TFRCCSNK2A2
10A2MSPARCHDAC3CEBPAAKT1
Table 7.

Target lists for brain cancer and networks of data combination C1.

Vertices removedbetweennessdifference in betweennessdegreedifference in degreeperturbation scores
1VCLVCLVCLITGB1HLA-C
2TP53ITGB1TP53HSPA1AITGB1
3TLN1RAF1TLN1COL1A1RAF1
4SMAD3TP53SMAD3RAF1SMAD3
5CTNNB1PIK3R1RNF11VCLHD
6PIK3R1HSPA1AHDCOL1A2RAN
7RAF1SMAD3CTNNB1NFKBIAHSPA1A
8RNF11KITAPPXPO1PTK2B
9HSF1TLN1ITGB1TGFBR2ITGA5
10HDCOL1A1E2F4RANLSM1
Table 8.

Network properties.

NetworkNumber of vertices, NNumber of edges, nGiant component size, GAssortativity coefficient, rAverage degree, kAverage shortest path, lClustering coefficient CModularity, Q
C1 stomach cancer887586414−0.14991.32135.82500.03090.7666
C1 stomach normal848501366−0.18431.18165.71180.02710.7620
C1 colon cancer992709502−0.13251.42945.55040.02850.7332
C1 colon normal1069843569−0.11971.57715.61760.03430.7477
C1 pancreas cancer8985853910.05841.30286.86270.03350.7837
C1 pancreas normal8594522860.01061.05236.76590.02990.7806
C1 prostate cancer16271462926−0.09921.79715.82790.04680.7166
C1 prostate normal16541517954−0.09601.83435.80970.04650.7137
C1 breast cancer1243976620−0.12501.57035.60870.04640.7062
C1 breast normal12931100690−0.11681.70145.45270.04210.7072
C1 lung cancer1085685450−0.10231.26266.20260.03740.7633
C1 lung normal1120811574−0.09551.44826.29180.03450.7698
C1 brain cancer15221360830−0.09621.78715.53380.04270.6749
C1 brain normal14571190766−0.09101.63345.81380.03920.7042
C2 stomach cancer115812436410.09612.14685.67230.08690.7324
C2 stomach normal112910645800.13001.88485.74950.07890.7507
C2 colon cancer130014397200.03792.21385.48630.08580.7145
C2 colon normal139215658070.02272.24855.57660.09240.7109
C2 pancreas cancer113710726010.36021.88566.67530.09010.7829
C2 pancreas normal111912215570.60852.18236.43200.10170.7649
C2 prostate cancer2141310714570.16872.90235.64410.09530.6868
C2 prostate normal2168323014880.18882.97975.64570.09460.6794
C2 breast cancer165120669800.13302.50275.59490.09350.6677
C2 breast normal1665216110080.14192.59575.46450.08850.6649
C2 lung cancer140815937720.27172.26276.19700.09050.7628
C2 lung normal145818659010.26372.55825.99250.09210.7318
C2 brain cancer2016274613270.18092.72425.69880.09400.6672
C2 brain normal1939242012220.10622.49615.88290.08610.7127
Table 9.

Network similarity for every pair of cancer networks.

NetworkC1 stomachC1 colonC1 pancreasC1 prostateC1 breastC1 lungC1 brainC2 stomachC2 colonC2 pancreasC2 prostateC2 breastC2 lungC2 brain
C1 stomach1.00000.21480.15710.22200.22900.18120.19240.47140.12750.10610.11200.12330.09830.1040
C1 colon1.00000.18170.23210.21660.20070.19180.49270.12580.12000.12120.11260.1067
C1 pancreas1.00000.17910.15970.22820.19110.54570.09200.08830.12150.1033
C1 prostate1.00000.30860.20890.28450.47060.19470.13820.1744
C1 breast1.00000.19750.21920.47240.11940.1272
C1 lung1.00000.21580.43000.1183
C1 brain1.00000.4953

C2 stomach0.47140.13290.09530.15950.15090.11250.13721.00000.24280.16980.22990.26200.21610.2161
C2 colon0.49270.10900.16410.14180.12320.13501.00000.18560.24450.22810.23200.2219
C2 pancreas0.54570.13990.11730.15520.14721.00000.18350.16610.21030.1712
C2 prostate0.47060.16390.10840.16271.00000.33700.25870.3284
C2 breast0.47240.11060.13971.00000.23950.2727
C2 lung0.43000.14021.00000.2580
C2 brain0.49531.0000
Table 10.

Vertex set similarity for every pair of cancer networks.

NetworkC1 stomachC1 colonC1 pancreasC1 prostateC1 breastC1 lungC1 brainC2 stomachC2 colonC2 pancreasC2 prostateC2 breastC2 lungC2 brain
C1 stomach1.00000.34410.31640.35310.38670.35630.34580.76600.28190.26900.27660.30560.29150.2710
C1 colon1.00000.35390.37480.34400.39300.35160.76310.30210.29520.27620.32300.2778
C1 pancreas1.00000.35390.32160.38770.32970.78980.27740.25690.31620.2593
C1 prostate1.00000.45760.41540.48330.75990.37900.35550.3921
C1 breast1.00000.38320.41430.75290.32150.3307
C1 lung1.00000.38820.77060.3073
C1 brain1.00000.7550

C2 stomach0.76600.28820.26370.30810.32870.30030.30031.00000.32940.30770.34380.36430.34490.3325
C2 colon0.76310.28990.32260.29020.32570.30171.00000.34340.36220.33590.37950.3366
C2 pancreas0.78980.31370.28030.33210.29141.00000.34010.30650.36610.3127
C2 prostate0.75990.36290.32760.38911.00000.45450.40170.4783
C2 breast0.75290.30850.34281.00000.36440.4082
C2 lung0.77060.33121.00000.3779
C2 brain0.75501.0000
Table 11.

Linear regression correlation coefficients for initial vertex degree between underlying networks and corresponding cancer networks.

C1C2
C1 stomach0.84940.6805
C1 colon0.83660.6852
C1 pancreas0.82150.6810
C1 prostate0.90790.6832
C1 breast0.89480.6991
C1 lung0.87790.6619
C1 brain0.91570.7050
C2 stomach0.60800.8842
C2 colon0.65030.8725
C2 pancreas0.57180.8582
C2 prostate0.64420.9254
C2 breast0.63040.9077
C2 lung0.54140.8895
C2 brain0.65160.9233
Table 12.

Comparison of perturbation results (target rankings) using Spearman’s rank correlation coefficient based on the betweenness method. Only the common targets are compared, and the ranks are relative to this target set.

NetworkC1 stomachC1 colonC1 pancreasC1 prostateC1 breastC1 lungC1 brainC2 stomachC2 colonC2 pancreasC2 prostateC2 breastC2 lungC2 brain
C1 stomach1.00000.81630.70570.57390.79760.77210.65020.74130.59920.56020.46650.43890.48110.4766
C1 colon1.00000.77170.57800.84430.74950.62910.78790.54680.45210.46020.41320.4196
C1 pancreas1.00000.63500.73230.70550.56700.79200.52380.44230.45250.4471
C1 prostate1.00000.61410.61280.78090.80300.67050.65320.5749
C1 breast1.00000.71710.61780.55970.48470.4340
C1 lung1.00000.64120.59730.5432
C1 brain1.00000.7489

C2 stomach0.74130.50650.41590.45220.55900.62480.48741.00000.66970.69100.59820.50280.57010.5574
C2 colon0.78790.65040.50700.60820.70690.49281.00000.66400.58670.61530.61170.5969
C2 pancreas0.79200.52600.59480.56190.49851.00000.66390.55070.54700.5877
C2 prostate0.80300.46660.51370.60661.00000.72890.69940.7280
C2 breast0.55970.48730.56531.00000.71150.6674
C2 lung0.59730.62241.00000.6877
C2 brain0.74891.0000
Table 13.

Comparison of perturbation results (target rankings) using Spearman’s rank correlation coefficient based on the difference in betweenness method. Only the common targets are compared, and the ranks are relative to this target set.

NetworkC1 stomachC1 colonC1 pancreasC1 prostateC1 breastC1 lungC1 brainC2 stomachC2 colonC2 pancreasC2 prostateC2 breastC2 lungC2 brain
C1 stomach1.00000.14590.2992−0.00610.21700.37570.27680.59610.11630.1800−0.00620.06320.37860.1459
C1 colon1.00000.09660.20420.36640.42100.11410.63740.09730.08950.18480.2884−0.0890
C1 pancreas1.00000.18260.16970.44320.17260.6436−0.0030−0.06520.44530.0826
C1 prostate1.00000.31360.31410.12600.40120.22570.27280.0600
C1 breast1.00000.25170.02440.58500.1557–0.0561
C1 lung1.00000.19190.61860.1034
C1 brain1.00000.6197

C2 stomach0.59610.11580.1911–0.01950.19550.29020.30441.00000.09310.12990.01040.10240.36710.1901
C2 colon0.63740.09890.19040.30640.35210.07621.00000.11970.15590.18840.32890.0596
C2 pancreas0.64360.14590.11490.43590.18831.00000.07680.05760.43070.1453
C2 prostate0.40120.09500.19340.10071.00000.16270.28090.1058
C2 breast0.58500.26810.17311.00000.30310.1051
C2 lung0.61860.16551.00000.1340
C2 brain0.61971.0000
Table 14.

Ratio of average degree for targets with associations over the average degree of all targets in the underlying data source combinations.

Data source combinationCancer associationsDisease associationsDrug associations
C12.27151.77771.6333
C21.90411.43781.3457
Table 15.

Ratio of average degree for targets with associations over the average degree of all targets in different cancer networks generated from the curated only data source combination.

Cancer typeCancer associationsDisease associationsDrug associations
C1 stomach2.03402.00672.4387
C1 colon1.43331.38891.3356
C1 pancreas1.88391.99361.9390
C1 prostate1.78601.76282.4421
C1 breast1.63181.56011.2491
C1 lung1.67191.99921.5839
C1 brain2.10832.17771.7906
C2 stomach1.48321.35561.6414
C2 colon0.99181.07280.9576
C2 pancreas1.57661.41421.3132
C2 prostate1.37321.22761.6521
C2 breast1.34671.10520.9614
C2 lung1.23331.23840.9642
C2 brain1.48021.49961.5017
Table 16.

Ratio of average degree for targets with associations over the average degree of all targets in the high-throughput dataset.

Cancer associationsDisease associationsDrug associations
1.22551.27932.1912
Table 17.

Ratio of average degree for targets with associations over the average degree of all targets in different cancer networks generated from the high throughput dataset.

Cancer typeCancer associationsDisease associationsDrug associations
stomach0.95201.00981.8037
colon0.99511.37802.2372
pancreas1.02530.88982.9431
prostate1.02671.31932.0682
breast1.15241.12671.8870
lung1.20701.28252.7589
brain1.25871.12932.4459
  42 in total

1.  Emergence of scaling in random networks

Authors: 
Journal:  Science       Date:  1999-10-15       Impact factor: 47.728

2.  Metabolic network structure determines key aspects of functionality and regulation.

Authors:  Jörg Stelling; Steffen Klamt; Katja Bettenbrock; Stefan Schuster; Ernst Dieter Gilles
Journal:  Nature       Date:  2002-11-14       Impact factor: 49.962

3.  Modular organization of cellular networks.

Authors:  Alexander W Rives; Timothy Galitski
Journal:  Proc Natl Acad Sci U S A       Date:  2003-01-21       Impact factor: 11.205

Review 4.  Modeling transcriptional regulatory networks.

Authors:  Hamid Bolouri; Eric H Davidson
Journal:  Bioessays       Date:  2002-12       Impact factor: 4.345

5.  BIND: the Biomolecular Interaction Network Database.

Authors:  Gary D Bader; Doron Betel; Christopher W V Hogue
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

6.  Hierarchical organization of modularity in metabolic networks.

Authors:  E Ravasz; A L Somera; D A Mongru; Z N Oltvai; A L Barabási
Journal:  Science       Date:  2002-08-30       Impact factor: 47.728

7.  Reconciling gene expression data with known genome-scale regulatory network structures.

Authors:  Markus J Herrgård; Markus W Covert; Bernhard Ø Palsson
Journal:  Genome Res       Date:  2003-10-14       Impact factor: 9.043

8.  Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach.

Authors:  M C Whitlock
Journal:  J Evol Biol       Date:  2005-09       Impact factor: 2.411

Review 9.  Selective protection of normal proliferating cells against the toxic effects of chemotherapeutic agents.

Authors:  Khandan Keyomarsi; Arthur B Pardee
Journal:  Prog Cell Cycle Res       Date:  2003

10.  Development of human protein reference database as an initial platform for approaching systems biology in humans.

Authors:  Suraj Peri; J Daniel Navarro; Ramars Amanchy; Troels Z Kristiansen; Chandra Kiran Jonnalagadda; Vineeth Surendranath; Vidya Niranjan; Babylakshmi Muthusamy; T K B Gandhi; Mads Gronborg; Nieves Ibarrola; Nandan Deshpande; K Shanker; H N Shivashankar; B P Rashmi; M A Ramya; Zhixing Zhao; K N Chandrika; N Padma; H C Harsha; A J Yatish; M P Kavitha; Minal Menezes; Dipanwita Roy Choudhury; Shubha Suresh; Neelanjana Ghosh; R Saravana; Sreenath Chandran; Subhalakshmi Krishna; Mary Joy; Sanjeev K Anand; V Madavan; Ansamma Joseph; Guang W Wong; William P Schiemann; Stefan N Constantinescu; Lily Huang; Roya Khosravi-Far; Hanno Steen; Muneesh Tewari; Saghi Ghaffari; Gerard C Blobe; Chi V Dang; Joe G N Garcia; Jonathan Pevsner; Ole N Jensen; Peter Roepstorff; Krishna S Deshpande; Arul M Chinnaiyan; Ada Hamosh; Aravinda Chakravarti; Akhilesh Pandey
Journal:  Genome Res       Date:  2003-10       Impact factor: 9.043

View more
  2 in total

Review 1.  Rapid learning for precision oncology.

Authors:  Jeff Shrager; Jay M Tenenbaum
Journal:  Nat Rev Clin Oncol       Date:  2014-01-21       Impact factor: 66.675

2.  Leveraging models of cell regulation and GWAS data in integrative network-based association studies.

Authors:  Andrea Califano; Atul J Butte; Stephen Friend; Trey Ideker; Eric Schadt
Journal:  Nat Genet       Date:  2012-07-27       Impact factor: 38.330

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.