| Literature DB >> 22343619 |
B Dutta1, L Pusztai, Y Qi, F André, V Lazar, G Bianchini, N Ueno, R Agarwal, B Wang, C Y Shiang, G N Hortobagyi, G B Mills, W F Symmans, G Balázsi.
Abstract
BACKGROUND: The rapid collection of diverse genome-scale data raises the urgent need to integrate and utilise these resources for biological discovery or biomedical applications. For example, diverse transcriptomic and gene copy number variation data are currently collected for various cancers, but relatively few current methods are capable to utilise the emerging information.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22343619 PMCID: PMC3304402 DOI: 10.1038/bjc.2011.584
Source DB: PubMed Journal: Br J Cancer ISSN: 0007-0920 Impact factor: 7.640
Figure 1Overview of data analysis strategy and driver-network identification. (A) Clinical subtype-specific seed gene selection: Genes that were frequently amplified and showed strong correlation between amplification and mRNA expression were selected as seed genes. (B) Driver-network construction: starting with a seed gene, all of its nearest neighbours in an a priori assembled signalling, PPI and TF network space are considered for inclusion provided that at least one of two criteria (differential expression and co-expression) is satisfied (see Materials and Methods). Once the first neighbours of the seed gene are included, their neighbours are similarly tested in an iterative process to further expand the driver-network. We used extensive re-sampling, where a subset of the patients/cell lines were randomly selected to construct the driver-network samples. Genes that appeared in at least 50% of the driver-network samples were included in the final driver-network.
Figure 2Breast cancer subtype-specific (ER+: A and D, HER2+: B and E, TNBC: C and F) driver-networks in two separate data sets (Chin et al. (2009): A–C, Andre et al. (2006): D–F). The size of each node is proportional to the differential expression level of the corresponding gene. Yellow and blue nodes represent upregulation and downregulation, respectively, from gene expression data. The shapes indicate the type of genomic change, squares representing the seed genes with copy number alterations, circles representing differential expression without copy number alteration, hexagons representing both copy number and mRNA expression changes, and triangles representing inclusion based on differential co-expression without differential expression. Downregulated genes were included in the network either as seed genes (square nodes) or based on differential positive or negative co-expression (triangle nodes), as only significant overexpression was used for network expansion. The width and colour of an edge connecting two nodes reflect the magnitude and sign of the correlation (red: positive; green: negative) between two genes within the driver-network. An arrow pointing from one member gene to another indicates a transcriptional or signalling relationship, whereas lines represent PPIs. Genes common between the two driver-networks have blue font colour.
Reproducibility of the driver-networks from independent data sets and from seed gene- and expression randomisations (mean±s.d.)
|
|
|
| |
|---|---|---|---|
| Normalised overlap of driver-networks | 0.49 | 0.20 | 0.30 |
| Normalised overlap from seed randomisation ( | 0.36±0.06 (0.01) | 0.16±0.04 (0.08) | 0.18±0.06 (0.01) |
| Normalised overlap from expression randomisation ( | 0.12±0.04 (0.00) | 0.12±0.03 (0.00) | 0.09±0.03 (0.00) |
Abbreviations: ER=oestrogen receptor; HER2=human epidermal growth factor receptor 2; TN=triple receptor-negative.
Statistical significance of the reproducibility (P-value) was estimated by comparing the observed normalised overlap with the distribution of overlaps obtained from randomisation. The greatest and least normalised overlap of the driver-networks between the two data sets was observed for the ER+ and HER2+ subtypes, respectively.
Figure 3Functional relevance of the triple-negative breast cancer (TNBC) driver-networks. ‘Driver-network samples’ correspond to the thirty genes randomly selected from TNBC driver-networks and ‘differential expression samples’ correspond to 10 genes differentially expressed in all the data sets. Each of these genes was knocked down with four different siRNAs in eight TNBC and four non-TNBC cell lines. Viability score represents the average knockdown of all the genes in all the cell lines (see the Materials and Methods and the text for details). Functional validation showed that genes from driver-networks had subtype-specific effect on viability. The driver-network members were also functionally more important compared with the strongest candidates from differential gene expression analysis.