| Literature DB >> 26484195 |
Catharina Olsen1, Kathleen Fleming2, Niall Prendergast2, Renee Rubio2, Frank Emmert-Streib3, Gianluca Bontempi1, John Quackenbush2, Benjamin Haibe-Kains4.
Abstract
Quantitative validation of gene regulatory networks (GRNs) inferred from observational expression data is a difficult task usually involving time intensive and costly laboratory experiments. We were able to show that gene knock-down experiments can be used to quantitatively assess the quality of large-scale GRNs via a purely data-driven approach (Olsen et al. 2014). Our new validation framework also enables the statistical comparison of multiple network inference techniques, which was a long-standing challenge in the field. In this Data in Brief we detail the contents and quality controls for the gene expression data (available from NCBI Gene Expression Omnibus repository with accession number GSE53091) associated with our study published in Genomics (Olsen et al. 2014). We also provide R code to access the data and reproduce the analysis presented in this article.Entities:
Keywords: Colon cancer; Gene expression; Knock-down; Microarray; shRNA
Year: 2015 PMID: 26484195 PMCID: PMC4535466 DOI: 10.1016/j.gdata.2015.03.011
Source DB: PubMed Journal: Genom Data ISSN: 2213-5960
Fig. 1Quality controls for the Affymetrix Raw data generated in [5]. CEL file names for each experiment is provided on the left side, followed by the percentage of present and absent calls (in red) following the Affymetrix guidelines. The blue region in the middle of the plot represents the 3-fold region for scale factor as this region is considered as acceptable according to Affymetrix guidelines; any scale factor outside this region is drawn in red as it is considered an indicator of poor quality. Beta-actin and GAPDH 3′–5′ ratios are also represented on the right side by triangles and circles, respectively; ratio higher than 1.25 are drawn in red as they are considered indicators of poor quality.
Fig. 2Call percentage for each CEL file. The colors correspond to the biological replicate number. The quality of the first two replicates is lower than for the remaining five replicates.
For each biological replicate, the time of data generation is specified. There are three main batches: 2008 (biological replicates 1), 2009 (biological replicates 2) and 2011 (biological replicates 3–7).
| Date | ||||||
|---|---|---|---|---|---|---|
| 2008-12-16 | 2008-12-17 | 2009-07-15 | 2011-07-19 | 2011-07-20 | ||
| Biological replicate | 1 | 11 | 10 | 0 | 0 | 0 |
| 2 | 0 | 0 | 22 | 0 | 0 | |
| 3 | 0 | 0 | 0 | 15 | 5 | |
| 4 | 0 | 0 | 0 | 19 | 1 | |
| 5 | 0 | 0 | 0 | 17 | 3 | |
| 6 | 0 | 0 | 0 | 18 | 2 | |
| 7 | 0 | 0 | 0 | 2 | 0 | |
Fig. 3Each plot shows the difference of expression for the eight core genes. The knocked down gene highlighted in light blue. The significance level is indicated by ‘-’ for p < 0.1, ‘*’ for p < 0.05, ‘**’ for p < 0.01 and ‘***’ for p < 0.001 using a Wilcoxon signed rank test.
| Specifications | |
|---|---|
| Organism/cell line/tissue | |
| Sex | |
| Sequencer or array type | |
| Data format | |
| Experimental factors | |
| Experimental features | |
| Consent | |
| Sample source location | |