| Literature DB >> 12962547 |
Hanga C Galfalvy1, Loubna Erraji-Benchekroun, Peggy Smyrniotopoulos, Paul Pavlidis, Steven P Ellis, J John Mann, Etienne Sibille, Victoria Arango.
Abstract
BACKGROUND: Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods.Entities:
Mesh:
Substances:
Year: 2003 PMID: 12962547 PMCID: PMC212256 DOI: 10.1186/1471-2105-4-37
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1MAS5.0, MBEI and RMA signal variability. The variability in signal intensity measurement obtained with three different probe-level data extraction methods is represented by the lowess curves of the coefficient of variation. The X-axis represent increasing signal intensities, as measured by the percentage of arrays on which this gene is detected as present (% present calls). Presence calls were obtained with MBEI (BA9, n = 39; BA47, n = 36). Note that curves for the two brain areas are very close to each other for all three methods.
MAS, MBEI and RMA detection sensitivity for Y-chromosome-linked genes.
| BA9 | BA47 | |||||
| probe set | MAS5 | MBEI | RMA | MAS5 | MBEI | RMA |
| 211149_at | ||||||
| 207246_at | ||||||
| 204409_s_at | ||||||
| 207063_at | ||||||
| 205001_s_at | ||||||
| 206624_at | ||||||
| 206700_s_at | ||||||
| 214983_at | ||||||
| 205000_at | ||||||
| 201909_at | ||||||
| 204410_at | ||||||
"X" denotes p-values that passed the Benjamini-Hochberg false discovery rate screen (based on Log2-values). None of the other Y-chromosome-linked probeset were significantly different between males and females.
Male-Female differentially expressed genes.
| FC | M/F FC | |||||||||
| 201909_ | BC010286 | ribosomal protein S4, Y-linked | RPS4Y | Yp11.3 | protein biosynthesis | 8.9 | 10.8 | >1700 | ||
| 204409_s_ | BC005248 | eukaryotic translation initiation factor 1A | EIF1AY | Yq11.2 | translation initiation | 1.9 | 2.3 | >72 | ||
| 204410_ | AF000987 | eukaryotic translation initiation factor 1A | EIF1AY | Yq11.2 | translation initiation | 1.6 | 1.7 | >72 | ||
| 205000_ | AF000984 | DEAD/H (Asp-Glu-Ala-Asp/His)box | DBY | Yq11 | RNA helicase | 6.4 | 8.5 | >96000 | ||
| 205001_s_ | AF000985 | DEAD/H (Asp-Glu-Ala-Asp/His)box | DBY | Yq11 | RNA helicase | 1.9 | 1.9 | >96000 | ||
| 206624_ | Y13618 | ubiquitin specific protease 9 | USP9Y | Yq11.2 | deubiquitylation | 4.7 | 5.1 | |||
| 206700_s_ | U52191 | SMC (mouse) homolog | SMCY | Yq11 | transcription factor | 2.9 | 3.3 | >5000 | ||
| 207063_ | AF119903 | hypothetical protein PRO2834 | Yq11.2 | unknown | 1.3 | 1.5 | ||||
| 207246_ | M30607 | zinc finger protein | ZFY | Yp11.3 | transcription regulation | 1.3 | 1.2 | |||
| 214983_ | AL080135 | hypothetical protein DFKZp434I143 | Y | unknown | 1.9 | 2.1 | >100,000 | |||
| 207703_ | AB023168 | KIAA0951 protein | Xp22.3 | unknown | 2.5 | 2.7 | ||||
| 214218_s_ | AV699347 | XIST | XIST | Xq13.2 | X-gene inactivation | -6.4 | -7.8 | <-3400 | ||
| 221728_x_ | AK025198 | XIST | XIST | Xq13.2 | X-gene inactivation | -12 | -15 | <-3400 | ||
| FC | ||||||||||
| 211149_ | AF000994 | ubiquitously transcribed tetratricopeptide rep | UTY | Yq11 | protein-protein interaction | 1.1 | 1.2 | 0.0148 | >67,000 | |
| 210292_s_ | AF332218 | protocadherin 11 X-linked | PCDH11X | Xq21.3 | cell-cell recognition | 1.5 | 1.8 | 0.0012 | ||
| 211937_ | NM_001417 | eukaryotic translation initiation factor 4B | EIF4B | 12q12 | translation regulation | 1.2 | 1.1 | 0.0916 | ||
| 219737_s_ | AI524125 | protocadherin 9 | PCDH9 | 13q | cell-cell recognition | 1.3 | 1.0 | 0.4902 | ||
Bold denotes p-values that passed the Benjamini-Hochberg false discovery rate screen. M/F FC, male versus female fold change (see Methods); Sybr-PCR, real-time PCR.
Figure 2Y-chromosome-linked probesets: male-female expression comparisons. RMA-based averaged values (± STDEV) are displayed. A) Probesets with significant differences in expression levels for male and female samples in BA9 and/or BA47. All male-female comparisons were statistically significant with the exception of #11 in BA9 and # 12 and 13 in BA47 (See also Table 2). Probesets are organized according to order of y-linked genes in Table 2. B) Selected Y-linked probesets without sex-differences. All these genes were detected as ''absent'' by MAS5.0 or MBEI. Signal level represent background estimates. Probesets are: 1, 201909_at; 2, 204409_s_at; 3, 204410_at; 4, 205000_at; 5, 205001_s_at; 6, 206624_at; 7, 206700_s_at; 8, 207063_at; 9, 207246_at; 10, 214983_at; 11, 211149_at; 12, 208067_x_at, 13, 211227_s_at; 14, 214983_at; 15, 217261_at; 16, 217162_at; 17, 221179_at; 18, 211461_at; 19, 209596_at; 20, 210322_x_at; 21, 216376_x_at; 22, 216922_x_at; 23, 211462_s_at; 24, 207909_x_at; 25, 207918_s_at; 26, 207912_s_at.
Figure 3Distribution of the t-tests p-values for sex differences or random group labels. Distribution of the p-values from the t-tests comparing males and females. These p-values are slightly lower than expected from a uniform distribution, representing a mixture distribution of p-values from differentially expressed and not affected genes.
Microarray quality control parameters.
| Mean ± SEM | 3.12 ± 0.13 | 1.12 ± 0.05 | 40.76 ± 3.41 | 52.4 ± 0.5 | 1.14 ± 0.03 | 0.96 ± 0.02 | |
| Max | 4.97 | 2.14 | 94.17 | 57.1 | 1.91 | 1.28 | |
| min | 1.95 | 0.71 | 16.04 | 45.4 | 0.87 | 0.83 | |
| Mean ± SEM | 2.82 ± 0.11 | 1.21 ± 0.07 | 34.95 ± 2.69 | 53.5 ± 0.6 | 1.45 ± 0.09 | 1.10 ± 0.04 | |
| Max | 4.49 | 2.32 | 81.07 | 58.7 | 3.26 | 1.61 | |
| min | 1.76 | 0.57 | 13.65 | 44.9 | 0.95 | 0.82 | |
Quality control parameters for brain samples microarrays from Brodman Area 9 (BA9, 39 arrays) and 47 (BA47, 36 arrays). Values were derived from MAS5.0 array analysis.