| Literature DB >> 17242735 |
Jeanine S Morey1, James C Ryan, Frances M Van Dolah.
Abstract
Quantitative real-time PCR (qPCR) is a commonly used validation tool for confirming gene expression results obtained from microarray analysis; however, microarray and qPCR data often result in disagreement. The current study assesses factors contributing to the correlation between these methods in five separate experiments employing two-color 60-mer oligonucleotide microarrays and qPCR using SYBR green. Overall, significant correlation was observed between microarray and qPCR results (rho=0.708, p<0.0001, n=277) using these platforms. The contribution of factors including up- vs. down-regulation, spot intensity, rho-value, fold-change, cycle threshold (C(t)), array averaging, tissue type, and tissue preparation was assessed. Filtering of microarray data for measures of quality (fold-change and rho-value) proves to be the most critical factor, with significant correlations of rho>0.80 consistently observed when quality scores are applied.Entities:
Year: 2006 PMID: 17242735 PMCID: PMC1779618 DOI: 10.1251/bpo126
Source DB: PubMed Journal: Biol Proced Online ISSN: 1480-9222 Impact factor: 3.244
Genes selected for validation by qPCR from microarray analyses (supplemental table).
|
|
|
|
|
|
| 7-dehydrocholesterol reductase | NM_001360 | 127-277 | 58°C | 5 |
| Activator of heat shock 90kDa protein ATPase homolog 1 | NM_012111 | 494-621 | 60°C | 5 |
| Adrenomedullin | NM_001124 | 40-148 | 55°C | 5 |
| Aldolase C, fructose-biphosphate | NM_005165 | 60-252 | 60°C | 5 |
| Alpha tubulin-like | NM_145042 | 686-848 | 62°C | 5 |
| bHLH-PAS type transcription factor NXF | NM_153553.1 | 388-578 | 58°C | 1 & 2 |
| Calpain 10 | NM_011796 | 844-1626 | 64°C | 1 & 2 |
| Calponin 1 | NM_009922 | 303-771 | 62°C | 1 & 2 |
| CCAAT/enhancer binding protein, beta | NM_009883 | 1174-1305 | 58°C | 1 & 2 |
| CCAAT/enhancer binding protein, delta | NM_007679 | 1278-1434 | 60°C | 3 |
| CD52 antigen | NM_013706 | 52-234 | 62°C | 3 |
| c-FOS | NM_010234 | 36-390 | 62°C | 1, 2, 3, & 4 |
| Chemokine (C-X-C motif) receptor 4 | NM_009911 | 158-288 | 58°C | 4 |
| Cofilin 1 | BB097704 | 55-273 | 58°C | 3 |
| Cold inducible RNA binding protein | NM_007705 | 221-349 | 58°C | 1 & 2 |
| Cyclin G2 | NM_004354 | 124-271 | 60°C | 5 |
| Cytochrome P450, family 1, subfamily a, polypeptide 2 | NM_009993 | 489-609 | 62°C | 4 |
| Cytochrome P450, family 2, subfamily a, polypeptide 4 | NM_009997 | 458-628 | 55°C | 4 |
| Cytochrome P450, family 2, subfamily f, polypeptide 2 | NM_007817 | 236-408 | 55°C | 4 |
| Cytochrome P450, family 51, subfamily a, polypeptide 1 | NM_000786 | 10-172 | 60°C | 5 |
| DNA damage inducible transcript 4 | NM_019058 | 365-499 | 60°C | 5 |
| Dystrobrevin beta | NM_007886 | 1496-1838 | 62°C | 1 & 2 |
| Fatty acid desaturase 1 | NM_013402 | 359-514 | 60°C | 5 |
| FK506 binding protein 11 | NM_024169 | 253-644 | 62°C | 1 & 2 |
| G protein-coupled receptor 114 | NM_153837 | 150-284 | 58°C | 5 |
| G1p2, Isg15 ubiquitin-like modifier | NM_015783 | 107-234 | 58°C | 3 |
| Glucocorticoid induced gene 1 | NM_133218 | 1236-1566 | 55°C | 1 & 2 |
| Glutamate receptor, ionotropic, kainate 5 | NM_008168 | 911-1769 | 62°C | 1 & 2 |
| Glutamine Synthetase | X16314.1 | 309-829 | 62°C | 1 & 2 |
| Granzyme A | NM_010370 | 187-475 | 62°C | 3 |
| Growth arrest and DNA-damage-inducible 45 beta | NM_008655 | 351-506 | 60°C | 1, 2, & 3 |
| Growth arrest and DNA-damage-inducible 45 gamma | NM_011817 | 225-348 | 55°C | 1 & 2 |
| Growth Hormone | NM_008117.2 | 347-447 | 58°C | 1 |
| Intercellular adhesion molecule 2 | NM_000873 | 85-207 | 60°C | 5 |
| Interferon regulatory factor 7 | AK079685 | 703-837 | 60°C | 3 |
| Interleukin 21 receptor | NM_021798 | 213-406 | 58°C | 5 |
| Internexin neuronal intermediate filament protein | NM_010563 | 1022-1281 | 55°C | 1 & 2 |
| Jun dimerization protein 2 | NM_130469 | 44-211 | 60°C | 5 |
| Jun-B | NM_008416 | 994-1214 | 62°C | 1 & 2 |
| KRAB zinc finger protein KR18 | XM_139814 | 108-206 | 60°C | 4 |
| Mitogen-activated protein kinase kinase kinase 6 | NM_016693 | 677-791 | 62°C | 1, 2, & 4 |
| Neural Proliferation, differentiation and control gene 1 | NM_008721 | 200-586 | 62°C | 1 & 2 |
| Neurotransmitter transporter, creatine Slc6a8 | NM_133987 | 2203-2499 | 60°C | 1 & 2 |
| Neurotransmitter transporter, glycine Slc6a9 | NM_008135 | 67-285 | 58°C | 3 |
| Nuclear factor of kappa light chain gene enhancer in B-cells inhibitor, alpha | NM_010907 | 292-467 | 58°C | 1 & 2 |
| Nucleoporin 88kDa | NM_002532 | 65-218 | 60°C | 5 |
| pEL98 protein | D00208 | 231-341 | 60°C | 3 |
| Peroxiredoxin 3 | NM_007452 | 86-488 | 62°C | 1 & 2 |
| Pigpen protein | AK086587 | 529-722 | 60°C | 1 & 2 |
| Placenta-specific 8 | NM_139198 | 88-196 | 58°C | 3 |
| Programmed cell death 10 | NM_019745 | 508-791 | 58°C | 1 & 2 |
| Prostaglandin-endoperoxide synthase 2 | NM_011198 | 1231-1410 | 62°C | 1 & 2 |
| RAB1 | NM_008996 | 81-237 | 58°C | 1 & 2 |
| RAB33A | NM_004794 | 344-510 | 58°C | 5 |
| S100 calcium binding protein A8 | NM_013650 | 245-348 | 55°C | 3 |
| S100 calcium binding protein A9 | NM_009114 | 35-174 | 62°C | 3 |
| Serum/glucocorticoid regulated kinase | NM_011361 | 158-329 | 60°C | 1, 2, & 4 |
| Spindle pole body component 24 homolog | NM_182513 | 381-486 | 60°C | 5 |
| Squalene epoxidase | NM_003129 | 609-785 | 60°C | 5 |
| SRY-box containing gene 2 | NM_011443 | 633-756 | 55°C | 4 |
| Thioredoxin 1 | NM_011660 | 88-335 | 55°C | 1 & 2 |
| Transcription factor 7-like 2 | NM_030756 | 345-496 | 60°C | 5 |
| Transformation related protein 53 inducible nuclear protein 1 | NM_021897 | 236-416 | 58°C | 4 |
| Tubulin, alpha 4 | NM_009447 | 94-310 | 60°C | 1, 2, 3, & 4 |
| Tumor necrosis factor (ligand) superfamily, member 13b | NM_033622 | 661-1056 | 65°C | 1 & 2 |
| Tumor protein p53 inducible nuclear protein 1 | NM_033285 | 1883-2040 | 58°C | 5 |
| UDP-glucuronic acid decarboxylase | AK005536 | 570-706 | 58°C | 4 |
| X-box binding protein 1 | NM_005080 | 610-715 | 60°C | 5 |
| Following microarray analyses, genes were chosen based on fold change and p-value for validation by qPCR. GenBank accession numbers are listed. *1: Mouse brain DA TC, 2: Mouse brain DA DR, 3: Mouse blood DA TC, 4: Mouse brain PbTx TC, 5: Human AZA TC. DA: domoic acid. PbTx: brevetoxin. AZA: azaspiracid. TC: time course. DR: dose response. | ||||
Correlations of microarray and qPCR data.
|
|
|
|
|
|
| Mouse brain DA TC | 28 | 0.686 | <0.0001 | 84 |
| Mouse brain DA DR | 29 | 0.676 | <0.0001 | 58 |
| Mouse blood DA TC | 12 | 0.748 | <0.0001 | 36 |
| Mouse brain PbTx TC | 13 | 0.633 | <0.0001 | 39 |
| Human AZA TC | 20 | 0.727 | <0.0001 | 60 |
| All Data Sets | 68 | 0.708 | <0.0001 | 277 |
| Five data sets were analyzed, both individually and combined to form a sixth large data set referred to as "all data." The number of individual genes verified for each data set is shown as well as the resulting total number of data points used for the calculations of correlation (n). All correlations were calculated using Spearman’s Rho. DA, domoic acid; PbTx, brevetoxin; AZA, azaspiracid; TC, time course; DR, dose response. | ||||
Fig. 1Analysis of data correlation categorized by direction of regulation, spot intensity, and cycle threshold.
Correlation of microarray and qPCR data as it relates to (A) direction of regulation, (B) Log (spot intensity), and (C) cycle threshold. Spot intensity data was binned by quartiles and thus, as the intensities from each experiment differed slightly, actual intensities are not indicated in the legend. Asterisks indicate a statistically significant correlation of array and qPCR data (p<0.05). The hatched bars represent the compilation of the five individual data sets, referred to as "all data." Statistical differences of correlations, determined by ANOVA, are indicated by different letters. All correlations were calculated using Spearman’s Rho. The number of samples included in each correlation is shown in the base of the bar. ND, no data (insufficient sample size, n≤2); DA, domoic acid; PbTx, brevetoxin; AZA, azaspiracid; TC, time course; DR, dose response.
Fig. 2Analysis of data correlation categorized by fold change.
Correlation of microarray and qPCR data as it relates to (A) fold change measured by microarray and (B) fold change measured by qPCR. The combined data set of "all data" was queried first by microarray fold change (1.4 fold cut-off) and then by (C) spot intensity and (D) Ct values to determine the overall impact of fold change on array and qPCR data correlations. Asterisks indicate a statistically significant correlation (p<0.05). The hatched bars represent the compilation of the five individual data sets, referred to as "all data." Statistical differences of correlations, determined by ANOVA, are indicated by different letters. All correlations were calculated using Spearman’s Rho. The number of samples included in each correlation is shown in at the base of the bar. ND, no data (insufficient sample size, n≤2); DA, domoic acid; PbTx, brevetoxin; AZA, azaspiracid; TC, time course; DR, dose response.
Fig. 3Analysis of data correlation categorized by p-values from microarray analyses. (A)
Correlation of microarray and qPCR data as it relates to microarray spot p-values. The combined data set of "all data" was analyzed by (B) spot intensity and (C) Ct values to determine the effect of p-values (0.0001 cutoff) on array and qPCR data correlation. P-values are based on calculations including signal strength, background values, spot morphology, fold change, variation between replicates, etc. Asterisks indicate a statistically significant result (p<0.05). The hatched bars represent the compilation of the five individual data sets, referred to as "all data." Statistical differences of correlations, determined by ANOVA, are indicated by different letters. All correlations were calculated using Spearman’s Rho. The number of samples included in each correlation is shown in the base of the bar. ND, no data (insufficient sample size, n≤2); DA, domoic acid; PbTx, brevetoxin; AZA, azaspiracid; TC, time course; DR, dose response.
Fig. 4Combined effects of array fold change and p-value on data correlation.
Correlation of microarray and qPCR data as it relates to both array fold change and p-value. Analyses of the combined data set of "all data" indicate that fold change has the greatest impact on array and qPCR data correlation. However, array data quality, measured here as a p-value, is essential to predicting reliable data. Asterisks indicate a statistically significant correlation (p<0.05). All correlations were calculated using Spearman’s Rho. The number of samples included in each correlation is shown in the base of the bar.
Fig. 5Effects of composite array use on array and qPCR data correlation.
Correlation of microarray and qPCR results based on data from individual animals versus the use of weighted averages from composite array data. While minor differences were observed depending on the data set used, one data set did not consistently yield higher correlations. It does not appear that the use of composite arrays appreciably influence the observed correlations with qPCR data. Asterisks indicate a statistically significant correlation (p<0.05). All correlations were calculated using Spearman’s Rho. Statistical differences of correlation, determined by Wilcoxon’s test, are indicated by different letters in the "all genes" data set. The number of samples included in each correlation is shown in the base of the bar.
Fig. 6Correlation of microarray and qPCR data from fresh vs. frozen tissue.
Correlation of microarray and qPCR results based on data from RNA extracted from fresh mouse brains in the DA TC and DR versus the use of RNA extracted from flash frozen brains in the PbTx TC. While minor differences were observed depending on the data set used, one data set did not consistently yield higher correlations. It does not appear that the use of frozen tissue, rather than fresh, appreciably influences the observed correlations with qPCR data. Asterisks indicate a statistically significant correlation (p<0.05). All correlations were calculated using Spearman’s Rho. Statistical differences of correlation, determined by ANOVA, are indicated by different letters in the "all genes" data set. The number of samples included in each correlation is shown in the base of the bar.
Correlations of genes of included in microarray trend analysis versus genes excluded from microarray trend analysis.
|
|
| |||||
|
|
|
|
|
|
|
|
| All Primers | 0.882 | <0.0001 | 45 | 0.306 | 0.049 | 42 |
| Up-regulated on array | 0.898 | <0.0001 | 33 | 0.048 | 0.8278 | 23 |
| down-regulated on array | -0.140 | 0.6641 | 12 | -0.243 | 0.3155 | 19 |
| Fold Change ≥ ± 1.4 by array | 0.893 | <0.0001 | 28 | ND | ND | 0 |
| Fold Change < ± 1.4 by array | 0.586 | 0.0134 | 17 | 0.306 | 0.049 | 42 |
| Fold Change ≥ ± 1.4 by qPCR | 0.893 | <0.0001 | 25 | 0.562 | 0.0366 | 14 |
| Fold Change < ± 1.4 by qPCR | 0.537 | 0.0147 | 20 | -0.009 | 0.9629 | 28 |
| Ct > median | 0.870 | <0.00011 | 22 | 0.465 | 0.0339 | 21 |
| Ct < median | 0.895 | <0.0001 | 23 | 0.004 | 0.9866 | 21 |
| Log(Intensity) > median | 0.843 | <0.0001 | 22 | -0.072 | 0.7557 | 21 |
| Log(Intensity) < median | 0.934 | <0.0001 | 23 | 0.443 | 0.0444 | 21 |
| p-value ≤ 0.0001 | 0.910 | <0.0001 | 26 | 0.872 | 0.0539 | 5 |
| p-value > 0.0001 | 0.708 | 0.0007 | 19 | 0.180 | 0.2875 | 37 |
| Up or down-regulated conserved | 80.00%* | 36 | 76.19% | 32 | ||
| For verification of the mouse brain DA time course 14 genes included in the array trend set (≥1.7 fold change in at least one time point and a composite p-value ≤0.0001) determined to be of biological interest, given their known responses to DA or the degree of change exhibited by microarray, were selected for validation by qPCR. In addition, 13 genes of lesser confidence that were excluded from array trend analysis were verified by qPCR. Both data sets were normalized to tubulin, alpha 4. The effects of direction of regulation, degree of fold change, Ct value, spot intensity, and array p-value on correlation were examined. All correlations were calculated using Spearman’s Rho. *This value is the percentage of genes for which the direction of change was determined to be the same by both array and qPCR analyses. ND: no data (insufficient sample size, n≤2). | ||||||