| Literature DB >> 26653287 |
Anvar Suyundikov1,2, John R Stevens3, Christopher Corcoran4, Jennifer Herrick5, Roger K Wolff6, Martha L Slattery7.
Abstract
BACKGROUND: Most currently-used normalization methods for miRNA array data are based on methods developed for mRNA arrays despite fundamental differences between the data characteristics. The application of conventional quantile normalization can mask important expression differences by ignoring demographic and environmental factors. We present a generalization of the conventional quantile normalization method, making use of available subject-level covariates in a colorectal cancer study.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26653287 PMCID: PMC4675058 DOI: 10.1186/s12864-015-2199-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Non-normalized and quantile normalized miRNA expressions of tumor samples from non-risk and risk group subjects
Summaries of continuous covariates in real CRC data
| Covariate | Mean | SD |
|---|---|---|
| Age at diagnosis or selection (years) | 64.1 | 9.8 |
| Average num. cigarettes per day | 12.5 | 14.7 |
| Calories (kcal) | 2504.7 | 1199.3 |
| BMI | 27.6 | 5.4 |
| lutein + zeaxantin (mcg) | 3119.3 | 2542.3 |
| Vitamin D (mcg) | 6.7 | 5.0 |
| Lycopene (mcg) | 8850.5 | 8195.1 |
Summaries of binary or discrete covariates in real CRC data
| Covariate | Summary |
|---|---|
| Gender | 54 % male, 46 % female |
| Recent aspirin/NSAID use | 64 % no, 36 % yes |
| Recent smoker | 83 % no, 17 % yes |
| (among women) menopause | 12 % pre, 88 % post |
| (among post-menopausal women) | |
| taking HRT within 2 years | 30 % yes, 70 % no |
| Data collection center | 79 % Kaiser, 21 % Utah |
| Race | 81.6 % White, 8.5 % Hispanic, |
| 7.6 % Black, 2.1 % other | |
| Smoking status | 13 % current, 45 % former, |
| 42 % never | |
| Long-term alcohol consumption | 38 % none, 35 % moderate, |
| 27 % high | |
| SEER summary stage | 1 % in situ, 34 % localized, 52 % |
| regional, 12 % distant, 1 % unknown | |
| AJCC severity stage | 1 % 0 (in situ), 26 % 1, 31 % 2, 30 % 3, |
| 12 % 4 (distant) | |
| Colon or rectal cancer | 76 % colon, 24 % rectal |
Fig. 2TPR and FDR for sample sizes of 200 and 400 for conventional and weighted quantile normalizations
Fig. 3TPR and FDR for sample sizes of 200 and 400 for the quantile normalization and the weighted quantile normalization while accounting for unrelated covariates
Fig. 4Scatter plot of adjusted p-values of the CRC miRNA data, normalized by the quantile and the weighted quantile normalization methods (in log-scale). The green and red circles represent the miRNAs that are found significant only in the horizontal and vertical analyses, respectively