| Literature DB >> 23360712 |
Douglas W Mahoney1, Terry M Therneau, S Keith Anderson, Jin Jen, Jean-Pierre A Kocher, Monica M Reinholz, Edith A Perez, Jeanette E Eckel-Passow.
Abstract
BACKGROUND: Formalin fixed, paraffin embedded tissues are most commonly used for routine pathology analysis and for long term tissue preservation in the clinical setting. Many institutions have large archives of Formalin fixed, paraffin embedded tissues that provide a unique opportunity for understanding genomic signatures of disease. However, genome-wide expression profiling of Formalin fixed, paraffin embedded samples have been challenging due to RNA degradation. Because of the significant heterogeneity in tissue quality, normalization and analysis of these data presents particular challenges. The distribution of intensity values from archival tissues are inherently noisy and skewed due to differential sample degradation raising two primary concerns; whether a highly skewed array will unduly influence initial normalization of the data and whether outlier arrays can be reliably identified.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23360712 PMCID: PMC3626608 DOI: 10.1186/1756-0500-6-33
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Figure 1Plot of vs. median array expression. The circled points represent arrays that were Figure 1: Plot of NUSE vs. median array expression. The circled points represent arrays that were considered to be of poor quality by Stress/dfArray and the horizontal line represents the cutoff suggested for NUSE. The bead level standard deviations information was available from one plate of the experiment (n=96).
Figure 2Arrays from the breast cancer case study were grouped into four quadrants according to their IQR range (IQR = Q3-Q1) and skewness (skew = (Q2-Q1)/IQR; symmetric distributions will have skew=0.5). Quadrant R1 consists of arrays with IQR>2 and skew>0.2. Quadrant R2 consists of arrays with IQR>2 and skew≤0.2. Quadrant R3 consists of arrays with IQR≤2 and skew>0.2. Quadrant R4 consists of arrays with IQR≤2 and skew≤0.2. The percent of the 1618 arrays that fall into each quadrant are summarized along the top of the figure. Panel (A) depicts pre-normalized intensity values; Panel (B) depicts the RLE metric.
Quality assessment strategies for Formalin Fixed Paraffin-Embedded tissues analyzed with Illumina’s DASL assay
| Normalize Data | |
| Calculate | Calculate |
| (Plot | |
| Stage 1: Remove arrays with | Stage 1: Remove arrays with |
| Renormalize data after removing bad arrays | Renormalize data after removing bad arrays |
| Calculate | Calculate |
| Stage 2: Investigate arrays with | Stage 2: Remove arrays with |
| Final normalization after removing all outlying arrays |
Figure 3Plot of IQR (A) and skewness (B) of un-normalized data versus Median for all 1618 arrays. Plot of IQR (C) and skewness (D) of un-normalized data versus median dfArray for the 1410 arrays with median Stress < 1.5.
Figure 4Plot of versus median (A) and 75percentile absolute value of (B) with reference dashed lines indicating the respective threshold for outlier detection for each method. Plot of median Stress versus 75th percentile absolute value of dfArray (C) with dashed lines indicating the respective threshold for outlier detection. Open circles indicate outliers as identified by lumi Outlier.
Figure 5The relative increase in feature variance (A) and bias in the estimated feature intensity (B) by excluding any array ( light gray line ) and excluding the arrays that were considered to be an outlier by the ( medium gray line ). The 25th, 50th, and 75th percentiles of feature intensity for the 1378 arrays are listed along the top of the figure and the reference lines at 1 (A) and 0 (B) represent the reference samples.