| Literature DB >> 24286512 |
Daniel Bottomly1, Peter A Ryabinin1, Jeffrey W Tyner2, Bill H Chang3, Marc M Loriaux4, Brian J Druker5, Shannon K McWeeney6, Beth Wilmot7.
Abstract
BACKGROUND: Patient-specific aberrant expression patterns in conjunction with functional screening assays can guide elucidation of the cancer genome architecture and identification of therapeutic targets. Since most statistical methods for expression analysis are focused on differences between experimental groups, the performance of approaches for patient-specific expression analyses are currently less well characterized. A comparison of methods for the identification of genes that are dysregulated relative to a single sample in a given set of experimental samples, to our knowledge, has not been performed.Entities:
Year: 2013 PMID: 24286512 PMCID: PMC3971350 DOI: 10.1186/gm509
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Figure 1The outlying degree outperforms other methods in both high and low variability simulated datasets. (A) Expression data was simulated from two distributions (normal with mean of seven and standard deviation of one as well as a t-distribution with non-centrality parameter set to seven and the degrees of freedom equal to fifteen) that were at the extremes of what would be typically observed in microarray data with the distribution of hypothetical patient data situated somewhere in the middle. (B, C) The outlying degree (k = 9) significantly outperformed both the Zscore and Rscore method in terms of power and false discovery for all combinations of effect size and distribution type. However, all the methods were only effective when encountering high effect sizes (four to five) with low variability (normal distribution). The grey areas indicate 0.95 confidence intervals. Note that for the false discovery rate, the estimates were very stable and the grey area is not readily observable. OD, outlying degree.
Figure 2The weighted outlying degree can attenuate the effect of sample-specific technical variability. (A) An example of a simulated dataset from the normal distribution with a technical factor affecting 2,500 of the 10,000 genes of sample one, making it divergent. The size of the effect is a two-unit decrease. (B, C) display power and false discovery rate estimates for the methods based on similar simulations to (A), where either 2,500 or 7,500 genes of one or three samples were affected. The effect size was kept at five units. The WODb method outperforms the others at least for the case where the number of divergent samples was equal to three. The grey areas indicate 0.95 confidence intervals. Note that for the false discovery rate, the estimates were very stable and the grey area is not readily observable. OD, outlying degree method; WODa, weighted outlying degree with weighting performed after nearest neighbor computations; WODb, weighted outlying degree with weighting performed before nearest neighbor computations.
Figure 3The outlying degree is more robust to variability across samples than the Zscore in experimental data. (A) The top five genes for both the Zscore and outlying degree method were found for sample 09206. From comparison purposes we plotted the distribution of the expression levels of the 12 patient samples for the top five ranked genes in either method. It was clear that the Zscore ranked higher those genes where a single outlier was found with the remainder of the samples tightly grouped together whereas the outlying degree (k = 6) ranked higher those genes with large differences while tolerating more expression variability between the samples. Exon-level summary of the genes ranked the highest in (A) (Rank 1) are shown for both (B) the Zscore and (C) the outlying degree methods.