| Literature DB >> 28589853 |
Qike Li1,2,3,4, A Grant Schissler1,2,3,4, Vincent Gardeux1,2,3, Ikbel Achour1,2,3, Colleen Kenost1,2,3, Joanne Berghout1,2,3, Haiquan Li5,6,7, Hao Helen Zhang8,9, Yves A Lussier10,11,12,13,14,15.
Abstract
BACKGROUND: Transcriptome analytic tools are commonly used across patient cohorts to develop drugs and predict clinical outcomes. However, as precision medicine pursues more accurate and individualized treatment decisions, these methods are not designed to address single-patient transcriptome analyses. We previously developed and validated the N-of-1-pathways framework using two methods, Wilcoxon and Mahalanobis Distance (MD), for personal transcriptome analysis derived from a pair of samples of a single patient. Although, both methods uncover concordantly dysregulated pathways, they are not designed to detect dysregulated pathways with up- and down-regulated genes (bidirectional dysregulation) that are ubiquitous in biological systems.Entities:
Keywords: Head and neck squamous cell carcinomas (HNSCCs); Mixture Model; N-of-1-pathways; Precision Medicine; RNA-Seq; Single-Subject Analysis
Mesh:
Year: 2017 PMID: 28589853 PMCID: PMC5461551 DOI: 10.1186/s12920-017-0263-4
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.622
Fig. 1The outline of MixEnrich. Single-subject paired transcriptomes (e.g., healthy and tumor, left panel) are used as the input for the clustering procedure (middle panel). The mixture model clusters all mRNAs of the subject into two groups, determining dysregulated mRNAs. The dysregulated mRNAs are then tested for enrichment into pathways using a Fisher’s Exact Test. FC = fold change; |log2FC| = the absolute value of log2 transformed fold-change; DEG = differentially expressed mRNAs; FET = Fisher’s Exact Test
Contingency table for Fisher’s Exact Test
| dysregulated mRNAs | unaltered mRNAs | Row sums | |
|---|---|---|---|
| mRNAs in target pathway |
|
|
|
| mRNAs not in target pathway |
|
|
|
| Column sums |
|
|
|
Simulation parameters
| Parameter | Description of the parameter | Values tested |
|---|---|---|
|
| Fold change of dysregulated background mRNAs | {1, 1.3, 1.5, 2} |
|
| Percentage of dysregulated mRNAs as noise in the background | {0, 0.01, 0.05, 0.1, 0.2} |
|
| Number of mRNAs randomly chosen in the target pathway | {5, 10, [15, 490] by step 25, 500} |
|
| Percentage of dysregulated mRNAs in the target pathway | {(0, 1] by step 0.05} |
|
| Fold change of mRNAs in the target pathway | {1.3, 1.5, 2} |
|
| Percentage of up-regulated mRNAs among dysregulated mRNAs in the target pathways | {0, 0.1, 0.2, 0.3, 0.4, 0.5} |
Dataset description
| Dataset and Study | Dataset I: | Dataset II: | Dataset III: |
|---|---|---|---|
| Type | Healthy lung tissues | Head and Neck squamous cell carcinomas | Breast invasive carcinoma |
| Source | TCGA | TCGA | TCGA |
| Date | March 2013 | May 2015 | October 2016 |
| Platform | Illumina RNA-Seq V.2 | Illumina RNA-Seq V.2 | Illumina RNA-Seq V.2 |
| Genes mapped | 20,502 | 20,501 | 20,501 |
| Patients | |||
| Total | 55 | 45 pairs | 112 pairs |
| Healthy | 55 | 45 | 112 |
| Tumor | not applicable | 45 | 112 |
| URL |
|
|
|
Fig. 2Illustrative ROC curves and comparison of the overall performance of three single-subject methods. MixEnrich is compared to MD and Wilcoxon in overall performance across all simulated pathway dysregulation scenarios via area under ROC curves (AUCs). Panel a shows an example of ROC curves for the three methods derived from the following setting: 20% of mRNAs in the background were dysregulated at fold change of 2; 20% of mRNAs in the target pathways (size of 65 genes) were dysregulated at fold change of 1.3 with half of them up-regulated. Each boxplot, in Panel b, visualizes all resultant AUCs of the corresponding method across all simulation settings (outliers are not illustrated)
Fig. 3Evaluation of performance as each parameter of the simulation varies Each column corresponds to one simulation parameter (horizontal axis), while each row corresponds to a method (names on the left of the vertical axis). Each panel, defined by the combination of a simulation parameter and a method, contains all 107,640 AUCs resulted from a method. For example, in the panel of pathway dysregulation percentage (p.dPct) for N-of-1-pathways Wilcoxon, bottom left panel, each boxplot illustrates the distribution of AUCs resulting from Wilcoxon at a fixed value of p.dPct (horizontal axis) while varying all the other five simulation parameters. For the sake of clarity, outliers are not shown
Fig. 4Joint effect of the pathway size and proportion of dysregulated mRNAs on the performance of MixEnrich method. Panel a shows an example contour plot of the AUC values under the combination of the four parameters (values are shown at the up-right corner in the contour plot) for MixEnrich. Every point in the contour plot is colored by the AUC value, with the color key shown in the upper left corner of the panel. Points with the same AUC values are connected by contour lines in the plot. We used area above the 95% curve (AAC95%; white area above the contour line of AUC = 0.95) as the overall joint performance measure when the two parameters change simultaneously. Panel b shows the distribution of every possible AAC95% for each method; each boxplot includes 234 data points; each point in a boxplot corresponds to a specific combination of the four other parameters: bg.Pct, bg.FC, p.upPct, and p.FC while allowing p.S and p.dPct to vary
Fig. 5MixEnrich shows higher performance than other single-subject and cohort-based methods (the latter utilized on small samples). Each boxplot corresponding to the N-of-1-pathways methods (MixEnrich in purple, MD in green, and Wilcoxon in orange) consists of 15 AUCs resulting from 15 tested patients. Each boxplot corresponding to the cohort-based methods (DESeq + Enrichment in red and GSEA in blue) includes 50 AUCs resulting from 50 distinct subsets of the 15 tested patients (Validation case study of head and neck cell carcinoma patients). Cohort-based methods were performed across 3, 6 and 12 patients (Pt). The number of distinct subjects is shown below the horizontal axis as human icons to further illustrate how many distinct subjects are required in cohort-based analyses to obtain improvements of the AUC (vertical axis). In addition, the three single-subject analyses predict between 200-300 candidate pathways at FDR = 1%, while cohort-based statistics operating on 3 to 12 individuals predict only 50 pathways at FDR = 5% and over 200 at FDR = 20% (data not shown), which explains in part the observed differences in accuracies