| Literature DB >> 29218900 |
Joanne Berghout1, Qike Li, Nima Pouladi, Jianrong Li, Yves A Lussier.
Abstract
Analysis of single-subject transcriptome response data is an unmet need of precision medicine, made challenging by the high dimension, dynamic nature and difficulty in extracting meaningful signals from biological or stochastic noise. We have proposed a method for single subject analysis that uses a mixture model for transcript fold-change clustering from isogenically paired samples, followed by integration of these distributions with Gene Ontology Biological Processes (GO-BP) to reduce dimension and identify functional attributes. We then extended these methods to develop functional signing metrics for gene set process regulation by incorporating biological repressor relationships encoded in GO-BP as negatively_regulates edges. Results revealed reproducible and biologically meaningful signals from analysis of a single subject's response, opening the door to future transcriptomic studies where subject and resource availability are currently limiting. We used inbred mouse strains fed different diets to provide isogenic biological replicates, permitting rigorous validation of our method. We compared significant genotype-specific GO-BP term results for overlap and rank order across three replicate pairs per genotype, and cross-methods to reference standards (limma+FET, SAM+FET, and GSEA). All single-subject analytics findings were robust and highly reproducible (median area under the ROC curve=0.96, n=24 genotypes × 3 replicates), providing confidence and validation of this approach for analyses in single subjects. R code is available online at http://www.lussiergroup.org/publications/PathwayActivity.Entities:
Mesh:
Year: 2018 PMID: 29218900 PMCID: PMC5730358
Source DB: PubMed Journal: Pac Symp Biocomput ISSN: 2335-6928
Figure 3Incorporation of signed functional activation of GO-BP using negatively_regulates
Count of GO-BP terms identified for each genotype by limma+FET (n=3 paired subjects/genotype; FDR 5%), SAM+FET (n=3 paired subjects/genotype; FDR 5%), GSEA (n=3/diet/genotype; FDR 20%), or as 3 replicate isogenic single-subjects via MixEnrich (ME; n=1 pair/genotype; FDR 5%).
| Method | 129 | A/J | BALB | C3H | C57BL | CAST | DBA/2 | I/Ln | MRL | NZB | PERA | SM/J | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Males | 1034 | 1085 | 609 | 926 | 393 | 507 | 84 | 298 | 253 | 1317 | 1275 | 955 | |
| SAM+FET | 827 | 13 | 408 | 295 | 11 | 229 | 38 | 74 | 62 | 397 | 192 | 555 | |
| GSEA | 1171 | 1102 | 251 | 648 | 1181 | 524 | 19 | 588 | 21 | 1316 | 714 | 1203 | |
| ME rep 1 | 1015 | 1155 | 934 | 872 | 865 | 269 | 816 | 530 | 1493 | 885 | 1182 | ||
| ME rep 2 | 1231 | 1121 | 649 | 752 | 1351 | 691 | 541 | 343 | 285 | 1050 | 868 | 817 | |
| ME rep 3 | 970 | 1005 | 774 | 921 | 633 | 728 | 295 | 1170 | 394 | 1280 | 962 | 1045 | |
|
| |||||||||||||
| all 3 ME | 670 | 752 | 439 | 554 | 470 | 427 | 166 | 276 | 199 | 857 | 580 | 578 | |
|
| |||||||||||||
| 3 ME+ | 582 | 698 | 368 | 524 | 306 | 322 | 65 | 204 | 165 | 823 | 551 | 534 | |
|
| |||||||||||||
| all methods | 368 | 5 | 67 | 159 | 1 | 127 | 6 | 15 | 14 | 283 | 123 | 277 | |
|
| |||||||||||||
| Female | 70 | 1422 | 156 | 342 | 123 | 1100 | 48 | 797 | 127 | 1165 | 515 | 140 | |
| SAM+FET | 40 | 171 | 102 | 93 | 71 | 99 | 91 | 245 | 73 | 119 | 134 | 88 | |
| GSEA | 150 | 811 | 0 | 62 | 31 | 752 | 12 | 230 | 39 | 1117 | 25 | 77 | |
| ME rep 1 | 400 | 1521 | 603 | 474 | 440 | 843 | 355 | 652 | 575 | 932 | 719 | 241 | |
| ME rep 2 | 465 | 1256 | 275 | 435 | 1235 | 306 | 528 | 306 | 1096 | 708 | 281 | ||
| ME rep 3 | 386 | 1496 | 361 | 324 | 450 | 870 | 273 | 850 | 318 | 1225 | 470 | 338 | |
|
| |||||||||||||
| all 3 ME | 217 | 923 | 189 | 217 | 178 | 630 | 158 | 366 | 171 | 737 | 305 | 157 | |
|
| |||||||||||||
| 3ME+ | 53 | 866 | 118 | 198 | 85 | 571 | 41 | 329 | 104 | 705 | 236 | 110 | |
|
| |||||||||||||
| all methods | 5 | 127 | 0 | 12 | 3 | 64 | 9 | 59 | 9 | 62 | 6 | 9 | |
Figure 1(A) Distribution of limma+FET adjusted p-values for GO-BP terms identified in common by all 3 ME replicates in B6 male mice versus orphan GO-BPs identified by a single ME replicate. (B) ROC and (C) precision-recall curves assessing the results of GO-BP enrichment for three replicate pairs analyzed for each genotype by MixEnrich (red, three lines), or cohort-derived reference sets of n=3 high fat diet versus low fat diet mice analyzed by SAM+FET (dark gray, one line) and GSEA (light gray, one line). All curves were compared to a reference standard generated limma+FET at FDR 5%.
Figure 2High correlation between rank order of GO-BP terms based on odds ratio. (A) high Spearman's correlation coefficient (rho) across all strains, only males shown. (B) Example plot of rank correlation using NZB female data, comparing MixEnrich replicate 3 to the limma analysis conducted across all NZB females, rho=0.75, p>10-25. Data are ranked from smallest to largest, so the largest ORs are ranked at (∼1500, ∼1500), rather than (1,1). (B) Example plot of log2(OR) values for the same pairwise comparison.