| Literature DB >> 28649437 |
Jörg Menche1,2,3, Emre Guney1,4, Amitabh Sharma1,4,5, Patrick J Branigan6, Matthew J Loza6, Frédéric Baribaud6, Radu Dobrin6, Albert-László Barabási1,2,4,5.
Abstract
Gene expression data are routinely used to identify genes that on average exhibit different expression levels between a case and a control group. Yet, very few of such differentially expressed genes are detectably perturbed in individual patients. Here, we develop a framework to construct personalized perturbation profiles for individual subjects, identifying the set of genes that are significantly perturbed in each individual. This allows us to characterize the heterogeneity of the molecular manifestations of complex diseases by quantifying the expression-level similarities and differences among patients with the same phenotype. We show that despite the high heterogeneity of the individual perturbation profiles, patients with asthma, Parkinson and Huntington's disease share a broadpool of sporadically disease-associated genes, and that individuals with statistically significant overlap with this pool have a 80-100% chance of being diagnosed with the disease. The developed framework opens up the possibility to apply gene expression data in the context of precision medicine, with important implications for biomarker identification, drug development, diagnosis and treatment.Entities:
Year: 2017 PMID: 28649437 PMCID: PMC5445628 DOI: 10.1038/s41540-017-0009-0
Source DB: PubMed Journal: NPJ Syst Biol Appl ISSN: 2056-7189
Fig. 1Personalized gene expression analysis. a Example of the distribution of expression levels for the asthma biomarker POSTN. While the group-based comparison (FC = 1.2, p-value <3 × 10−6) suggests a global up-regulation of POSTN, many asthmatic individuals exhibit normal or even down-regulated POSTN levels. b Fraction of case subjects in which genes that are denominated as being DE in a standard group-wise analysis display normal expression levels, or expression levels that suggest a dysregulation in the opposite direction. The distributions show the respective fractions over all group-wise DE genes. All whisker bars throughout this manuscript indicate the 5, 25, 50, 75, and 95th percentiles of the respective distributions. Small numbers within the bars indicate the absolute number of patients that the respective median fraction corresponds to. c–e Illustration of the proposed approach towards individual perturbation profiles: instead of comparing two groups of case and control subjects, we compare each case subject individually with the background of control subjects (c). Genes whose expression level is sufficiently far from the range observed in the control subjects d are denoted as perturbed in the respective individual. Together, the perturbed genes constitute a personalized, i.e. subject specific “barcode” (e)
Fig. 2Heterogeneity among the PEEPs. a Distribution of the fraction of all PEEPs in which a gene appears that has been identified in a standard group-wise analysis (for asthma). b Fraction all group-wise DE genes found in the PEEPs for asthma patients. c, d Pairwise overlap of the genes in the PEEPs as measured by the Jaccard index (c) and the number of common genes (d). e Fraction of all case subject pairs whose gene overlap is statistically significant (Fishers’s exact test, p-value <0.05). f Distribution of the fraction of asthma patient PEEPs in which a gene appears
Fig. 3Functional characteristics of the genes in PEEPs. a A schematic figure illustrating how the same pathway associated with a specific function may be disrupted by perturbations at different locations in different subjects. b Individual perturbations of all asthmatic subjects within the asthma-specific pathway IFN-γ and Th2 cytokines-induced inflammatory signaling in normal and asthmatic airway epithelium. Each row corresponds to one pathway gene and each column to one subject. On the right: the number of subjects that have the respective gene up-regulated or down-regulated. Below: number of up-regulated or down-regulated genes within the pathway for each subject. c Pairwise similarity as measured by the Jaccard index of the pathway perturbations of all subject pairs whose profiles are significantly enriched within the pathway (Fisher’s exact test with Bonferroni correction, p-value <0.05) for all considered asthma-specific pathways (see Supplementary Table 1). Note that only the genes within the respective pathway areused for the comparison. d-f as in (c), but averaged over all geneGO terms and general MSigDB pathways that are significantly enriched in the profiles of the respective subjects (see Methods). BP biological process, MF molecular function, CC cellular component
Fig. 4Integrating the personalized profiles into a predictive pool of disease-associated genes. a, b Distribution of the number of individual perturbation profiles in which a gene appears for a control and b case subjects of the three considered diseases. The two curves in each panel correspond to the actual data and the random expectation according to a model of randomly selected genes (green). c Venn diagram of three broad gene pools compiled from genes that are in at least X individual perturbation profiles. d The ROC for the disease state classification by the fraction of the broad gene pool that is contained in a subject’s perturbation profile. The AUC values are 0.77 ± 0.03, 0.81 ± 0.06 and 1.0 ± 0.0 for asthma, PD and HD, respectively (mean value ± standard deviation computed over 100 cross-validations). e Sensitivity and specificity as a function of the fraction of broad gene pool for asthma (z thresh = 2.5; X = 10). f Illustration of the disease model suggested by the analysis of PEEPs