| Literature DB >> 29311748 |
Sivateja Tangirala1, Chirag J Patel2.
Abstract
While both genes and environment contribute to phenotype, deciphering environmental contributions to phenotype is a challenge. Furthermore, elucidating how different phenotypes may share similar environmental etiologies also is challenging. One way to identify environmental influences is through a discordant monozygotic (MZ) twin study design. Here, we assessed differential gene expression in MZ discordant twin pairs (affected vs. non-affected) for seven phenotypes, including chronic fatigue syndrome, obesity, ulcerative colitis, major depressive disorder, intermittent allergic rhinitis, physical activity, and intelligence quotient, comparing the spectrum of genes differentially expressed across seven phenotypes individually. Second, we performed meta-analysis for each gene to identify commonalities and differences in gene expression signatures between the seven phenotypes. In our integrative analyses, we found that there may be a common gene expression signature (with small effect sizes) across the phenotypes; however, differences between phenotypes with respect to differentially expressed genes were more prominently featured. Therefore, defining common environmentally induced pathways in phenotypes remains elusive. We make our work accessible by providing a new database (DiscTwinExprDB: http://apps.chiragjpgroup.org/disctwinexprdb/ ) for investigators to study non-genotypic influence on gene expression.Entities:
Mesh:
Year: 2018 PMID: 29311748 PMCID: PMC5758574 DOI: 10.1038/s41598-017-18585-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Analysis Procedure. A schematic diagram depicting the analysis pipeline. (1) Data Selection involved a filtration process for selecting twin expression datasets. (2) Differential Expression Analysis was carried out (using probe or transcript-level values) to find significant differentially expressed transcripts using FDR and effect size thresholds. (3) Meta-Analytic Gene Level Summarization was carried out to summarize transcript-level differences to gene-level differences.
Summary of Datasets.
| Study identifier | Reference(s) | Number of Twin Pairs | Phenotype | Number of Genes | Platform | Sample Source(Tissue and cell lines) | Source |
|---|---|---|---|---|---|---|---|
| GSE22619 | Lepage | 10 | ulcerative colitis (UC) | 22836 | GPL570 | Primary mucosal tissue, colon | GEO |
| GSE16059 | Byrnes | 44 | chronic fatigue syndrome (CFS) | 22836 | GPL570 | Peripheral venous blood | GEO |
| GSE20319 | Leskinen | 10 | physical activity (PA) | 19429 | GPL6884 | Musculus vastus lateralis | GEO |
| GSE33476 | Yu | 17 | intelligence quotient(IQ) | 18638 | GPL6244 | Lymphoblastoid cell lines | GEO |
| GSE37146 | Sjogren | 11 | intermittent allergic rhinitis ( | 19580 | GPL6102 | Peripheral blood mononuclear cells | GEO |
| MDD(dbGAP) | Wright | 28 | major depressive disorder (MDD) | 19284 | GPL13667 | Peripheral blood | dbGAP |
| E-MEXP-1425 | Pietiläinen | 13 | obesity (OB) | 22836 | GPL570 | Adipose tissue | Array Express |
This table shows the phenotype and number of genes being measured, sample size, platform, tissue, source, and reference paper for each of the seven studies.
Figure 2Volcano plots for seven phenotypes. The mean differences versus the negative log (base 10) of FDR for the seven phenotypes (each with greater than 10 twin pairs). The blue color indicates FDR significant genes (FDR < 0.05) and the red color indicates FDR nonsignificant genes. The black lines indicate the effect size thresholds (95th percentile of absolute value of mean expression differences for each phenotype).
Percentages of Overlaps of Significant (FDR < 0.05 and Absolute Value Effect Size Threshold of 95th percentile) Genes.
| Phenotype | PA | UC | IAR_invitro | CFS | IQ | MDD | OB |
|---|---|---|---|---|---|---|---|
| PA | 0.08 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| UC | 0.01 | 1.86 | 0.01 | 0.00 | 0.09 | 0.01 | 0.06 |
| IAR_invitro | 0.00 | 0.01 | 0.37 | 0.00 | 0.01 | 0.00 | 0.00 |
| CFS | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 |
| IQ | 0.00 | 0.09 | 0.01 | 0.00 | 3.63 | 0.00 | 0.01 |
| MDD | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 | 0.03 | 0.01 |
| OB | 0.00 | 0.06 | 0.00 | 0.00 | 0.01 | 0.01 | 0.59 |
This table shows the percentages of overlapping significant genes in phenotype pairs out of the total overlapping measured genes in those pairs.
Spearman correlations of mean gene expression differences between phenotypes.
| Phenotype | PA | UC | IAR_invitro | CFS | MDD | IQ | OB |
|---|---|---|---|---|---|---|---|
| PA | 1.00 | 0.00 | −0.02 | 0.02 | 0.01 | 0.02 | 0.00 |
| UC | 0.00 | 1.00 | −0.01 | 0.18 | 0.04 | 0.07 | 0.04 |
| IAR_invitro | −0.02 | −0.01 | 1.00 | 0.00 | 0.03 | −0.04 | 0.00 |
| CFS | 0.02 | 0.18 | 0.00 | 1.00 | 0.18 | 0.09 | −0.13 |
| MDD | 0.01 | 0.04 | 0.03 | 0.18 | 1.00 | 0.03 | −0.04 |
| IQ | 0.02 | 0.07 | −0.04 | 0.09 | 0.03 | 1.00 | −0.02 |
| OB | 0.00 | 0.04 | 0.00 | −0.13 | −0.04 | −0.02 | 1.00 |
This table shows the Spearman correlations between seven phenotypes in each phenotype pair.
Figure 3Empirical Cumulative Distribution Function Plot of I2 values. The distribution of all measured genes (from the seven studies) among their I2 values.