| Literature DB >> 30266904 |
Luan Lin1,2, Quan Chen1,2, Jeanne P Hirsch3, Seungyeul Yoo1,2, Kayee Yeung4, Roger E Bumgarner4, Zhidong Tu1,2, Eric E Schadt1,2,5, Jun Zhu6,7,8.
Abstract
A large amount of panomic data has been generated in populations for understanding causal relationships in complex biological systems. Both genetic and temporal models can be used to establish causal relationships among molecular, cellular, or phenotypical traits, but with limitations. To fully utilize high-dimension temporal and genetic data, we develop a multivariate polynomial temporal genetic association (MPTGA) approach for detecting temporal genetic loci (teQTLs) of quantitative traits monitored over time in a population and a temporal genetic causality test (TGCT) for inferring causal relationships between traits linked to the locus. We apply MPTGA and TGCT to simulated data sets and a yeast F2 population in response to rapamycin, and demonstrate increased power to detect teQTLs. We identify a teQTL hotspot locus interacting with rapamycin treatment, infer putative causal regulators of the teQTL hotspot, and experimentally validate RRD1 as the causal regulator for this teQTL hotspot.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30266904 PMCID: PMC6162292 DOI: 10.1038/s41467-018-06203-3
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Overview of causal relationships in complex biological systems. a Bayesian network is a directed graph. However, causal relationships cannot be unambiguously inferred from a directed graph due to Markov equivalent structures. b Systematic genetic perturbation data enables causality inference and causal network construction by eliminating biologically impossible structures. c Two traits are related via a feedback loop. d Time-series data provide information for inferring causality. e Time-series data alone are not sufficient for distinguishing causality and varied time delay. Additional information is needed to infer true causal relationships
Fig. 2Performance comparison among different time-dependent genetic association methods. Performance was assessed under different strength of inter time point correlation (auto-correlation) based simulated data. a ROC curves of different methods when auto-correlation is low; b ROC curves of different methods when auto-correlation is intermediate; c ROC curves of different methods when auto-correlation is strong; d area under the curve (AUC) of different methods under different strength of auto-correlation
Number of eQTL identified and average 1-LOD drop CI by different approaches (static, MPTGA, union, regression, and Fisher’s method)
| Static (T0) | MPTGA | Union | Regression | Fisher | |
|---|---|---|---|---|---|
| #eQTLs identified | 3288 | 3819 | 3669 | 3453 | 3011 |
| Avg. 95% CI (kb) | 37.5 | 28.7 | 30.8 | 17.9 | 35 |
CI confidence interval, MPTGA multivariate polynomial temporal genetic association
Fig. 3Number of eQTLs linked to each genome location based on different approaches. The genome was divided into 602 bins of 20 kb each and shown in chromosomal order. The red lines were the cutoff for eQTL hotspot identification. a The static method; b the MPTGA method; c the regression method; d the union method; e the Fisher’s p-value method
Hotspots identified by different approaches and enrichment analysis of rapamycin response signature
| Hotspot | Chr | Pos | Size of eQTL hotspot | Enrichment for the rapamycin signature | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| T0 | MPTGA | UNION | Regre. | Fisher | T0 | MPTGA | UNION | Regression | Fisher | ||||||||
| perc (%) | perc (%) | perc (%) | perc (%) | perc (%) | |||||||||||||
| 1 | 1 | 50,000 | 34a | NA | NA | NA | NA | 2.9 | 1 | NA | NA | NA | NA | NA | NA | NA | NA |
| 2 | 1 | 190,000 | 35 | NA | NA | NA | NA | 0 | 1 | NA | NA | NA | NA | NA | NA | NA | NA |
| 3 | 2 | 430,000 | 44a | NA | NA | 44 | NA | 0 | 1 | NA | NA | NA | NA | 5.9 | 0.61 | NA | NA |
| 4 | 2 | 550,000 | 367a | 662 | 584 | 607 | 372 | 5.4 | 0.24 | 7.7 | 9.59e−5b | 4.8 | 0.42 | 5.9 | 0.06 | 5.9 | 0.13 |
| 5 | 3 | 90,000 | 60a | 38 | 40 | 105 | 40 | 10 | 0.0551 | 18.4 | 1.45e−3b | 12.5 | 0.034 | 8.6 | 0.05 | 12.5 | 0.034 |
| 6 | 3 | 170,000 | 34 | NA | NA | 26 | 28 | 8.8 | 0.2022 | NA | NA | NA | NA | 11.5 | 0.11 | 10.7 | 0.13 |
| 7 | 4 | 90,000 | NA | 63c | NA | 25c | NA | NA | NA | 20.6 | 3.85e−6b | NA | NA | 4.0 | 1 | NA | NA |
| 8 | 5 | 190,000 | NA | 162c | NA | 57c | NA | NA | NA | 17.9 | 1.28e−10b | NA | NA | 8.8 | 0.12 | NA | NA |
| 9 | 7 | 410,000 | 43 | 43 | 38 | NA | NA | 2.3 | 1 | 0 | 1 | 0.0 | 1 | NA | NA | NA | NA |
| 10 | 8 | 110,000 | 97a | 256 | 121 | 111 | 87 | 6.2 | 0.28 | 4.3 | 0.63 | 4.1 | 0.66 | 2.7 | 0.89 | 3.4 | 0.77 |
| 11 | 9 | 70,000 | NA | 65c | NA | NA | NA | NA | NA | 23.1 | 1.44e−7b | NA | NA | NA | NA | NA | NA |
| 12 | 9 | 250,000 | NA | NA | NA | 27c | NA | NA | NA | NA | NA | NA | NA | 3.7 | 1 | NA | NA |
| 13 | 12 | 670,000 | 363a | 171 | 295 | 267 | 248 | 2.8 | 0.97 | 4.1 | 0.67 | 2.7 | 0.97 | 2.2 | 0.99 | 2.8 | 0.94 |
| 14 | 13 | 70,000 | 239a | 237 | 201 | 163 | 172 | 7.1 | 0.046 | 9.7 | 4.64e−4b | 6.0 | 0.21 | 5.5 | 0.33 | 7.0 | 0.095 |
| 15 | 13 | 910,000 | 42 | NA | NA | 26 | NA | 4.8 | 0.58 | NA | NA | NA | NA | 7.7 | 0.34 | NA | NA |
| 16 | 14 | 410,000 | 145a | NA | NA | 46 | NA | 2.1 | 0.97 | NA | NA | NA | NA | 6.5 | 0.35 | NA | NA |
| 17 | 15 | 150,000 | 486a | 1000 | 1183 | 742 | 987 | 6.8 | 0.0131 | 7.7 | 7.71e−7b | 6.1 | 0.0042b | 6.7 | 0.0026b | 6.3 | 0.0041b |
| 18 | 16 | 510,000 | 138a | 168 | 52 | 147 | 67 | 3.6 | 0.76 | 11.9 | 7.08e−5b | 7.7 | 0.21 | 4.8 | 0.51 | 4.4 | 0.6 |
An NA in the table means the locus was not an eQTL hotspot in the corresponding method. When comparing with the rapamycin response signature, the percentage of genes in a hotspot overlapping with the rapamycin signature and the Fisher’s exact test p-value are reported. The expected percentage by chance is 4.6%. The hotspots significantly enriched for the rapamycin signature were marked (p < 0.05 after multiple testing correction)
aHotspots overlapped with those previously defined static eQTL hotspots
bThe hotspots significantly enriched for the rapamycin signature (p < 0.05 after multiple testing correction)
cThe hotspots identified only via time-dependent approaches
eQTL expression quantitative trait loci, MPTGA multivariate polynomial temporal genetic association
GO term enrichment analysis for eQTL hotspots identified by MPTGA
| Hotspot | GO-Term | ||
|---|---|---|---|
| Chr | Pos | ||
| 2 | 550,000a | Structural constituent of ribosome | 3.97E−57 |
| 3 | 90,000a | Branched chain family amino acid | 2.08E−10 |
| 4 | 90,000a | Structural constituent of ribosome | 2.03E−37 |
| 5 | 190,000a | Structural constituent of ribosome | 1.27E−126 |
| 7 | 410,000 | Protein folding | 6.85E−14 |
| 8 | 110,000 | Mating projection tip | 7.27E−6 |
| 9 | 70,000a | Structural constituent of ribosome | 1.77E−50 |
| 12 | 670,000 | Ergosterol biosynthesis | 6.37E−25 |
| 13 | 70,000a | Phosphate transport | 2.52E−5 |
| 15 | 150,000a | Structural constituent of ribosome | 7.25E−43 |
| 16 | 510,000a | Structural constituent of ribosome | 4.00E−28 |
aThe hotspots were enriched for the rapamycin signature as in Table 2
eQTL expression quantitative trait loci, MPTGA multivariate polynomial temporal genetic association
Causal regulators for teQTL hotspots inferred by the TCGT test
| Hotspot | Yvert et al. | BN full | TGCT | |
|---|---|---|---|---|
| Chr | Pos | |||
|
| 550,000a | |||
|
| 90,000a |
| ||
|
| 90,000a | NA | NA | |
|
| 190,000a | NA | NA | |
|
| 410,000 | NA | NA | |
|
| 110,000 |
|
| |
|
| 70,000a | NA | NA | |
|
| 670,000 |
|
| |
|
| 70,000a | None | None | |
|
| 150,000a | None |
| |
|
| 510,000a | NA | NA |
The putative causal regulators were compared with previous predictions in Yvert et al.[29] and Zhu et al.[3]
Genes in bold font are overlapping with previous findings
aThe hotspots were enriched for the rapamycin signature as in Table 2
NA not applicable, teQTL temporal expression quantitative trait loci, TGCT temporal genetic causality test
Fig. 4The teQTL hotspot chrIX:70,000 was only identified by the MPTGA method. a A typical pattern of expression traits linked to the teQTL hotspot. b Plot of – log 10(p-value) of RPP2A association at chrIX based on the MPTGA method. c Plot of – log 10(p-value) of RPP2A association at chrIX based on the regression method. d Plot of – log 10(p-value) of RPP2A association at chrIX based on the union method. Dashed lines in (b, c), and d are the p-value cutoff corresponding to FDR = 0.05
Fig. 5RRD1 is a putative key causal regulator for the teQTL hotspot chrIX:70,000. a RRD1 ko signatures without and with rapamycin treatment were only compared with teQTL hotspots identified by MPTGA. RRD1 ko signature with rapamycin treatment (RRD1 X rapamycin interaction signature) is enriched in multiple teQTL hotspots including ones at chrIX:70,000 and chrXV:170,000. b The teQTL plot for RRD1 based on MPTGA. RRD1 physically locates at chrIX:70,000 and has a strong cis-eQTL at the locus and a trans-eQTL at chrXV:150,000. c The causal network based on TGCT indicates that genes linked to the teQTL hotspot chrIX:70,000 are also regulated by multiple genetic perturbations so that RRD1 ko signature with rapamycin treatment overlaps with multiple teQTL hotspots. Nodes of larger size are putative key causal genes for teQTL hotspots. Red and green nodes represent upregulated and downregulated genes in RRD1 ko, in comparison to wild-type strain when treated with rapamycin, respectively
Enrichment analysis of genes that can extend yeast replicative lifespan in hotspots
| Hotspot | T0 | MPTGA | UNION | REGRESSION | FISHER | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CHR | Pos | Fold enrichment | Fold enrichment | Fold enrichment | Fold Enrichment | Fold Enrichment | |||||
|
| 50,000 | 0a | 1 | NA | NA | NA | NA | NA | NA | NA | NA |
|
| 190,000 | 1.37 | 0.43 | NA | NA | NA | NA | NA | NA | NA | NA |
|
| 430,000 | 0a | 1 | NA | NA | NA | NA | 1.63 | 0.28 | NA | NA |
|
| 550,000 | 0.72a | 0.91 | 1.77 | 3.27E−5b | 0.98 | 0.57 | 1.14 | 0.24 | 0.97 | 0.6 |
|
| 90,000 | 0.4a | 1 | 0 | 1 | 0.6 | 1 | 0.68 | 0.82 | 0 | 1 |
|
| 170,000 | 0 | 1 | NA | NA | NA | NA | 0 | 1 | 0 | 1 |
|
| 90,000 | NA | NA | 3.04c | 0.004b | NA | NA | 0c | 1 | NA | NA |
|
| 190,000 | NA | NA | 4.29c | 1.29E−11b | NA | NA | 5.04c | 2.81E−6b | NA | NA |
|
| 330,000 | NA | NA | NA | NA | NA | NA | 0.86 | 1 | NA | NA |
|
| 410,000 | 2.79 | 0.03 | 1.67 | 0.27 | 1.89 | 0.21 | NA | NA | NA | NA |
|
| 110,000 | 0a | 1 | 0.47 | 0.98 | 0.2 | 1 | 0.22 | 1 | 0.28 | 1 |
|
| 70,000 | NA | NA | 4.42c | 1.20E−5b | NA | NA | NA | NA | NA | NA |
|
| 250,000 | NA | NA | NA | NA | NA | NA | 0.89c | 1 | NA | NA |
|
| 670,000 | 0.99a | 0.55 | 0.98 | 0.58 | 1.14 | 0.35 | 1.17 | 0.32 | 1.35 | 0.15 |
|
| 70,000 | 0.5a | 0.97 | 0.61 | 0.94 | 0.83 | 0.74 | 0.88 | 0.68 | 0.84 | 0.73 |
|
| 910,000 | 2.28 | 0.1 | NA | NA | NA | NA | 3.69 | 0.02 | NA | NA |
|
| 490,000 | 1.13a | 0.42 | NA | NA | NA | NA | 1.56 | 0.3 | NA | NA |
|
| 150,000 | 0.89a | 0.74 | 1.25 | 0.05 | 0.87 | 0.87 | 0.87 | 0.81 | 0.83 | 0.91 |
|
| 510,000 | 1.39a | 0.22 | 2 | 0.01 | 0.92 | 0.65 | 0.98 | 0.58 | 0.72 | 0.78 |
Similar to Table 2
aHotspots are overlapped with those previously defined static eQTL hotspots
bThe hotspots were enriched for the rapamycin signature
cThe hotspots were identified only via time-dependent approaches
MPTGA multivariate polynomial temporal genetic association, NA not applicable