| Literature DB >> 29083405 |
Anna J Jasinska1,2, Ivette Zelaya3, Susan K Service1, Christine B Peterson4, Rita M Cantor1,5, Oi-Wa Choi1, Joseph DeYoung1, Eleazar Eskin5,6, Lynn A Fairbanks1, Scott Fears1, Allison E Furterer7, Yu S Huang1, Vasily Ramensky1, Christopher A Schmitt1, Hannes Svardal8, Matthew J Jorgensen9, Jay R Kaplan9, Diego Villar10, Bronwen L Aken11, Paul Flicek11, Rishi Nag11, Emily S Wong11, John Blangero12, Thomas D Dyer12, Marina Bogomolov13, Yoav Benjamini14, George M Weinstock15, Ken Dewar16, Chiara Sabatti17,18, Richard K Wilson19, J David Jentsch20,21, Wesley Warren19, Giovanni Coppola1,22, Roger P Woods21,22, Nelson B Freimer1,5.
Abstract
By analyzing multitissue gene expression and genome-wide genetic variation data in samples from a vervet monkey pedigree, we generated a transcriptome resource and produced the first catalog of expression quantitative trait loci (eQTLs) in a nonhuman primate model. This catalog contains more genome-wide significant eQTLs per sample than comparable human resources and identifies sex- and age-related expression patterns. Findings include a master regulatory locus that likely has a role in immune function and a locus regulating hippocampal long noncoding RNAs (lncRNAs), whose expression correlates with hippocampal volume. This resource will facilitate genetic investigation of quantitative traits, including brain and behavioral phenotypes relevant to neuropsychiatric disorders.Entities:
Mesh:
Year: 2017 PMID: 29083405 PMCID: PMC5714271 DOI: 10.1038/ng.3959
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Fig. 1PCA of 1,000 genes with the most variable expression levels. Analysis was performed separately by tissue; sample size was 60 animals for adrenal, blood, fibroblasts, and pituitary and 59 for BA46, caudate, and hippocampus. Numbers in the labels for x and y axes indicate the proportion of total variance accounted for by that PC.
Fig. 2Boxplot of log counts per million (CPM) expression in samples of BA46 from 58 animals vs. timepoint, for three genes with a strong relationship between expression pattern and age. The inter-quartile range defines the height of the box, and whiskers extend to 1.5x the inter-quartile range. Outliers are indicated as individual points. In each box, the median is represented by the horizontal black bar.
Gene expression data sets. The number of probes/genes with at least one significant local and distant eQTL (at Bonferroni corrected thresholds) are presented. We have 80% power to detect distant eQTLs accounting for 15% of the variability in expression in Dataset 1 and 66% of the variability in Dataset 2
| Tissue | Probes/genes analyzed | Local eQTL | Distant eQTL | %Distant eQTL on same chr |
|---|---|---|---|---|
| Blood | 3,417 | 461 | 215 | 80.8% |
| Adrenal | 25,187 | 555 | 80 | 54.5% |
| BA46 | 27,530 | 307 | 30 | 81.8% |
| Blood | 33,776 | 60 | 4 | 100.0% |
| Caudate | 28,249 | 441 | 47 | 69.0% |
| Fibroblast | 22,328 | 239 | 43 | 33.2% |
| Hippocampus | 26,957 | 361 | 45 | 70.6% |
| Pituitary Gland | 27,236 | 596 | 80 | 77.5% |
microarray dataset (Dataset 1) with an initial set of 22,184 probes on Illumina HumanRef-8 v2 (6,018 probes passed filters described in Supplementary Table 1; 3,417 were heritable); RNA-Seq (Dataset 2) with an initial set of 33,994 genes annotated in vervet
Local eQTL are eQTL that are within 1 Mb of the gene. Bonferroni threshold for Dataset 1: 4.8 x 10-8; Bonferroni threshold for Dataset 2: 6.5 x 10-10
Distant eQTL are more than 1 Mb away from the gene, and may be on the same or a different chromosome. Bonferroni threshold for Dataset 1: 1.5 x 10-11; Bonferroni threshold for Dataset 2: 5.3 x 10-13
Comparison of specific genes with local eQTL in Vervet Dataset 2 to GTEx. For each tissue we present the number of genes with at least one significant local eQTL in Vervet (at FDR thresholds).
| Tissue | Vervet number of individuals | # Local eQTL Vervet Genes | GTEx number of individuals | GTEx number of eGenes | # Vervet Genes with Human Ortholog | # Orthologous Genes Tested in GTEx | % Tested Genes p<0.05 | % Tested Genes p <.05/# tested Genes | % Tested Genes significant genome-wide in GTEx |
|---|---|---|---|---|---|---|---|---|---|
| Adrenal | 58 | 2932 | 126 | 2915 | 1828 | 1674 | 100% | 28.7% | 18.2% |
| Blood | 58 | 574 | 338 | 5438 | 264 | 229 | 100% | 70.7% | 38.9% |
| Caudate | 57 | 3140 | 100 | 2396 | 1737 | 1548 | 100% | 24.6% | 14.1% |
| Hippocampus | 58 | 2437 | 81 | 1405 | 1436 | 1296 | 100% | 18.4% | 9.2% |
| Pituitary | 58 | 3395 | 87 | 2222 | 1863 | 1743 | 100% | 20.7% | 13.0% |
The number of eGenes found in the multi-tissue hierarchical FDR procedure applied to vervet Dataset 2 and to GTEx.
Vervet genes with a human ortholog that were not tested in GTEx were filtered by their QC procedures
The threshold for significance corrected for the number of genes compared between Vervet and GTEx (column 7).
Genes were declared significant by GTEx at an FDR of 0.05.
Fig. 3Master regulatory locus on vervet chromosome CAE 9. Upper panel: Ensembl view of the CAE 9 region. Lower panel: The minimum –log10(p-value) for each SNP in association analyses vs. expression in 347 animals of microarray probes on different chromosomes. The symbols are color-coded to represent the number of probes significantly associated to each SNP: 1-2 probes (black), 3-4 probes (yellow), 5-6 probes (blue), 7-10 probes (purple), 11-14 probes (red). Symbols indicate the p-value from analysis of expression in Dataset 2 (RNA-Seq). Cross: p<2.35e-05; X: p<0.001; circle: p>0.001. The large red X at the top of the plot is CAE9_82694171.
Fig. 4Hippocampal volume QTL and local hippocampal eQTLs in RNA-Seq analysis. Top panel: purple dotted line is the multipoint LOD score for hippocampal volume (measured in 347 animals). Circles represent evidence for association of SNPs to hippocampal expression in 58 animals of three genes: LOC103222765 (red), LOC103222769 (blue) and LOC103222771 (gold). Solid circles indicate genome-wide significant associations. The region between the black vertical lines is blown up in the middle and bottom panels. The horizontal dotted line represents the genome-wide significant threshold for local eQTLs. Middle panel: SNPs with –log10(p-value)>8 for association to expression in hippocampus, color codes are as in the top panel. Bottom panel: Genes sited between 68.7 and 69 Mb (the eQTL region). Color codes are as in the top panel. The Pearson correlations for expression between these three genes are: LOC103222765-LOC103222769 r=-0.16; LOC103222765-LOC103222771 r=0.32; LOC103222769-LOC103222771 r=0.60.
Fig. 5Correlation in 16 animals of hippocampal volume (MRI) with hippocampal expression of LOC103222765 (left), LOC103222769 (middle) and LOC103222771 (right). The expression data are from qRT-PCR. Quantification was performed using the relative standard curve method, with the reference gene HPRT1 used as an endogenous control for normalization of the interpolated lncRNA quantities. Hippocampal volume measurements are residuals from a regression on covariates of age and sex. “r” is the Pearson correlation coefficient, and the p-value tests the null hypothesis that r=0. The Pearson correlation between expression of these three genes are: LOC103222765-LOC103222769 r=0.56; LOC103222765-LOC103222771 r=0.64; LOC103222769-LOC103222771 r=0.63.