| Literature DB >> 34890163 |
Emily Mahoney1, Vaibhav Janve, Timothy J Hohman, Logan Dumitrescu.
Abstract
Gene-based methods such as PrediXcan use expression quantitative trait loci to build tissue-specific gene expression models when only genetic data is available. There are known sex differences in tissue-specific gene expression and in the genetic architecture of gene expression, but such differences have not been incorporated into predicted gene expression models to date. We built sex-aware PrediXcan models using whole blood transcriptomic data from the Genotype-Tissue Expression (GTEx) project (195 females and 371 males) and evaluated their performance in an independent dataset. Specifically, PrediXcan models were built following the method described in Gamazon et al. 2015, but we included both whole-sample and sex-specific models. Validation was evaluated leveraging lymphoblast RNA sequencing data from the EUR cohort of the 1000 Genomes Project (178 females and 171 males). Correlations (R2) between observed and predicted expression were evaluated in 5,283 autosomal genes to determine performance of models. In sum, we successfully predicted 1,149 genes in males and 623 in females, while 3,511 genes appeared to be not sex-specific. Of the sex-specific genes, 15% (189 genes in males and 73 genes in females) exhibited higher R2 in sex-specific models compared to whole-sample models, although the overall gain in predictive power was generally minimal and well within measurement error. Nevertheless, two female-specific genes and six male-specific genes showed significantly better prediction when using the sex-specific weights versus the whole-sample weights; furthermore, several of these genes play a role in mitochondrial metabolism, which is known to be influenced by sex hormones. Taken together, these results support previous reports of the small contribution of genetic architecture to sex-specific expression. Still, sex-aware PrediXcan models were able to provide robust sex-specific prediction signals. Future studies exploring the contribution of the X chromosome and tissue specificity on sex-specific genetically regulated expression will clarify the utility of this method.Entities:
Mesh:
Year: 2022 PMID: 34890163 PMCID: PMC8924937
Source DB: PubMed Journal: Pac Symp Biocomput ISSN: 2335-6928
PrediXcan Model Characteristics and Performance
| Group | N Samples | N Genes | Mean R2 (SE) | Range R2 (min-max) |
|---|---|---|---|---|
|
| ||||
| Model Characteristics in GTEx | ||||
|
| ||||
| all | 566 | 6055 | 0.144 (0.002) | 0.010 – 0.768 |
| males | 371 | 5372 | 0.137 (0.002) | 0.010 – 0.752 |
| females | 195 | 3789 | 0.133 (0.002) | 0.019 – 0.593 |
|
| ||||
| Models Validated in 1000G | ||||
|
| ||||
| all[ | 349 | 2081 | 0.095 (0.003) | 0.011 – 0.852 |
| males[ | 171 | 188 | 0.091 (0.007) | 0.023 – 0.620 |
| females[ | 178 | 73 | 0.072 (0.009) | 0.021 – 0.357 |
Limited to models with R2>0.01 and p-value<0.05
Limited to validated models, where 1) predicted expression positively associated (ie, p<0.05 and beta>0) with actual expression and 2) whole-sample R2>sex-specific R2
Limited to validated models, where 1) predicted expression positively associated (ie, p<0.05 and beta>0) with actual expression and 2) sex-specific R2>whole-sample R2
Fig. 1.Overlap of Successfully Modeled Genes in GTEx. The number of genes whose gene expression was successfully modeled in GTEx (ie, R2>0.01 and p-value<0.05) is presented for the in whole sample (green), males (orange), and females (blue).
Fig. 2.Comparison of Sex-Specific vs Whole-Sample Models. Model fit (R2) for each gene in males is presented on the x-axis and for females on the y-axis. Points are colored based on whether they had a better R2 from the sex-specific Elastic Net model than from the whole-sample model.
Fig. 3.Change in R2 from Sex-Specific vs Whole-Sample Models in GTEx. Change in R2 (sex-specific R2 minus whole-sample R2) is presented on the x-axis with the number of genes on the y-axis. Bars are stacked with change in males in grey and change in females in red. Colored vertical lines indicate the average R2 change in each sex (0.010 for females, 0.003 for males).
Sex-Specific Genes
| Whole Sample | Males | ||||||
| N SNPs | GTEx R2 | 1000G R2 | N SNPs | GTEx R2 | 1000G R2 | R2 Change | |
|
| |||||||
|
| 12 | 0.055 | 0.163 | 9 | 0.080 | 0.184 | 0.021 [0.001, 0.040] |
|
| 40 | 0.210 | 0.076 | 34 | 0.219 | 0.102 | 0.025 [0.001, 0.047] |
|
| 7 | 0.016 | 0.021 | 24 | 0.042 | 0.096 | 0.048 [0.001, 0.094] |
|
| 51 | 0.132 | 0.115 | 32 | 0.179 | 0.142 | 0.060 [0.015, 0.098] |
|
| 46 | 0.151 | 0.016 | 18 | 0.155 | 0.051 | 0.038 [0.002, 0.070] |
|
| 14 | 0.276 | 0.104 | 11 | 0.282 | 0.142 | 0.027 [0.004, 0.047] |
|
| |||||||
| Whole Sample | Males | ||||||
| N SNPs | GTEx R2 | 1000G R2 | N SNPs | GTEx R2 | 1000G R2 | R2 Change | |
|
| |||||||
|
| 67 | 0.282 | 0.213 | 51 | 0.313 | 0.306 | 0.089 [0.048, 0.129] |
|
| 40 | 0.065 | 0.241 | 28 | 0.163 | 0.322 | 0.054 [0.014, 0.097] |
Fig. 4.Genes that Showed Significantly Better Prediction Using Female-Specific Models. This figure presents the two female-specific genes which were validated in 1000G, with MRPS10 in the top row and MRPL14 in the bottom row. The y-axis in all four plots is observed expression of each gene within females. In the plots on the left, the x-axis is the whole-sample predicted expression (that is, predicted expression based on Elastic Net models with both males and females) while the x-axis in the plots on the right is female-specific predicted expression. The R2 for each model is indicated on the bottom right corner of each plot.