| Literature DB >> 35205280 |
Abstract
The discovery of expression quantitative trait loci (eQTLs) and their target genes (eGenes) has not only compensated for the limitations of genome-wide association studies for complex phenotypes but has also provided a basis for predicting gene expression. Efforts have been made to develop analytical methods in statistical genetics, a key discipline in eQTL analysis. In particular, mixed model- and deep learning-based analytical methods have been extremely beneficial in mapping eQTLs and predicting gene expression. Nevertheless, we still face many challenges associated with eQTL discovery. Here, we discuss two key aspects of these challenges: 1, the complexity of eTraits with various factors such as polygenicity and epistasis and 2, the voluminous work required for various types of eQTL profiles. The properties and prospects of statistical methods, including the mixed model method, Bayesian inference, the deep learning method, and the integration method, are presented as future directions for eQTL discovery. This review will help expedite the design and use of efficient methods for eQTL discovery and eTrait prediction.Entities:
Keywords: complex phenotype; expression quantitative trait locus; regulation of gene expression; statistical genetics; target gene
Mesh:
Year: 2022 PMID: 35205280 PMCID: PMC8871770 DOI: 10.3390/genes13020235
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Examples of eQTL studies and reviews using a mixed model or a deep learning.
| eQTL | Methods | Results | Reference |
|---|---|---|---|
|
| |||
| cis-neQTL | GEMMA | Inflammation-dependent cis-neQTLs in intestine | [ |
| neQTL | GaLA-QTLM | neQTL mapping with ancestry data in multi-ethnic population | [ |
| neQTL | MM with pedigree-based covariance matrix | Cell type-specific neQTLs in brain and blood for Alzheimer’s disease | [ |
| neQTL | BOLT-LMM, GEMMA | Cis-/trans-neQTLs in peripheral blood and their contribution to heritability | [ |
| cis-neQTL | StructLMM | Cell-context interaction with environmental variables | [ |
| neQTL, rQTL, pQTL | AIREML, BLUP | Efficient translational control of pQTLs for ribosomal protein genes | [ |
| neQTL, rQTL, pQTL | AIREML | Inclusion of optimal number of eQTLs in constructing polygenic covariance matrix | [ |
| General | REML and others | Review on frequentist mixed model methodology | [ |
| General | Gibbs sampling, MCMC | Review on Bayesian mixed model methodology | [ |
|
| |||
| meQTL | CpGenie (CNN) | Prediction of the allele-specific impact of nucleotide variants on proximal CpG methylation | [ |
| cQTL | DeepHiC (CNN) | Functional prediction of nucleotide substitution on chromatin interaction using Hi-C data | [ |
| neQTL, dsQTL, atacQTL | MtBNN | Incorporating a Bayesian approach to assessing functional impact of non-coding variants | [ |
| General | DNN | Introduction to deep learning for genomics covering more than eQTL mapping | [ |
neQTL, narrow-sense expression quantitative trait locus; rQTL, ribosome occupancy QTL; pQTL, protein abundance QTL; meQTL, methylation QTL; cQTL, chromatin interaction QTL; dsQTL, DNase sensitivity QTL; atacQTL, assay for transposase accessible chromatin QTL; GEMMA, genome-wide efficient mixed-model association; MM, mixed model; AIREML, average information restricted maximum likelihood; BLUP, best linear unbiased prediction; MCMC, Markov chain Monte Carlo; CNN, convolutional. neural network; RNN, recurrent neural network; DNN, deep neural network.
Figure 1Obstacles to expression quantitative trait locus (eQTL) identification. General obstacles to both cis-eQTL and trans-eQTL identification are presented in red, whereas obstacles to only trans-eQTL identification are presented in dark green. The asterisk indicates obstacles that increase the difficulty of inference of causality.
Types of expression quantitative trait loci (eQTLs) associated with various molecular layers.
| Name | Abbrev. | Molecular Phenotype Associated with eQTL (Method) | Ref. |
|---|---|---|---|
|
| |||
| chromatin accessibility QTL | caQTL | Active and potential regulatory DNA elements, e.g., dsQTL (DNase-seq, ATAC-seq) | [ |
| methylation QTL | meQTL | DNA methylation for altering chromatin structure, mainly in regulatory regions such as promoters and intron–exon boundaries (ChIP-seq) | [ |
| histone QTL | hQTL | Magnitude of histone post-translational modifications for chromosomal packaging, e.g., H3K4me3 for promoters, H3K4me1 for enhancers, H3K27ac for promoters and enhancers (ChIP-seq) | [ |
| TF binding QTL | bQTL | Transcription factor binding, e.g., NF-κB, PU.1/Spi1, Stat1, JunD, and Pou2f1/Oct1 (ChIP-seq) | [ |
|
| |||
| promoter interacting eQTL | pieQTL | eQTLs overlapping active cis-regulatory elements that interact with their target gene promoters (HiChIP) | [ |
| chromatin interaction QTL | cQTL | Allelic differences of chromatin interactions between two homologous chromosomes mediated by CTCF and RNAPII (ChIA-PET) | [ |
| promoter enhancer interaction QTL | peQTL | Allele-specific RNAPII-mediated chromatin interactions with phased transcript (ChIA-PET) | [ |
|
| |||
| narrow-sense eQTL | neQTL | Gene expression level as the sum of all transcripts of each gene. We used the “neQTL” to differentiate this from eQTL, a generic term for all kinds (RNA-seq) | [ |
| miRNA eQTL | miR-eQTL | Expression level of miRNA for post-transcriptional and translational regulation (small RNA-seq) | [ |
| lncRNA eQTL | lncR-eQTL | Expression level of lncRNA for transcriptional, post-transcriptional, and epigenetic regulation (RNA-seq) | [ |
| circRNA-eQTL | Circ-eQTL | Expression level of circRNA for sequestration of miRNAs/proteins, splicing interference, and transcriptional and translational regulation (RNA-seq) | [ |
| response eQTL | reQTL | Transcriptomic response to external stimuli (RNA-seq) | [ |
| repeat eQTL | repeat-eQTL | Retrotransposon-derived repeat element as a source for evolution of new transcripts (RNA-seq) | [ |
| splicing QTL | sQTL | Relative abundance of the transcript isoforms of a gene or the intron excision ratios of an intron cluster for regulation of alternative splicing (RNA-seq) | [ |
| transcript ratio QTL | trQTL | Ratio of each transcript to the total gene expression for transcript usage, splicing, and transcript structure (RNA-seq) | [ |
| Allele specific expression QTL | aseQTL | Transcription differences between two different haplotypes in a heterozygous individual (RNA-seq) | [ |
| poly(A) ratio QTL | apaQTL | Alternative polyadenylation for mRNA stability and translation efficiency (RNA-seq) | [ |
| RNA editing QTL | edQTL | RNA editing level for post-transcriptional processes such as RNA splicing, localization, stability, and translational efficiency (RNA-seq) | [ |
| m6A QTL | m6A-QTL | N6-methyladenosine level in mRNA transcript for mRNA processing. (m6A-seq) | [ |
| RNA synthesis rate QTL | rsQTL | Transcription rates (4sU-seq) | [ |
| RNA decay QTL | rdQTL | mRNA decay rates for modulating steady-state transcript levels (RNA-seq) | [ |
| transcription initiation QTL | tiQTL | Activity of transcribed transcriptional regulatory elements (tTREs) in promoter and enhancer region (PRO-seq) | [ |
| directional initiation QTL | diQTL | Directionality of divergent bidirectional transcription at tTREs using log ratio of plus strand reads over minus-strand reads (PRO-seq) | [ |
|
| |||
| ribosome occupancy QTL | rQTL | Ribosome occupancy for translational regulation and translation efficiency (Ribo-seq) | [ |
| protein abundance QTL | pQTL | Protein expression level for post-transcriptional regulation (mass spectrometry) | [ |
|
| |||
| Metabolite QTL | mQTL | Small endogenous molecules or metabolites that reflects the dynamic response to physiological, pathophysiological, and/or developmental stimuli (NMR or mass spectroscopy) | [ |
| microbiome QTL | mbQTL | Microbial composition in multiple host tissues such as gut and skin (16S rRNA and ITS sequencing) | [ |