Literature DB >> 32476988

Understanding Non-Mendelian Genetic Risk.

Abstract

This opinion paper highlights strategies for a better understanding of non-Mendelian genetic risk that was revealed by genome-wide association studies (GWAS) of complex diseases. The genetic risk resides predominantly in non-coding regulatory DNA, such as in enhancers. The identification of mechanisms, the causal variants (mainly SNPs), and their target genes are, however, not always apparent but are likely involved in a network of risk determinants; the identification presents a bottle-neck in the full understanding of the genetics of complex phenotypes. Here, we propose strategies to identify functional SNPs and link risk enhancers with their target genes. The strategies are 1) identifying fine-mapped SNPs that break/form response elements within chromatin bio-features in relevant cell types 2) considering the nearest gene on linear DNA, 3) analyzing eQTLs, 4) mapping differential DNA methylation regions and relating them to gene expression, 5) employing genomic editing with CRISPR/cas9 and 6) identifying topological associated chromatin domains using chromatin conformation capture.

Entities: Chemical Disease Gene Species

Keywords: GWAS functionality; SNP; chromatin; genes; genomics; non-mendelian genetic risk

Year: 2019 PMID： 32476988 PMCID： PMC7235392 DOI： 10.2174/1389202920666191018085511

Source DB: PubMed Journal: Curr Genomics ISSN： 1389-2029 Impact factor: 2.236

INTRODUCTION

Non-Mendelian genomic risk is a relatively new field in understanding genetic diseases, such as cancer. Non-Mendelian genomic risk contrasts with Mendelian-inherited mutations; the latter can be followed in families via linkage analyses and have been known and studied for some time [1]. The contrast was revealed by genome-wide association studies (GWAS) of many complex traits, which identified non-Mendelian risk loci containing polymorphic variants mostly occurring in non-coding DNA; functional analyses of such risk have largely lagged the original GWAS signal identification. Until 2016, out of 3,836 successful GWAS studies, only 84 revealed some (but not complete) mechanistic understanding [2]. In most studies, risk SNPs have only been associated with disease (and not causing it). The reasons are that a GWAS signal at a particular locus has several surrogates in linkage disequilibrium and these, in turn, are linked to the functions of closely mapped (nearest) genes of interest [2]. Therefore, two main questions remain: Which SNP or SNPs are functional/causal and which genes functionally translate the risk signal.

FUNCTIONAL/CAUSAL RISK SNPs

SNP alleles are of various degrees in linkage disequilibrium (LD) in different racial-ethnic groups and at different loci [3]. Fine mapping can be achieved by direct genotyping and/or imputation of 1000 genomes [4]; SNPs with the lowest pvalues and greatest effect sizes are the most likely causal ones. The functionality of SNPs in cell types can be gleaned from nucleosome positioning (DNase1-hypersensitivity and ATAC-seq) and surrounding histone modifications (h3K27ac and H3K1me) [5], SNPs at these sites are likely to be functionally involved, especially if they break/form transcription factor motifs [6].

IDENTIFICATION OF GENES THAT FUNCTIONALLY TRANSLATE THE RISK SIGNAL

Genes near GWAS risk signals may not necessarily impose risk. This is because many/most GWAS risk signals reside in regulatory DNA, such as enhancers. Matching enhancers with genes has revealed wide-spread intergenic and intragenic (intronic) interactions, which in turn may control gene expression at some genomic distance [7]. There are several examples of risk enhancers controlling genes at a distance, but a striking one is an obesity- and type 2 diabetes-associated non-coding sequence within the intron of gene FTO, which was shown to be functionally connected with gene IRX3, a megabase distant [8]. Even at very short map distances, more than 40% of enhancers skip over the nearest gene and interact with distant ones [9]. Furthermore, some enhancers regulate multiple genes and several enhancers interact with a given gene [10, 11]. Enhancers, containing risk SNPs (risk enhancers), may influence phenotypes (both normal and pathological) via complex mechanisms [12]. The problem of understanding non-Mendelian genetic risk can be formulated, in the first instance, as which are the causal SNPs and how best to match risk enhancers with promoters of the genes they regulate, thus revealing risk mechanisms. Five enhancer/promoter matching strategies can be considered to shed light on the above conundrum.

STRATEGIES TO MATCH ENHANCERS WITH GENE PROMOTER

Nearest Gene

This strategy is used most often in gene identification of GWAS loci. In some cases, (but not in all), this makes perfect sense, especially if the nearest gene also happens to carry Mendelian-inherited mutations. Germline mutations and SNPs at genes such as TERT, p53, and BRCA1/2 indicate that these genes are involved in cancer etiology [13]. However, as stated above, in most cases, the nearest genes are not involved in complex disease risk.

eQTL Analyses

Expression-quantitative-trait-loci (eQTL) analyses are based on the correlation between variant genotypes (homozygous, heterozygous, homozygous alternate allele) and gene expression among a large number of samples [14]. In this approach, a priori candidates must be identified to see which genes in Cis are the most likely ones since genome-wide analyses suffer from multiple- hypotheses restrictions of significance. eQTL analyses suffer from lack of power (thus resulting in false negatives), cell type heterogeneity and false positives due to stochastic variation and abundant association [15].

Epigenetic Traits

By comparing DNA methylation with gene expression levels, one can correlate increased DNA methylation at enhancers with gene expression inhibition in multiple samples. The method is based on the negative correlation between CpG methylation at enhancers and their gene regulatory activities. This method was recently employed by comparing tumor with normal tissues [9, 16]. The advent of genome-wide bisulfate sequencing (to detect all methylated DNA sites), will in the future, reveal active and inactive enhancers in many cell types. A more recent software update tool has been published, called ELMER2 [17]. It is important to note that this type of analysis is only correlative and thus cannot be used to understand precise mechanisms of direct interactions.

Genomic Editing

Genome editing using CRISPR/cas9 technology has gained prominence due to the amazing precision by which this can be done and the potential benefits that can be achieved both in vitro (experimental systems) and in vivo (correcting genetic defects) [18]. This powerful technology may be employed to understand the enhancer/target gene pairs. Enhancers containing risk variants can be edited using CRISPR/cas9 by direct deletion or allelic replacement [19]. Both manipulation types can be followed by RNA-seq to determine changes in gene expression because of the manipulation. Risk enhancers and insolated sites (CTCF binding) containing risk SNPs can be manipulated by bringing different enhancer/promoter matches into play [20]. A major concern is a possibility that the guide-RNAs used to target the locus in question, may also bind to irrelevant sites and create off-target artifacts; this can be addressed by using different non-overlapping guide-RNAs on the same locus but this is expensive and labor-intensive. It is important to note that this approach does not distinguish between direct and indirect effects as mediated by intermediate genes.

Chromatin Conformation Capture

Several versions of this approach have been developed. At the basis of these approaches is the crosslinking of intact chromatin, followed by restriction enzyme digestion and subsequent ligation. This covalently links DNA fragments, which in linear DNA are at a distance, revealing looping and what has been described at topological-associated domains (TADs). Gene-enhancer pairs most likely function within such TADs. Several versions of this strategy exist. Conformation chromatin capture (3C) between one to one interactions, chromatin conformation capture conformation using circular DNA (4C) between a locus (aka viewpoint) and genome-wide targets, chromatin conformation capture using carbon-copy techniques (5C) between many interactions at a locus and finally chromatin conformation capture genome-wide (Hi-C) between many viewpoints and targets. Many reviews have been written on this topic; here are two reviews [21, 22]. These powerful methods directly assess enhancer/promoter interactions and are not dependent on gene intermediates. False-positive interactions may result from fortuitous non-relevant interactions.

CONCLUSION

The strategies outlined above are complementary and should be used in combination to ensure viable mechanistic insight into non-Mendelian genetic risk. Such understanding will yield novel insight (and likely therapeutic targets) in the genetic etiology of complex diseases.

22 in total

Review 1. The lod score method.

Authors: J P Rice; N L Saccone; J Corbett
Journal: Adv Genet Date: 2001 Impact factor: 1.944

Review 2. Genotype imputation for genome-wide association studies.

Authors: Jonathan Marchini; Bryan Howie
Journal: Nat Rev Genet Date: 2010-07 Impact factor: 53.242

Review 3. Long-Range Chromatin Interactions.

Authors: Job Dekker; Tom Misteli
Journal: Cold Spring Harb Perspect Biol Date: 2015-10-01 Impact factor: 10.005

4. Abundant associations with gene expression complicate GWAS follow-up.

Authors: Boxiang Liu; Michael J Gloudemans; Abhiram S Rao; Erik Ingelsson; Stephen B Montgomery
Journal: Nat Genet Date: 2019-05 Impact factor: 38.330

5. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation.

Authors: Guoliang Li; Xiaoan Ruan; Raymond K Auerbach; Kuljeet Singh Sandhu; Meizhen Zheng; Ping Wang; Huay Mei Poh; Yufen Goh; Joanne Lim; Jingyao Zhang; Hui Shan Sim; Su Qin Peh; Fabianus Hendriyan Mulawadi; Chin Thing Ong; Yuriy L Orlov; Shuzhen Hong; Zhizhuo Zhang; Steve Landt; Debasish Raha; Ghia Euskirchen; Chia-Lin Wei; Weihong Ge; Huaien Wang; Carrie Davis; Katherine I Fisher-Aylor; Ali Mortazavi; Mark Gerstein; Thomas Gingeras; Barbara Wold; Yi Sun; Melissa J Fullwood; Edwin Cheung; Edison Liu; Wing-Kin Sung; Michael Snyder; Yijun Ruan
Journal: Cell Date: 2012-01-20 Impact factor: 41.582