| Literature DB >> 32693092 |
Zhaozhong Zhu1, Kohei Hasegawa2, Carlos A Camargo3, Liming Liang4.
Abstract
Asthma is a heterogeneous respiratory disease reflecting distinct pathobiologic mechanisms. These mechanisms are based, at least partly, on different genetic factors shared by many other conditions, such as allergic diseases and obesity. Investigating the shared genetic effects enables better understanding of the mechanisms of phenotypic correlations and is less subject to confounding by environmental factors. The increasing availability of large-scale genome-wide association study (GWAS) for asthma has enabled researchers to examine the genetic contributions to the epidemiologic associations between asthma subtypes and those between coexisting diseases and/or traits and asthma. Studies have found not only shared but also distinct genetic components between asthma subtypes, indicating that the heterogeneity is related to distinct genetics. This review summarizes a recently compiled analytic approach-genome-wide cross-trait analysis-to determine shared and distinct genetic architecture. The genome-wide cross-trait analysis features in several analytic aspects: genetic correlation, cross-trait meta-analysis, Mendelian randomization, polygenic risk score, and functional analysis. In this article, we discuss in detail the scientific goals that can be achieved by these analyses, their advantages, and their limitations. We also make recommendations for future directions: (1) ethnicity-specific asthma GWASs and (2) application of cross-trait methods to multiomics data to dissect the heritability found in GWASs. Finally, these analytic approaches are also applicable to complex and heterogeneous traits beyond asthma.Entities:
Keywords: Asthma; GWAS; allergic diseases; cross-trait; heterogeneity; obesity; shared genetics
Year: 2020 PMID: 32693092 PMCID: PMC7368660 DOI: 10.1016/j.jaci.2020.07.004
Source DB: PubMed Journal: J Allergy Clin Immunol ISSN: 0091-6749 Impact factor: 10.793
Fig 1Venn diagram of shared and distinct genetics between asthma subtypes and those between coexistent diseases or traits and asthma. A, Shared and distinct genetics between asthma subtypes (eg, allergic asthma vs nonallergic asthma). B, Shared genetics between coexistent diseases/traits (eg, allergic diseases, obesity, mental health disorders) and asthma. The overlapping size between 3 example coexistent diseases/traits and asthma are based on the order of their genetic similarities, with allergic diseases sharing the most genetic components with asthma, followed by obesity and mental health disorders. A and B, The area covered by horizontal cross-lines indicates similar underlying genetic components/mechanisms in allergic asthma and shared genetics of allergic diseases and asthma.
Fig 2Directed acyclic graphs of relationship between shared genetic or environmental factors with traits. A, Shared environment factors (not affected by shared genetic variants) will not bias genetic correlation. After appropriately control for population ancestry, the genetic effects b1 and b2 are unrelated to E, and therefore the genetic correlation—correlation b1 and b2—is not related to E. B, Another situation in which shared environment factors (not affected by shared genetic variants) will not bias genetic correlation. C, Shared environment factors (partially affected by shared genetic variants). In this case, E is not considered a confounder; rather, it is considered a mediator in the causal pathway of interest; c represents the effect of shared genetic variants on environmental factors, and d1 and d2 represent the effect of shared environmental factors on traits.
Fig 3Data availability for GWAS of asthma and study design of genome-wide cross-trait analysis. Genome-wide genetic correlation analysis is used to examine the genetic correlation between a pair of traits by using genome-wide SNPs. Cross-trait meta-analysis is used to determine the shared genetic variants between multiple traits. Mendelian randomization analysis is used to examine the causal effect of the exposure trait on the other trait by using the genetic variants for exposure trait as the instrument variables. eQTL enrichment analysis is used to determine the enrichment of genetic variants associated with complex traits in eQTL. Fine mapping credible set analysis is used to examine whether there is a potential causal variant at each locus. Variant functional annotation is used to predict the functional effect of an individual SNP on a transcript. eQTL colocalization analysis is used to determine the shared causal variants between GWAS signals and eQTL signals. CAAPA, Consortium on Asthma among African-ancestry Populations in the Americas; TAGC, Trans-National Asthma Genetic Consortium.
Glossary of terms related to genome-wide cross-trait analysis
| Term | Definition |
|---|---|
| Cross-trait meta-analysis | A meta-analysis testing the null hypothesis that none of the traits being examined is associated with the genetic variant. One genetic variant is tested at a time. |
| eQTLs | Genetic variants that are associated with the gene expression levels. |
| Genetic correlation | Assuming that all genetic variants have some effect on a trait and that their effect size follows a gaussian distribution (called the infinitesimal model), the genetic correlation between 2 traits (A and B) measures the Pearson correlation between the genetic variant effect on traits A and B. |
| GWAS | An analytic method that tests the association between each genetic variant and a specific phenotype (a disease status or a quantitative trait). One genetic variant is tested at a time. |
| HLA/MHC region | A genomic region of an approximately 3.6-Mb genome sequence located on the chromosome 6p21, which is mainly known for its pervasive pleiotropic effect and immune-related function. The extended MHC region is at 25 to 34 Mb on chromosome 6. |
| Horizontal pleiotropy | A genetic variant or gene having independent effects on multiple traits that do not have a causal effect on each other. |
| Instrumental variables | Variables that are associated with the modifiable exposure or risk factor of interest and affect the outcome only through the exposure or risk factor. |
| Mendelian randomization | An analytic approach that examines the causality of an observed association of a modifiable exposure or risk factor with an outcome of interest by using ≥1 genetic instrumental variables. |
| Polygenic risk score | A score based on a set of disease and/or trait-associated genetic variants, commonly defined as the weighted sum of their genotypes. Weights are chosen by their association effect on the disease and/or trait, directly from GWAS or further modified on the basis of a suitable statistical model incorporating all genetic variants on the genome. |
| Vertical pleiotropy (genetic causality) | A genetic variant or gene having an effect on a trait that has causal effect on another trait. |
Summary of genome-wide cross-trait analysis methods
| Analysis method | Software | Advantages | Disadvantages | Examples of application in asthma or complex traits/PMID |
|---|---|---|---|---|
| Genetic correlation | LDSC/S-LDSC | Requires only GWAS summary statistics; computationally efficient; accounts for additive confounding in single-trait heritability (such as population stratification) and confounding in genetic correlation (such as overlapping samples); can allow relatively flexible heritability architecture in MAF, LD, and functional categories (the authors LDSC recommended S-LDSC) | Is sensitive to other genetic architectures not captured by the baseline LD model; requires that the reference panel LD and GWAS summary statistics be computed from the same population | Childhood asthma and adult asthma/30929738, allergic diseases and asthma/29785011, obesity and asthma/31669095, mental health disorders and asthma/31619474 |
| GCTA/GCTA-LDMS | Estimates genetic correlation with high accuracy; the LD and association effect are computed from the same genotype data; accounts for different genetic architectures by MAF and LD categories (the authors of GCTA recommended GCTA–LDMS-I) | Requires genotype data; computation is infeasible for an extremely large data set | Complex traits/ 21167468 | |
| SumHer/BLD-LDAK | Is similar to LDSC but assumes a specific parametric model for MAF/LD-dependent genetic architecture and multiplicative inflation bias due to population stratification or family relatedness; can allow the same baseline LD categories as in LDSC (the authors of LDAK recommended BLD-LDAK/BLD-LDAK-alpha) | Is sensitive to other genetic architecture deviated from the assumed parametric model; requires that the reference panel LD and GWAS summary statistics be computed from the same population | Complex traits/ 32203469 | |
| Cross-trait meta-analysis | ASSET | Accounts for overlapping samples | Is applicable only to binary traits | Allergic diseases and asthma/29785011, mental health disorders and asthma/31619474 |
| CPASSOC | Is applicable to both binary and continuous traits | Yields potential false positives due to overlapping samples | Obesity and asthma/31669095 | |
| MTAG | Accounts for possibly unknown sample overlap | Requires an assumption that all variants share the same genetic correlation across all traits (ie, no subset-specific effect is assumed) | Complex traits/29292387 | |
| Mendelian randomization | Inverse variance–weighted approach | Is applicable when the genetic variants’ pleiotropic effects (genetic variant–outcome direct effect) happen to cancel out | Accounts for only the designed scenario; requires independent variants | Asthma and cancer/32006205 |
| Egger regression | Is applicable when the genetic variant–exposure association is independent of the pleiotropic effect; appears to protect false positives in several simulation studies | Accounts for only the designed scenario; requires independent variants; when the outcome GWAS is low-power, its power to detect causal effect could be substantially smaller than that of other methods | Asthma and cancer/32006205 | |
| Weighted median estimator | <50% (counts or total weights) of the genetic variants are invalid instruments | Accounts for only the designed scenario | Asthma and cancer/32006205 | |
| Weighted mode-based estimate (weighted MBE) | Is applicable when the variants satisfying the exclusion restriction assumption give a causal effect estimate that is the majority among the effect estimates from all variants in the analysis; appears to protect false positives in several simulation studies | Accounts for only the designed scenario | Asthma and cancer/32006205 | |
| GSMR | Accounts for LD between variants; detects and accounts for outliers that could violate the exclusion restriction assumption | Requires sufficient numbers of GWAS significant variants; requires a genetic variant–exposure association that is independent of the pleiotropic effect | Obesity and asthma/31669095, mental health disorders and asthma/31619474 | |
| MR-PRESSO | Detects and accounts for outliers that could violate the exclusion restriction assumption | Requires independent and sufficient numbers of GWAS significant variants; requires a genetic variant–exposure association that is independent of the pleiotropic effect | Asthma and cancer/32006205 | |
| LCV | Distinguishes genetic correlation from genetic causality; uses all genome-wide variants | Does not estimate causal effect size, but does provide a scale parameter with higher magnitude indicating that it is closer to causality; requires that the LD reference match the study populations | Obesity and asthma/31669095 | |
| CAUSE | Allows genetic correlation and genetic causality for different variants; can estimate causal effect size; uses all genome-wide variants; has better power avoid false positive trade-off than other methods | Requires independent variants by pruning GWAS results; may have a higher than expected false-positive rate than Egger regression and MBE when a large fraction of variants affect the exposure and outcome through a strong shared factor; in addition, the power for both exposure and outcome GWAS is high | Complex traits/32451458 | |
| Polygenic risk score | LDpred | Accounts for LD between SNPs | Is computation-intensive when the number of SNPs is more than a couple of million | Asthma/32522462 |
| MTGBLUP | Simultaneously estimates genetic effect and genetic correlation for multiple traits | Assumes an infinitesimal genetic architecture. Is computation-intensive when the number of SNPs is more than a couple of million and becomes prohibitive for more than hundreds of thousands of samples | Complex traits/25640677 | |
| MTAG | Provides improved polygenic prediction thanks to a consistent estimator, and its effect estimates always have a lower genome-wide mean squared error than the corresponding single-trait GWAS estimates do | Requires an assumption that all variants share the same genetic correlation across all traits (ie, no subset-specific effect is assumed); power might be reduced when the genetic correlations between traits are not very high | Complex traits/29292387 | |
| CTPR | Optimizes the prediction accuracy for the primary trait of interest by taking advantage of shared genetic effects among multiple traits; secondary traits of GWAS can be individual-level data or summary statistics | Requires individual-level data for the primary trait of interest | Complex traits/30718517 |
BLD, Background linkage disequilibrium; CAUSE, causal analysis using summary effect estimates; CTPR, cross-trait penalized regression; GCTA, genome-wide complex trait analysis; GSMR, generalized summary data–based Mendelian randomization; LCV, latent casual variable; LDMS-I: linkage disequlibrim and minor allele frequency stratified-I; MAF, minor allele frequency; MR-PRESSO, Mendelian randomization pleiotropy residual sum and outlier; MTAG, multitrait analysis of GWAS; MTGBLUP: multi-trait genomic best linear unbiased prediction; PMID, PubMed identifier; S-LDSC, stratified LDSC; SumHer, single-nucleotide polymorphism heritability.
Fig 4Diagram of horizontal pleiotropy and vertical pleiotropy with examples for asthma. BMI, Body mass index.