| Literature DB >> 30076208 |
Yitian Zhou1, Reedik Mägi2, Lili Milani2,3, Volker M Lauschke4.
Abstract
Abnormal plasma apolipoprotein levels are consistently implicated in CVD risk. Although 30% to 60% of their interindividual variability is genetic, common genetic variants explain only 10% to 20% of these differences. Rare genetic variants may be major sources of the missing heritability, yet quantitative evaluations of their contribution to phenotypic variability are lacking. Here, we analyzed whole-genome and whole-exome sequencing data from 138,632 individuals across seven major human populations to present a systematic overview of genetic apolipoprotein variability. We provide population-specific frequencies of 38 clinically important apolipoprotein alleles and identify further 6,875 genetic variants, 33% of which are novel and 98.7% of which are rare with minor allele frequencies <1%. We predicted the functional impact of rare variants and found that their relative importance differed drastically between genes and among ethnicities. Importantly, we validated the clinical relevance of multiple variants with predicted effects by leveraging association data from the CARDIoGRAM (Coronary Artery Disease Genomewide Replication and Meta-analysis) and Global Lipids Genetics consortia. Overall, we provide a consolidated overview of population-specific apolipoprotein genetics as a valuable data resource for scientists and clinicians, estimate the importance of rare genetic variants for the missing heritability of apolipoprotein-associated disease traits, and pinpoint multiple novel apolipoprotein variants with putative population-specific impacts on serum lipid levels.Entities:
Keywords: Alzheimer’s disease; cholesterol; lipid traits; population genetics
Mesh:
Substances:
Year: 2018 PMID: 30076208 PMCID: PMC6168301 DOI: 10.1194/jlr.P086710
Source DB: PubMed Journal: J Lipid Res ISSN: 0022-2275 Impact factor: 5.922
Fig. 1.Landscape of genetic variability in human APO genes. A: Stacked column and pie chart showing the variant composition of the 11 analyzed APO genes. In total, 8,886 variants were identified in 138,632 individuals, of which 6,875 are located in exons. The majority of exonic variants are missense mutations (57%), followed by synonymous variants (24%). B: Scatter plot in which the number of variants identified in each gene is plotted against the respective gene length. Linear regression line is shown. APOB is by far the largest APO gene and harbors the most variants. C: However, when the number of variants is normalized by gene length, APOB was the most conserved, harboring approximately three times less variants per kilobase compared with the least conserved gene, APOC1.
TABLE 1. Genetic diversity of a selection of clinically important APO variants across major human populations
| Population frequencies (in %) | ||||||||||
| Defining variants as RSID (HGVS) | Variant type | EUR | AFR | EAS | SAS | AMR | AJ | Clinical parameters | Effect size or strength of association | Reference |
| rs5742904 (NC_000002.11:g.21229160C>T) | Missense (R3527Q) | <0.1 | <0.1 | 0 | 0 | <0.1 | 0 | Ischemic heart disease | OR = 7 | ( |
| rs1042031 (NC_000002.11:g.21225753C>T) | Missense (E4181K) | 18.3 | 15.4 | 4.8 | 10.3 | 12.3 | 14.8 | Ischemic cerebrovascular disease | HR = 0.5 | ( |
| Ischemic stroke | HR = 0.2 | |||||||||
| rs1367117 (NC_000002.11:g.21263900G>A) | Missense (T98I) | 31.9 | 11.2 | 12.7 | 16 | 28.9 | 18.7 | CAD | β=0.035 | ( |
| LDL-C | β=0.12 | |||||||||
| TG | β=0.025 | |||||||||
| LDL | β’=4.05 | ( | ||||||||
| rs1042034 (NC_000002.11:g.21225281C>T) | Missense (S4338N) | 78.4 | 85.2 | 27.3 | 48.6 | 74.2 | 80.5 | TG | β’=−5.99 | ( |
| rs693 (NC_000002.11:g.21232195G>A) | Synonymous (T2515T) | 50 | 22.1 | 5.6 | 26.7 | 38 | 34.9 | LDL-C | ( | |
| LDL | β=0.123 | ( | ||||||||
| rs562338 (NC_000002.11:g.21288321A>G) | UTR | 18 | 59.6 | 0.19 | N.A. | 16.5 | 29.8 | LDL-C | ( | |
| rs754523 (NC_000002.11:g.21311691A>G) | UTR | 31.1 | 21.9 | 29.5 | N.A. | 29.1 | 29.1 | LDL-C | ( | |
| rs515135 (NC_000002.11:g.21286057T>C) | UTR | 18.2 | 47.7 | 9.5 | N.A. | 18.8 | 30.1 | CAD | OR = 1.03-1.08 | ( |
| rs673548 (NC_000002.11:g.21237544G>A) | Intron | 22.7 | 21.1 | 73 | N.A. | 22 | 14.2 | TG | β=−0.081 | ( |
| Wild-type | 77.4 | 67.5 | 83.6 | 85.7 | 86.4 | 80.6 | ||||
| Missense (R176C) | 7.7 | 10.8 | 7.5 | 4.2 | 3.2 | 7.8 | AD | OR = 0.6-2.6 | ( | |
| Missense (C130R) | 14.9 | 21.7 | 8.9 | 10.1 | 10.4 | 11.6 | AD | OR = 2.2ENTenlineENT33.1 | ( | |
| CAD | OR = 1.06 | ( | ||||||||
| NAFLD | OR = 0.51 | ( | ||||||||
| rs4420638 (NC_000019.9:g.45422946A>G) | UTR | 18.4 | 20.2 | 11.4 | N.A. | 10.8 | 14.6 | LDL | β’=7.14 | ( |
| LDL-C | ( | |||||||||
| β=0.19 | ( | |||||||||
| rs439401 (NC_000019.9:g.45414451T>C) | UTR | 36.8 | 15.3 | 58.6 | N.A. | 54.9 | 43 | TG | β’=−5.5 | ( |
| rs670 (NC_000011.9:g.116708413C>T) | Promoter | 17.3 | 14.8 | 27.8 | N.A. | 27.7 | 15.2 | LDL | OR = 1.66 | ( |
| TC | OR = 1.77 | |||||||||
| rs5082 (NC_000001.10:g.161193683G>A) | Promoter | 40.6 | 22.5 | 8.2 | N.A. | 23.4 | 35.8 | Obesity | OR = 1.84 | ( |
| rs675 (NC_000011.9:g.116691675T>A) | Missense (T367S) | 19.7 | 11.5 | <0.1 | 12.9 | 9.4 | 21.5 | CAD | HR = 2.07 | ( |
| rs5110 (NC_000011.9:g.116691634C>A) | Missense (Q380H) | 7.8 | 1.5 | <0.1 | 1.9 | 3.6 | 6.7 | TG | ( | |
| VLDL | ||||||||||
| HDL | ||||||||||
| rs1729407 (NC_000011.9:g.116677370C>G) | Intergenic | 50.8 | 12.7 | 30.4 | N.A. | 50.7 | 35.4 | HDL | ( | |
| Promoter, Kozak, Intron and UTR | 8.1 | 0 | 23.8 | 17.7 | 13 | N.A. | TG | 20-30% elevation | ( | |
| Missense (S19W) | 6.4 | 6.2 | <0.1 | 3.8 | 15.3 | 6.8 | TG | OR = 7.79 | ( | |
| rs662799 (NC_000011.9:g.116663707G>A) | Promoter | 6.9 | 12.1 | 29.2 | N.A. | 15 | 10.6 | TG | ( | |
| TG | ( | |||||||||
| HDL | ||||||||||
| rs2266788 (NC_000011.9:g.116660686G>A) | UTR | 7.4 | 1.6 | 21 | N.A. | 13.6 | 10.9 | TG | ( | |
| CAD | OR = 1.15 | ( | ||||||||
| rs2075291 (NC_000011.9:g.116661392C>A) | Missense (G185C) | <0.1 | 0.3 | 6.9 | 0.8 | <0.1 | 0 | TG | OR = 11.73 | ( |
| CAD | OR = 2.09 | ( | ||||||||
| rs11568822 (NC_000019.9:g.45417640_45417641insCGTT) | Promoter | 23 | 35 | N.A. | N.A. | 16.5 | N.A. | AD | OR = 1.84 | ( |
| rs5128 NC_000011.9:g.116703640G>C | UTR | 9.5 | 15.5 | 30.3 | 14 | CAD | OR = 1.3 | ( | ||
| rs2854116 NC_000011.9:g.116700169C>T | Promoter | 39.1 | 70.7 | 41.6 | 47 | Metabolic syndrome | OR = 1.73 | ( | ||
| CAD | OR = 1.28 | ( | ||||||||
| NAFLD | ( | |||||||||
| rs2854117 NC_000011.9:g.116700142T>C | Promoter | 28.9 | 68.3 | 42.4 | 35.4 | TG | ( | |||
| HDL | ||||||||||
| NAFLD | ( | |||||||||
| rs147210663 NC_000011.9:g.116701560G>A | Missense (A43T) | <0.1 | 0.2 | <0.1 | 1.1 | TG | ( | |||
| HDL | ||||||||||
| rs76353203 NC_000011.9:g.116701353C>G | Stop-gain (R19X) | <0.1 | <0.1 | <0.1 | 0 | CAC | OR = 0.35 | ( | ||
| CHD | HR = 0.68 | |||||||||
| rs138326449 NC_000011.9:g.116701354G>A | Splice donor | 0.2 | <0.1 | 0 | 0.2 | TG | ( | |||
| rs8178847 NC_000017.10:g.64216815C>T | Missense (R154H) | 6.7 | 9.8 | 6.3 | VT | OR = 1.55 | ( | |||
| rs1801689 NC_000017.10:g.64210580A>C | Missense (C325G) | 3.3 | 0.5 | <0.1 | LDL | ( | ||||
| rs1801690 NC_000017.10:g.64208285C>G | Missense (W335S) | 5.6 | 0.8 | 6.8 | TG | ( | ||||
| APOA-I levels | ||||||||||
| rs3760291 NC_000017.10:g.64226197G>T | Promoter | 26.1 | 7.2 | 6.3 | TC | ( | ||||
| LDL | ||||||||||
| APOB levels | ||||||||||
| APOE levels | ||||||||||
| rs707922 NC_000006.11:g.31625507G>T | Missense (G111V) | 5.2 | 31.9 | 15.9 | TC | ( | ||||
| LDL | ||||||||||
| rs805296 NC_000006.11:g.31622893T>C | Promoter | 1.3 | 11.5 | 11.6 | CAD | OR = 1.9 | ( | |||
| T2DM | OR = 2.29 | ( | ||||||||
| rs940494 NC_000007.13:g.56348924A>G | Promoter | 22.4 | 14.1 | 9.4 | CAD | OR = 1.82 | ( | |||
AD, Alzheimer’s disease; AFR, Africans; AJ, Ashkenazi Jews; AMR, Latinos; CAC, coronary artery calcification; CHD, coronary heart disease; EAS, East Asians; EUR, Europeans; HDL, HDL levels; HGVS, Human Genome Variation Society; HR, hazard ratio; LDL, LDL levels; LDL-C, LDL cholesterol; N.A., not available; RA, rheumatoid arthritis; RSID, reference SNP cluster ID; SAS, South Asians; T2DM, T2D mellitus; TC, total cholesterol levels; TG, triglyceride levels; VLDL, VLDL levels; VT, venous thrombosis; β, standardized effect size; β′, nonstandardized effect size.
Obtained from Gayà-Vidal et al. (97).
Obtained from Gao et al. (98).
Obtained from Lucatelli et al. (99).
Fig. 2.Rare genetic variants are important contributors to the functional variability of apolipoproteins. A: The vast majority of the identified apolipoprotein variants were rare (98.7%) or very rare (96.7%) with MAF < 1% or < 0.1%, respectively. B: The fraction of common and rare APO variants with putatively functional effects, as predicted by five computational algorithms. Note that all algorithms indicate that rare variants are enriched in mutations with functional effects. *** P < 0.0001 in paired heteroscedastic t-test. Error bars indicate SD. C: The number of CNVs per individual in APO genes is shown for six major human populations. AFR, Africans; AMR, Latinos; EAS, East Asians; EUR, Europeans; SAS, South Asians. Deletions and duplications are indicated in light and dark shades, respectively. Note that APOA1, APOA4, and APOC3 are in the same locus. D: The number of variants with putatively functional consequences per individual is shown for each analyzed APO gene across six populations. The inset indicates the fractions of putatively functional variants that are explained by rare variants (see also supplemental Fig. S1). Average value is indicated by red dashed line. Genes with an aggregated functional variant frequency is <1% are by definition explained exclusively by rare variants and are not shown.
Fig. 3.Structural variability of APOB and APOE. A: Domain map of APOB containing signal peptide (amino acids 1–27), proline-rich domains (amino acids 2,578–2,767; 3,243–3,318; and 3,714–3,892) (see Ref. 99), and the LDLR binding region (amino acids 3,386–3,396) (see Ref. 25). Additionally, the C terminus of the truncated APOB48 isoform is indicated. B: The domain map of APOE is shown with signal peptide (amino acids 1–18), LDLR binding region (amino acids 152–168), and lipid-binding region (amino acids 262–290) highlighted (see Ref. 100). Aligned line plots indicate the total number of variants (black line) and number of functional variants (red line) per 100 bp. Functionality is predicted by five orthogonal computational algorithms, and the average ± SD is shown. Variants that were common (MAF > 1%) in at least one population and deemed functional by all employed algorithms are highlighted. Pie charts indicate their relative abundance in the six major populations analyzed. AFR, Africans; AJ, Ashkenazi Jews; AMR, Latinos; EAS, East Asians; EUR, Europeans; SAS, South Asians.
Selection of putatively functional variants in APOB and APOE
| Population Frequencies (%) | Global Lipid Genetics Consortium | CARDIoGRAM | |||||||||
| Defining Variants as RSID (HGVS) | Variant Type | EUR | AFR | EAS | SAS | AMR | AJ | Effect Size | Log_odds | ||
| rs6752026 (NC_000002.11:g.21260934G>A) | Missense (P145S) | <0.1 | 12.8 | 0 | <0.1 | 0.7 | <0.1 | N.A. | N.A. | N.A. | N.A. |
| rs13306198 (NC_000002.11:g.21260084G>A) | Missense (T194M) | <0.1 | <0.1 | 5.5 | 0.2 | 0.3 | 0 | N.A. | N.A. | N.A. | N.A. |
| rs13306194 (NC_000002.11:g.21252534G>A) | Missense (R532W) | 0.1 | <0.1 | 13.4 | 0.2 | <0.1 | <0.1 | N.A. | N.A. | N.A. | N.A. |
| rs61736761 (NC_000002.11:g.21238007G>T) | Missense (L1212M) | <0.1 | 8.9 | 0 | 0.9 | 0.5 | <0.1 | N.A. | N.A. | N.A. | N.A. |
| rs1801699 (NC_000002.11:g.21233999T>C) | Missense (N1914S) | 1.9 | 0.5 | <0.1 | 0.6 | 6.5 | 1.6 | 0.091 | 2.6 × 10−6 | N.A. | N.A. |
| rs533617 (NC_000002.11:g.21233972T>C) | Missense (H1923R) | 4 | 0.7 | <0.1 | 3.7 | 0.9 | 2.4 | 0.14 | 9.6 × 10−45 | −0.096 | 0.034 |
| rs12713675 (NC_000002.11:g.21232373G>T) | Missense (A2456D) | <0.1 | 5.3 | 0 | <0.1 | 0.3 | 0 | N.A. | N.A. | N.A. | N.A. |
| rs676210 (NC_000002.11:g.21231524G>A) | Missense (P2739L) | 21.6 | 14.7 | 72.5 | 50.2 | 25.7 | 19.4 | 0.059 | 4.1 × 10−39 | 0.03 | 0.077 |
| rs12720854 (NC_000002.11:g.21229905T>C) | Missense (S3279G) | 0.3 | 1.7 | <0.1 | 0.1 | 0.6 | 1 | N.A. | N.A. | N.A. | N.A. |
| rs12720855 (NC_000002.11:g.21229860A>G) | Missense (S3294P) | <0.1 | 5.3 | 0 | <0.1 | 0.3 | 0 | N.A. | N.A. | N.A. | N.A. |
| rs1042023 (NC_000002.11:g.21229446G>C) | Missense (Q3432E) | 1.1 | 0.1 | 0 | 0.1 | 0.5 | 0.1 | N.A. | N.A. | N.A. | N.A. |
| rs533904656 (NC_000019.9:g.45411025G>A) | Missense (A18T) | 0 | 0 | 0.2 | 0 | 0 | 0 | N.A. | N.A. | N.A. | N.A. |
| rs769452 (NC_000019.9:g.45411110T>C) | Missense (L46P) | 0.3 | <0.1 | 0 | <0.1 | <0.1 | 0.5 | N.A. | N.A. | N.A. | N.A. |
| rs769455 (NC_000019.9:g.45412040C>T) | Missense (R163C) | <0.1 | 2 | 0 | <0.1 | 0.2 | 0 | N.A. | N.A. | N.A. | N.A. |
| rs749750245 (NC_000019.9:g.45412172C>T) | Missense (R207C) | 0 | 0 | 0 | 0 | 0.2 | 0 | N.A. | N.A. | N.A. | N.A. |
| rs140808909 (NC_000019.9:g.45412337G>A) | Missense (E262K) | 0 | 0 | 0.3 | 0 | 0 | 0 | N.A. | N.A. | N.A. | N.A. |
| rs190853081 (NC_000019.9:g.45412340G>A) | Missense (E263K) | 0 | 0 | 0.3 | 0 | 0 | 0 | N.A. | N.A. | N.A. | N.A. |
AFR, Africans; AJ, Ashkenazi Jews; AMR, Latinos; EAS, East Asians; EUR, Europeans; HGVS, Human Genome Variation Society; N.A., not available; RSID, reference SNP cluster ID; SAS, South Asians.
Fig. 4.Putatively deleterious variants are enriched in mutations with effects on blood lipid traits. Identified APO variants were overlaid with GWAS data provided by the Global Lipids Genetics Consortium for total cholesterol (A), LDL cholesterol (B), HDL cholesterol (C), and serum triglycerides (D). Sizes of dots indicate P values of the associations between variant and the respective clinical parameter. P < 10−4 indicates significance of association after Bonferroni correction. Importantly, variants predicted to be deleterious (indicated in red) were significantly enriched in mutations affecting lipid traits (P < 0.001; chi-squared test) compared with variants predicted to be functionally neutral (indicated in green). When individual lipid parameters were compared, variant associations were significant for cholesterol traits (total, LDL, and HDL cholesterol; P < 0.05) but not serum triglyceride levels (P = 0.48; heteroscedastic two-tailed t-test). * P < 0.05; n.s., not significant.