| Literature DB >> 31937769 |
Yun Rose Li1,2,3, Joseph T Glessner1,2, Bradley P Coe4, Jin Li1,5, Maede Mohebnasab1, Xiao Chang1, John Connolly1, Charlly Kao1, Zhi Wei6, Jonathan Bradfield1, Cecilia Kim1, Cuiping Hou1, Munir Khan1, Frank Mentch1, Haijun Qiu1, Marina Bakay1, Christopher Cardinale1, Maria Lemma1, Debra Abrams1, Andrew Bridglall-Jhingoor1, Meckenzie Behr1, Shanell Harrison1, George Otieno1, Alexandria Thomas1, Fengxiang Wang1, Rosetta Chiavacci1, Lawrence Wu1, Dexter Hadley3, Elizabeth Goldmuntz2,7, Josephine Elia2,8,9, John Maris2,10, Robert Grundmeier11, Marcella Devoto2,12,13,14, Brendan Keating1, Michael March1, Renata Pellagrino1, Struan F A Grant1,2, Patrick M A Sleiman1, Mingyao Li14, Evan E Eichler4,15, Hakon Hakonarson16,17.
Abstract
Copy number variants (CNVs) are suggested to have a widespread impact on the human genome and phenotypes. To understand the role of CNVs across human diseases, we examine the CNV genomic landscape of 100,028 unrelated individuals of European ancestry, using SNP and CGH array datasets. We observe an average CNV burden of ~650 kb, identifying a total of 11,314 deletion, 5625 duplication, and 2746 homozygous deletion CNV regions (CNVRs). In all, 13.7% are unreported, 58.6% overlap with at least one gene, and 32.8% interrupt coding exons. These CNVRs are significantly more likely to overlap OMIM genes (2.94-fold), GWAS loci (1.52-fold), and non-coding RNAs (1.44-fold), compared with random distribution (P < 1 × 10-3). We uncover CNV associations with four major disease categories, including autoimmune, cardio-metabolic, oncologic, and neurological/psychiatric diseases, and identify several drug-repurposing opportunities. Our results demonstrate robust frequency definition for large-scale rare variant association studies, identify CNVs associated with major disease categories, and illustrate the pleiotropic impact of CNVs in human disease.Entities:
Mesh:
Year: 2020 PMID: 31937769 PMCID: PMC6959272 DOI: 10.1038/s41467-019-13624-1
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Genome-wide frequency and distribution of CNVs.
a “Manhattan plot” showing the distribution of identified CNVs across the genome with genomic positions across the x-axis, including deletions (left), duplications (center), and homozygous deletions (bottom) with the y-axis showing observed frequency of each CNV in the total study population. b Histogram of duplication, deletion, and homozygous deletions frequencies for all identified CNVs as a function of total CNV burden per individual (in bps on a log10 scale). c Histogram of duplication, deletion, and homozygous deletions frequencies as a function of total CNV burden (in bps on a log10 scale) per individual. d Density plot showing each individual’s deletion burden (in bps on a log10 scale) plotted against that same individual’s duplication burden (in bps on a log10 scale). Individuals with the same event burden (bps affected) are binned into hexagon and the color correlates with the total count of individuals in each bin, reflecting that most individuals who have more duplications also have more deletions; however, there are some exceptions in patients with relatively large burdens of deletions or duplications without comparable burden of the converse.
Fig. 2Functional enrichment analysis.
a Relative enrichment ratios for mapping of a CNV to a locus annotated as each of the given categories. ER was calculated based on 10,000 repeated simulations for each category/CNVR combo (see Methods). Regions of the genome bearing CNVs were significantly more likely to map to loci that are gene-bearing (Genic regions), exonic, or conserved (PHAST or PhyloP), and are enriched for functional loci, including miRNA targets (RNAi), CpG islands (CpG), microsatellites, recombination hotspots (Recombination), transcription factor-binding sites (TF BD), and were more likely to map to regions previous CNVs have been reported in the Database for Genomic Variants (DGV). ER enrichment ratio; DUP duplication; DEL deletion; HD homozygous deletion. The enrichment results are reported separately for “ALL” CNVRs and CNVRs optimized for Dup, Del, or HD CNVs. b Contribution of CNVRs with different transcriptional impact as annotated by Ensemble Variant Effect Predictor for the nearest gene, respectively, for all CNVRs, deletion CNVRs, duplication CNVRs, disease-associated CNVRs, homozygous deletion CNVRs, and all CNVs identified.
Fig. 3Genomic landscape of disease-associated CNVs.
a Subjects (n) that are enrolled in the study based on disease category. Patients are classified into one of the four major disease categories or healthy controls. b Distribution of disease-associated deletions (top) and duplications (bottom) of CNVRs by length. Disease categories are color-coded. CNVRs associated with disease categories are on average larger than CNVRs not associated with disease categories (black-colored line), but do not significantly differ from one another. c Circos plot illustrating the distribution of CNVRs identified in the context of genomic elements. For all other layers from INNERMOST to OUTERMOST, the tracks show: sno/miRNAs (1), miRNA targets (2), conserved (PHAST) sites (3), frequency of duplication CNVRs (4), frequency of deletion CNVRs (5), recombination rates (r) (6), expression of major EnCODE cell lines (7) corresponding to the candidate genes impacted by the respective DA-CNVRs (8). The innermost linkages reflect genes encoding the respective protein for all pairwise protein–protein interactions affected DA-CNVRs.
Select loci enriched with CNVs in four major disease categories.
| Pheno | Cyto band | Del/ Dup | Gene name | Annotations | |
|---|---|---|---|---|---|
| Cardio | 1p36.31 | del | 5.3E–14 | ACOT7 upregulation protects against fatty acid oversupply in the heart [21212523] | |
| Neuro | 1q21.1 | del dup | 6.0E–31 1.6E–27 | ||
| Cancer | 2p24.3 | dup | 2.4E–61 | Amplified/duplicated in neuroblastoma [23401364 and 15013217], DDX1-MYCN duplication in nephroblastoma [24161495] | |
| Cancer | 2q31.1 | del | 6.2E–26 | Overexpressed in breast cancer [19022662], deletion in osteosarcoma [15298715], CNV reduced hepatocellular carcinoma cell line expression [28863781] | |
| Aid | 4p13 | del | 1.7E–18 | Zn regulation in white blood cells [25927708] | |
| Cancer | 4p13 | del | 4.9E–23 | BEND4 in colorectal cancer [21636702], CCDC4 in MDR pancreatic adenocarcinoma [18453221], deletion in bladder and other carcinomas [11906820] | |
| Aid | 4p16.3 | dup | 1.4E–21 | E3 Ubiquitin ligase, involved in meiotic recombination [23396135] | |
| Neuro | 4p16.3 | both | 2.5E–24 | Neurogenesis during cortical development [22842144], neuronal differentiation [20823227] | |
| Neuro | 5q23.1 | dup | 3.8E–27 | Induced in Parkinson’s disease [24444419] | |
| Cancer | 5q31.2 | del | 3.5E–20 | Medullary thyroid carcinoma [25435367], tumor suppressor signaling [23959801], oral SCC [23541579], HCC [17934217], colorectal cancer [15532096], brain tumors [9417864] | |
| Neuro | 5q35.1 | del | 3.9E–23 | Parkinson’s disease [19162339] | |
| Neuro | 5q35.3 | dup | 6.6E–75 | Schizophrenia [21784156] | |
| Neuro | 6p22.1 | dup | 1.8E–52 | HIST1H2BG associated with schizophrenia [23904455] | |
| Cancer | 7p12.1 | dup | 5.1E–58 | Lung adenocarcinoma [21151896] | |
| Cancer | 7p12.2 | del | 3.7E–17 | Inactivation in lymphoma [11980663 and 11839096], deletion of IKZF1 or ZNFN1A1 in acute lymphoblastic leukemia [19129520, 26050650, 29519871, 15390181], | |
| Aid | 7p15.3 | del | 6.9E–19 | Inflammatory bowel disease [28067908] | |
| Cancer | 8p22 | del | 7.5E–28 | Mutated in AML [24189654], recurrent copy number loss in breast cancer [29545918], downregulation of MiR-383 in intron of SGCZ [28243881], gene fusions NCAM2-SGCZ in metastatic small-cell gallbladder neuroendocrine carcinoma [28040546] | |
| Cancer | 8q11.1 | dup | 2.0E–31 | Upstream of | |
| Cancer | 8q24.13 | del | 1.6E–17 | Leukemia [27390356], double minute in acute myeloid leukemia [18503831], colon cancer [19691111], prostate cancer [24962028] | |
| Neuro | 10p14 | dup | 8.0E–19 | Neuroblastoma apoptosis-related RNA-binding protein [9858671] | |
| Neuro | 10q22.2 | del | 1.5E–23 | Epilepsy and glioma [26329539] | |
| Cancer | 10q26.3 | del | 8.4E–15 | Silenced in colorectal cancer [22238052], higher methylation level in colorectal cancer [23546389, 22238052, 23321599] | |
| Cancer | 11p12 | dup | 2.7E–26 | Marker for Langerhans cell sarcoma [25837753], disregulation of CD44 in different types of cancer [25025570, 30631039, 30443182, 30211160, 30317669, 30463359] | |
| Aid | 11q13.4 | del | 1.8E–26 | Protective in sepsis [25873251], protective in inflammation [23925522], decrease the severity of multiple sclerosis in mice model [21857957] | |
| Cancer | 11q22.3 | dup | 9.0E–18 | NSCLC [21874247], an important molecular determinant of response to the Sepantronium Bromide [25064833, 28465296, 25568070] | |
| Neuro | 14q24.3 | dup | 5.9E–41 | Pediatric pineal germinomas [27889662], vanishing white matter syndrome [22678813] | |
| Aid | 14q31.3 | del | 4.5E–23 | Autoantigen in SLE [23401699], prostate cancer [26890304] | |
| Cancer | 16p13.3 | del | 4.4E–50 | Increased in brain tumor [16141199] | |
| Neuro | 16p13.3 | del | 1.7E–18 | Neural crest development [16943273], loss of Sox8 alleles in Hirschsprung disease [15572147] | |
| Neuro | 17q12 | del | 9.4E–18 | ACACA deleted in autism [23375656], ACACA associated with Alzheimer’s disease [22982105] | |
| Neuro | 17q21.1 | dup | 5.7E–73 | Major depression [23671070], bipolar disorder [26746321, 25359533] | |
| Aid | 17q25.3 | dup | 6.1E–34 | Associated with N-glycosylation of human immunoglobulin G show pleiotropy with autoimmune diseases and haematological cancers [23382691] | |
| Cancer | 19p13.3 | dup | 7.6E–25 | Cancer [28188128], CLL [28165464], lung cancer [25982285], near | |
| Neuro | 19p13.3 | dup | 6.5E–47 | Decreased HCN2 reduces learning abilities [21593326], HCN2 gene deletion decreased neuropathic pain [21903816] | |
| Cardio | 22q11.21 | del | 3.7E–23 | Velocardiofacial syndrome [26278718] |
CNVRs presented in Table 1 show disease category-specific enrichment reaching statistical significance (P < 9 × 10−14), which is adjusted for multiple comparisons based on results obtained from repeated simulations (see Methods). See Supplementary Data 5 for extended results (P < 5 × 10−8). Brackets [notation] denote PMIDs
Homozygous deletion CNVRs associated with a major disease category.
| Pheno | Chr:Pos (hg18) | Cases | Controls | Gene name | Phenotype information | |
|---|---|---|---|---|---|---|
| Cardio | chr11:55204003–55204003 | 8.0E–10 | 157 | 1204 | Obesity [21131291] | |
| Cancer | chr11:81194909–81194909 | 2.9E–07 | 80 | 129 | Target of FOXF2; deficiency important in E > M transition | |
| Neuro | chr1:78432711–78432711 | 6.1E–06 | 20 | 6 | Missense (c.785C > T; p.L262R) and nonsense (c.903G > A; p.W301X) mutations in human GIPC3 cause congenital sensorineural hearing impairment | |
| Aid | chr1:167466049–167466049 | 3.0E–05 | 15 | 6 | Venous thromboembolism | |
| Neuro | chr13:97328242–97330758 | 3.8E–05 | 28 | 16 | Schizophrenia | |
| Neuro | chr6:67105019–67105019 | 4.1E–05 | 97 | 110 | AR retinitis pigmentosa | |
| Cardio | chr5:113188389–113197319 | 4.2E–05 | 34 | 195 | mRNA metabolism | |
| Aid | chr3:191217916–191217916 | 4.5E–05 | 28 | 23 | Homozygous loss-of-function mutation causes severe non-syndromic high myopia with early-onset cataracts. | |
| Aid | chr12:27539678–27545813 | 8.1E–05 | 10 | 2 | Receptor tyrosine kinase | |
| Neuro | chr3:163625169–163625169 | 1.1E–04 | 57 | 55 | NA | |
| Aid | chr5:117421055–117421055 | 2.5E–04 | 79 | 124 | NA | |
| Neuro | chr19:40354649–40354649 | 2.6E–04 | 51 | 49 | NA | |
| Neuro | chr15:50050557–50057972 | 2.7E–04 | 12 | 3 | Neural tube development [20178782] | |
| Aid | chr3:75511365–75532825 | 3.8E–04 | 57 | 82 | N/A |
CNVRs presented in Table 2 are select loci from those that reached experimentally defined statistical significance (P < 5 × 10−4), which adjusts for multiple comparisons based on results obtained from repeated simulations (see Methods). See Supplementary Data 6 for additional loci that are marginally associated with at least one disease category