| Literature DB >> 34001112 |
Feyza Yilmaz1,2, Megan Null3, David Astling4, Hung-Chun Yu2, Joanne Cole2,5, Stephanie A Santorico3,5,6, Benedikt Hallgrimsson7, Mange Manyama8, Richard A Spritz2,5, Audrey E Hendricks3,5,6, Tamim H Shaikh9,10.
Abstract
BACKGROUND: Copy number variations (CNVs) account for a substantial proportion of inter-individual genomic variation. However, a majority of genomic variation studies have focused on single-nucleotide variations (SNVs), with limited genome-wide analysis of CNVs in large cohorts, especially in populations that are under-represented in genetic studies including people of African descent.Entities:
Keywords: African; Bantu; CNV; Copy number variation; Genome-wide
Mesh:
Year: 2021 PMID: 34001112 PMCID: PMC8130444 DOI: 10.1186/s12920-021-00978-z
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1CNV analysis and filtering pipeline. Workflow showing the various filtering steps applied to detected CNVs in order to obtain a set of high confidence CNVs used for further analysis
Fig. 2CNV blocks. a A schematic demonstrating the delineation of CNV blocks followed by determination of total count within the Bantu cohort and categorization based on frequency. Black and gray rectangles, a-h and j, k represent five overlapping CNVs observed in different individuals [1–5]. a–k represent the start and end coordinates of the CNVs. Blue rectangles represent CNV blocks. b Represents an actual example of CNV block delineation from our CNV dataset
Number and size distribution of CNVs in Bantu Africans
| CNV length | Number of probes | CNV | ||
|---|---|---|---|---|
| Count | Loss | Gain | Total | |
| 1–10 ≥ kb | 5–10 | 129,752 | 4878 | 134,630 |
| 11–25 | 80,315 | 14,633 | 94,948 | |
| 26–50 | 11,733 | 4049 | 15,782 | |
| 51–100 | 985 | 910 | 1895 | |
| > 100 | 4 | 55 | 59 | |
| Total | 222,789 | 24,525 | 247,314 | |
| > 10–100 ≥ kb | 5–10 | 15,900 | 1680 | 17,580 |
| 11–25 | 45,287 | 9715 | 55,002 | |
| 26–50 | 41,636 | 12,560 | 54,196 | |
| 51–100 | 18,785 | 6590 | 25,375 | |
| > 100 | 3714 | 2323 | 6037 | |
| Total | 125,322 | 32,868 | 158,190 | |
| > 100–300 ≥ kb | 5–10 | 37 | 10 | 47 |
| 11–25 | 650 | 76 | 726 | |
| 26–50 | 1122 | 492 | 1614 | |
| 51–100 | 1373 | 1119 | 2492 | |
| > 100 | 3413 | 1990 | 5403 | |
| Total | 6595 | 3687 | 10,282 | |
| ≥ 300 kb | 5–10 | 2 | 4 | 6 |
| 11–25 | 1 | 1 | 2 | |
| 26–50 | 12 | 28 | 40 | |
| 51–100 | 8 | 9 | 17 | |
| > 100 | 298 | 728 | 1026 | |
| Total | 321 | 770 | 1091 | |
Fig. 3Genomic Map of CNVRs. CNVRs detected in our cohort are shown as colored density plots across individual chromosomes represented by ideograms. The genome was divided into 1 million equal sized windows and the number of CNVRs within each window were counted and plotted on the density plot. Color key—red: loss CNVRs, blue: gain CNVRs, green: loss and gain CNVRs. Density was calculated by dividing the genome in equal sized windows (n = 1,000,000) and counting the number of CNVRs overlapping each of the windows
Bantu CNVRs overlap with CNV datasets
| CNV datasets | Total CNVRs |
|---|---|
| All four | 1952 |
| Any three | 10,046 |
| Any two | 4712 |
| DGV only | 338 |
| gnomAD only | 1 |
| Low mappability regions only | 396 |
| African CNVR | 1 |
| None | 48 |
| Total | 17,494 |
DGV: CNVRs generated from the Database of Genomic Variants CNVs, gnomAD: CNVRs generated from Genome Aggregation Database CNVs, African CNVR: CNVRs identified by Nyangiri and colleagues (44), All Four: CNVRs observed in all four datasets. Any Three and Any Two: CNVRs from any three or two of the above datasets respectively
Fig. 4Novel CNVRs. The chromosomal locations of CNVRs detected in the Bantu cohort, which did not overlap with known CNV datasets included in our comparison analysis. Vertical, colored lines represent individual CNVRs. Color key—red: loss blue: gain, green: loss and gain