| Literature DB >> 32499510 |
Timothy Hughes1,2, Lars Hansson3,4, Ibrahim Akkouh3,4, Riad Hajdarevic3,5, Jorunn S Bringsli6,7, Anja Torsvik6,7, Elin Inderhaug3,4, Vidar M Steen6,7, Srdjan Djurovic3,6.
Abstract
Alpha defensins are anti-microbial peptides of the innate immune system. The defensin A1 and A3 genes are located in a repeat array of variable copy number (the DEFA1A3 locus) and encode the human neutrophil peptides 1, 2 and 3. The possibility that copy number variation (CNV) may be associated with infection susceptibility and autoimmune pathology motivated the study of DEFA1A3 CNV across populations. We enhanced two existing methods (one qPCR-based and one sequencing-based) to enable copy number estimation that discriminates between DEFA1 and DEFA3 genes. We used these methods to quantify A1/A3 copy number variation in 2504 samples from the 1000 Genomes high-coverage dataset as well as performing FiberFISH assays on selected samples to visualize the haplotypes. These methods produce accurate estimates and show that there are substantial differences between populations. The African population is a clear outlier with a high frequency of the ancestral pure DEFA1 haplotype, but also harbours exceptionally long haplotypes of 24 copies of both DEFA1 and DEFA3, whilst the East Asian population displays the highest mean level of DEFA3 copy number. Further, our findings demonstrate that qPCR can be an accurate method for CNV estimation and that defensins substantially extend the known range of copy number variation for a human protein-coding gene.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32499510 PMCID: PMC7272440 DOI: 10.1038/s41598-020-65675-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overview of the defensin-α locus and HTS DEFA1/DEFA3 ratio estimation. Genes in black and pseudogenes in grey. WG: whole genome sequencing. GenomeStrip CNV regions are the regions defined in Handsaker et al.[8] which do not capture the true nature of the underlying DEFA1A3 repeat.
DEFA1 and DEFA3 copy number distribution statistics by super-population.
| Super-population | DEFA1 | DEFA3 | DEFA3 Absence (%) | ||||
|---|---|---|---|---|---|---|---|
| Min | Max | Mean | Min | Max | Mean | ||
| EUR | 0.8 | 11.6 | 5.7 | 0 | 4.4 | 1.4 | 13.1 |
| AFR | 0.9 | 14.5 | 5.7 | 0 | 10.8 | 1.5 | 28.7 |
| EAS | 0.7 | 16.1 | 6.1 | 0 | 5.1 | 1.5 | 11.5 |
| AMR | 0.6 | 10.3 | 5.2 | 0 | 4.5 | 1.4 | 12.7 |
| SAS | 1.0 | 11.1 | 5.6 | 0 | 5.3 | 1.3 | 11.2 |
| ALL | 0.6 | 16.1 | 5.7 | 0 | 10.8 | 1.4 | 16.3 |
The DEFA3 absence number is the percent of genotypes in the population with zero copies of DEFA3.
Figure 2HTS copy number estimates by super-population. (A) DEFA1 and DEFA3 copy number for all samples (light grey lines delineate genotypes with total CN of 5, 10, and 15). (B) 2D densities for each super-population. For both (A,B): AFR African (black), EUR European (blue), EAS East Asian (gold), SAS South Asian (green), AMR ad mixed American (red).
Figure 3Haplotypes by FiberFISH annotated with DEFA1A3 copy number. Probe colours as defined in Fig. 1: DEFA1A3 gene (red), DEFT1P pseudogene (green), left flank (red & blue), right flank (blue only). For HG02554, qPCR estimates match FiberFISH estimates exactly as this FiberFISH sample was used for calibration/scaling of the qPCR estimates.
Figure 4Concordance of HTS and qPCR copy number genotyping methods. (A). DEFA1, (B). DEFA3, (C). DEFA1A3, (D). DEFA1A3 vs DEFA1 + DEFA3 (both by qPCR). ρ: Pearson correlation coefficient. Solid line: y = x. Thin lines: y = x + 1 and y = x-1.