| Literature DB >> 29703149 |
Vinicius H da Silva1,2, Veronika N Laine3,4,5, Mirte Bosse4, Kees van Oers3, Bert Dibbits4, Marcel E Visser3, Richard P M A Crooijmans3,4, Martien A M Groenen3.
Abstract
BACKGROUND: Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations.Entities:
Keywords: Duplication; Genetic variation; Inheritance; Parus major; Recombination
Mesh:
Substances:
Year: 2018 PMID: 29703149 PMCID: PMC6389189 DOI: 10.1186/s12864-018-4577-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1CNV inheritance in mother-daughter family structure. We inferred the percentage of CNVs in mothers overlapping CNVs at the same state (gain or loss) in their respective daughters. The x-axis indicates distinct groups of CNVs which were classified based on the number of SNP probes supporting each of them. CNVs supported by 50 SNP probes or more are grouped together. In the y-axis the percentage of inherited CNVs represents the ratio between all CNVs and inherited ones in each probe group. The number of CNVs per group is reflected by the dot size. a: All CNVRs. b: Polymorphic CNVRs (≥ 21 birds, at least 1% of the population with CNVs identified)
Fig. 2CNVR example and the strategy to estimate the frequency of CNVs which are sharing breakpoints. The frequency for a given genomic interval is given by the number of CNVs starting or ending at certain SNP probes. All the windows around the breakpoints have 10 kb and may have one frequency for the common start positions and one for the end positions
Homologous regions in the great tit genome with more than 90% of sequence identity and respective proportions of intra and interchromosomal homologies
| Homology | Number of regions | Total size (Mb) | Similarity (+-SD) |
|---|---|---|---|
| Intrachromosomal | 1111 | 2.66 | 92.97+-2.26 |
| Interchromosomal | 879 | 1.58 | 92.78+-2.1 |
| All | 1512 | 3.44 | 92.89+-2.25 |
Fig. 3Overlap of CNV breakpoints with repetitive regions in the genome. CNV breakpoints with 10 in frequency or more are grouped together. a: Homologous regions with more than 90% in similarity and 1 kb. b: Masked regions as retroelements, RNA-related regions, DNA transposons and in-tandem repeats
Fig. 4Colocalization of CNV breakpoints (10 kb windows with ≥ 4 in frequency) and homologous regions binned by sequence identity. The y-axis depicts the ratio between observed and expected number of overlaps (based on 10,000 randomic simulations) between CNV breakpoints and homologous regions. Homologous regions are placed in one of the bin classes in the x-axis which are based on inter- or intrachromosomal percent identities. Permutation p-values are based on the number of random simulations that obtained more overlaps than observed (* ≤ 0.05 and *** ≤ 0.001)
Fig. 5Overlap of CNV breakpoints with functional features and regions prone to breakage. a: CpG sites. b: Transcription start sites (TSSs). c: AT-rich intervals. CNV breakpoints observed in 30 birds or more are grouped together for CpG and AT-rich features. Otherwise, in TSSs we grouped those with 10 or in frequency because most of high frequent CNV breakpoints are small groups and can impair confident comparison with more scarce features as TSSs (in comparison with CpG or AT-rich sites)
Biological pathways enriched at CNVRs in the great tit genome
| ID | Description | Number of proteins | Ajusted | Protein ratio |
|---|---|---|---|---|
| hsa05412 | Arrhythmogenic right ventricular cardiomyopathy (ARVC) | 59 | 5.15 × 10−6 | 0.728 |
| hsa04020 | Calcium signaling pathway | 126 | 1.16 × 10−4 | 0.583 |
| hsa04360 | Axon guidance | 127 | 3.99 × 10−4 | 0.57 |
| hsa04724 | Glutamatergic synapse | 78 | 8.2 × 10−4 | 0.609 |
| hsa04514 | Cell adhesion molecules (CAMs) | 75 | 8.2 × 10−4 | 0.638 |
| hsa04925 | Aldosterone synthesis and secretion | 60 | 8.2 × 10−4 | 0.61 |
| hsa04713 | Circadian entrainment | 67 | 3.1 × 10−3 | 0.604 |
| hsa00220 | Arginine biosynthesis | 19 | 3.15 × 10−3 | 0.826 |
| hsa04970 | Salivary secretion | 48 | 1.34 × 10−2 | 0.615 |
| hsa04022 | cGMP-PKG signaling pathway | 105 | 1.73 × 10−2 | 0.591 |
| hsa05410 | Hypertrophic cardiomyopathy (HCM) | 55 | 1.73 × 10−2 | 0.536 |
| hsa04740 | Olfactory transduction | 29 | 1.73 × 10−2 | 0.674 |
| hsa05010 | Alzheimer’s disease | 78 | 3.84 × 10−2 | 0.545 |
| hsa04750 | Inflammatory mediator regulation of TRP channels | 60 | 4.92 × 10−2 | 0.561 |
| hsa05414 | Dilated cardiomyopathy | 57 | 4.92 × 10−2 | 0.564 |
ID = pathway identification code; Description = pathway name; Number of proteins = number of protein names with genes overlapping CNVRs; Adjusted p-value = enrichment FDR corrected p-value; Protein ratio = ratio between protein names with genes in CNVRs and all protein names assigned to a specific pathway