| Literature DB >> 23989647 |
Agnieszka Żmieńko1, Anna Samelak, Piotr Kozłowski, Marek Figlerowicz.
Abstract
Copy number variants (CNVs) are genomic rearrangements resulting from gains or losses of DNA segments. Typically, the term refers to rearrangements of sequences larger than 1 kb. This type of polymorphism has recently been shown to be a key contributor to intra-species genetic variation, along with single-nucleotide polymorphisms and short insertion-deletion polymorphisms. Over the last decade, a growing number of studies have highlighted the importance of copy number variation (CNV) as a factor affecting human phenotype and individual CNVs have been linked to risks for severe diseases. In plants, the exploration of the extent and role of CNV is still just beginning. Initial genomic analyses indicate that CNVs are prevalent in plants and have greatly affected plant genome evolution. Many CNV events have been observed in outcrossing and autogamous species. CNVs are usually found on all chromosomes, with CNV hotspots interspersed with regions of very low genetic variation. Although CNV is mainly associated with intergenic regions, many CNVs encompass protein-coding genes. The collected data suggest that CNV mainly affects the members of large families of functionally redundant genes. Thus, the effects of individual CNV events on phenotype are usually modest. Nevertheless, there are many cases in which CNVs for specific genes have been linked to important traits such as flowering time, plant height and resistance to biotic and abiotic stress. Recent reports suggest that CNVs may form rapidly in response to stress.Entities:
Mesh:
Year: 2013 PMID: 23989647 PMCID: PMC4544587 DOI: 10.1007/s00122-013-2177-7
Source DB: PubMed Journal: Theor Appl Genet ISSN: 0040-5752 Impact factor: 5.699
Fig. 1Potential effects of CNV on gene expression. a–c Examples of CNVs that result in an elevated transcript level; d–f Examples of CNVs that result in a decreased level of the full length transcript. Gene CNV (complete duplication or deletion) may change an effective gene dosage (a, b, d). CNV affecting an enhancer sequence may alter transcription level without change in gene copy number (c). Partial gene deletion (e) or insertion of a duplicated sequence (f) may disrupt gene structure and functionality. P promoter, G gene, R enhancer sequence
Genome-scale CNV genotyping studies in plant genomes
| Method | Accessions | CNVs count and characteristics | Gene content | References |
|---|---|---|---|---|
| Maize | ||||
| CGH, 2.12M NimbleGen 45–60-mer probes, matching B73 genome | Mo17 and B73 accessions | >400 CNVs and >1,700 presence–absence variants were identified (according to most stringent analysis criteria); detected differences mainly indicated lower copy number in Mo17 | At least 50 genes were located in CNVs segments and 180 in presence–absence variants | Springer et al. ( |
| CGH, 105K Agilent 60-mer probes, matching 45,000 ESTs and unigenes of B73 line | 14 inbred lines, including B73 reference line | >2,000 CNVs were identified; 42 % of regions were detected only in one line; 57 % changes indicated lower copy number in various accessions in comparison to B73; CNVs were distributed uniformly across chromosomes but higher CNV density was observed toward the telomeres | Due to probe design, all CNVs covered genic regions | Beló et al. ( |
| CGH, 120K NimbleGen 45–60-mer probes, matching 32,000 genes predicted in B73 genome | 19 inbred maize accessions, 14 wild or inbred teosinte accessions | 3,410 CNV genes had increased copy number in B73; 479 CNV genes had increased copy number in the tested accessions; CNV density resembled general genic density across the chromosomes; 86 % of structural variants was observed both in maize and in teosinte | Due to probe design, all CNVs covered genic regions; CNVs were observed in ~10 % of genes surveyed | Swanson-Wagner et al. ( |
| Whole-genome NGS, Illumina 75-bp paired-end reads, read-depth analysis, de novo assembly and annotation | Zheng58, 5003, 478, 178, Chang7-2, and Mo17 inbred lines | Only presence—absence variants were investigated; 296 genes putatively missing from one or more investigated lines were found; 570 putative novel genes were identified which were absent from B73 reference genome but present in the other of the six inbred lines; 157 genes were confirmed to be missing from B73, while about 300 are likely to be present in B73 line but not in the current genome sequence release | All analyzed presence–absence variants were in gene-coding regions; most deletion events involved only a single gene, some involved 2–4 adjacent genes, 1 large deletion on chromosome 6 of the Mo17 genome, which spans ~2 Mb involved at least 18 out of 24 genes | Lai et al. ( |
| Whole-genome NGS, Illumina 76-100 bp paired-end reads, read-depth analysis | 83 maize lines, 17 | 90 % of the non-overlapping 10-kb windows showed variation in read depth (at 1 % false discovery rate) and 70 % of windows had such variation in at least 10 of analyzed lines. | 10,000 gene-coding regions (32 %) exhibited at least twofold variation in read depth | Chia et al. ( |
| Arabidopsis | ||||
| Combination of CGH (Affymetrix Tiling 1.0R arrays) and whole-genome NGS (Illumina 35-36 bp single or paired-end reads, read-depth analysis) | Eil-0, Lc-0, Sav-0, Tsu-1, Col-0 (used as a reference) accessions | 55,000 25-bp tiles, on average were detected in each accession, which had relative hybridization signal ratio <−1.0 (log2) compared to the reference DNA and 0 read coverage across the entire length | 1,220 (Eil-0), 1,312 (Lc-0), 1,344 (Sav-0) and 987 (Tsu-1) genes with deletions were identified, over 36 % of deletions affected coding regions and transposable element genes were over-represented; about 20 % of protein-coding gene deletions were common in the four accessions | Santuari et al. ( |
| Whole-genome NGS, Illumina 42-64 bp paired-end reads, read-depth and paired-end analysis, de novo assembly | 80 naturally inbred accessions representing eight geographic regions from Eurasia and North Africa | 1,059 copy number variable regions were inferred, each represented by 1–13 CNV genotypes; CNVs size ranged from 1 to 13 kb | 393 CNVs overlapped with coding sequences, covering over 500 protein-coding genes | Cao et al. ( |
| Whole-genome NGS, Illumina 36–75 bp single- and paired-end reads, read-depth and paired-end analysis, reference-based assembly | Ler accession (comparative analysis to Col0) | 2,315 large indels including CNVs were found in Ler, widely dispersed along chromosomes | 316 genes were affected by large indels; 130 single-copy genes had complete deletion in Ler; 107 Ler-specific genes were predicted | Lu et al. ( |
| Rice | ||||
| CGH, 720K NimbleGen 45-60-mer probes, 500 bp spacing |
| 641 CNVs covering ~7.6 Mb of the rice genome were found; CNVs ranged from 1 to 180 kb; most CNVs indicated lower copy number in Guang-lu-ai 4 | 500 genes with lower copy number and 19 genes with higher copy number were identified in Guang-lu-ai 4 in comparison with Nipponbare | Yu et al. ( |
| Whole-genome NGS, Illumina 45–100 bp paired-end reads, read-depth and paired-end analysis, de novo assembly | 40 cultivated rice accessions (Nipponbare was used as a reference) and 10 accessions of wild | 1,415 novel genes were found (48 % of them were observed in only one accession and 22 %—only in wild rice); 1,327 possible gene loss events were detected by read-depth analysis and 839 were supported by paired-end mapping; 1,676 CNVs with increased copy number in at least one accession were found | All analyzed presence/absence variants and over 50 % of CNVs covered genic regions; 39 % of CNV genes coded for hypothetical or functional unknown proteins and many of the annotated genes were disease-resistance related | Xu et al. ( |
| Sorghum | ||||
| Whole-genome NGS, Illumina 44 bp paired-end reads, read-depth and paired-end analysis, de novo assembly | Keller, E-Tian, Ji2731 and BTx623 (used as a reference) accessions | 16,487 presence/absence variants with average length of 2,394 bp were found; 17,111 CNVs (13,427 gains and 3,684 losses) of 2 kb—48 Mb were detected | Presence/absence variants co-localized with 1,416 genes; CNVs co-localized with 2,600 genes; 32 of them were identified in all three lines | Zheng et al. ( |
| Soybean | ||||
| CGH, 700K NimbleGen 50–75 bp probes with 1 kb median interval; exome NGS, NimbleGen soybean exome chip, Illumina 76-bp paired-end reads | Kingwa and Williams cultivars; individuals of Williams 82 cultivar | High level of structural variation was observed between Williams and Kingwa genotypes on all 20 chromosomes; significant level of CNV was also observed among individuals of Williams 82 cultivar, mainly within known regions of heterogeneity; most of those CNVs were also detected between the parental Williams and Kingwa genotypes | 25 genes showed presence–absence variation between Williams 82 individuals; 5 of them were LRR genes; 22 of them reside within 10-Mb region of chromosome 3 | Haun et al. ( |
| CGH, 700K NimbleGen array, 50-75 bp probes with 1 kb median interval; exome NGS, NimbleGen soybean exome chip; Illumina 76-bp paired-end reads | Archer, Minsor, Noir 1, Williams 82 (used as a reference) accessions | 188–267 CNVs per genotype comparison were discovered, with the median size 18–23 kb; at least 133 presence–absence variants were found; unequal distribution of CNVs was observed (e.g., little variation on chromosomes 5 and 11 but extended variation regions on chromosomes 3 and 18) | 672 genes localized within CNVs; they were mainly copy-loss event; genes with function in disease resistance and response to biotic stress were abundant | McHale et al. ( |
| Whole-genome NGS, Illumina 45- or 76-bp paired-end reads, read-depth analysis, paired-end mapping | 17 wild and 14 cultivated accessions | Over 186,000 presence–absence variants were identified between wild and cultivated soybeans; comparison of genomes of wild W05 accession (de novo sequenced at 80×) and the reference Williams revealed over 5,500 large presence–absence variants (>500 bp) | 856 genes were localized within regions of variation between W05 and Williams 82; over 40 % of them related to binding, metabolic and catalytic processes; 28 variants were absent from genomes of all cultivated accessions and were primarily related to disease resistance and metabolism | Lam et al. ( |
| Wheat | ||||
| Liquid-phase targeted exome NGS, Illumina 40-bp single end reads, read-depth analysis | Tetraploid | 85 CNVs and 9 deletions were identified: 77 copy gain events/8 deletions were found in the cultivated genome and 8 copy gain events/1 deletion in the wild wheat | Genes within CNVs encoded proteins involved in response to biotic and abiotic stresses, regulating gene expression or translation, cellular metabolism and kinases | Saintenac et al. ( |
| Potato | ||||
| BAC-FISH analysis, using 18 randomly selected BAC clones mapping to potato chromosome 6 | Atlantic and Katahdin cultivars; selected BACs were surveyed in additional 14 cultivars | 6 BACs generated signals suggesting deletions in Atlantic and Katahdin cultivars. For BACs RH102I10 and RH83C08, deletions were detected in multiple cultivars | One BAC clone RH102I10 was analyzed in terms of gene content. It spans 19 annotated genes; 4 of them were analyzed and their normalized transcript levels correlated positively and significantly with RH102I10 copy number in different genotypes; in addition, female gametes with fewer copies of RH102I10 were found to be inferior compared with those with more copies of this CNV | Iovene et al. ( |
Confirmed examples of CNV affecting plant phenotype
| CNV region | Attribute | Gene(s)/product(s) | Description | References |
|---|---|---|---|---|
| Soybean | ||||
| Rhg1 locus on chromosome 18, 31 kb |
|
| Overexpression of all three genes together (but not individual genes) provides resistance to nematode; 10 tandem copies are present in | Cook et al. ( |
| Palmer amaranth | ||||
| Distributed all over the genome | Acquired resistance to glyphosate treatment |
| Increased copy number of | Gaines et al. ( |
| Barley | ||||
| Boron-tolerance QTL on chromosome 4H | High boron tolerance of Algerian landrace Sahara 3771 |
| Tolerant Sahara 3771 genotype contains ~4 times more | Sutton et al. ( |
| Frost resistance-2 locus on chromosome 5, genetically linked with Vrn1-locus | vrn-H1 winter allele associated with winter-hardy genotypes and Vrn-H1 spring allele associated with non-winter-hardy genotypes | A cluster of | Tandem segmental duplications through the CBF2A–CBF4B genomic region differentiate freeze-tolerant genotypes from sensitive genotypes which carry single copies of those genes | Knox et al. ( |
| Wheat | ||||
| Vrn-1 locus on chromosome 5A | Differing vernalization-requirements associated with Vrn1-A allele, which influence flowering time |
| Copy number of | Díaz et al. ( |
| Ppd-1 locus on chromosome 2B | Day-neutral phenotype associated with Ppd-B1a alleles in several varieties, influencing flowering time |
| Day-neutral genotypes carry 2-4 haploid copies of | Díaz et al. ( |
| Rht-D1 locus on chromosome 4D | Dominant |
| Tandem segmental duplication (TSD) of a >1 Mb region result in two copies of the Rht-D1b; Rht1-D1c is threefold more effective in reducing plant height than a single Rht-D1b | Pearce et al. ( |
| Rice | ||||
| Submergence 1 (Sub1) locus on chromosome 9 | Tolerance-specific allele Sub1A-1 associated with enhanced submergence tolerance in |
| Presence of | Xu et al. ( |
| Maize | ||||
| Aluminum (Al) tolerance QTL in telomeric region of chromosome 6 | Al tolerance associated with ZmMATE1 gene in a tolerant line Al237 |
| Tandem triplication of | Maron et al. ( |
| Tunicate1 (Tu1) locus on long arm of chromosome 4 | A dominant mutation causing pleiotropic phenotype; it affects phase transition, branch meristem formation, spikelet initiation, and sex determination; predominant feature is tunicate phenotype—mature kernels of the cob are covered by glumes |
| In pod corn 5′ regulatory region of ZMM19 gene is fused by a 1.8-Mb chromosomal inversion to the 3′ region of a gene expressed in the inflorescence, which leads to mild half-tunicate phenotype. A 30-kb tandem duplication of the rearranged region results in severe tunicate phenotype observed in some plants | Han et al. ( |
Fig. 2Gene CNV contributes to wheat phenotypic diversity. a CNV of Vrn-A1 gene controls flowering time by affecting vernalization requirement; b CNV of Ppd-B1 controls flowering time by affecting photoperiod sensitivity; c CNV of Rht-D1b gene (a truncated version of Rht-D1a) determines severity of plant dwarfism phenotype. In all three cases, the impact of gene copy number on observed phenotype has been verified experimentally. Source data: a, b Díaz et al. (2012); c Li et al. (2012)
Fig. 3Glyphosate resistance in Palmer amaranth mediated by CNV of EPSPS gene. a Graphical representation of the shikimate pathway. Step 7 is catalyzed by EPSPS enzyme; b–d mechanism of EPSPS inhibition by glyphosate and its overcoming by increased number of EPSPS gene copies. In absence of glyphosate, PEP and S3P bind to EPSPS (b). When glyphosate is present, it competitively binds to EPSPS, mimicking an intermediate state of the ternary enzyme–substrates complex and inhibiting EPSPS (c). Amplification of EPSPS gene leads to production of additional protein molecules and PEP binding, even in presence of glyphosate (d). e Differences in EPSPS gene copy number between glyphosate susceptible and glyphosate-resistant Palmer amaranth individuals. EPSPS 5-enolpyruvylshikimate-3-phosphate synthase, PEP phosphoenol pyruvate, S3P shikimate-3-phosphate, EPSP 5-enolpyruvylshikimate 3-phosphate, G glyphosate