| Literature DB >> 27440920 |
Philippe Lashermes1, Yann Hueber2, Marie-Christine Combes2, Dany Severac3, Alexis Dereeper4.
Abstract
Allopolyploidization is a biological process that has played a major role in plant speciation and evolution. Genomic changes are common consequences of polyploidization, but their dynamics over time are still poorly understood. Coffea arabica, a recently formed allotetraploid, was chosen to study genetic changes that accompany allopolyploid formation. Both RNA-seq and DNA-seq data were generated from two genetically distant C. arabica accessions. Genomic structural variation was investigated using C. canephora, one of its diploid progenitors, as reference genome. The fate of 9047 duplicate homeologous genes was inferred and compared between the accessions. The pattern of SNP density along the reference genome was consistent with the allopolyploid structure. Large genomic duplications or deletions were not detected. Two homeologous copies were retained and expressed in 96% of the genes analyzed. Nevertheless, duplicated genes were found to be affected by various genomic changes leading to homeolog loss or silencing. Genetic and epigenetic changes were evidenced that could have played a major role in the stabilization of the unique ancestral allotetraploid and its subsequent diversification. While the early evolution of C. arabica mainly involved homeologous crossover exchanges, the later stage appears to have relied on more gradual evolution involving gene conversion and homeolog silencing.Entities:
Keywords: evolution; gene conversion; genome dominance; homoeologous recombination; polyploidy
Mesh:
Year: 2016 PMID: 27440920 PMCID: PMC5015950 DOI: 10.1534/g3.116.030858
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Flow chart of methods used to analyze the fate of duplicate genes in allotetraploid C. arabica.
Characterization of four regions in C. arabica (acc. Caturra) carrying contiguous genes exhibiting homeolog losses, using the 11 chromosomes of C. canephora as genomic reference sequence
| A | 2 | 4 | Cc02g15410/Cc02g15440 | 15.4 |
| B | 2 | 7 | Cc02g39930/Cc02g39990 | 68.0 |
| C | 7 | 142 | Cc07g00010/Cc07g01770 | 1197.9 |
| D | 10 | 3 | Cc10g00010/Cc10g00030 | 27.8 |
Figure 2GC-content normalized DNA copy number profile, and FREEC-predicted copy number alteration (Red: gains; blue: losses) for the C. arabica genome (acc. Caturra) using a nonoverlapping 50-kb sliding window and the 11 chromosomes of C. canephora as genomic reference sequence. Automatically predicted copy numbers are shown in black (line).
Figure 3SNP density along the 11 homeologous chromosome groups in C. arabica (acc. Caturra). The 11 chromosomes of C. canephora (acc. DH200-94) were used as reference genome. Nonoverlapping 10-kb sliding windows, and coverage criteria required to consider a position to minimize the rate of false SNPs, were applied to estimate the density of SNPs in C. arabica. The relative proportion (percentage nucleotides) in C. canephora (1-Mb sliding window) of transposable elements (green) and genes (blue) are shown at the bottom.
Figure 4Examples of regions exhibiting homeologous SNP deficit in C. arabica (acc. Caturra) on homeologous chromosome groups 2 and 7. Nonoverlapping 10-kb sliding windows were used to estimate SNP density in C. arabica (Yellow track). To minimize the rate of false-positive SNPs, a minimum depth coverage of 10 was required for a position to be taken into consideration, and positions with a depth coverage more than twice the overall sample mean were discarded (Blue track).
Determination of the subgenome origin of homeolog loss (A) and silencing events (B) detected in two accessions of C. arabica
| AR41 | 174 | 120 | 39 | 15 |
| Caturra | 148 | 110 | 27 | 11 |
| Loss shared by both accessions | 143 | 108 | 25 | 10 |
| Loss not shared | 36 | 14 | 17 | 5 |
Figure 5Distribution of loci exhibiting either homeolog loss (A) or homeolog silencing (B) identified in C. arabica (acc. Caturra) across C. canephora reference chromosomes. Single events of either homeolog loss or homeolog silencing are in red, and regions carrying contiguous genes exhibiting either homeolog loss or homeolog silencing are in blue.
Gene ontology enrichment analysis of genes exhibiting homeolog silencing shared by the two accessions of C. arabica analyzed
| GO:0043229 | Intracellular organelle | C | 4.151711E-5 | 3.610184E-7 | 13 | 3977 | 53 | 3898 | Under |
| GO:0043226 | Organelle | C | 4.151711E-5 | 3.610184E-7 | 13 | 3977 | 53 | 3898 | Under |
| GO:0043227 | Membrane-bounded organelle | C | 7.544177E-5 | 1.312031E-6 | 13 | 3881 | 53 | 3994 | Under |
| GO:0043231 | Intracellular membrane-bounded organelle | C | 7.544177E-5 | 1.312031E-6 | 13 | 3881 | 53 | 3994 | Under |
| GO:0005623 | Cell | C | 1.361508E-4 | 3.219604E-6 | 24 | 5116 | 42 | 2759 | Under |
| GO:0044464 | Cell part | C | 1.361508E-4 | 3.561238E-6 | 24 | 5098 | 42 | 2777 | Under |
| GO:0044424 | Intracellular part | C | 1.361508E-4 | 4.143721E-6 | 19 | 4510 | 47 | 3365 | Under |
| GO:0034641 | Cellular nitrogen compound metabolic process | P | 1.882623E-4 | 6.944386E-6 | 2 | 1852 | 64 | 6023 | Under |
| GO:0005622 | Intracellular | C | 1.882623E-4 | 7.366786E-6 | 21 | 4679 | 45 | 3196 | Under |
| GO:0006725 | Cellular aromatic compound metabolic process | P | 5.411463E-4 | 3.058653E-5 | 2 | 1720 | 64 | 6155 | Under |
| GO:1901360 | Organic cyclic compound metabolic process | P | 5.411463E-4 | 3.058653E-5 | 2 | 1720 | 64 | 6155 | Under |
| GO:0046483 | Heterocycle metabolic process | P | 5.411463E-4 | 3.058653E-5 | 2 | 1720 | 64 | 6155 | Under |
| GO:0006139 | Nucleobase-containing compound metabolic p. | P | 5.411463E-4 | 3.058653E-5 | 2 | 1720 | 64 | 6155 | Under |
| GO:0005634 | Nucleus | C | 8.456478E-3 | 5.147422E-4 | 2 | 1413 | 64 | 6462 | Under |
| GO:0044444 | Cytoplasmic part | C | 8.769347E-3 | 6.260661E-4 | 14 | 3281 | 52 | 4594 | Under |
| GO:0010467 | Gene expression | P | 8.769347E-3 | 6.331017E-4 | 0 | 912 | 66 | 6963 | Under |
| GO:0003676 | Nucleic acid binding | F | 8.769347E-3 | 6.481691E-4 | 1 | 1146 | 65 | 6729 | Under |
| GO:0034645 | Cellular macromolecule biosynthetic process | P | 1.039839E-2 | 9.925819E-4 | 0 | 860 | 66 | 7015 | Under |
| GO:0044249 | Cellular biosynthetic process | P | 1.039839E-2 | 9.925819E-4 | 0 | 860 | 66 | 7015 | Under |
| GO:0009059 | Macromolecule biosynthetic process | P | 1.039839E-2 | 9.925819E-4 | 0 | 860 | 66 | 7015 | Under |
| GO:0044271 | Cellular nitrogen compound biosynthetic proc. | P | 1.039839E-2 | 9.925819E-4 | 0 | 860 | 66 | 7015 | Under |
| GO:1901576 | Organic substance biosynthetic process | P | 1.039839E-2 | 9.946287E-4 | 0 | 861 | 66 | 7014 | Under |
| GO:0006807 | Nitrogen compound metabolic process | P | 1.267849E-2 | 1.267849E-3 | 7 | 2175 | 59 | 5700 | Under |
| GO:0005737 | Cytoplasm | C | 2.685331E-2 | 2.802085E-3 | 19 | 3719 | 47 | 4156 | Under |
| GO:0044237 | Cellular metabolic process | P | 2.979136E-2 | 3.238192E-3 | 14 | 3065 | 52 | 4810 | Under |
Analysis was performed using the full set of 9047 analyzed genes as reference group and Fisher’s exact test with a false discovery rate (FDR) correction for multiple testing.