| Literature DB >> 31134109 |
Danilo Augusto Sforça1, Sonia Vautrin2, Claudio Benicio Cardoso-Silva1, Melina Cristina Mancini1, María Victoria Romero-da Cruz1, Guilherme da Silva Pereira3, Mônica Conte1, Arnaud Bellec2, Nair Dahmer1, Joelle Fourment2, Nathalie Rodde2, Marie-Anne Van Sluys4, Renato Vicentini1, Antônio Augusto Franco Garcia3, Eliana Regina Forni-Martins1, Monalisa Sampaio Carneiro5, Hermann Paulo Hoffmann5, Luciana Rossini Pinto6, Marcos Guimarães de Andrade Landell6, Michel Vincentz1, Helene Berges2, Anete Pereira de Souza1.
Abstract
Sugarcane (Saccharum spp.) is highly polyploid and aneuploid. Modern cultivars are derived from hybridization between S. officinarum and S. spontaneum. This combination results in a genome exhibiting variable ploidy among different loci, a huge genome size (~10 Gb) and a high content of repetitive regions. An approach using genomic, transcriptomic, and genetic mapping can improve our knowledge of the behavior of genetics in sugarcane. The hypothetical HP600 and Centromere Protein C (CENP-C) genes from sugarcane were used to elucidate the allelic expression and genomic and genetic behaviors of this complex polyploid. The physically linked side-by-side genes HP600 and CENP-C were found in two different homeologous chromosome groups with ploidies of eight and ten. The first region (Region01) was a Sorghum bicolor ortholog region with all haplotypes of HP600 and CENP-C expressed, but HP600 exhibited an unbalanced haplotype expression. The second region (Region02) was a scrambled sugarcane sequence formed from different noncollinear genes containing partial duplications of HP600 and CENP-C (paralogs). This duplication resulted in a non-expressed HP600 pseudogene and a recombined fusion version of CENP-C and the orthologous gene Sobic.003G299500 with at least two chimeric gene haplotypes expressed. It was also determined that it occurred before Saccharum genus formation and after the separation of sorghum and sugarcane. A linkage map was constructed using markers from nonduplicated Region01 and for the duplication (Region01 and Region02). We compare the physical and linkage maps, demonstrating the possibility of mapping markers located in duplicated regions with markers in nonduplicated region. Our results contribute directly to the improvement of linkage mapping in complex polyploids and improve the integration of physical and genetic data for sugarcane breeding programs. Thus, we describe the complexity involved in sugarcane genetics and genomics and allelic dynamics, which can be useful for understanding complex polyploid genomes.Entities:
Keywords: chimerical gene; genetic mapping; homologs; physical mapping; polyploid; sugarcane
Year: 2019 PMID: 31134109 PMCID: PMC6514446 DOI: 10.3389/fpls.2019.00553
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Schematic representation of the sugarcane BAC haplotypes from Region01 and Region02. Squares of the same color represent sugarcane genes orthologous to Sorghum bicolor genes. Dotted lines connect the homologous genes in sugarcane at different positions. In sugarcane Region02, the CENP-C haplotypes in Region02 are represented by two squares (blue and pink), where each square represents a partial gene fusion. The dark gray strip represents the shared region from Region01 and Region02 (duplication). The genes in light gray (from S. bicolor) are not found in the sugarcane BACs. The representation is not to scale. The orientation of transcription is indicated by the direction of the arrow at the end of each gene.
Figure 2Representation of each sugarcane BAC from Region01 and Region02. Arrows and rectangles of the same color represent the homologous genes in sugarcane. Black rectangles represent repeat regions. Yellow lines represent gaps. Similar regions are represented by a gray shadow connecting the BACs. The orientation of transcription is indicated by the direction of the arrow at the end of each gene. Scale representation.
Figure 3FISH of the sugarcane BACs. (A) BAC Shy065N22 hybridization in sugarcane variety SP-803280 mitosis showing eight signals for Region01. (B) BAC Shy048L15 hybridization in sugarcane variety SP-803280 mitosis showing 10 signals for Region02.
Figure 4Fusion gene formation of CENP-C and Sobic003G299500. (A) Sorghum CENP-C and Sobic003G299500 genome location. (B) Sugarcane genomic CENP-C haplotypes in Region01 (all expressed). (C) Partially duplicated sugarcane paralogs of CENP-C and Sobic003G299500 haplotypes in Region02 (only haplotypes XI/XII/XIII and haplotype XIV have evidence of expression). (D) Sugarcane ortholog of Sobic003G299500 found in the sugarcane R570 BAC library. (E) Transcripts from sugarcane SP80-3280 mapped against the CDS of sugarcane CENP-C haplotypes from Region01. (F) Transcripts from sugarcane SP80-3280 mapped against the sugarcane chimeric paralogs of CENP-C and Sobic003G299500. As evidence of fusion gene formation, the transcripts show the fusion point of the paralogs. (G) Transcripts from sugarcane SP80-3280 mapped against the CDS of the sugarcane R570 Sobic003G299500 ortholog.
Genomic frequencies of the SNPs in the HP600 haplotypes in Region01.
| 1 | C | G -> C | SNP (transversion) | 12 | 443 | 101 | Yes | 0.23 | 1 | 7 | 0.125 | 2.32E-09 | 2 | 6 | 0.25 | 2.98E-01 |
| 2 | – | -C | Deletion | 78 | 515 | 28 | Yes | 0.05 | 1 | 7 | 0.125 | 1.13E-07 | 2 | 6 | 0.25 | 4.76E-32 |
| 3 | T | C -> T | SNP (transition) | 133 | 542 | 38 | Yes | 0.07 | 1 | 7 | 0.125 | 5.16E-05 | 2 | 6 | 0.25 | 1.62E-27 |
| 4 | A | G -> A | SNP (transition) | 153 | 577 | 33 | Yes | 0.06 | 1 | 7 | 0.125 | 9.76E-08 | 2 | 6 | 0.25 | 1.56E-34 |
| 5 | TT | GG -> TT | Substitution | 166 | 699 | 137 | Yes | 0.2 | 1 | 7 | 0.125 | 1.18E-07 | 2 | 6 | 0.25 | 8.85E-04 |
| 6 | T | C -> T | SNP (transition) | 263 | 569 | 55 | No | 0.1 | 1 | 7 | 0.125 | 4.23E-02 | 1 | 7 | 0.125 | 4.23E-02 |
| 7 | (GAG)3 -> (GAG)2 | Deletion (tandem repeat) | 283 | 654 | 42 | No | 0.06 | 1 | 7 | 0.125 | 4.35E-07 | 1 | 7 | 0.125 | 4.35E-07 | |
| 8 | C | T -> C | SNP (transition) | 429 | 849 | 83 | No | 0.1 | 1 | 7 | 0.125 | 1.68E-02 | 1 | 7 | 0.125 | 1.68E-02 |
| 9 | A | G -> A | SNP (transition) | 434 | 993 | 69 | No | 0.07 | 1 | 7 | 0.125 | 1.68E-08 | 1 | 7 | 0.125 | 1.68E-08 |
| 10 | C | G -> C | SNP (transversion) | 436 | 1035 | 275 | Yes | 0.27 | 2 | 6 | 0.25 | 2.51E-01 | 3 | 5 | 0.375 | 1.196E-13 |
| 11 | T | G -> T | SNP (transversion) | 463 | 936 | 56 | No | 0.06 | 1 | 7 | 0.125 | 5.11E-11 | 1 | 7 | 0.125 | 5.11E-11 |
| 12 | A | C -> A | SNP (transversion) | 519 | 679 | 57 | No | 0.08 | 1 | 7 | 0.125 | 9.10E-04 | 1 | 7 | 0.125 | 9.10E-04 |
Genome and transcriptome SNPs were used. The global expression (in diverse tissues) was used to determine whether the genomic frequency could explain the transcription frequency (H0). The binomial test was used to verify H0. The
in p-values reflect the acceptance of H.
Genomic frequencies of the SNPs in the CENP-C haplotypes in Region01 and Region02.
| 1 | G | C -> G | SNP (transversion) | 106 | 16 | 13 | Yes | 0.81 | 5 | 3 | 0.63 | 1.95E-01 | 4 | 4 | 0.5 | 2.13E-02 |
| 2 | G | A -> G | SNP (transition) | 150 | 19 | 8 | Yes | 0.42 | 1 | 7 | 0.13 | 1.25E-03 | 2 | 6 | 0.25 | 1.08E-01 |
| 3 | C | G -> C | SNP (transversion) | 246 | 34 | 7 | Yes | 0.21 | 1 | 7 | 0.13 | 1.87E-01 | 2 | 6 | 0.25 | 6.93E-01 |
| 4 | T | A -> T | SNP (transversion) | 369 | 65 | 7 | Yes | 0.11 | 1 | 7 | 0.13 | 8.51E-01 | 2 | 6 | 0.25 | 6.14E-03 |
| 5 | A | G -> A | SNP (transition) | 371 | 68 | 19 | No | 0.28 | 1 | 7 | 0.13 | 6.21E-04 | 1 | 7 | 0.13 | 6.21E-04 |
| 6 | C | T -> C | SNP (transition) | 390 | 64 | 15 | No | 0.23 | 1 | 7 | 0.13 | 1.32E-02 | 1 | 7 | 0.13 | 1.32E-02 |
| 7 | G | T -> G | SNP (transversion) | 513 | 46 | 12 | Yes | 0.26 | 3 | 5 | 0.38 | 1.28E-01 | 4 | 4 | 0.5 | 1.64E-03 |
| 8 | A | G -> A | SNP (transition) | 518 | 45 | 10 | Yes | 0.22 | 2 | 6 | 0.25 | 7.34E-01 | 3 | 5 | 0,375 | 4.40E-02 |
| 9 | T | G -> T | SNP (transversion) | 731 | 54 | 8 | Yes | 0.15 | 2 | 6 | 0.25 | 1.14E-01 | 3 | 5 | 0,375 | 3.58E-04 |
| 10 | C | A -> C | SNP (transversion) | 1008 | 56 | 9 | No | 0.16 | 1 | 7 | 0.13 | 4.17E-01 | 1 | 7 | 0.13 | 4.17E-01 |
| 11 | T | C -> T | SNP (transition) | 1061 | 91 | 29 | Yes | 0.32 | 2 | 6 | 0.25 | 1.46E-01 | 3 | 5 | 0,375 | 2.81E-01 |
| 12 | T | C -> T | SNP (transition) | 1088 | 77 | 41 | Yes | 0.53 | 4 | 4 | 0.50 | 6.48E-01 | 3 | 5 | 0,375 | 6.37E-03 |
| 13 | T | C -> T | SNP (transition) | 1190 | 76 | 9 | Yes | 0.12 | 2 | 6 | 0.25 | 7.49E-03 | 3 | 5 | 0,375 | 1.10E-06 |
| 14 | A | G -> A | SNP (transition) | 1209 | 76 | 20 | No | 0.26 | 1 | 7 | 0.13 | 1.31E-03 | 1 | 7 | 0.13 | 1.31E-03 |
| 15 | T | A -> T | SNP (transversion) | 1251 | 62 | 10 | Yes | 0.16 | 2 | 6 | 0.25 | 1.41E-01 | 3 | 5 | 0,375 | 3.29E-04 |
| 16 | G | A -> G | SNP (transition) | 1255 | 62 | 55 | Yes | 0.89 | 6 | 2 | 0.75 | 1.19E-02 | 5 | 3 | 0,625 | 5.15E-06 |
| 17 | -ATG | Deletion | 1307 | 75 | 9 | Yes | 0.12 | 1 | 7 | 0.13 | 1.00E+00 | 2 | 6 | 0.25 | 7.38E-03 | |
| 18 | G | A -> G | SNP (transition) | 1314 | 90 | 23 | Yes | 0.26 | 1 | 7 | 0.13 | 6.50E-04 | 2 | 6 | 0.25 | 9.03E-01 |
| 19 | G | T -> G | SNP (transversion) | 1347 | 103 | 13 | Yes | 0.13 | 2 | 6 | 0.25 | 2.88E-03 | 3 | 5 | 0,375 | 3.09E-08 |
| 20 | A | T -> A | SNP (transversion) | 1384 | 101 | 37 | Yes | 0.37 | 1 | 7 | 0.13 | 5.30E-10 | 2 | 6 | 0.25 | 1.09E-02 |
| 21 | G | C -> G | SNP (transversion) | 1424 | 80 | 9 | No | 0.11 | 1 | 7 | 0.13 | 8.66E-01 | 1 | 7 | 0.13 | 8.66E-01 |
| 22 | A | C -> A | SNP (transversion) | 1437 | 84 | 10 | Yes | 0.12 | 1 | 7 | 0.13 | 1.00E+00 | 2 | 6 | 0.25 | 5.12E-03 |
| 23 | TT | AA -> TT | Substitution | 1481 | 62 | 7 | No | 0.11 | 1 | 7 | 0.13 | 1.00E+00 | 1 | 7 | 0.13 | 1.00E+00 |
| 24 | G | A -> G | SNP (transition) | 1527 | 106 | 90 | Yes (duplication) | 0.85 | ||||||||
| 25 | C | T -> C | SNP (transition) | 1540 | 139 | 86 | Yes (duplication) | 0.62 | ||||||||
| 26 | A | T -> A | SNP (transversion) | 1584 | 253 | 235 | Yes (duplication) | 0.93 | ||||||||
| 27 | A | G -> A | SNP (transition) | 1638 | 247 | 39 | Yes (duplication) | 0.16 | ||||||||
| 28 | C | A -> C | SNP (transversion) | 1648 | 209 | 106 | Yes (duplication) | 0.51 | ||||||||
| 29 | A | C -> A | SNP (transversion) | 1739 | 122 | 16 | Yes (duplication) | 0.13 | ||||||||
| 30 | T | C -> T | SNP (transition) | 1751 | 132 | 32 | Yes (duplication) | 0.24 | ||||||||
| 31 | A | G -> A | SNP (transition) | 1753 | 138 | 16 | Yes (duplication) | 0.12 | ||||||||
| 32 | A | C -> A | SNP (transversion) | 1762 | 131 | 21 | No (duplication) | 0.16 | ||||||||
| 33 | T | A -> T | SNP (transversion) | 1776 | 125 | 75 | Yes (duplication) | 0.6 | ||||||||
| 34 | C | G -> C | SNP (transversion) | 1796 | 88 | 31 | No (duplication) | 0.35 | ||||||||
| 35 | G | C -> G | SNP (transversion) | 1808 | 37 | 25 | Yes | 0.68 | 4 | 3 | 0.57 | 0.00E+00 | 4 | 4 | 0.57 | 8.90E-01 |
| 36 | T | C -> T | SNP (transition) | 1808 | 78 | 41 | Yes (duplication) | 0.53 | ||||||||
| 37 | T | C -> T | SNP (transition) | 1814 | 78 | 27 | Yes (duplication) | 0.35 | ||||||||
| 38 | T | C -> T | SNP (transition) | 1827 | 68 | 7 | Yes (duplication) | 0.1 | ||||||||
| 39 | A | T -> A | SNP (transversion) | 1830 | 65 | 8 | Yes (duplication) | 0.12 | ||||||||
| 40 | A | G -> A | SNP (transition) | 1839 | 62 | 23 | Yes (duplication) | 0.37 | ||||||||
| 41 | A | G -> A | SNP (transition) | 1853 | 52 | 6 | Yes (duplication) | 0.12 | ||||||||
| 42 | C | A -> C | SNP (transversion) | 1866 | 47 | 30 | Yes (duplication) | 0.64 | ||||||||
| 43 | A | C -> A | SNP (transversion) | 1910 | 152 | 34 | Yes (duplication) | 0.22 | ||||||||
| 44 | A | G -> A | SNP (transition) | 1917 | 158 | 103 | Yes (duplication) | 0.65 | ||||||||
| 45 | G | T -> G | SNP (transversion) | 1922 | 165 | 110 | Yes (duplication) | 0.67 | ||||||||
| 46 | T | A -> T | SNP (transversion) | 1938 | 170 | 41 | Yes (duplication) | 0.24 | ||||||||
| 47 | A | C -> A | SNP (transversion) | 2039 | 196 | 37 | Yes (duplication) | 0.19 | ||||||||
| 48 | T | C -> T | SNP (transition) | 2043 | 196 | 143 | Yes (duplication) | 0.73 | ||||||||
| 49 | G | T -> G | SNP (transversion) | 2080 | 177 | 88 | Yes (duplication) | 0.5 | ||||||||
| 50 | C | A -> C | SNP (transversion) | 2123 | 126 | 89 | Yes (duplication) | 0.71 | ||||||||
Genome and transcriptome SNPs were used. The global expression (in diverse tissues) was used to determine whether the genomic frequency could explain the transcription frequency (H0). The binomial test was used to verify H0. The
in p-values reflect the acceptance of H.
Figure 5Ploidy and dosage in the sugarcane genomic DNA (BACs) and the SuperMASSA estimation. The location of each SNP is shown by one haplotype from Region01 and one haplotype from Region02. “SuperMASSA Best Ploidy” means the SuperMASSA best ploidy with a posteriori probability >0.8. “SuperMASSA Expected Ploidy” means we fixed the ploidy of the loci in SuperMASSA according to the BAC-FISH and BAC sequencing results. “Genomic Ploidy” means the ploidy of the loci according to the BAC-FISH and BAC sequencing results. “*” means the SNP was found only in the transcriptome.
Figure 6Schematic representation of the sugarcane linkage map. The sugarcane variety SP80-3280 SNPs were used to create multiple linkage maps with information about the sugarcane genome (BACs). (A) Linkage groups using markers in Figure 6. (B) Linkage groups without markers duplicated according to BACs. (C) Linkage groups without markers in the wrong order according to BACs. (D) Linkage group formed trying to create one group. (E) Best linkage groups using BACs information.