| Literature DB >> 28082603 |
Mariane de Mendonça Vilela1, Luiz Eduardo Del Bem1, Marie-Anne Van Sluys2, Nathalia de Setta3, João Paulo Kitajima4, Guilherme Marcelo Queiroga Cruz2, Danilo Augusto Sforça1, Anete Pereira de Souza1, Paulo Cavalcanti Gomes Ferreira5, Clícia Grativol6, Claudio Benicio Cardoso-Silva1, Renato Vicentini1, Michel Vincentz1.
Abstract
Whole genome duplication has played an important role in plant evolution and diversification. Sugarcane is an important crop with a complex hybrid polyploid genome, for which the process of adaptation to polyploidy is still poorly understood. In order to improve our knowledge about sugarcane genome evolution and the homo/homeologous gene expression balance, we sequenced and analyzed 27 BACs (Bacterial Artificial Chromosome) of sugarcane R570 cultivar, containing the putative single-copy genes LFY (seven haplotypes), PHYC (four haplotypes), and TOR (seven haplotypes). Comparative genomic approaches showed that these sugarcane loci presented a high degree of conservation of gene content and collinearity (synteny) with sorghum and rice orthologous regions, but were invaded by transposable elements (TE). All the homo/homeologous haplotypes of LFY, PHYC, and TOR are likely to be functional, because they are all under purifying selection (dN/dS ≪ 1). However, they were found to participate in a nonequivalently manner to the overall expression of the corresponding gene. SNPs, indels, and amino acid substitutions allowed inferring the S. officinarum or S. spontaneum origin of the TOR haplotypes, which further led to the estimation that these two sugarcane ancestral species diverged between 2.5 and 3.5 Ma. In addition, analysis of shared TE insertions in TOR haplotypes suggested that two autopolyploidization may have occurred in the lineage that gave rise to S. officinarum, after its divergence from S. spontaneum.Entities:
Keywords: R570; autopolyploidization; genome evolution; homo/homeologues expression; sugarcane
Mesh:
Substances:
Year: 2017 PMID: 28082603 PMCID: PMC5381655 DOI: 10.1093/gbe/evw293
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Phylogenetic analysis and schematic synteny of TOR (A), PHYC (B), and LFY (C) genomic regions for sugarcane haplotypes (homo/homeologous), sorghum and rice. TOR BACs have just one gene and a noncollinear pseudogene. Phylogenies were generated by neighbor-joining analysis of LFY, PHYC, and TOR nucleotide coding sequences (cds) alignments. The scale bar represents the relative genetic distance (number of substitutions per site). Numbers close to the branches are bootstrap values. Genes are represented by black arrows and pseudogenes by textured arrows. Collinear genes are linked by gray strips. Transposable elements (TEs) are represented by colored arrows: blue arrows are gypsy-like TEs; green arrows are copia-like TEs; red and orange arrows are nonLTR TEs; pink arrows are DNA transposons; gray and white arrows are undefined insertions. Shades of the same color represent similarity between TEs or undefined insertions. Textured and colored arrows are TEs and undefined insertions that have no similarity with any other in the genomic region. Shared TEs are linked by strips of the TE color and identified from “a” to “l” according to supplementary table S4, Supplementary Material online. Predicted insertion times of shared LTR-retrotransposons are indicated below the respective arrows. Predicted insertion events of most shared TEs are reported in the phylogenetic tree using symbols (black circles, stars, squares, diamonds and triangles). Numbers 1–12 indicate sugarcane annotated genes (table 1); I and II indicate sorghum noncollinear genes; i–vi indicate rice noncollinear genes. LFY is gene number 9, PHYC is number 4 and TOR is number 1. Dashed curved arrows connect duplicated genes. Red lines outline sorghum genes edited into one gene, based on the structure of their orthologous counterparts in other grasses.
Annotated Genes in Sugarcane BACs and Their Orthologous Relationships with Sorghum and Rice Genes
| Genomic Region | Gene No. | Functional Annotation | BLASTX E-Value | Orthologous Locus in Sorghum | Orthologous Locus in Rice |
|---|---|---|---|---|---|
|
| 1 |
| 0.0E+00 | Sb06g027386 | Os04g51100 |
| 2 |
| 0.0E+00 | Sb06g027382+ Sb06g027384c | Os04g51090 | |
| 3 |
| 0.0E+00 | Sb06g027380 | Os04g51080 | |
| 4 |
| 0.0E+00 | Sb01g013060 | Os04g51040 and Os04g51050d | |
| 5a |
| – | – | – | |
| 6 |
| 0.0E+00 | Sb06g027350+ Sb06g027360c | Os04g51030 | |
| 7b |
| – | – | – | |
| 8b |
| – | – | – | |
| 9 |
| 5.0E−171 | Sb06g027340 | Os04g51000 | |
| 10 |
| 1.0E−108 | Sb06g027330 | Os04g50990 | |
| 11 |
| 6.0E54− | Sb06g027320 | Os04g50970 | |
| 12 |
| 0.0E+00 | Sb06g027315 | Os04g50960 | |
|
| 1 |
| 0.0E+00 | Sb01g007830 | Os03g54100 |
| 2 |
| 0.0E+00 | Sb01g007840 | Os03g54091 | |
| 3b |
| – | – | – | |
| 4 |
| 0.0E+00 | Sb01g007850 | Os03g54084 | |
| 5 |
| 3.0E−27 | Sb01g007870 | Os03g54050 | |
| 6 |
| 0.0E+00 | Sb01g007878 | Os09g13630 | |
| 7 |
| 0.0E+00 | Sb01g007880 and Sb01g007900d | Os03g54000 | |
| 8 |
| 0.0E+00 | Sb01g007890 and Sb01g007910+ Sb01g007920c, d | Os03g53990 | |
| 9 |
| 8.0E−69 | Sb01g007930 | Os03g53980 | |
|
| 1 |
| – | – | – |
| 2 |
| 0.00E+00 | Sb09g017790 | Os05g14550 |
The pseudogene does not have orthologous counterpart in sorghum and rice genomic region.
Duplicated gene.
Sorghum genes edited into one gene based on the structure of their orthologous counterparts in rice (Oryza sativa), maize (Zea mays), and Brachypodium (Brachypodium distachyon). These loci were named Sb06g027350 + Sb06g027360, Sb06g027382 + Sb06g027384, and Sb01g007910 + Sb01g007920.
The predicted gene in sugarcane has orthology with two distinct genes in the related species (fig. 1).
. 2.—Origin of TOR haplotypes based on shared insertions and sequence polymorphisms consistent with tree topology. “O” indicates the evidence for S. officinarum origin, “S”, S. spontaneum origin and “-” undetermined origin (for details see text, supplementary fig. S3 and table S5, Supplementary Material online).
Estimated dS, dN, and dN/dS for LFY, PHYC, and TOR Genes
| Sugarcane × Sorghum | Sugarcane × Rice | ||||||
|---|---|---|---|---|---|---|---|
| Gene | BACs | dS | dN | dN/dSa | dS | dN | dN/dSa |
|
| 007C22 | 0.064 | 0.007 | 0.108 | 0.545 | 0.033 | 0.060 |
| 030H10 | 0.062 | 0.008 | 0.124 | 0.551 | 0.033 | 0.060 | |
| 102H05 | 0.061 | 0.008 | 0.126 | 0.556 | 0.033 | 0.060 | |
| 187L21 | 0.062 | 0.007 | 0.116 | 0.554 | 0.033 | 0.060 | |
| 043C15 | 0.070 | 0.005 | 0.075 | 0.568 | 0.031 | 0.054 | |
| 156D23 | 0.069 | 0.005 | 0.071 | 0.569 | 0.030 | 0.053 | |
| Average | 0.065 | 0.007 | 0.103 | 0.557 | 0.032 | 0.058 | |
| σ | 0.004 | 0.001 | 0.022 | 0.009 | 0.001 | 0.003 | |
|
| 038J02 | 0.066 | 0.014 | 0.206 | 0.414 | 0.056 | 0.136 |
| 056J11 | 0.070 | 0.013 | 0.183 | 0.418 | 0.055 | 0.132 | |
| 095E16 | 0.077 | 0.014 | 0.187 | 0.414 | 0.056 | 0.136 | |
| 173M11 | 0.075 | 0.014 | 0.190 | 0.414 | 0.056 | 0.136 | |
| Average | 0.072 | 0.014 | 0.192 | 0.415 | 0.056 | 0.135 | |
| σ | 0.004 | 0.000 | 0.009 | 0.002 | 0.000 | 0.002 | |
|
| 011C13 | 0.087 | 0.008 | 0.097 | 0.290 | 0.072 | 0.249 |
| 015P19 | 0.096 | 0.012 | 0.123 | 0.302 | 0.073 | 0.240 | |
| 070I10 | 0.093 | 0.009 | 0.101 | 0.302 | 0.075 | 0.249 | |
| 202G24 | 0.089 | 0.012 | 0.132 | 0.296 | 0.073 | 0.245 | |
| 239D06 | 0.095 | 0.013 | 0.135 | 0.308 | 0.073 | 0.236 | |
| 236J21 | 0.085 | 0.013 | 0.158 | 0.285 | 0.076 | 0.267 | |
| 253I01 | 0.076 | 0.010 | 0.127 | 0.287 | 0.068 | 0.235 | |
| Average | 0.089 | 0.011 | 0.125 | 0.296 | 0.073 | 0.246 | |
| σ | 0.006 | 0.002 | 0.019 | 0.008 | 0.002 | 0.010 | |
All dN/dS values are statistically significant for purifying selection (dN/dS ≪ 1; Z-test P-value = 0).
Correlation between SNP Genomic Dosage and Relative Expression Level of TOR and PHYC Haplotypes
| Haplotype dosage (Genomic data) | Read counts RNA-seq | RNA-seq depth (# mapped reads) | Haplotype relative dosage (gDNA) | Haplotype relative expression (mRNA) | Exact binomial test p-value | Haplotypes with alternative SNP | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gene | SNP position | Main haplotype | Alternative haplotype | Main haplotype | Alternative haplotype | Main haplotype | Alternative haplotype | Main haplotype | Alternative haplotype | Main haplotype | Alternative haplotype | 030H10 | 214B20 | 187L21 | 102H05 | 007C22 | 043C15 | 156D23 | ||
|
| 5463 | C | T | 6 | 1 | 34 | 6 | 40 | 0.86 | 0.14 | 0.85 | 0.15 | 0.823 | X | ||||||
| 5887 | G | A | 5 | 2 | 48 | 18 | 66 | 0.71 | 0.29 | 0.73 | 0.27 | 0.892 | X | X | ||||||
| 6315 | G | A | 6 | 1 | 30 | 1 | 31 | 0.86 | 0.14 | 0.97 | 0.03 | 0.117 | X | |||||||
| 6604 | A | T | 6 | 1 | 65 | 9 | 74 | 0.86 | 0.14 | 0.88 | 0.12 | 0.740 | X | |||||||
| 6646 | G | T | 5 | 2 | 44 | 15 | 59 | 0.71 | 0.29 | 0.75 | 0.25 | 0.667 | X | X | ||||||
| 6924 | C | T | 6 | 1 | 96 | 11 | 107 | 0.86 | 0.14 | 0.90 | 0.10 | 0.271 | X | |||||||
| 7295 | T | C | 4 | 3 | 102 | 67 | 169 | 0.57 | 0.43 | 0.60 | 0.40 | 0.437 | X | X | X | |||||
| 7356 | C | T | 6 | 1 | 216 | 0 | 216 | 0.86 | 0.14 | 1.00 | 0.00 | 7.84E-15 | X | |||||||
| 7380 | T | C | 4 | 3 | 117 | 58 | 175 | 0.57 | 0.43 | 0.67 | 0.33 | 0.009 | X | X | X | |||||
Note.—Gray filled cells indicate statistically significant correlation between the genomic dosage of SNP loci and the SNP RNA-seq frequency (P-value < 0.05, Exact Binomial Test). “X” indicates the haplotypes containing the alternative SNP.