| Literature DB >> 31341223 |
Yinan Zhang1, Meijun Guo1, Jie Shen1, Xie Song1, Shuqi Dong1, Yinyuan Wen1, Xiangyang Yuan1, Pingyi Guo2.
Abstract
Resistance genes play an important role in the defense of plants against the invasion of pathogens. In Setaria italica and closely related grass species, R genes have been identified through genetic mapping and genome-wide homologous/domain searching. However, there has been to date no systematic analysis of the evolutionary features of R genes across all sequenced grass genomes. Here, we determined and comprehensively compared R genes in all 12 assembled grass genomes and an outgroup species (Arabidopsis thaliana) through synteny and selection analyses of multiple genomes. We found that the two groups of nucleotide binding site (NBS) domains containing R genes-R tandem duplications (TD) and R singletons-adopted different strategies and showed different features in their evolution. Based on Ka/Ks analysis between syntenic R loci pairs of TDs or singletons, we conclude that R singletons are under stronger purifying selection to be conserved among different grass species than R TDs, while R genes located at TD arrays have evolved much faster through diversifying selection. Furthermore, using the variome datasets of S. italica populations, we scanned for selection signals on genes and observed that a part of R singleton genes have been under purifying selection in populations of S. italica, which is consistent with the pattern observed in syntenic R singletons among different grass species. Additionally, we checked the synteny relationships of reported R genes in grass species and found that the functionally mapped R genes for novel resistance traits are prone to appear in TDs and are heavily divergent from their syntenic orthologs in other grass species, such the black streak R gene Rxo1 in Z. mays and the blast R gene Pi37 in O. sativa. These findings indicate that the R genes from TDs adopted tandem duplications to evolve faster and accumulate more mutations to facilitate functional innovation to cope with variable threats from a fluctuating environment, while R singletons provide a way for R genes to maintain sequence stability and retain conservation of function.Entities:
Mesh:
Year: 2019 PMID: 31341223 PMCID: PMC6656885 DOI: 10.1038/s41598-019-47121-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Genomic information of 12 grass species and one outgroup species studied in this work.
| Index | Species | #Chr* | #Genes | Genome (Mb) | WGP** |
|---|---|---|---|---|---|
| 1 |
| 9 | 34584 | 405.74 | 1 |
| 2 |
| 9 | 35214 | 394.9 | 1 |
| 3 |
| 18 | 91838 | 1689.57 | 2 |
| 4 |
| 9 | 37232 | 554.13 | 1 |
| 5 |
| 10 | 34211 | 732.15 | 1 |
| 6 |
| 10 | 88760 | 2067.86 | 2 |
| 7 |
| / | 28437 | 243.17 | 1 |
| 8 |
| 21 | 107891 | 14547.26 | 3 |
| 9 |
| 10 | 29898 | 234.14 | 1 |
| 10 |
| 5 | 34310 | 271.16 | 1 |
| 11 |
| 12 | 42006 | 374.47 | 1 |
| 12 |
| 11 | 36528 | 472.96 | — |
| 13 |
| 5 | 27751 | 119.67 | — |
*Chr: Chromosomes; **WGP: whole genome polyploidization.
Figure 1Phylogenetic tree built on Ks loci of syntenic orthologous genes among the 12 grass species. Red stars denote whole-genome duplication events; double red stars denotes whole genome triplication events.
Figure 2The phylogenetic tree of 535 R genes in S. italica. The tree was built with the maximum-likelihood algorithm by MEGA software. The red full circles denote S. italica specific R genes, i.e., S. italica R genes that have no syntenic orthologs in any of the other 11 grass species. Branches colored in orange are R genes from tandem duplication arrays, others are R gene singletons.
Statistics summary of R genes in the 13 studied species.
| Index | Species | #R candidate | #NBS R | NBS R/Gene | #R TD* | #TD R** | TD R/NBS | #R singleton | #R locus |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 1,264 | 535 | 1.55% | 106 | 303 | 0.57 | 232 | 338 | |
| 2 | 1,152 | 465 | 1.32% | 92 | 241 | 0.52 | 224 | 316 | |
| 3 | 2,949 | 1,267 | 1.38% | 212 | 524 | 0.41 | 743 | 955 | |
| 4 | 1,101 | 420 | 1.13% | 86 | 234 | 0.56 | 186 | 272 | |
| 5 | 1,036 | 425 | 1.24% | 82 | 233 | 0.55 | 192 | 274 | |
| 6 | 1,101 | 306 | 0.34% | 36 | 79 | 0.26 | 227 | 263 | |
| 7 | 410 | 97 | 0.34% | 4 | 8 | 0.08 | 89 | 93 | |
| 8 | 5,712 | 2,747 | 2.55% | 547 | 1,588 | 0.58 | 1,159 | 1,706 | |
| 9 | 945 | 377 | 1.26% | 78 | 203 | 0.54 | 174 | 252 | |
| 10 | 1,091 | 464 | 1.35% | 105 | 267 | 0.58 | 197 | 302 | |
| 11 | 1,390 | 587 | 1.40% | 116 | 339 | 0.58 | 248 | 364 | |
| 12 | 863 | 190 | 0.52% | 24 | 67 | 0.35 | 123 | 147 | |
| 13 | 896 | 229 | 0.83% | 49 | 124 | 0.54 | 105 | 154 |
*R TD means tandem duplication array that is composed of R genes;
**TD R means the R genes comes from a tandem array; +: tetrapolyploidization; ++: hexapolyploidization.
The subgroups of 535 R genes in S. italica.
| Index | Domains | #Copy | Sum |
|---|---|---|---|
| 1 | CC-NBS-LRR | 159 | 230 |
| 2 | CC-NBS | 41 | |
| 3 | CC-NBS-NBS-LRR | 14 | |
| 4 | CC-NBS-NBS | 12 | |
| 5 | CC-NBS-LRR-LRR | 3 | |
| 6 | CC-LRR-NBS-LRR | 1 | |
| 7 | NBS-LRR | 156 | 302 |
| 8 | NBS | 88 | |
| 9 | NBS-NBS | 45 | |
| 10 | NBS-NBS-LRR | 10 | |
| 11 | NBS-LRR-LRR | 2 | |
| 12 | LRR-NBS-NBS-LRR | 1 | |
| 13 | TIR-NBS | 2 | 3 |
| 14 | TIR-NBS-TIR | 1 | |
| Total | 535 | ||
Figure 3The frequency distribution of Ka/Ks values between syntenic R gene pairs from R singletons (colored in red) or R TD arrays (colored in light orange).
Functional mutation comparison between total genes and R genes.
| #Genes | #Functional | Functional | P values | |
|---|---|---|---|---|
| Total gene | 34,584 | 79,296 | 2.29 | <2.20 × 10−16 |
| NBS R gene | 535 | 9,006 | 16.83 | |
|
| 232 | 2,734 | 11.78 | 3.68 × 10−10 |
|
| 303 | 6,272 | 20.70 |
Figure 4The distribution of π values of total genes (colored in black) and R genes (light blue) in the population of S. italica. R genes were further separated into R singletons (red) and R tandem arrays (light orange).
The syntenic orthologous genes of O. sativa blast disease resistance gene Pi37 in the other 11 grass species.
| Species | Pi37 synteny | Identity% | Coverage% | E-value | Tandem array |
|---|---|---|---|---|---|
| / | / | / | |||
| 68.34 | 100 | 0 | / | ||
| 68.03 | 100 | 0 | / | ||
| 66.26 | 56.90 | 0 | |||
| 66.29 | 47.60 | 0 | |||
| 64.56 | 55.12 | 0 | |||
|
| 45.11 | 38.84 | 4.42E-84 | / | |
| / | / | / | / | / | |
| 62.11 | 100 | 0 | / | ||
| 61.95 | 100 | 0 | |||
| 61.53 | 92.48 | 0 | / | ||
| 55.43 | 100 | 0 | / | ||
| 27.54 | 48.14 | 1.68E-40 | / |