The family of resistance gene analogues (RGAs) with a nucleotide-binding site (NBS) domain accounts for the largest number of disease resistance genes and is one of the largest gene families in plants. We have identified 868 RGAs in the genome of the apple (Malus × domestica Borkh.) cultivar 'Golden Delicious'. This represents 1.51% of the total number of predicted genes for this cultivar. Several evolutionary features are pronounced in M. domestica, including a high fraction (80%) of RGAs occurring in clusters. This suggests frequent tandem duplication and ectopic translocation events. Of the identified RGAs, 56% are located preferentially on six chromosomes (Chr 2, 7, 8, 10, 11, and 15), and 25% are located on Chr 2. TIR-NBS and non-TIR-NBS classes of RGAs are primarily exclusive of different chromosomes, and 99% of non-TIR-NBS RGAs are located on Chr 11. A phylogenetic reconstruction was conducted to study the evolution of RGAs in the Rosaceae family. More than 1400 RGAs were identified in six species based on their NBS domain, and a neighbor-joining analysis was used to reconstruct the phylogenetic relationships among the protein sequences. Specific phylogenetic clades were found for RGAs of Malus, Fragaria, and Rosa, indicating genus-specific evolution of resistance genes. However, strikingly similar RGAs were shared in Malus, Pyrus, and Prunus, indicating high conservation of specific RGAs and suggesting a monophyletic origin of these three genera.
The family of resistance gene analogues (RGAs) with a nucleotide-binding site (NBS) domain accounts for the largest number of disease resistance genes and is one of the largest gene families in plants. We have identified 868 RGAs in the genome of the apple (Malus × domestica Borkh.) cultivar 'Golden Delicious'. This represents 1.51% of the total number of predicted genes for this cultivar. Several evolutionary features are pronounced in M. domestica, including a high fraction (80%) of RGAs occurring in clusters. This suggests frequent tandem duplication and ectopic translocation events. Of the identified RGAs, 56% are located preferentially on six chromosomes (Chr 2, 7, 8, 10, 11, and 15), and 25% are located on Chr 2. TIR-NBS and non-TIR-NBS classes of RGAs are primarily exclusive of different chromosomes, and 99% of non-TIR-NBS RGAs are located on Chr 11. A phylogenetic reconstruction was conducted to study the evolution of RGAs in the Rosaceae family. More than 1400 RGAs were identified in six species based on their NBS domain, and a neighbor-joining analysis was used to reconstruct the phylogenetic relationships among the protein sequences. Specific phylogenetic clades were found for RGAs of Malus, Fragaria, and Rosa, indicating genus-specific evolution of resistance genes. However, strikingly similar RGAs were shared in Malus, Pyrus, and Prunus, indicating high conservation of specific RGAs and suggesting a monophyletic origin of these three genera.
When a genome sequence is available, the analysis of large gene families can contribute to the understanding of major events responsible for molecular evolution. This is the case for resistance gene analogues (RGAs) with a nucleotide-binding site (NBS) domain [1]–[5]. The NBS domain is part of the larger NB-ARC domain that hydrolyses ATP and GTP and functions as a molecular switch for signal transduction after pathogen recognition [6]. Many resistance proteins encoded by RGAs contain a leucine-rich repeat (LRR) domain [7], [8], involved in protein–protein interactions and in pathogen recognitions [9]. Proteins codified by RGAs can be further classified according to the presence of the toll/interleukin-1 receptor (TIR) or other N-terminal features, such as coiled-coil (CC) and BED finger (Bed) [3], [10], [11]. The N-terminal features are involved in downstream specificity and signaling regulation [12]. RGAs evolved for pathogen recognition and frequently matched with specific pathogen avirulence factors to trigger signal transduction cascades and defense responses [9].The genome sequencing of model plants has enabled the study of RGA families in monocots and dicots, including Arabidopsis thaliana
[11], [13], Brassica rapa
[14], Carica papaya
[15], [16], Cucumis sativus
[17], Glycine max
[18], [19], Zea mays
[20], [21], Medicago truncatula
[22], Oryza sativa
[23]–[25], Populus trichocarpa
[26], Sorghum bicolor
[27], Vitis vinifera
[2], [5],[28],[29], Brachypodium distachyon
[30], [31], Solanum tuberosum
[32], and Solanum lycopersicum
[33]. According to these studies, approximately 0.2–1.3% of genes predicted in plant genomes corresponds to RGAs, which occur at a density of 0.3–1.6 per mega base (Mb). The genome of apple (Malus × domestica Borkh.) also contains a large number of RGAs
[34]. Apple is characterized by recent whole genome duplication (WGD) [34]. The role and relevance of such radical genomic changes in plant evolution was largely demonstrated, but the number and timing of WGDs in the different plant species was only partially understood [35], . Polyploidy is common in angiosperms [22], [37], and most if not all extant species are thought to be ancient polyploids [38]. However, ancestral genomes are in most cases dispersed on multiply rearranged chromosomes, having also suffered wholesale gene losses [5], [39]. Given that synonymous substitutions are immune to selection pressure [40], the per-site synonymous substitution rate (Ks) is widely used to infer the time of WGD and to describe the relationships among chromosomes [2], [34].In this study, cluster organization of RGAs and their distribution across chromosomes were analyzed in terms of recent duplication of the apple genome. In addition, the phylogenesis of RGAs from the domesticated and wild Malus species, including also other Rosaceae, P. trichocarpa, and V. vinifera RGAs, was considered to clarify the evolutionary history of apple and its related species.
Results
Classes of RGAs in Malus × domestica
Based on the presence of the NBS domain, 868 RGAs were identified in the genome of the M. domestica cultivar ‘Golden Delicious’, and all of them showed a significant (more than 90%) protein similarity with RGAs of A. thaliana, P. trichocarpa, and V. vinifera. In addition, 124 putative RGA alleles were found, and they were not further analyzed. By domain analysis, RGAs were assigned to TIR-NBS-LRR (TNL) and CC-NBS-LRR (CNL) classes. In particular, 505 RGAs were classified as NBS-LRR (NL), including CNL subclass, and 231 RGAs were classified as TIR-NBS (TN), including TIR-NBS-LRR (TNL), NBS-LRR-TIR (NLT), TIR-CC-NBS-LRR (TCNL), TIR-CC-NBS (TCN), and TIR-NBS (TN) subclasses (Table 1). In addition, 132 RGAs were characterized only by the presence of the NBS (N) or CC-NBS (CN) domains.
Table 1
Classification and organization of resistance gene analogues (RGAs) with a nucleotide-binding site (NBS) domain in different plant genomes.
Characteristic
Malus ×domestica
Arabidopsisthaliana
Populustrichocarpa
Vitisvinifera
Oryzasativa
Cucumissativus
Caricapapaya
Sorghumbicolor
Brassicarapa
Brachypodiumdistachyon
Glycinemax
Zeamays
Number of totalpredicted genes
57,524
27,228
45,654
33,514
41,911(28,236 [30])
26,682
28,591
27,640
nd
25,532
46,430
32,540
Genome size (Mb)
750
125
485
487
389
243
372
730
529
272
1,115
2,500
N° of RGAs
868
178
402
391
535
61
54
211(245 [30])
92
178(238 [30])
429 [30]
129 [80]
NBS-LRR classa
505
57
236
194
480
nd
31
184
17
212
236
95
(58%)
(32%)
(59%)
(57%)
(89%)
(57%)
(74%) [30]
(18%)
(89%) [30]
(55%) [30]
(74%) [30]
TIR-NBS classb
231
115
94
42
3
nd
7
2
42
nd
154
nd
(27%)
(64%)
(23%)
(13%)
(1%)
(13%)
(1%) [31]
(46%)
(36%) [30]
Other RGAsc
132
6
72
103
52
nd
16
61
33
27
39
34
(15%)
(4%)
(18%)
(30%)
(10%)
(30%)
(25%) [30]
(36%)
(11%) [30]
(9%) [30]
(26%) [30]
RGAs/totalgenes (%)
1.51
0.65
0.88
1.01
1.27
0.23
0.19
0.76(0.88 [30])
nd
0.69(0.9 [30])
0.92 [30]
0.39 [30]
RGAs per Mb
1.16
1.42
0.82
0.7
1.5
0.25
0.15
0.28(0.33 [30])
0.92
0.65(0.87 [30])
0.38 [30]
0.056 [30]
Average numberof exons in RGAs
4.51
4.19
2.35
3.96
3.72
nd
nd
nd
4.2
nd
nd
nd
Number of Single RGAs
156
46
135
55
104
20
12
nd
18
nd
nd
nd
Number of Clusters
152
39
75
52
157
11
13
nd
24
nd
nd
nd
Maximum Numberof RGAs in clusters
21
11
19
26
11
9
7
nd
5
nd
Average Numberof RGAs in cluster
4.11
3.21
3.75
5.78
3.48
3.72
2.92
nd
2.54
nd
nd
nd
Source
this paperand [34]
[11]
[26]
[2], [29]
[24], [25]
[17]
[16]
[27]
[14]
[31]
[19]
[21]
NBS-LRR class includes: NBS-LRR (NL) and CC-NBS-LRR (CNL). Percentage (%) of this class relative to the total number of RGAsis reported in brackets.
TIR-NBS class includes: TIR-NBS-LRR (TNL), NBS-LRR-TIR (NLT), TIR-CC-NBS-LRR (TCNL), TIR-CC-NBS (TCN), and TIR-NBS (TN). Percentage (%) of this class relative to the total number of RGAsis reported in brackets.
Class of other RGAs includes: NBS (N) and CC-NBS (CN). Percentage (%) of this class relative to the total number of RGAsis reported in brackets.
nd: not declared by the authors.
NBS-LRR class includes: NBS-LRR (NL) and CC-NBS-LRR (CNL). Percentage (%) of this class relative to the total number of RGAsis reported in brackets.TIR-NBS class includes: TIR-NBS-LRR (TNL), NBS-LRR-TIR (NLT), TIR-CC-NBS-LRR (TCNL), TIR-CC-NBS (TCN), and TIR-NBS (TN). Percentage (%) of this class relative to the total number of RGAsis reported in brackets.Class of other RGAs includes: NBS (N) and CC-NBS (CN). Percentage (%) of this class relative to the total number of RGAsis reported in brackets.nd: not declared by the authors.The 868 RGAs accounted for 1.51% of M. domestica predicted genes, a percentage slightly higher than that in other plant genomes (Table 1). The density of RGAs per Mb was similar for M. domestica and other genomes with the exception of Z. mays, C. papaya, C. sativus, and S. bicolor.The mean exon number detected in apple RGAs was 4.51, and the number of exons of CNL class (3.46) was lower than the number of TNL class (6.41; P<0.001). Thus, the number of exons in RGAs of M. domestica was consistent with the number in A. thaliana and B. rapa but higher than the number in V. vinifera, P. trichocarpa, and O. sativa (Table 1). Moreover, 23% of CNL RGAs are encoded by a single exon, while all TNL have at least three exons.
Genome Organization and Phylogeny of RGAs in Malus × domestica
Contigs anchored to the genome were used to assess the distribution of RGAs in the apple genome [34]. Of the RGAs, 778 (90%) were located across the 17 apple chromosomes (Figure 1). Among the anchored RGAs, 435 (56%) were assigned to six chromosomes: Chr 2, 7, 8, 10, 11, and 15 (Figure 1 and Table 2). Conversely, Chr 4, 6, 13, 14, and 16 had a low content of RGAs (27, 9, 17, 22, and 14 RGAs, respectively). RGAs were mainly (80%) grouped in clusters, 152 clusters included the majority (622) of the RGAs (Figure 1, Table 2 and Table S1). On average, four RGAs were present in a cluster, and the largest cluster contained 21 RGAs (located on Chr 2). Several clusters of RGAs can be associated with QTLs affecting disease resistance of Malus genotypes (Figure 1).
Figure 1
Chromosomal organization of RGAs in Malus × domestica.
A: Phylogenetic analysis of NBS domain was carried out by neighbor-joining method [65] on RGAs protein sequences from M. domestica cultivar ‘Golden delicious’. Major phylogenetic clades (from CN1 to CN5 and from TN1 to TN6) correspond to the classification based on protein domains. TN1 (light blue): TIR-NBS-LRR; TN2 (light purple): TIR-NBS-LRR and TIR-NBS; TN3 (black): TIR-NBS-LRR; TN4 (blue): TIR-NBS-LRR, CC-TIR-NBS, and TIR-NBS; TN5 (orange): TIR-NBS-LRR, and TIR-NBS; TN6 (dark purple): TIR-NBS-LRR; CN1 (pink): CC-NBS-LRR; CN2 (red): CC-NBS-LRR and NBS-LRR; CN3 (light green): CC-NBS-LRR, NBS-LRR, NBS; CN4 (green): CC-NBS-LRR, NBS-LRR, NBS; CN5 (dark green): CC-NBS-LRR, NBS-LRR, NBS. B:
RGAs assigned to chromosomes (Chr) are represented by dots with colors corresponding to major phylogenetic clades. The size of each chromosome is given in megabase (Mb, on the left side), whereas the markers of the genetic map are shown in black (on the right side). Resistance-related genes different from RGAs are shown in red. Known quantitative trait loci (QTL) for resistance to apple scab (brown), powdery mildew (green), aphids (light blue), fire blight (red) and rust mite (blue) are shown by bars on the left side of chromosomes [67]–[73], together with the major resistance genes to apple scab (Vd3 and Rvi genes) [74]–[76], powdery mildew (Pl1) [77], and aphids (Sd-1, Sd-2, Er1, Er2) [78], [79].
Table 2
Organization and distribution of resistance gene analogues (RGAs) with a nucleotide-binding site (NBS) domain in the apple (Malus × domestica) chromosomes.
Chromosome
Number of RGAs
Genome organization of RGAs
Number of single RGAs
Number of Clusters
Average Number of RGAs/cluster
1
43
10
7
4.7
2
109
14
15
6.3
3
47
12
11
3.2
4
27
12
6
2.5
5
48
11
11
3.4
6
9
2
2
3.5
7
57
4
11
4.8
8
76
11
14
4.6
9
40
7
10
3.3
10
56
14
14
3.0
11
79
7
10
7.2
12
37
11
6
4.3
13
17
9
4
2.0
14
22
6
4
4.0
15
58
14
14
3.1
16
14
3
4
2.8
17
39
9
8
3.8
Not anchored RGAs
90
–
–
–
Total
868
156
152
4.1
Chromosomal organization of RGAs in Malus × domestica.
A: Phylogenetic analysis of NBS domain was carried out by neighbor-joining method [65] on RGAs protein sequences from M. domestica cultivar ‘Golden delicious’. Major phylogenetic clades (from CN1 to CN5 and from TN1 to TN6) correspond to the classification based on protein domains. TN1 (light blue): TIR-NBS-LRR; TN2 (light purple): TIR-NBS-LRR and TIR-NBS; TN3 (black): TIR-NBS-LRR; TN4 (blue): TIR-NBS-LRR, CC-TIR-NBS, and TIR-NBS; TN5 (orange): TIR-NBS-LRR, and TIR-NBS; TN6 (dark purple): TIR-NBS-LRR; CN1 (pink): CC-NBS-LRR; CN2 (red): CC-NBS-LRR and NBS-LRR; CN3 (light green): CC-NBS-LRR, NBS-LRR, NBS; CN4 (green): CC-NBS-LRR, NBS-LRR, NBS; CN5 (dark green): CC-NBS-LRR, NBS-LRR, NBS. B:
RGAs assigned to chromosomes (Chr) are represented by dots with colors corresponding to major phylogenetic clades. The size of each chromosome is given in megabase (Mb, on the left side), whereas the markers of the genetic map are shown in black (on the right side). Resistance-related genes different from RGAs are shown in red. Known quantitative trait loci (QTL) for resistance to apple scab (brown), powdery mildew (green), aphids (light blue), fire blight (red) and rust mite (blue) are shown by bars on the left side of chromosomes [67]–[73], together with the major resistance genes to apple scab (Vd3 and Rvi genes) [74]–[76], powdery mildew (Pl1) [77], and aphids (Sd-1, Sd-2, Er1, Er2) [78], [79].As previously shown in Arabidopsis
[6], [11], RGAs of TIR-NBS and non-TIR-NBS classes had different topologies in the phylogenetic analysis (Figure 1A). In particular, six major TIR-NBS clades (numbered from TN1 to TN6) and five non-TIR-NBS major clades (numbered from CN1 to CN5) were identified in apple. RGAs of TIR-NBS class were mainly located on Chr 2, 5, 9, 12, 15, 16, and 17, with Chr 16 hosting the TIR-NBS class almost exclusively (Figure S1A and Table S1). Chr 3, 4, 8, 11, 13, and 14 were mainly characterized by non-TIR-NBS class, and Chr 11 had almost exclusively RGAs of non-TIR-NBS class. Considering TIR-NBS and non-TIR-NBS phylogenetic clades, the major clade TN6 represented more than one-third of the RGAs on Chr 1 and 6, while the major clade CN4 included more than half of the RGAs on Chr 11 and 14 (Figure S1A). Moreover, the major clade TN4 was located preferentially (63%) on Chr 2 (Figure S1B).
Phylogeny of RGAs in Domesticated and Wild Malus Species
Twenty-four wild Malus species (Table S2) were considered, and PCR fragments were amplified from germplasm. After sequence comparison, unique fragments were translated in to amino acid sequences (Table S1), and 115 of them matched NBS sequences of known resistance proteins with an E-value lower than 1E−10. Phylogenetic analysis indicated that RGAs of wild Malus species grouped mainly in clades that included sequences of the domesticated apple (Figure 2). A significant fraction of phylogenetic clades contained only a few RGAs, probably due to the short sequence of the NBS domain used for this analysis. Some clades consisted mainly of sequences from wild species and contained only few RGAs of the domesticated apple.
Figure 2
Phylogenesis of RGAs from Malus × domestica and from wild Malus species.
Phylogenetic analysis of the NBS domain was carried out by the neighbor-joining method [65] using RGA sequences of M. domestica cultivar ‘Golden delicious’ (black) and wild Malus species (red). Proteins present in contiguous positions on the tree and belonging to the same species are merged (collapsed branches are indicated by the + sign). Phylogentec tree reveals 18 clades specific to M. domestica, six clades specific to wild Malus species, and 49 clades that include RGAs sequences of both domesticated and wild apple species.
Phylogenesis of RGAs from Malus × domestica and from wild Malus species.
Phylogenetic analysis of the NBS domain was carried out by the neighbor-joining method [65] using RGA sequences of M. domestica cultivar ‘Golden delicious’ (black) and wild Malus species (red). Proteins present in contiguous positions on the tree and belonging to the same species are merged (collapsed branches are indicated by the + sign). Phylogentec tree reveals 18 clades specific to M. domestica, six clades specific to wild Malus species, and 49 clades that include RGAs sequences of both domesticated and wild apple species.
Phylogeny of RGAs among Rosaceae Species
A total of 693 Rosaceae RGA sequences at NCBI were downloaded (75 from Rubus, 293 from Prunus, 16 from Fragaria, 125 from Rosa, 34 from Pyrus, and 150 public sequences from Malus species) and compared to the 868 RGAs of M. domestica and the 210 sequences obtained from wild Malus species (Table S1). In the phylogenetic tree of Rosaceae species (Figure 3), 49 clades were specific to the genus Malus, and included sequences from two or more Malus species. Most of the remaining clades were represented by RGAs from two or more Rosaceae genera. In particular, three clades comprised RGAs of Malus, Pyrus, and Prunus, indicating a monophyletic origin of the three genera and strong conservation of some RGA sequences in these plants. Few clades were represented by non-apple RGAs, and clades specific to Fragaria or Rosa were also present.
Figure 3
Phylogenesis of RGAs from Malus species (wild and domesticated apple), Pyrus communis, Prunus species, Fragaria ananassa, Rubus idaeus, and Rosa species.
Phylogenetic analysis of the NBS domain was carried out by the neighbor-joining method [65] using RGA sequences of domesticated and wild Malus species (green), Pyrus spp. (yellow), Prunus spp. (purple), Fragaria spp. (red), Rosa spp. (orange), and Rubus spp. (blue). Proteins present in contiguous positions of the tree are merged (collapsed branches are indicated by the + sign). Phylogentec tree indicates 49, three and one clades specific to Malus spp., Fragaria spp. and Rosa spp., respectively. Clades with RGAs of different genera: three clades of Malus spp. and Prunus spp.; seven clades of Malus spp. and Pyrus spp.; two clades of Malus spp. and Rubus spp.; four clades of Malus spp. and Rosa spp.; two clades of Fragaria spp. and Rosa spp.; two clades of Malus spp., Rosa spp., and Rubus spp.; three clades of Malus spp., Pyrus spp., and Rosa spp.; three clades of Malus spp., Prunus spp., and Rubus spp.; four clades of Malus spp., Prunus spp., Rosa spp., and Rubus spp.; three caldes of Malus spp., Prunus spp., Pyrus spp., Rosa spp., and Rubus spp.; two clades of Malus spp., Fragaria spp., Prunus spp., Rosa spp., and Rubus spp., one clade of Malus spp., Fragaria spp., Pyrus spp., Rosa spp., and Rubus spp.
Phylogenesis of RGAs from Malus species (wild and domesticated apple), Pyrus communis, Prunus species, Fragaria ananassa, Rubus idaeus, and Rosa species.
Phylogenetic analysis of the NBS domain was carried out by the neighbor-joining method [65] using RGA sequences of domesticated and wild Malus species (green), Pyrus spp. (yellow), Prunus spp. (purple), Fragaria spp. (red), Rosa spp. (orange), and Rubus spp. (blue). Proteins present in contiguous positions of the tree are merged (collapsed branches are indicated by the + sign). Phylogentec tree indicates 49, three and one clades specific to Malus spp., Fragaria spp. and Rosa spp., respectively. Clades with RGAs of different genera: three clades of Malus spp. and Prunus spp.; seven clades of Malus spp. and Pyrus spp.; two clades of Malus spp. and Rubus spp.; four clades of Malus spp. and Rosa spp.; two clades of Fragaria spp. and Rosa spp.; two clades of Malus spp., Rosa spp., and Rubus spp.; three clades of Malus spp., Pyrus spp., and Rosa spp.; three clades of Malus spp., Prunus spp., and Rubus spp.; four clades of Malus spp., Prunus spp., Rosa spp., and Rubus spp.; three caldes of Malus spp., Prunus spp., Pyrus spp., Rosa spp., and Rubus spp.; two clades of Malus spp., Fragaria spp., Prunus spp., Rosa spp., and Rubus spp., one clade of Malus spp., Fragaria spp., Pyrus spp., Rosa spp., and Rubus spp.
Comparison of RGAs among Malus × domestica, Populus trichocarpa, and Vitis vinifera
RGA sequences can also be compared across different plant families, and a phylogenetic tree of RGAs from M. domestica, wild Malus species, V. vinifera, and P. trichocarpa (Table S1) was obtained (Figure 4). Several clades included sequences from two or three species, and two major clades, named Md1 and Md2, comprised only sequences of M. domestica (Figure 4). However, sequences of the Md1 clade were grouped in three subclades in the phylogenetic tree of RGAs from Rosaceae species (Figure S2). RGAs of subclades Md1 sc2 and Md1 sc3 did not show similarity with any Rosaceae RGAs, whereas sequences of Md1 subclade 1 (Md1 sc1) shared significant similarity with four RGAs of Pyrus (Figure S2). Clade Md2 included one and two RGAs from Rubus and Rosa, respectively. Most of the RGAs of the clade Md2 are located on Chr 2, 3, 7, 11, 12, and 15.
Figure 4
Phylogenesis of RGAs from Malus species (wild and domesticated apple), Populus trichocarpa and Vitis vinifera.
Phylogenetic analysis of the NBS domain was carried out by the neighbor-joining method [65] using RGA sequences of domesticated and wild Malus species (green), P. trichocarpa (cyan), and V. vinifera (purple). Proteins present in contiguous positions on the tree and belonging to the same species are merged (collapsed branches are indicated by the + sign). Two phylogenetic clades comprise only sequences of M. domestica (Md1 and Md2).
Phylogenesis of RGAs from Malus species (wild and domesticated apple), Populus trichocarpa and Vitis vinifera.
Phylogenetic analysis of the NBS domain was carried out by the neighbor-joining method [65] using RGA sequences of domesticated and wild Malus species (green), P. trichocarpa (cyan), and V. vinifera (purple). Proteins present in contiguous positions on the tree and belonging to the same species are merged (collapsed branches are indicated by the + sign). Two phylogenetic clades comprise only sequences of M. domestica (Md1 and Md2).
Duplication of RGAs in Malus × Domestica
To study the recent duplication of RGAs in the M. domestica genome, Ks values were determined, and results from recent gene duplications were highlighted (Figure S3). Links among different RGAs helped to describe the relationships among the duplicated apple chromosomes [34]. Homologous apple chromosomes had more than 10 links, except for Chr 13 and 16, which hosted only a low number of RGAs. Chr 6 was not included in this analysis because it contains only nine RGAs, six of them derived from the recent WGD. Moreover, the duplicated chromosomes had RGAs belonging to the same phylogenetic clades (Figure S4).
Discussion
To counteract pathogens, plants rely on the innate immunity of their cells and on systemic signals emanating from infection sites [9], [41]. Pathogen effectors from very diverse organisms are recognized by resistance proteins encoded by RGAs and activate plant defense responses [6], [9]. NBS-mediated disease resistance is effective against obligate biotrophic and hemibiotrophic pathogens but not against necrotrophs, which kill host tissues during colonization [42].In apple, the abundance of RGAs is only partly related to genome size (750 Mb), which is much smaller than in maize (2300 Mb; [21]) or soybean (1115 Mb; [19]). The TIR-NBS class accounts for the largest group of RGAs in A. thaliana (64%; [11]) and B. rapa (64%; [14]). In P. trichocarpa
[26], V. vinifera
[2], [5], [28], [29], and C. papaya
[16], [30], the percentage of TIR-NBS class is much lower than in the previously mentioned species. The TIR-NBS class is present at a very low frequency in O. sativa (1%; [24]) and S. bicolor (1%; [27]) and is absent in B. distachyon and Z. mays
[30], supporting the conclusion that this class is specific for dicotyledons. In apple, 231 RGAs of TIR-NBS class have been identified, and they are mainly located on Chr 2, 5, 9, 12, 15, 16, and 17. However, the number of RGAs belonging to non-TIR-NBS class in apple (505) is greater than in all other species considered, and these RGAs are mainly located on Chr 3, 4, 8, 11, 13, and 14. The existence of chromosome-specific RGAs classes suggests that groups of chromosomes evolved separately, but further analyses are required to test this hypothesis. In grapevine, the existence of two chromosome groups has been inferred based on RGAs cluster similarity, and the two groups seem to have evolved independently [2]. Moreover, the TIR-NBS class is specific for only one of the two components of V. vinifera genome, suggesting an independent evolution of the RGA classes [2].In apple, 56% of RGAs (435 of 778 anchored) are located preferentially on six chromosomes, with 14% located on Chr 2. In large gene families, genes are commonly organized in clusters and superclusters [4], [5], [11], [14], [16], [25], [26], as demonstrated here for the apple genome. Of the RGAs clusters in apple, 71% (108 of 152) include RGAs from the same phylogenetic clade, and 29% RGAs from two to three different clades. Clusters frequently consist of tandem duplications of the same gene [5], [43]. Heterogeneous clusters, in which sequences belong to different phylogenetic lineages, are also present, most probably as a result of different molecular mechanisms like ectopic recombination, chromosomal translocation, and gene transposition, as has been recently highlighted for the grapevine genome [2]. This kind of genome evolution could be explained in terms of a positive selection for cluster complexity, which could serve as the basis for the generation of new resistance specificities [4], [44]. The role of tandem duplication in the apple genome is supported by low Ks values among RGAs of the same cluster, as is already known for other species [2], [5], [14], [22], [43]. Gene duplication in a position different from the original cluster has to be preceded by gene transposition, as predicted for A. thaliana and V. vinifera RGAs
[1], [2]. Thus, a successful transposition is the starting point for the creation of a new RGA cluster, and the selection for disease resistance could favor the process [5], [45]. Moreover, analysis of RGA transposition has indicated that V. vinifera putative component genomes may have evolved independently and later fused and evolved together in the same nucleus [2].Velasco et al.
[34] have shown that recent WGD has increased the chromosome number in apple from nine in the putative ancestor to the current 17. The recent duplication of RGAs due to a WGD event supports the existence of i) a tetraploid state of the genome in which a pair of chromosomes exists with a second homologous pair; ii) duplications inside chromosomes, particularly for Chr 11 where recent duplications can be observed; and iii) duplications in different chromosomes, suggesting recent events of gene transposition. Eight of the 17 chromosomes (Chr 3 and 11, 5 and 10, 9 and 17, and 13 and 16) represent a direct duplication of four ancestral chromosomes, and each of the extant Chr 4, 6, 12, and 14 derives from translocation between two ancestral chromosomes [34]. More complex events have generated the remaining five chromosomes that are derived from starting three ancestral chromosomes. The different clades of RGAs along duplicated chromosomes indicate a similar position of orthologous RGAs along each chromosome doublets (Chr 3 and 11, 5 and 10, 9 and 17, and 13 and 16). These results strongly support the origin of the apple chromosomes as described by Velasco et al.
[34] and indicate that RGA distribution might be used to dissect plant genome evolution [2]. As is the case for other species, the process of gene duplication has shaped the apple genome in different ways, including the selective retention of paralogs associated with specific biological processes, the amplification of specific gene families, and an extensive subfunctionalization of paralogs. Both the major WGD event and small-scale duplications could be responsible for the high number of the apple RGAs. A remarkable feature of gene duplication in apple is the high proportion of paralogs showing divergent expression patterns [46]. Extensive subfunctionalization could have contributed to the acquisition of new traits specific to apple or to the Pyrinae lineage [47]. Sequences of Eurosid genomes provide evidence of ancient genome duplications that occurred early in evolution, suggesting a polyploid origin for most Eudicots [28], [48].Most of the RGAs of wild Malus species are closely related RGAs of the domesticated apple. Whereas RGAs sequencing from wild Malus species was partial and could include alleles of the same gene, phylogenetic analysis revealed specific clades of wild Malus species, indicating, as expected, the potential to enlarge the the genetic variation of RGAs in domesticated apple. Moreover, the comparison of apple RGAs with those of other Rosaceae indicates the existence of specific clades for apple. In addition, several clades include a mixture of RGAs from Malus, Pyrus, and Prunus, indicating that similar resistance genes are still shared in different genera of the Rosaceae. While these results support the monophyletic origin of the three genera, clades specific for each genus were also found. The existence of genus- or species-specific clades indicates the existence of mechanisms for cluster conservation, as reported by Plocik et al.
[49].Phylogenetic relationships within the Rosaceae inferred from RGAs are consistent with phylogenies based on chloroplast and other nuclear genes [50], [51]. The phylogenetic analysis of the RGAs from Malus, Vitis, and Populus shows that Malus contains two large non-TIR-NBS clades that are specific to Malus. This inference should be considered with caution, because the RGA sequences used in our analysis are from only a few species. Several other reasons could explain the variation of RGAs in Rosaceae species, such as the inter-specific variation of the RGA family size observed in dicotyledonous plants. Similar situations were reported for other gene families in the Archeae
[52], bacteria [52], [53], and mammals [54], [55]. The variation of RGA family size between species could be attributed to gene duplication, deletion, pseudogenization, and functional diversification [56]–[58]. The last case is supported by the necessity of a species to adapt to rapidly changing pathogen populations.
Concluding Remarks
This paper analyses the RGAs of Malus spp. and other Rosaceae species to reveal specific evolutionary features of M. domestica. RGAs of M. domestica are mainly located in clusters and are mapped preferentially on six chromosomes. TIR-NBS and non-TIR-NBS classes of RGAs are located in different chromosome groups. Phylogenetic reconstruction in the Rosaceae family revealed specific clades of RGAs for Malus spp., Fragaria spp., and Rosa spp., indicating genus-specific evolution of resistance genes. However, strikingly similar RGAs were shared in different species of Malus, Pyrus, and Prunus highlighting a monophyletic origin of these three genera and the high conservation of some RGA sequences in these plants.
Materials and Methods
Identification of RGAs in the Apple Genome
The RGA sequences were identified from the predicted proteins of M. domestica cultivar ‘Golden Delicious’ [34] based on their NB-ARC domain profile (PF00931 [59]) using HMMER [60]. Putative RGA alleles were identified as predicted genes that have more than 90% of sequence similarity and overlap with another RGA along each scaffold of the heterozygous apple genome. Apple RGAs were validated by BLAST-N analysis (more than 90% protein sequence similarity) against known A. thaliana, P. trichocarpa, and V. vinifera genes. RGAs were grouped in different classes based on the presence of the domains TIR, LRR, CC, and BED finger [43]. The motifs were derived from the domain profiles retrieved from PFAM (http://pfam.janelia.org), PANTHER (http://www.pantherdb.org/), and SMART (http://smart.embl-heildelberg.de) databases and from the COILS program; a stringent threshold of 0.9 was used so that CC domains were specifically detected [61]. Resistance-related proteins were also identified based on kinase domains (IPR000719, PF07714, PF00069). Additional putative apple resistance genes were selected using BLAST and Arabidopsis proteins as reference sequences, based on a 60% similarity threshold.
Identification of RGA Clusters in the Apple Genome
The Arabidopsis definition of RGA cluster [4] was adopted: two or more RGAs in a cluster should be located within an average of 250 Kb and should not be interrupted by more than 21 open reading frames different from RGAs, as previously adopted for grapevine RGA clusters [2].
Isolation of RGAs from Wild Species
Four pairs of degenerate primers targeting the NBS domain [62], [63] were used to amplify RGA sequences from 26 different Malus accessions present in the USDA apple germplasm collection at Geneva (NY, USA) (www.ars-grin.gov/npgs/index.html; Table S2). The homologous sequences represent the following species: M. baccata, M. florentina, M. floribunda, M. fusca, M. halliana, M. honanensis, M. hupehemsis, M. kansuensis, M. micromalus, M. orientalis, M. prattii, M. prunifolia, M. pumila, M. robusta, M. sargentii, M. sieboldii, M. sieversii, M. sikkimensis, M. sublobata, M. sylvestris, M. transitoria, and M. yunnanensis (Table S2). PCR fragments were cloned in pGEMT easy (Promega), and two clones for each fragment were sequenced. Sequences were screened, cleaned, and compared with resistance genes previously identified in Rosaceae and in other Angiosperms. BLAST DNA similarity searches were performed against the RGA sequences of the apple genome using a collection of established RGAs. The RGAs were translated using tBLAST-N. Clones were filtered based on hit quality, because most of the RGA clones encoded between 24 and 40 amino acid residues. Queries having only a single hit below 90% identity were removed, and those with multiple smaller hits were annotated manually. RGA sequences from wild Malus species were submitted to the NCBI database (www.ncbi.nlm.nih.gov) under the accession numbers reported in Table S1.
Phylogenetic Analyses
Public RGA sequences from Rosaceae, P. trichocarpa, and V. vinifera Release 2 were downloaded from GenBank (http://www.ncbi.nlm.nih.gov; Table S1). RGA sequences from wild Malus species were also included (Table S1). Protein sequences of NBS domain of RGAs from M. domestica were aligned together with NBS sequences of wild Malus species, P. trichocarpa, V. vinifera and with the other Rosaceae species using hidden Markov models with the Sequence Alignment and Modeling Software System (SAM-T2K [64]); the sequences were formatted for analysis with the Phylip phylogenetic inference package [65].The SEQBOOT tool of the Phylip package was used to generate 500 bootstraps of the data set, and the PROTDIST tool was used to construct 500 bootstrapping distance matrices using the Dayhoff PAM matrix [65]. These matrices were jumbled twice and processed with the FITCH tool to create a phylogenetic tree. A neighbor-joining tree of the 500 bootstraps was also constructed (jumbling the sequence input order twice), and a majority-rule consensus tree was assembled.
Determination of the Ks Value
Based on a CLUSTALW nucleotide alignment of M. domestica RGAs sequences, a total of 302253 Ks values were obtained [66]. The connections between chromosomes were defined on the basis of the number of RGAs and Ks values. A connection between two chromosomes was accepted if at least ten RGAs had a Ks value lower than or equal to the first quartile of 0.25 [34].A: Distribution (percentage) of the major phylogenetic clades of apple RGAs (Figure1A) on the 17 M. domestica chromosomes (Chr). B: Percentage of chromosome (Chr) assignment to the major phylogenetic clades. Colours of major phylogenetic clades and chromosomes are listed below each chart.(TIF)Click here for additional data file.Phylogenesis of
from Rosaceae species. Phylogenetic analysis of the NBS domain was carried out by the neighbor-joining method [65] using RGA sequences of domesticated and wild Malus species (green), Pyrus spp. (yellow), Prunus spp. (purple), Fragaria spp. (red), Rosa spp. (orange), and Rubus spp. (blue). The composition of the phylogenetic clades (Md1 and Md2; Figure 4) and subclades (sc) of sequences mainly from M. domestica is highlighted. Proteins present in contiguous positions on the tree are merged (collapsed branches are indicated by the + sign).(TIF)Click here for additional data file.Connections between apple chromosomes based on Ks values from pairwise comparisons of
. Joining lines represent connections between two RGAs among duplicated chromosomes [35] (blue, red, pink, green), among not duplicated chromosomes (yellow), and within the same chromosome (gray). Each line represents a connection between two RGAs with a Ks value lower than 0.25 [35]. A connection between two chromosomes was accepted if at least ten pairwise comparisons had a Ks value lower than 0.25.(TIF)Click here for additional data file.Distribution of
among chromosome (Chr) doublets derived from the recent whole genome duplication of apple
[
. Colours of major phylogenetic clades (Figure 1A) are indicated.(TIF)Click here for additional data file.List of accession numbers and abbreviations of resistance gene analogues (
) with a nucleotide-binding site (NBS) domain from
,
,
, wild
species,
spp.,
spp.,
,
, and
Chromosome location, code, and class based on protein domain analysis are indicated for each of the RGAs of M. domestica.(XLS)Click here for additional data file.List of wild
species accessions (USDA apple germplasm collection at Geneva, NY, USA;
www.ars-grin.gov/npgs/index.html
) used for the isolation of
.(DOCX)Click here for additional data file.
Authors: Steven Maere; Stefanie De Bodt; Jeroen Raes; Tineke Casneuf; Marc Van Montagu; Martin Kuiper; Yves Van de Peer Journal: Proc Natl Acad Sci U S A Date: 2005-03-30 Impact factor: 11.205
Authors: F Calenge; C G Van der Linden; E Van de Weg; H J Schouten; G Van Arkel; C Denancé; C-E Durel Journal: Theor Appl Genet Date: 2005-01-13 Impact factor: 5.699
Authors: Xun Xu; Shengkai Pan; Shifeng Cheng; Bo Zhang; Desheng Mu; Peixiang Ni; Gengyun Zhang; Shuang Yang; Ruiqiang Li; Jun Wang; Gisella Orjeda; Frank Guzman; Michael Torres; Roberto Lozano; Olga Ponce; Diana Martinez; Germán De la Cruz; S K Chakrabarti; Virupaksh U Patil; Konstantin G Skryabin; Boris B Kuznetsov; Nikolai V Ravin; Tatjana V Kolganova; Alexey V Beletsky; Andrei V Mardanov; Alex Di Genova; Daniel M Bolser; David M A Martin; Guangcun Li; Yu Yang; Hanhui Kuang; Qun Hu; Xingyao Xiong; Gerard J Bishop; Boris Sagredo; Nilo Mejía; Wlodzimierz Zagorski; Robert Gromadka; Jan Gawor; Pawel Szczesny; Sanwen Huang; Zhonghua Zhang; Chunbo Liang; Jun He; Ying Li; Ying He; Jianfei Xu; Youjun Zhang; Binyan Xie; Yongchen Du; Dongyu Qu; Merideth Bonierbale; Marc Ghislain; Maria del Rosario Herrera; Giovanni Giuliano; Marco Pietrella; Gaetano Perrotta; Paolo Facella; Kimberly O'Brien; Sergio E Feingold; Leandro E Barreiro; Gabriela A Massa; Luis Diambra; Brett R Whitty; Brieanne Vaillancourt; Haining Lin; Alicia N Massa; Michael Geoffroy; Steven Lundback; Dean DellaPenna; C Robin Buell; Sanjeev Kumar Sharma; David F Marshall; Robbie Waugh; Glenn J Bryan; Marialaura Destefanis; Istvan Nagy; Dan Milbourne; Susan J Thomson; Mark Fiers; Jeanne M E Jacobs; Kåre L Nielsen; Mads Sønderkær; Marina Iovene; Giovana A Torres; Jiming Jiang; Richard E Veilleux; Christian W B Bachem; Jan de Boer; Theo Borm; Bjorn Kloosterman; Herman van Eck; Erwin Datema; Bas te Lintel Hekkert; Aska Goverse; Roeland C H J van Ham; Richard G F Visser Journal: Nature Date: 2011-07-10 Impact factor: 49.962