Literature DB >> 22192092

The evolution of the Gp-Rbp-1 gene in Globodera pallida includes multiple selective replacements.

Jean Carpentier¹, Magali Esquibet, Didier Fouville, Maria J Manzanares-Dauleux, Marie-Claire Kerlan, Eric Grenier.

Abstract

The Globodera pallida SPRYSEC Gp-Rbp-1 gene encodes a secreted protein which induces effector-triggered immunity (ETI) mediated by the Solanum tuberosum disease resistance gene Gpa2. Nonetheless, it is not known how the Andes orogeny, the richness in Solanum species found along the Cordillera or the introduction of the nematode into Europe have affected the diversity of Gp-Rbp-1 and its recognition by Gpa2. We generated a dataset of 157 highly polymorphic Gp-Rbp-1 sequences and identified three Gp-Rbp-1 evolutionary pathways: the 'Northern Peru', 'Peru clade I/European' and 'Chilean' paths. These may have been shaped by passive dispersion of the nematode and by climatic variations that have influenced the nature and diversity of wild host species. We also confirmed that, by an analysis of the selection pressures acting on Gp-Rbp-1, this gene has evolved under positive/diversifying selection, but differently among the three evolutionary pathways described. Using this extended sequence dataset, we were able to detect eight sites under positive selection. Six sites appear to be of particular interest because of their predicted localization to the extended loops of the B30.2 domain and/or support by several computational methods. The P/S 187 position was previously identified for its effect on the interaction with GPA2. The functional importance of the other five amino acid polymorphisms observed was investigated using Agrobacterium transient transformation assays. None of these new residues, however, appears to be directly involved in Gpa2-mediated plant defence mechanisms. Thus, the P/S polymorphism observed at position 187 remains the sole variation sufficient to explain the recognition of Gp-Rbp-1 by Gpa2.

Entities: CellLine Chemical Disease Mutation Species

Mesh：

Substances：
Helminth Proteins

Year: 2011 PMID： 22192092 PMCID： PMC3440577 DOI： 10.1111/j.1364-3703.2011.00769.x

Source DB: PubMed Journal: Mol Plant Pathol ISSN： 1364-3703 Impact factor: 5.663

INTRODUCTION

Plant–parasite interactions are often analysed in the context of the gene‐for‐gene model described by Flor (1955). In this model, both plant disease resistance and avirulence genes evolve under diversifying selection, the former to improve the recognition of pathogen effectors and the latter to escape recognition by resistance proteins (Flor, 1955; Jones and Dangl, 2006). In cyst nematodes, the role of certain secreted proteins in the host–pathogen interaction has been well documented (Davis ). One of these secreted proteins, Gp‐RBP‐1 (Blanchard ), belongs to the SPRYSEC multigene family and is described as the avirulence gene whose product is recognized by the resistance protein GPA2 (Sacco ). The SPRYSEC gene family encodes nematode secreted proteins specifically expressed in the dorsal oesophageal gland of the nematode early in the parasitic cycle (Rehman ). Apart from the presence of a signal peptide (SP), SPRYSEC genes contain a single B30.2 domain, which is an extended domain structure comprising PRY and SPRY units known to be involved in protein–protein interactions (Woo ). The subcellular expression in plants of several members of this gene family from the potato cyst nematode, Globodera pallida, has been characterized (Jones ). Although the Gp‐RBP‐1 protein appears to be localized in the cytoplasm, other SPRYSEC members show different subcellular localization patterns (nucleus and nucleolus), suggesting that SPRYSEC proteins can target a range of host proteins and act as suppressors of host defences (Jones ; Rehman ). Among the G. pallida SPRYSECs, the Gp‐RBP‐1 protein has been shown to be able to trigger a hypersensitive response (HR) in the presence of the coiled coil‐nucleotide binding‐leucine‐rich repeat (CC‐NB‐LRR) protein Gpa2. Transient expression assays co‐expressing Gpa2 with variants of Gp‐Rbp‐1 in tobacco leaves have also suggested that a proline residue at position 187 (P/S polymorphism) in the Gp‐RBP‐1 protein sequence is required for recognition by Gpa2 (Sacco ). Despite all this functional information, the evolution of Gp‐Rbp‐1 has not been investigated to date and, in particular, it is unknown how the phylogeographical history of G. pallida may have had an impact on the diversity of Gp‐Rbp‐1. Moreover, the Gp‐Rbp‐1 polymorphism assessed in Sacco's study was mostly based on sequences obtained from European G. pallida populations. Considering the diversity observed in indigenous South American G. pallida populations, it is not clear whether the P‐187 residue alone is still sufficient for Gpa2 recognition. Indeed, for millions of years, Solanum species and nematodes have co‐evolved and have gradually spread throughout most of the South American continent, with the Andes orogeny playing a dominant role in the South American conquest by G. pallida (Grenier ). Nearly 200 wild potato species can be found in South America. Many of these wild species are present in Argentina, Bolivia and Peru (where 93 species have been recorded), whereas, in other countries, such as Chile or Brazil, the spectrum of Solanum species is much poorer (Hijmans and Spooner, 2001). Genetic studies on Peruvian populations of G. pallida, conducted using neutral markers, have shown a clear south‐to‐north phylogeographical pattern with five well‐supported clades (Picard ). All cyst nematodes sampled in Europe have been shown to be derived from populations belonging to Peru clade I, the southernmost Peruvian clade, which also shows the greatest genetic diversity (Picard ). In this study, we broadly explored the variability of Gp‐Rbp‐1 by sequencing it in various novel G. pallida populations, especially populations from Peru (clades I–V) and Chile. We analysed the genetic structure of more than 150 sequences from 22 G. pallida populations to study the effects of ecological changes linked to the Andes orogeny or the accidental introduction of G. pallida in Europe. Our findings suggest that the evolution of Gp‐Rbp‐1 has been shaped by three distinct evolutionary pathways. We then investigated the presence of selective pressures considering these three evolutionary pathways and were able to identify novel sites under positive selection. Finally, we used transient expression assays in tobacco leaves to examine the effect of these polymorphisms on the outcome of the interaction between S‐187 Gp‐Rbp‐1 variants and Gpa2.

RESULTS

Structure and variability of Gp‐Rbp‐1

The previously reported Gp‐Rbp‐1 sequences vary in length, but all are characterized by the presence of SP and B30.2 domains divided into a 129‐amino‐acid SPRY subdomain and two PRY subdomains. We amplified a total of 149 sequences from 19 populations with an average of eight different sequences per population. We completed our dataset with 33 sequences available in GenBank/EMBL databases, including sequences from the three populations D383, Rookmaker and Guiclan. Of these 182 sequences, 13 contain a SPRY subdomain of greater or less than 129 amino acids and, for this reason, these were removed from the final dataset. In the 169 remaining sequences, very few identical sequences were observed. Extensive variations in the coding DNA sequence length were observed, which ranged from 720 to 870 bp. These variations in length are a result of large indels affecting the B30.2 domain and the number of PRY subdomains (Fig. 1). Thus, 12 further sequences corresponding to variants showing large indels in the B30.2 domain were also removed (Fig. 1). As such, the final sequence dataset used for all further analyses included 157 sequences: 30 from GenBank and 127 from our study.

Figure 1

Structure and sequence size variation in the Gp‐Rbp‐1 sequence dataset. Insertions and deletions are indicated on the gene structure by an arrow. Indel sizes and positions are boxed above the arrows (numbering was made on the basis of the FJ392677.1 sequence used here as a reference). The size of the sequence in nucleotides and the number of sequences concerned are indicated on the right‐hand side of each structure. Information available (Sacco ) on the strength of the Gpa2‐dependent hypersensitive response (HR) induction is indicated as follows: +++, complete collapse and rapid desiccation of the infiltration patch within 2 days; ++, complete collapse of the infiltration patch by 3 days post‐infiltration; +, slow and incomplete collapse with residual live cells; NT, not previously tested. Sequences that were kept in the final sequence dataset and used for further analysis are highlighted in bold. nt, nucleotide. Polymorphism analysis conducted using DnaSP software revealed 250 polymorphic sites dispersed along the entire Gp‐Rbp‐1 sequence: 137 singletons and 113 parsimony‐informative sites. Among these 250 mutations, 72% are nonsynonymous and represent, on average, one possible amino acid replacement every seven amino acids along the entire protein sequence. In contrast with the findings reported following analysis with neutral markers (Picard ), we found no clear evidence of a reduction in variability from south to north Peru (Fig. 2). The maximum genetic diversity was observed around latitude 13°S, which corresponds to the geographical region of the G. pallida clades II and III. Nonetheless, strong local effects were also observed. For example, the populations P297 and P298 showed different levels of nucleotide diversity, even though they are geographically very close.

Figure 2

Variation in Gp‐Rbp‐1 nucleotide diversity (ρ) among the 12 Peruvian populations distributed from southern to northern Peru. The locations of the 12 Peruvian nematode populations studied are indicated on the Peru map. A colour code was used to indicate the affiliation of each population to its phylogeographical clade. The nucleotide diversity (ρ) calculated from the Gp‐Rbp‐1 sequences for each of these populations and its standard deviation are shown on the right‐hand side of the map and according to the geographical position of the population. Results from factorial analysis based on the Gp‐Rbp‐1 dissimilarity matrix are shown in Fig. 3: axes 1 and 2 explain 27.4% and 13.6% of the total observed variability, respectively. The third calculated axis, which explained 6.9% of the variability, was not considered in our study. No clear structure was observed in the five clades from the Peruvian populations. However, sequences issued from populations of the same geographical region (Peru vs. Chile vs. Europe) tended to cluster together. Two exceptions to this rule are noteworthy: (i) the relationship between the European and Peru clade I; and (ii) the relationship between Chilean and Peru P309 Gp‐Rbp‐1 sequences. The factorial analysis (Fig. 3) showed a central group (composed essentially of Gp‐Rbp‐1 sequences from European, clade I, clade II and clade III populations), which itself is split into three distinct groups. We interpreted these results in terms of three distinct evolutionary pathways: (i) the ‘Northern Peru path’, including sequences from populations of clades II–V; (ii) the ‘Peru clade I/European path’, grouping sequences from European and Peru clade I populations; and (iii) the ‘Chilean path’, encompassing sequences from P309 and Tierra del Fuego populations. Based on analysis of molecular variance (amova), the consideration of eight geographical G. pallida groups (five Peruvian clades + the Chilean population + the P309 population + European populations) could only explain 11.9% of the total variability (Table 1A). By contrast, 28.2% of the variability was explained by the three evolutionary pathways (Table 1B), thus confirming the observations made in the factorial analysis.

Figure 3

Table 1

Analysis of molecular variance (amova) of the Gp‐Rbp‐1 dataset under two different partitionings.

	Degree of freedom	Sum of squares	Variance components	Percentage of variation
(A)
Source of variation
Among groups	7	736.93	1.98	11.94
Among populations within groups	14	931.50	8.75	52.68
Within populations	135	793.53	5.88	35.38
Total	156	2461.96	16.61
(B)
Source of variation
Among evolutionary pathways	2	294.25	2.94	28.20
Among populations within evolutionary pathways	19	432.96	2.51	24.11
Within populations	135	671.28	4.97	47.69
Total	156	1398.49	10.42

amova considering: (A) eight geographical groups (five Peruvian clades, one Chilean group, one P309 group and one European group); or (B) three evolutionary pathways (‘Northern Peru path’, ‘Peru clade I/European path’ and ‘Chilean path’). In each case, the variance was calculated among the different groups or pathways, among the 22 populations within each group or pathway and within the 22 populations. Sequences obtained from Peruvian populations of clades II–V were assigned to the ‘Northern Peru path’, sequences obtained from European populations and from Peru clade I populations were assigned to the ‘Peru clade I/European path’, and sequences from the P309 and Tierra del Fuego populations were included in the ‘Chilean path’.

Factorial analysis based on the Gp‐Rbp‐1 dissimilarity matrix. Results of the factorial analysis are shown in a two‐axis system representing 41% of the variability (27.4% for the horizontal axis and 13.6% for the vertical axis). Each point corresponds to one Gp‐Rbp‐1 sequence with a colour code corresponding to its geographical group or clade of origin. The distance between two points represents the dissimilarity value between these two sequences in the two‐axis representation. The three arrows represent the three evolutionary pathways supported by the sequence dataset: ‘the Northern Peru path’, the ‘Peru clade I/European path’ and the ‘Chilean path’. Analysis of molecular variance (amova) of the Gp‐Rbp‐1 dataset under two different partitionings. amova considering: (A) eight geographical groups (five Peruvian clades, one Chilean group, one P309 group and one European group); or (B) three evolutionary pathways (‘Northern Peru path’, ‘Peru clade I/European path’ and ‘Chilean path’). In each case, the variance was calculated among the different groups or pathways, among the 22 populations within each group or pathway and within the 22 populations. Sequences obtained from Peruvian populations of clades II–V were assigned to the ‘Northern Peru path’, sequences obtained from European populations and from Peru clade I populations were assigned to the ‘Peru clade I/European path’, and sequences from the P309 and Tierra del Fuego populations were included in the ‘Chilean path’.

Gp‐Rbp‐1 has been subject to positive selection

Site‐specific models for the detection of positive selection were used to align the 157 Gp‐Rbp‐1 sequences obtained. We compared four evolutionary models implemented in the codeml program of paml: M1vsM2 and M7vsM8. The M2 and M8 models of positive selection appeared to be significantly (P < 0.001) better adapted to our dataset (Table 2). Both the M2 and M8 models identified eight sites under positive selection (C/G/R/S 37, H/N/Y 100, K/E 102, A/G 103, K/Q/R 119, F/I/V 184, P/S 187 and K/E 248) with a posterior probability greater than 98%. Residues 103, 119 and 184 were confirmed by single‐likelihood ancestor counting (SLAC) or fixed‐effects likelihood (FEL) methods, and residue 187 was the only one found by all methods. Of the eight sites predicted to be under positive selection by paml, seven were found in the B30.2 domain and, more specifically, five (100, 102, 103, 184 and 187) mapped to predicted loops that shape surface A of the SPRY domain, based on comparison with SPRYSEC 19 (Murrin and Talbot, 2007) (Fig. 4).

Table 2

Choice of the best evolutionary model fitting the Gp‐Rbp‐1 dataset.

Models	Null model	Positive selection model	Null model	Positive selection model
Models	M1	M2	M7	M8
ln(likelihood)	−4056.08	−4009.35	−4059.88	−4009.62
2Δl		93.46		100.52
χ ² (n= 2, P= 0.01)		9.21		9.21

To identify the model that best fitted our data, we used paml (Phylogenetic Analysis by Maximum Likelihood). Null models (M1 and M7) are compared with positive selection models (M2 and M8) using a likelihood score for each model. The significance of likelihood score difference was estimated by comparing the null model and positive selection model (2Δl) with a χ 2 table (likelihood ratio test, LRT). In the two comparisons (M1vsM2 and M7vsM8), 2Δl > 9.210 indicated that the positive selection models were a better fit than the null models with our dataset.

Figure 4

Distribution of the dN/dS ratio along the Gp‐Rbp‐1 amino acid sequence. Analyses were conducted using the codeml module of paml on the Gp‐Rbp‐1 sequences. Amino acid variants found to be subjected to positive selection with a posterior probability of >95% are indicated above each site. Positions under positive selection that were supported by paml and at least one other method [single‐likelihood ancestor counting (SLAC) or fixed‐effects likelihood (FEL)] are indicated with a star. Sequence portions corresponding to the SPRYSEC extended loops in the B30.2 protein structure are shaded in grey, and amino acids under positive selection found in theses portions are boxed. The positions of the predicted different domains (SP, PRY and SPRY) are indicated below the graph.

Choice of the best evolutionary model fitting the Gp‐Rbp‐1 dataset. To identify the model that best fitted our data, we used paml (Phylogenetic Analysis by Maximum Likelihood). Null models (M1 and M7) are compared with positive selection models (M2 and M8) using a likelihood score for each model. The significance of likelihood score difference was estimated by comparing the null model and positive selection model (2Δl) with a χ 2 table (likelihood ratio test, LRT). In the two comparisons (M1vsM2 and M7vsM8), 2Δl > 9.210 indicated that the positive selection models were a better fit than the null models with our dataset. Distribution of the dN/dS ratio along the Gp‐Rbp‐1 amino acid sequence. Analyses were conducted using the codeml module of paml on the Gp‐Rbp‐1 sequences. Amino acid variants found to be subjected to positive selection with a posterior probability of >95% are indicated above each site. Positions under positive selection that were supported by paml and at least one other method [single‐likelihood ancestor counting (SLAC) or fixed‐effects likelihood (FEL)] are indicated with a star. Sequence portions corresponding to the SPRYSEC extended loops in the B30.2 protein structure are shaded in grey, and amino acids under positive selection found in theses portions are boxed. The positions of the predicted different domains (SP, PRY and SPRY) are indicated below the graph. We identified 33 haplotypes (Table S1, see Supporting Information) corresponding to all observed combinations of the eight sites found under positive selection. The eight most frequent haplotypes (haplotypes 1–8) are represented by 9–23 sequences and include two‐thirds of the sequences. Among these most frequent haplotypes, it should be noted that: (i) all P309 and Tierra del Fuego Gp‐Rbp‐1 sequences correspond to the same haplotype (haplotype 7), which was only found in these two populations; and (ii) haplotype 8 seems to be restricted to Peru clade I populations. Loss of haplotype 8 in Europe and North Peru may have been caused by genetic drift or by a founder effect. At least two to seven haplotypes were identified in most of the studied populations (mean number of haplotypes per population, 3). Populations D383, Guiclan, P235, P308, P309 and Tierra del Fuego were represented by only one haplotype, which may have been a result in some cases of the small number of sequences retained in the final sequence dataset. In addition to the D383 population, we recovered only avirulent variants of Gp‐Rbp‐1 (i.e. P‐187 sequences) in five other populations: Luffness, Pukekoe, P205, P212 and P235. Nearly all the polymorphisms observed for the eight sites under positive selection were found in the ‘Northern Peru path’ and in the ‘Peru clade I/European path’ (Fig. 5). The exceptions were R‐37, which is missing in the ‘Northern Peru path’, and R‐119, which is missing in the ‘Peru clade I/European path’. Residues H‐100, A‐103 and I‐184 were predominant in each evolutionary pathway (present in more than 58% of the sequences). Residues G‐103, F‐184 and K‐248 were found in less than 20% of the sequences belonging to the ‘Peru clade I/European path’ and the ‘Northern Peru path’, and were missing in the ‘Chilean path’.

Figure 5

Variability observed in the Gp‐Rbp‐1 dataset at sites predicted to be under positive selection. The eight codons considered were 37, 100, 102, 103, 119, 184, 187 and 248 (numbering is made on the basis of the FJ392677.1 sequence used here as a reference). For each of these codons, the bars represent the proportion of each amino acid variant observed at this site in each of the three evolutionary pathways: the ‘Chilean path’ (C), the ‘Clade I/European path’ (E) and the ‘Northern Peru path’ (P).

The P‐187 position plays a predominant role in the recognition of Gp‐Rbp‐1 by Gpa2

Among the 33 haplotypes identified previously, 19 have a serine amino acid in position 187, but show variability for the seven other sites under positive selection. When only six of the sites, that appeared to be supported by several methods and/or were predicted to be localized to the loops of the B30.2 domain, were considered, 12 haplotypes still contained the most frequent S‐187 residue. We tested whether, among these 12 haplotypes (haplotypes 4, 5, 7, 8, 11, 14, 16, 22, 23, 24, 28 and 29), sequence variation at the sites under positive selection had an effect on the interaction with GPA2. For this, 10 Gp‐Rbp‐1 variants were transiently expressed with Gpa2 in tobacco leaves (haplotypes 4 and 22 were tested previously in Sacco ). Four days after agro‐infiltration, for each experimental replicate, we observed an HR with our positive control, but not with the 10 Gp‐Rbp‐1 variants expressed with Gpa2 (Fig. S1, see Supporting Information), as observed previously for haplotypes 4 and 22 in Sacco ). Although it was not possible to test all the haplotype combinations and despite the detection of novel sites under positive selection, our results support the previously suggested idea that the recognition of Gp‐Rbp‐1 by Gpa2 is fully controlled by the proline residue at position 187.

DISCUSSION

One of the main objectives of this study was to broadly explore the variability of the Gp‐Rbp‐1 gene in G. pallida populations. Our results revealed a considerable sequence variability, which could be assigned to three evolutionary pathways. Furthermore, Gp‐Rbp‐1 appears to have been subject to positive selection and this selection appears to have mostly affected sites within the encoded protein. All variants of Gp‐Rbp‐1 in our dataset were amplified from G. pallida cDNA with a single set of specific primers, and showed best blast hits with Gp‐Rbp‐1. However, as Gp‐Rbp‐1 belongs to the SPRYSEC multigene family, we cannot rule out the possibility that some of the sequences obtained are not true Gp‐Rbp‐1 variants, but actually represent other members of the SPRYSEC family. To limit this risk, we conducted our analysis on a reduced sequence dataset in which only sequences containing a SPRY subdomain length of 129 amino acids were retained. We also removed 12 sequences that had large indels in the B30.2 domain. However, these sequences may represent Gp‐Rbp‐1 variants obtained through alternative splicing. Alternative splicing events represent one of the main sources of proteomic diversity in multicellular eukaryotes (Nilsen and Graveley, 2010). Some cases of alternative splicing of putative effector proteins have been described in plant parasitic nematodes (Yu ; Zahler, 2005). It is known that Gp‐Rbp‐1 contains six introns, including intron 3, which is located between positions 243 and 244 of the most commonly identified variant of the Gp‐Rbp‐1 cDNA (Blanchard ). It is therefore probable that the large indels observed in some of the sequences correspond to intron 3 retention (+45 nucleotides) or exon 4 skipping (−72 nucleotides). Based on the previously published results by Sacco ) and our agro‐infiltration experiments (data not shown), it appears that both types of splice variant are still able to trigger Gpa2 recognition as long as they encode a proline at residue 187, although the timing and intensity of the response varied. As shown by amova, a substantial part (one‐third to nearly one‐half) of the variability remains at the intrapopulation level. This variation represents the interindividual variation that can be observed in cysts (cysts contain hundreds of larvae). Globodera pallida reproduces by obligate outcrossing and intrapopulation variation can also be increased when a single female is copulated by several males. A high level of intrapopulation variability has been reported previously for the nematode pectate lyase gene, pel‐2 (Geric Stare ). Globodera pallida has co‐evolved for a long time with wild plants rather than cultivated plants. We can thus hypothesize that the considerable intrapopulation sequence variation observed in the genes involved in the plant–nematode interaction is related to this history of co‐evolution with the many wild Solanum species that can be found in South America, the native area of this nematode. Gp‐Rbp‐1 evolution appears to have occurred independently in the five phylogeographical clades described for the G. pallida Peruvian populations (Picard ). In particular, populations from clade I were not the most polymorphic for Gp‐Rbp‐1 and we did not observe a reduction in diversity from south to north Peru. The maximum amount of diversity was observed in G. pallida populations sampled around latitude 13°S (geographical region of clades II and III), where Hijmans and Spooner (2001) previously observed the largest number of wild potato species (around 25 species in 250 km2). Thus, the diversity of Gp‐Rbp‐1 sequences appears to be linked to host plant diversity. Moreover, our results suggest that Gp‐Rbp‐1 was influenced by three evolutionary pathways that correlated with major dispersion events which represented drastic changes in terms of host richness and climate. The first evolutionary pathway, which we called the ‘Northern Peru path’, affects most of the Gp‐Rbp‐1 sequences of G. pallida populations from Peruvian clades II–V. This evolutionary pathway may be linked to the Andean uplift which impacted the landscape, climate and ecosystem during the pre‐Quaternary (Hoorn ). The second evolutionary pathway, called the ‘Peru clade I/European path’, affects nearly all Gp‐Rbp‐1 sequences from Peruvian clade I and Europe. It can be linked to the recent introduction of G. pallida from a Peruvian clade I population into Europe during the 19th century. It has been established that all the European populations of G. pallida come from a single restricted area on the north shore of Lake Titicaca (Plantard ). Therefore, it is not surprising that the most frequent European Gp‐Rbp‐1 haplotypes are also very common in clade I Peruvian populations. Such a new European environment with only one host species (S. tuberosum spp. tuberosum), long photoperiod and different climatic conditions could explain the emergence of another evolutionary pathway that has affected Gp‐Rbp‐1 evolution. Furthermore, five rare Gp‐Rbp‐1 haplotypes (haplotypes 14, 21, 22, 23 and 32) found in Europe appear to be absent from South America. The last evolutionary pathway, the ‘Chilean path’, includes all Gp‐Rbp‐1 sequences obtained for P309 and Tierra del Fuego. These sequences have diverged more than the European sequences from the clade I sequences. Thus, it is probable that this evolutionary pathway, like the ‘Northern Peru path’, occurred during geological time. In a similar manner, the Andes orogeny also occurred southward from Lake Titicaca (towards Tierra del Fuego), and our results reveal that the evolution of Gp‐Rbp‐1 in Chile does not mirror the pattern observed in northern Peru. It seems that a distinct evolutionary trajectory occurred in Chile, probably because of the presence of more contrasted ecosystems and/or a more fragmented distribution of wild potatoes in this area (Hijmans and Spooner, 2001). This Chilean evolutionary pathway needs to be confirmed with other Chilean or Argentinean populations and sequences. Surprisingly, Gp‐Rbp‐1 sequences isolated in P309, a population sampled in the south of Peru, appear to be genetically closer to the Chilean Gp‐Rbp‐1 sequences than to Peruvian clade I Gp‐Rbp‐1 sequences. It is thus highly likely that P309 represents an anthropogenic introduction event to a Chilean population in the south of Peru, rather than a native Peruvian G. pallida population. We have shown that the evolution of Gp‐Rbp‐1 has not strictly followed—at least in Peru—the phylogeographical history of the nematode population. Gp‐Rbp‐1 diversity seems rather to have been influenced by passive and anthropogenic dispersion of G. pallida in the American and European continents as well as climate variations that have influenced the nature and richness of wild host species. This evolutionary pattern is not surprising considering that SPRYSEC proteins are thought to be able to interact with a range of host proteins (Jones ; Rehman ). Thus, we can assume that the evolution of other such effectors was also strongly impacted by environmental changes in terms of host richness, as observed for the evolution of the Gp‐Rbp‐1 gene. Our analysis of the selection pressures acting on Gp‐Rbp‐1 sequences have revealed eight sites under positive selection: six are described for the first time in this study (the remaining two were found previously by Sacco ). Clearly, we were able to detect more sites under positive selection because we used more divergent populations, and thus circumvented one of the major limitations of the dN/dS approaches that have been developed and optimized to study interspecies rather than intraspecies' divergences (Goldman and Yang, 1994; Kimura, 1977; Muse and Gaut, 1994). Five of the eight sites predicted to be under positive selection were found in the B30.2 domain and mapped to predicted loops that shape surface A of the SPRY domain based on a comparison with SPRYSEC 19 (Murrin and Talbot, 2007). Similar results were obtained in a SPRYSEC protein of Globodera rostochiensis, where seven sites under positive selection were also found in these extended loops (Rehman ). In Phytophthora sp., it has been shown that most effector genes reveal hallmarks of positive selection and that this kind of selection is implicated in adaptation to the host (Raffaele ). Thus, we hypothesized that the eight sites in Gp‐Rbp‐1 and, more specifically, the five sites in the loops shaping the hypervariable surface A represent key positions that may influence the outcome of the interaction with host proteins, such as the resistance protein GPA2. This view is supported by the fact that P/S187, which was identified by all the methods used, has been shown previously to be an important variation allowing the recognition or not of Gp‐RBP‐1 by GPA2 in tobacco. However, none of these other sites appear to confer the ability to trigger an HR to the S‐187 Gp‐Rbp‐1 variants when expressed with Gpa2. The P/S polymorphism observed at position 187 therefore remains the only variation which plays a significant and direct role in the Gp‐Rbp‐1–Gpa2 interaction. It is still possible that the seven other sites identified in this study quantitatively affect the Gpa2– P‐187 Gp‐Rbp‐1 interaction, but they may also simply be involved in interactions with other plant host proteins. Interestingly, as for the D383 population, only avirulent Gp‐Rbp‐1 variants (P‐187) were recovered from the European population Luffness and the New Zealand population Pukekoe. This result is surprising because potatoes harbouring the Gpa2 resistance gene are susceptible to these G. pallida populations (E. Grenier, unpublished results). Thus, Gpa2‐mediated resistance in potato to G. pallida nematodes is not only explained by the nature of the amino acid site at position 187 in Gp‐Rbp‐1. The absence of virulent Gp‐Rbp‐1 variants (S‐187) in the Luffness and Pukekoe populations in our dataset could be explained by: (i) the number of sequences, which may not have been sufficiently large to detect the full extent of sequence mixtures within each of these two populations; or (ii) the presence of another effector in the Luffness and Pukekoe populations (but which is absent from D383) which suppresses the effector‐triggered immunity induced when Gpa2 recognizes an avirulent Gp‐Rbp‐1 variant.

EXPERIMENTAL PROCEDURES

Nematode populations

Twenty‐two G. pallida populations were analysed: eight from Europe (Duddingston, Pukekoe, Rookmaker, Guiclan, Luffness, Ouessant, Chavornay and D383); 13 from Peru [following Picard's classification (Picard ): clade I: P286, P297, P298, P308; clade II: P252, P235; clade III: P205, P212; clade IV: P38; clade V: Chocon, Otuzco, Huamachuco and unclassified: P309); and one from Chile (Tierra del Fuego).

Gp‐Rbp‐1 complementary DNAs

All of the populations used in this study come from sampling carried out in cultivated potato fields. For 19 populations (not including Rookmaker, Guiclan and D383), juveniles (J2s) were obtained by soaking cysts in potato root diffusates (of the potato cultivar Désirée) in the dark at room temperature for 1 week. Ten J2s were crushed on a glass slide, using a Pasteur pipette with a sealed end, in 10 µL of RNAse‐free water and frozen in liquid nitrogen. Reverse transcription (RT) was performed using the ‘SuperScript™ III One‐Step RT‐PCR System with Platinum®Taq High Fidelity’ kit (Invitrogen, Glasgow, UK). One microlitre of 3′ rapid amplification of cDNA ends (RACE) adapter primer at 10 µm and 1 µL of deoxynucleoside triphosphate (dNTP) at 10 mm were added to each sample. Samples were then heated at 65 °C for 5 min and cooled instantaneously on ice. A mixture of 4 µL of 5 × polymerase chain reaction (PCR) buffer, 2 µL of dithiothreitol (DTT) at 0.1 m, 2 µL of MgCl2 at 25 mm and 1 µL of RNAsin was heated at 42 °C and added to each sample. The reaction was incubated at 42 °C for 2 min and 1 µL of Superscript III was added, followed by a final incubation step at 42 °C for 1 h. Gp‐Rbp‐1 sequences were amplified by PCR using the RT reactions as a template. PCRs were carried out in a final volume of 25 µL using 3 µL of the RT reaction, 1 µm of the IC5‐4 forward primer (5′‐TTTTATTTGCCTCAAAATGCGC‐3′), 1 µm of the IC5‐11 reverse primer (5′‐ACAGCAAACCCATCATAAATTCTC‐3′), 0.2 mm of each dNTP, 1 unit of Taq polymerase (Promega, Madison, WI, USA), 2 mm MgCl2 and 5 µL of the provided 10 × Ex Taq buffer. The PCR protocol was 5 min at 98 °C, followed by 35 cycles of 10 s at 98 °C, 30 s at 51 °C and 1 min and 45 s at 72 °C, and a final step of 10 min at 72 °C.

Gp‐Rbp‐1 cloning, sequencing and sequence alignments

PCR products were cloned using the Strataclone™ PCR cloning kit (Agilent Technologies, Palo Alto, CA, USA). Eight to twelve transformed colonies per cloning were picked and used to amplify the insert. PCRs were performed in a final volume of 30 µL, containing 0.5 µm of M13 forward and reverse primers, 0.2 mm of each dNTP, 0.75 unit of Taq polymerase (Promega), 1.5 mm MgCl2 and 6 µL of the provided buffer (Promega). The PCR protocol started with 5 min at 95 °C and the 30‐cycle amplification profiles were as follows: 1 min at 92 °C, 1 min at 55 °C and 3 min at 72 °C, and a final step of 5 min at 72 °C. PCR products were sequenced by Genoscreen using the M13‐20 forward primer as sequencing primer. When possible, ambiguous nucleotides were manually corrected based on the chromatograms. New Gp‐Rpb‐1 sequences analysed in this study were deposited in GenBank/EMBL (GenBank: JF933771–JF933897, Table S1). We completed our dataset with 33 sequences already available in GenBank/EMBL databases: 12 sequences from the three populations D383, Rookmaker and Guiclan [GenBank: EF423893–EF423896 (D383), EF423897–EF423902 (Rookmaker), FJ882986.1 and EU982199.1 (Guiclan)] and 21 sequences from seven populations already represented in our dataset [AM491352–AM491355 (Chavornay), FJ392678.1 and FJ392677.1 (Chavornay), FJ882983.1–FJ882985.1 (Pukekoe), EU982198.1 (Duddindston), FJ882983, FJ882997 and FJ882998 (Humachuco), FJ882987.1–FJ882989.1 (P286), FJ882990.1–FJ882992.1 (P38), FJ882993.1 and FJ882995.1 (Otuzco)]. As such, the final dataset comprised DNA sequences from 22 G. pallida populations. DNA sequences, once translated into protein sequences, were aligned using Clustal W. The alignment of the different length repeats in the PRY motif was manually corrected. Motif detection in translated DNA sequences was performed using SMART (Simple Modular Architecture Research Tool) (Schultz ) and InterProscan (Zdobnov and Apweiler, 2001).

Phylogenetic and evolutionary analyses

Phylogenetic analysis was performed using the DARwin program (Perrier and Jacquemoud‐Collet, 2006). A dissimilarity matrix was calculated using the Kimura model (Kimura, 1980), 80% minimal proportion of valid data required for each unit pair and 1000‐replicate bootstrapping. A factorial analysis was carried out based on the previously calculated dissimilarity matrix. The diversity indicator ρ and its standard deviation for each G. pallida Peruvian population were estimated with DnaSP v5 software (Librado and Rozas, 2009). This indicator, corresponding to the average number of nucleotide differences per site between two sequences, was used to compare nematode population diversity following their latitude. Evaluation of the intragroup and intergroup/clade variability was performed by an amova using Arlequin 3.1 software (Excoffier ). The distance matrix between the different clades/groups was estimated with mega version 4 software (Tamura ) using a Kimura two‐parameter model (Kimura, 1980) with a gamma parameter of 0.45. Nucleotide sites with more than 20% gap/missing data were not considered in this calculation. To evaluate the selective pressures on Gp‐Rbp‐1, we used the ratio of nonsynonymous to synonymous substitution rates per site (ω= dN/dS), estimated by site‐specific models implemented in the paml package version 3.14 (Yang, 1997). The codeml program of paml assigns a likelihood score to models for selection. A likelihood score for a model incorporating positive selection (M2 or M8), which is higher than that for a null model without positive selection (M1 or M7), is evidence for positive selection. We estimated the significance of the differences by comparing the null model and positive selection model (2Δl) with a χ 2 table [likelihood ratio test (LRT); Table 2]. To identify the sites under positive selection, we used different methods, such as Bayes empirical Bayes implemented in codeml, which calculates the posterior probabilities that each site falls into a different ω class, SLAC (Kosakovsky Pond and Frost, 2005) and FEL (Kosakovsky Pond and Frost, 2005), both available through the DataMonkey web interface (Delport ).

Plasmid construction and transient expression

For the generation of Gp‐Rbp‐1 expression clones, the different inserts were ligated into the 5′XbaI and 3′BamHI sites of the pBIN61 binary vector series. For this, Gp‐Rbp‐1 clones Otuz‐1‐JC (JF933857), P309‐5‐JC (JF933824), P308‐1‐JC (JF933893), Choc‐10‐JC (JF933850), Guic‐4‐DF (EU982199.1), Choc‐7‐JC (JF933852), Oues‐11‐JC (JF933791), Huam‐16‐DF (FJ882996.1) and Choc‐9‐JC (JF933854) were amplified with the primers GpaRBPMforXba (5′‐CTCTAGACCATGGAGTCGCCAAAACCAAAC‐3′) and Rbp1stopRev (5′‐GGATCCGCAAACCCATCATAAATTCTCG‐3′) in order to remove the SP domain and to add XbaI and BamHI restriction sites at the 5′ and 3′ ends of the PCR product, respectively. After purification on agarose gel, PCR products were first ligated into a pSC‐A‐amp/kan vector (Strataclone PCR Cloning kit), digested with XbaI and BamHI restriction enzymes and then ligated into a pBIN61 vector. Nicotiana benthamiana plants were germinated and grown in a growth chamber maintained at 20 °C. For the transient expression of proteins, the plants were infiltrated by syringe with Agrobacterium tumefaciens strain C58C1 carrying the virulence plasmid pCH32 and the appropriate pBIN61 binary expression vector. Agrobacterium cultures were diluted to an optical density at 600 nm (OD600) = 1 and co‐infiltrated at a final OD600= 0.5. Plants were transferred to a growth chamber maintained with 16‐h light and 8‐h darkness at 20 °C for 3–5 days. All experiments were repeated twice on at least two leaves of two different plants. Fig. S1 Agrobacterium transient expression assay using nine S‐187 Gp‐Rbp‐1 haplotypes. Gp‐Rbp‐1 variants cloned into pBIN61 were transiently expressed via agro‐infiltration in Nicotiana benthamiana leaves in combination with either the empty vector or the Gpa2 potato resistance gene. Hypersensitive response (HR) phenotypes obtained 4 days after agro‐infiltration are shown. The Gp‐Rbp‐1 variant Rook‐6 (EF423902.1), co‐expressed with Gpa2, was used as a positive control. Numbers 0?9 represent the Gp‐Rbp‐1 variants Rook‐6, Otuz‐1‐JC, Choc‐10‐JC, P309‐5‐JC, P308‐1‐JC, Guic‐4‐DF, Choc‐7‐JC, Choc‐9‐JC, Huam‐16‐DF and Oues‐11‐JC, respectively; (A) and (EV) indicate whether the Gpa2 potato resistance or the empty vector gene, respectively, was co‐infiltrated. Table S1 Haplotypes and GenBank accession numbers of the 157 Gp‐Rbp‐1 sequences. Haplotypes were defined solely on the basis of the amino acid combination at the eight sites found to be under positive selection. Each line corresponds to one sequence with a specific amino acid at positions 37, 100, 102, 103, 119, 184, 187 and 248. Haplotype 1 was the most frequent haplotype and haplotype 33 was the rarest haplotype. Supporting info item Click here for additional data file. Supporting info item Click here for additional data file.

27 in total

1. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology.

Authors: Wayne Delport; Art F Y Poon; Simon D W Frost; Sergei L Kosakovsky Pond
Journal: Bioinformatics Date: 2010-07-29 Impact factor: 6.937

Review 2. Alternative splicing in C. elegans.

Authors: Alan M Zahler
Journal: WormBook Date: 2005-09-26

3. Origin and genetic diversity of Western European populations of the potato cyst nematode (Globodera pallida) inferred from mitochondrial sequences and microsatellite loci.

Authors: O Plantard; D Picard; S Valette; M Scurrah; E Grenier; D Mugniéry
Journal: Mol Ecol Date: 2008-04-10 Impact factor: 6.185

4. SMART, a simple modular architecture research tool: identification of signaling domains.

Authors: J Schultz; F Milpetz; P Bork; C P Ponting
Journal: Proc Natl Acad Sci U S A Date: 1998-05-26 Impact factor: 11.205

5. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome.

Authors: S V Muse; B S Gaut
Journal: Mol Biol Evol Date: 1994-09 Impact factor: 16.240

6. A codon-based model of nucleotide substitution for protein-coding DNA sequences.

Authors: N Goldman; Z Yang
Journal: Mol Biol Evol Date: 1994-09 Impact factor: 16.240

7. Genome evolution following host jumps in the Irish potato famine pathogen lineage.

Authors: Sylvain Raffaele; Rhys A Farrer; Liliana M Cano; David J Studholme; Daniel MacLean; Marco Thines; Rays H Y Jiang; Michael C Zody; Sridhara G Kunjeti; Nicole M Donofrio; Blake C Meyers; Chad Nusbaum; Sophien Kamoun
Journal: Science Date: 2010-12-10 Impact factor: 47.728

8. Identification and functional characterization of effectors in expressed sequence tags from various life cycle stages of the potato cyst nematode Globodera pallida.

Authors: John T Jones; Amar Kumar; Liliya A Pylypenko; Amarnath Thirugnanasambandam; Lydia Castelli; Sean Chapman; Peter J A Cock; Eric Grenier; Catherine J Lilley; Mark S Phillips; Vivian C Blok
Journal: Mol Plant Pathol Date: 2009-11 Impact factor: 5.663

Review 9. Parasitism proteins in nematode-plant interactions.

Authors: Eric L Davis; Richard S Hussey; Melissa G Mitchum; Thomas J Baum
Journal: Curr Opin Plant Biol Date: 2008-05-20 Impact factor: 7.834

10. The cyst nematode SPRYSEC protein RBP-1 elicits Gpa2- and RanGAP2-dependent plant cell death.

Authors: Melanie Ann Sacco; Kamila Koropacka; Eric Grenier; Marianne J Jaubert; Alexandra Blanchard; Aska Goverse; Geert Smant; Peter Moffett
Journal: PLoS Pathog Date: 2009-08-28 Impact factor: 6.823

9 in total

1. Analysis of putative apoplastic effectors from the nematode, Globodera rostochiensis, and identification of an expansin-like protein that can induce and suppress host defenses.

Authors: Shawkat Ali; Maxime Magne; Shiyan Chen; Olivier Côté; Barbara Gerič Stare; Natasa Obradovic; Lubna Jamshaid; Xiaohong Wang; Guy Bélair; Peter Moffett
Journal: PLoS One Date: 2015-01-21 Impact factor: 3.240

Review 2. SPRYSEC Effectors: A Versatile Protein-Binding Platform to Disrupt Plant Innate Immunity.

Authors: Amalia Diaz-Granados; Andrei-José Petrescu; Aska Goverse; Geert Smant
Journal: Front Plant Sci Date: 2016-10-20 Impact factor: 5.753

3. Analysis of survival and hatching transcriptomes from potato cyst nematodes, Globodera rostochiensis and G. pallida.

Authors: Marc-Olivier Duceppe; Joël Lafond-Lapalme; Juan Emilio Palomares-Rius; Michaël Sabeh; Vivian Blok; Peter Moffett; Benjamin Mimee
Journal: Sci Rep Date: 2017-06-20 Impact factor: 4.379

4. The Genomic Impact of Selection for Virulence against Resistance in the Potato Cyst Nematode, Globodera pallida.

Authors: Kyriakos Varypatakis; Pierre-Yves Véronneau; Peter Thorpe; Peter J A Cock; Joanne Tze-Yin Lim; Miles R Armstrong; Sławomir Janakowski; Mirosław Sobczak; Ingo Hein; Benjamin Mimee; John T Jones; Vivian C Blok
Journal: Genes (Basel) Date: 2020-11-28 Impact factor: 4.096

5. Potato cyst nematodes Globodera rostochiensis and G. pallida.

Authors: James A Price; Danny Coyne; Vivian C Blok; John T Jones
Journal: Mol Plant Pathol Date: 2021-03-11 Impact factor: 5.663

6. Evolution and variability of Solanum RanGAP2, a cofactor in the incompatible interaction between the resistance protein GPA2 and the Globodera pallida effector Gp-RBP-1.

Authors: Jean Carpentier; Eric Grenier; Magalie Esquibet; Louis-Philippe Hamel; Peter Moffett; Maria J Manzanares-Dauleux; Marie-Claire Kerlan
Journal: BMC Evol Biol Date: 2013-04-19 Impact factor: 3.260

7. Analysis of Globodera rostochiensis effectors reveals conserved functions of SPRYSEC proteins in suppressing and eliciting plant immune responses.

Authors: Shawkat Ali; Maxime Magne; Shiyan Chen; Natasa Obradovic; Lubna Jamshaid; Xiaohong Wang; Guy Bélair; Peter Moffett
Journal: Front Plant Sci Date: 2015-08-11 Impact factor: 5.753

8. Analysis of the Transcriptome of the Infective Stage of the Beet Cyst Nematode, H. schachtii.

Authors: John Fosu-Nyarko; Paul Nicol; Fareeha Naz; Reetinder Gill; Michael G K Jones
Journal: PLoS One Date: 2016-01-29 Impact factor: 3.240

9. The effector GpRbp-1 of Globodera pallida targets a nuclear HECT E3 ubiquitin ligase to modulate gene expression in the host.

Authors: Amalia Diaz-Granados; Mark G Sterken; Hein Overmars; Roel Ariaans; Martijn Holterman; Somnath S Pokhare; Yulin Yuan; Rikus Pomp; Anna Finkers-Tomczak; Jan Roosien; Erik Slootweg; Abdenaser Elashry; Florian M W Grundler; Fangming Xiao; Aska Goverse; Geert Smant
Journal: Mol Plant Pathol Date: 2019-11-22 Impact factor: 5.663

9 in total