Literature DB >> 21637462

Distinctively variable sequence-based nuclear DNA markers for multilocus phylogeography of the soybean- and rice-infecting fungal pathogen Rhizoctonia solani AG-1 IA.

Maisa B Ciampi1, Liane Rosewich Gale, Eliana G de Macedo Lemos, Paulo C Ceresini.   

Abstract

A series of multilocus sequence-based nuclear DNA markers was developed to infer the phylogeographical history of the Basidiomycetous fungal pathogen Rhizoctonia solani AG-1 IA infecting rice and soybean worldwide. The strategy was based on sequencing of cloned genomic DNA fragments (previously used as RFLP probes) and subsequent screening of fungal isolates to detect single nucleotide polymorphisms (SNPs). Ten primer pairs were designed based on these sequences, which resulted in PCR amplification of 200-320 bp size products and polymorphic sequences in all markers analyzed. By direct sequencing we identified both homokaryon and heterokaryon (i.e. dikaryon) isolates at each marker. Cloning the PCR products effectively estimated the allelic phase from heterokaryotic isolates. Information content varied among markers from 0.5 to 5.9 mutations per 100 bp. Thus, the former RFLP codominant probes were successfully converted into six distinctively variable sequence-based nuclear DNA markers. Rather than discarding low polymorphism loci, the combination of these distinctively variable anonymous nuclear markers would constitute an asset for the unbiased estimate of the phylogeographical parameters such as population sizes and divergent times, providing a more reliable species history that shaped the current population structure of R. solani AG-1 IA.

Entities:  

Keywords:  allelic discrimination; multilocus genotyping; polymorphisms; primer design

Year:  2009        PMID: 21637462      PMCID: PMC3036909          DOI: 10.1590/S1415-47572009005000063

Source DB:  PubMed          Journal:  Genet Mol Biol        ISSN: 1415-4757            Impact factor:   1.771


Classical analyses of the distribution of genetic diversity within and among populations have been used to identify patterns of migration and to reveal cryptic recombination in Rhizoctonia solani (Rosewich ; Ceresini , 2007; Ciampi , 2008; Linde ; Bernardes de Assis et al., 2008), but information on global phylogeography does not exist for any Rhizoctonia pathosystem. Phylogeography is the study of historical processes responsible for the contemporary geographic distributions of individuals. Past events that can be inferred include population expansion, population bottlenecks, vicariance and migration (Karl and Avise, 1993; Avise, 2000). The classical studies on population genetics have used molecular markers, such as RAPD, ISSR, RFLP, and more recently microsatellite loci. However, these molecular markers are not fully suitable for studying the phylogeography of the fungus. A suitable marker would enable the implementation of the genealogical approach and the application of coalescent and phylogenetic tools for population-level questions (Brito and Edwards, 2008). Sequence variation from several stretches of anonymous DNA regions have been suggested as the marker of choice to infer phylogeographical history of species, for containing multiple and linked single nucleotide polymorphisms (SNPs), essential for constructing gene genealogies (Karl and Avise, 1993; Brito and Edwards, 2008). SNPs have simple patterns of variation, the potential for automated detection, low mutation rates (about 10-8 to10-9), and thus, low levels of homoplasy (Brito and Edwards, 2008). In addition, many more tests for elucidating population parameters and historical demography (e.g., calculating deviations from neutrality, population size changes, divergence times, and recombination) exist for data derived from sequence-based markers than for any other marker (Brumfield ). With the costs of high throughput sequencing constantly getting reduced, analysis of nuclear DNA sequence variation is becoming more convenient and appropriate for phylogeography, population genetics, and phylogenetic studies (Zhang and Hewitt, 2003; Hayashi ). The aim of this study was to develop a series of anonymous nuclear DNA sequence-based markers suitable for studies of phylogeography of the rice- and soybean infecting fungus R. solani AG-1 IA, based on original RFLP loci (Rosewich ), to detect multiple SNPs. Our hypothesis was that these anonymous nuclear markers are distinctively variable, and their combination would constitute an asset for the unbiased estimate of the phylogeographical parameters such as population sizes and divergence times. We sampled 14 soybean-infecting R. solani AG-1 IA isolates (Table 1), from which anastomosis grouping and pathogenicity was determined previously (Fenille, 2001; Meyer, 2002; Costa-Souza ). These isolates represent distinct ITS-5.8S rDNA haplotypes detected in Brazil (Ciampi ). We developed seven sequencing markers based on seven pUC18 cloning vectors containing genomic DNA fragments previously used as RFLP probes (Rosewich ) and considered suitable to genotype R. solani AG-1 IA populations in the United States since they were polymorphic and also allowed allelic discrimination in heterokaryons (Rosewich ). Plasmids containing the fungal genomic sequences were sequenced with M13 vector primers. Chromatograms were assembled by SEQUENCHER v. 4.6 (Gene Codes Corporation) and a consensus sequence for each probe was computed from both forward and reverse sequences. Based on the consensus sequences, ten primer pairs were designed (ranging from 20 to 22 bp, Table 2) to amplify each specific locus, to further sequence multiple loci and to screen isolates for SNPs at each locus. Using the PRIMER3 RELEASE 1.0 software (Rozen and Skaletsky, 2000), all primers were projected to generate PCR products of 200-320 bp. Primers named “L” were projected to amplify a fragment from the 5'-end of a respective clone sequence and primers named “R” to amplify a fragment from the 3'-end.
Table 1

Rhizoctonia solani AG-1 IA isolates used in this study.

IsolateHostSourceOriginITS haplotype1GenBank accession number
3F1rice cv. Epagri 108A.S. PrabhuLagoa da Confusão, TO5DQ173049.1
3F6rice cv. Rio Formoso5DQ173050.1
4F1rice cv. Epagri 1085DQ173051.1
9F15DQ173052.1
SJ13soybean cv. Garça BrancaR.C. FenilleLucas do Rio Verde, MT22DQ173053.1
SJ1520DQ173055.1
SJ1614AY270010.1
SJ1912AY270013.1
SJ28soybean cv. Xingu23AY270006.1
SJ311DQ173058.1
SJ34soybean cv. FT-10819AY270007.1
SJ3613DQ173060.1
SJ4010DQ173061.1
SJ442DQ173062.1
SJ479DQ173063.1
SJ5317DQ173065.1
SJ93soybeanM.C. MeyerPedro Afonso, TO18DQ173068.1
SJ129soybean cv. SambaibaBalsas, MA16DQ173071.1

1ITS-5.8S haplotypes characterized by Ciampi .

Table 2

Characteristics of ten nuclear DNA sequencing markers developed for Rhizoctonia solani AG-1 IA.

LocusProduct size (bp)Primer pairPrimer sequence (5' - 3')TmGC%
R44L303R44LLAGACGTACTCTGTCCAGACCAA58.950.0
R44LRGAATAGGTTTCTGCCCTCTTCG61.450.0

R61L281R61LLGGACCTTGGCTTAGGAAAGAAG60.650.0
R61LRAGTGACGCTTGCTCAGACTAGG61.154.6
R61R300R61RLATCGCAAGAAACCAGACTGC60.450.0
R61RRCGAATATCGCCCATCGTACT59.950.0

R68L303R68LLAGACTGTTGACTGGTGTGATCG60.250.0
R68LRCAGCGCTGCGTACTACAGCTA61.857.1

R78L195R78LLATATGGCACCTGACCTCGAC6055.0
R78LRCGAGTTTGCCCATACTTGGT6050.0

R111R241R111RLGTGAGCGCCAGACAAGAGATA60.652.4
R111RRATTCCCAAGTCAGCAGCAGT59.950.0

R116L314R116LLCACAGATCCAGAGGTTGTGC59.355.0
R116LRTGCTTCCAGCGTACATTCTG6050.0
R116R223R116RLCGTTAGTATCGAGGTAGCCACA59.350.0
R116RRGACCGTAGACAGGAGAAGATCG60.354.6

R148L320R148LLCCGTCCGTTATCCGACTTACTA60.350.0
R148LRCCGTCCGTTATCCGACTTACTA60.450.0
R148R201R148RLAGCAGCATGCCGAGTTGATA61.950.0
R148RRGTCGGTATGTCACAGACGAATG60.450.0
A preliminary study to assess the new primers' efficacy in amplification by PCR was carried out by using a sub-sample of three soybean-infecting R. solani AG-1 IA isolates (SJ13, SJ19, and SJ36) and one rice-infecting isolate (3F6) (Table 1). Each primer pair was also tested on the original plasmid clone. PCR amplifications were performed separately for each locus in a 20 μL final volume. The reaction mixture contained 5 to 15 ng genomic DNA, 2 μL 10x PCR buffer, 0.4 mM dNTPs mixture, 10 pmol of each specific primer pair, and 1 U of Taq polymerase. The initial denaturation step was done at 96 °C for 2 min, followed by 35 cycles of 96 °C for 1 min 60 °C for 1 min and 72 °C for 1 min, with a final elongation step at 72 °C for 5 min. The amplicons were then sequenced and surveyed for SNPs among the four isolates. Markers with adequate amplification efficacy for all four initial isolates were selected to amplify all 18 fungal isolates listed in Table 1, using the PCR conditions described above. In this manner, a set of markers for genotyping R. solani AG-1 IA isolates was developed by multiple loci-sequencing. To separate distinct alleles within heterokaryons, PCR products showing one or more double peaks in both sequencing directions were cloned into a plasmid vector using the TOPO TA® cloning kit (Invitrogen). For each locus, eight clones were recovered per isolate. Plasmidial DNA was extracted following a standard protocol (Sambrook ), amplified and sequenced using M13 primers. Chromatograms were assembled and analyzed by SEQUENCHER v. 4.6 program (Gene Codes Corporation), generating consensus sequences in FASTA format. We searched for homolog sequences at NCBI GenBank (Benson ), using BLASTn and BLASTx (Altschul ). The sequences of each locus were aligned by using the CLUSTALX program (Thompson ). SNPs identification and characterization was performed by means of the CLOURE program (Kohli and Bachhawat, 2003), accentuating only distinct nucleotides related to the first sequence of the alignment. Identification of haplotypes (and isolates who shared it), as well as the number and position of polymorphic sites was done by the SNAP WORKBENCH program (Price and Carbone, 2005). Haplotype diversity (Hd) measures and respective sample standard deviations were calculated according to Nei (1987). Nucleotide diversity or the average number of differences per site between two homologous sequences (π) was also calculated according to Nei (1987). For each marker, π values were estimated as an average among all comparisons. The average number of nucleotide differences among sequences was calculated according to Tajima (1993). All measures were estimated using the program DNASP v. 4.5 (Rozas ). The consensus sequences of seven R. solani AG-1 IA clones (probes) from Rosewich exhibited sizes ranging 543-1023 bp and their respective GenBank accession numbers are EU907366-EU907372. Comparisons between these sequences and DNA sequences from NCBI GenBank did not result in any significant matches using BLASTn tool (Altschul ). However, BLASTx (Altschul ) comparisons resulted in partial identity of most probe sequences with protein coding sequences of basidiomycetes, such as Laccaria bicolor, Coprinopsis cinerea,Cryptococcus neoformans, and of some ascomycetes, such as Pichia guilliermondii and Phaeosphaeria nodorum. Only probe R68 showed no similarity with any sequence from GenBank (comparisons done in 2009-02-08). From combinations preliminarily tested on both four R. solani AG-1 IA isolates and the respective clones, eight primer pairs resulted in PCR products. Even though successful PCR amplification was obtained for markers R61L, R78L, R111R, and R116R using the initial sample of four isolates, positive amplifications were not obtained when the fungal isolates sample was increased to 18 isolates. Probably for these loci, a new set of primers should be designed. We identified SNPs in all loci surveyed, with polymorphism levels varying from one to 18 polymorphic sites. The highest number of SNPs was detected for marker R68L, with 18 mutations along 303 bp, while 0.5 to 4.2 mutations per 100 bp were detected in the other markers (Table 3). This locus showed either the highest nucleotide diversity level on polymorphic sites (π = 0.25), while this value ranged 0.003-0.013 for other markers, or the highest average number of nucleotide differences (k = 7.686; k ranging 0.545-4.176 for others). Haplotype diversity (Hd) measures were very different among markers, and varied from 0.55 (for R148R locus) to 0.94 (for R44L locus). Nucleotide diversity levels or the average number of differences per site between two homologous sequences (π) varied from 0.003 for locus R148R to 0.013 for locus R116L. The average number of nucleotide differences (k) among analyzed sequences was also markedly distinct, showing the lowest value of 0.55 for locus R148R and the highest one of 7.69 for locus R68L, which was the most polymorphic locus among the nuclear markers developed.
Table 3

Descriptive analysis of molecular variation within six nuclear DNA sequencing markers from Rhizoctonia solani AG-1 IA isolates.

LocusProduct size (bp)Number of isolates surveyedNumber and proportion of heterokaryotic isolatesNumber of sequences analyzed1Number of haplotypes detectedNumber of polymorphic sitesIndelsNumber of mutations/100 bpHd2π3k4NCBI-GenBank accession number
R44L3031614 = 0.8830171003.30,938 ± 0,0250.0113.267EU907373-EU907402
R61R300165 = 0.3121101013.30,900 ± 0,0390.0102.848EU907408-EU907428
R68L303164 = 0.2521111805.90,857 ± 0,0570.0257.686EU907471-EU907491
R116L313153 = 0.2018121324.20,922 ± 0,0470.0134.176EU907435-EU907452
R148L32043 = 0.7575822.50,857 ± 0,1370.0082.667EU907453-EU907459
R148R200110112120.50,545 ± 0,0720.0030.545EU907460-EU907470

1The total number of sequences analyzed is higher than the number of isolates surveyed because most of the individuals were heterokaryons, requiring proper separation of alleles from each heterogeneous sequence by cloning.

2Haplotype diversity (Hd) ± standard deviation, calculated according to Nei (1987).

3Nucleotide diversity (π) or average number of differences per site between two sequences, calculated according to Nei (1987), Eq. (10.5).

π values were estimated as the average among all comparisons, for each marker.

4The average number of nucleotide differences (Tajima 1983, Eq. (A3)).

Except for two loci (R44L and R68L), indels were found in all loci. Heterokaryotic isolates were detected for most loci, varying from 20% (R116L) to 88% (R44L) of the total isolates; the only exception was locus R148R, for which none heterokaryotic isolate was detect. Cloning and sequencing PCR products showed to be efficacious in resolving DNA bases ambiguity from alleles composing the heterokaryons. We sequenced eight clones of each isolate for each marker, and this strategy seemed to be sufficient for covering the whole allelic variation present in the fungal sample tested. According to the variation detected in isolates of this preliminary sub-sample, six markers were selected for sequencing the total sample of R. solani AG-1 IA isolates, listed in Table 1. A general description of marker variation is presented in Table 3, and additional information, such as haplotype frequency, and identification of isolates sharing each haplotype, as well as polymorphic positions within the sequences are presented in a supplementary file (Table S1, available as online content). Thus, the seven codominant RFLP probes were successfully converted into six distinctively variable sequence-based nuclear DNA markers. The application of the BLASTx tool from NCBI resulted in the detection of only partial DNA base identity of the sequences from the seven RFLP probes with protein coding sequences from few basidiomycete species. The low levels of identity (reflected by similarity with only very short fragments of such protein coding genes) suggest that these sequences of nuclear DNA fragments constitute uncharacterized anonymous regions, probably associated with non-coding regions of the R. solani AG-1 IA genome. Up to now, only five complete genomes of basidiomycetes are available: Coprinopsis cinerea (accession number NW_001885114), Phanerochaete chrysosporium (AADS00000000), Cryptococcus neoformans (AAEY00000000), Ustilago maydis (AACP00000000), and Laccaria bicolor (ABFE01000000). The scarce genomic information for basidiomycetes in general and the current lack of public information from any Rhizoctonia genome would explain the low similarity found among the sequences from these R. solani AG-1 IA probes and genes characterized until now. In fact, the first genome of a R. solani anastomosis group (the potato-infecting AG-3) has been completed in 2008 by the J. Craig Venter Institute and North Carolina State University (funded by US Department of Agriculture) but it is not yet publicly available for comparisons. We subsequently surveyed the frequency of multiple SNPs in each one of these six sequence-based nuclear DNA markers. DNA sequence analyses from distinct R. solani AG-1 IA isolates revealed variable levels of polymorphism among markers (Table 3). We also detected variable DNA base ambiguities, typical of heterokaryons, which were efficiently separated using the strategy of cloning and sequencing fragments amplified by PCR. In comparison to a prior multilocus genotyping system using ten microsatellite loci (Zala ), the new set of sequence-based nuclear DNA markers displayed best power for allele discrimination in R. solani AG-1 IA. The microsatellite genotyping system indicated the occurrence of four to 10 alleles per locus in 232 soybean-infecting isolates (Ciampi ), while up to 18 alleles were identified using our sequence-based markers in a considerably smaller sample of 16 isolates used in this study. These six new sequence-based loci could then be employed as a source of codominant and highly polymorphic SNP markers useful to investigate further questions on the population structure of this important plant pathogen. The chances of finding multiple SNPs are usually highest in non-coding and intergenic regions of the genome, because they are expected to be under less stringent selection than coding regions (van Tienderen ). The use of anonymous loci allows markers to be selected without reference to their polymorphism, a feature that some workers consider essential for providing an unbiased description of genomic variation (Brumfield ). Loci are often chosen by virtue of their polymorphism content, in part because higher polymorphism implies greater power for inferring population parameters (Epperson, 2005). SNPs might rapidly become the marker of choice for many applications in population ecology, evolution and conservation genetics, because of the potential for higher genotyping efficiency, data quality, genome-wide coverage and analytical simplicity (e.g. in modeling mutational dynamics) (Morin ). Furthermore, SNPs evolve in a well-described manner for simple mutational models, such as infinite allele sites model (Kimura and Crow, 1964). Despite the particular importance of SNPs as population genetic markers, our main goal with this research was to develop a set of sequence-based markers that could be useful and informative for studying the phylogeography of R. solani AG-1 IA, such as several recent studies that have successfully utilized anonymous regions to infer phylogeographic history (Dettman ; Carstens and Knowles, 2007; Ceresini ). Up to now only very few sequence-based markers were available for such purposes: ribosomal DNA genes and intergenic regions [such as the ITS-rDNA, commonly used for phylogenetics (Gonzalez ; Fenille ) and evolutive analyses (Ciampi )], and beta-tubulin gene (Gonzalez ). Only recently, two anonymous sequence-based nuclear DNA loci were developed from former PCR-RFLP markers (pP42F e pP89) and used for phylogeography study of the Solanaceae-infecting R. solani AG-3 (Ceresini ). Large-scale SNP surveys have shown considerable promise for revealing fine-scale population history, assisted by new sequencing technologies that will certainly make these markers a more viable option for studies of natural populations (Brito and Edwards, 2008). To illustrate the application of the new markers for phylogeographical studies, we performed nested clade analysis (NCA) for locus R44L on haplotypes network of R. solani AG-1 IA isolates, constructed using the statistical parsimony algorithm (Templeton ) implemented by TCS (Clement ) and presented in Figure 1. This network was submitted to a nested design, following rules by Templeton (1987), and tested for geographical association of haplotypes implemented by GeoDis (Posada ). It evidences a clade definition by sample origin and/or host: clade 2-1 groups only haplotypes of soybean-infecting isolates from either Mato Grosso or Maranhão State; clade 2-2 groups soybean-infecting haplotypes from Mato Grosso State; and clade 2-3 groups rice-infecting haplotypes from Tocantins State (Figure 1). Based on NCA, a contiguous range expansion was suggested for geographical association of clades, which is coherent with historical processes of dissemination of the pathogen following the expansion of rice and soybean crop areas.
Figure 1

Haplotype network of Rhizoctonia solani AG-1 IA for locus R44L, constructed using the statistical parsimony algorithm (Templeton ) implemented by TCS (Clement ), where haplotypes (H1-H13) form groups represented by circles; the area of each circle refers to the relative frequency of those haplotypes in the population, and the gray tones represent their geographical origin, as shown in the legend. A dot without denomination along the network indicates a putative haplotype not sampled from the population. Probable recombinant haplotypes, identified by sequence homoplasy, were removed from the network. Squares represent the nesting design following the rules proposed by Templeton (1987), which was used to test the geographical association of haplotypes, and was implemented by GeoDis (Posada ).

Haplotype network of Rhizoctonia solani AG-1 IA for locus R44L, constructed using the statistical parsimony algorithm (Templeton ) implemented by TCS (Clement ), where haplotypes (H1-H13) form groups represented by circles; the area of each circle refers to the relative frequency of those haplotypes in the population, and the gray tones represent their geographical origin, as shown in the legend. A dot without denomination along the network indicates a putative haplotype not sampled from the population. Probable recombinant haplotypes, identified by sequence homoplasy, were removed from the network. Squares represent the nesting design following the rules proposed by Templeton (1987), which was used to test the geographical association of haplotypes, and was implemented by GeoDis (Posada ). Phylogeographic studies combine information about genetics and population biology, phylogenetics, molecular evolution and historical biogeography to characterize the geographic distribution of pathogen genealogical lineages in the geographic space (referred to as phylogeographic patterns), inferring biogeographic, demographic, and evolutionary process that have shaped these current patterns (Avise, 2000; Knowles and Maddison, 2002; Knowles, 2004). To construct a robust phylogeographic history based on genealogical data, genomic DNA sequences from several independent loci are needed (Knowles, 2004), considering that each DNA sequence has its own genealogy, and that the evolutionary history of an organism is the sum of multiples different gene genealogies, composing a mosaic of genealogic patterns in response to ambient (Hare, 2001; Emerson and Hewitt, 2005). We postulate that the six distinctively variable anonymous DNA regions developed in our study contain multiple and linked single nucleotide polymorphisms (SNPs) essential for constructing and comparing multi-locus gene genealogies required in any phylogeography study. Phylogeographic studies using genealogical data from these independent loci would provide a more reliable species history containing the phylogeographic patterns that shaped the current population structure of R. solani AG-1 IA.

Supplementary Material

The following online material is available for this article:

Table S1

Detailed description of molecular variation within six nuclear DNA sequence-based markers from Rhizoctonia solani AG-1 IA isolates. This material is available as part of the online article from http://www.scielo.br/gmb.
  29 in total

1.  GeoDis: a program for the cladistic nested analysis of the geographical distribution of genetic haplotypes.

Authors:  D Posada; K A Crandall; A R Templeton
Journal:  Mol Ecol       Date:  2000-04       Impact factor: 6.185

2.  Development of PCR-based SNP markers for rice blast resistance genes at the Piz locus.

Authors:  K Hayashi; N Hashimoto; M Daigen; I Ashikawa
Journal:  Theor Appl Genet       Date:  2004-01-23       Impact factor: 5.699

3.  THE NUMBER OF ALLELES THAT CAN BE MAINTAINED IN A FINITE POPULATION.

Authors:  M KIMURA; J F CROW
Journal:  Genetics       Date:  1964-04       Impact factor: 4.562

4.  SNAP: workbench management tool for evolutionary population genetic analysis.

Authors:  Eric W Price; Ignazio Carbone
Journal:  Bioinformatics       Date:  2004-09-07       Impact factor: 6.937

Review 5.  The burgeoning field of statistical phylogeography.

Authors:  L L Knowles
Journal:  J Evol Biol       Date:  2004-01       Impact factor: 2.411

6.  Shifting distributions and speciation: species divergence during rapid climate change.

Authors:  Bryan C Carstens; L Lacey Knowles
Journal:  Mol Ecol       Date:  2007-02       Impact factor: 6.185

Review 7.  Multilocus phylogeography and phylogenetics using sequence-based markers.

Authors:  Patrícia H Brito; Scott V Edwards
Journal:  Genetica       Date:  2008-07-24       Impact factor: 1.082

8.  Highly polymorphic microsatellite loci in the rice- and maize-infecting fungal pathogen Rhizoctonia solani anastomosis group 1 IA.

Authors:  M Zala; B A McDonald; J Bernardes DE Assis; M B Ciampi; M Storari; P Peyer; P C Ceresini
Journal:  Mol Ecol Resour       Date:  2008-05       Impact factor: 7.090

9.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.

Authors:  J D Thompson; T J Gibson; F Plewniak; F Jeanmougin; D G Higgins
Journal:  Nucleic Acids Res       Date:  1997-12-15       Impact factor: 16.971

10.  Phylogenetic utility of indels within ribosomal DNA and beta-tubulin sequences from fungi in the Rhizoctonia solani species complex.

Authors:  Dolores González; Marc A Cubeta; Rytas Vilgalys
Journal:  Mol Phylogenet Evol       Date:  2006-05-02       Impact factor: 4.286

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.