Literature DB >> 30159521

Identification of massive molecular markers in Echinochloa phyllopogon using a restriction-site associated DNA approach.

Guoqi Chen1,2, Wei Zhang1,2, Jiapeng Fang1,2, Liyao Dong1,2.   

Abstract

Echinochloa phyllopogon proliferation seriously threatens rice production worldwide. We combined a restriction-site associated DNA (RAD) approach with Illumina DNA sequencing for rapid and mass discovery of simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers for E. phyllopogon. RAD tags were generated from the genomic DNA of two E. phyllopogon plants, and sequenced to produce 5197.7 Mb and 5242.9 Mb high quality sequences, respectively. The GC content of E. phyllopogon was 45.8%, which is high for monocots. In total, 4710 putative SSRs were identified in 4132 contigs, which permitted the design of PCR primers for E. phyllopogon. Most repeat motifs among the SSRs identified were dinucleotide (>82%), and most of these SSRs were four motif-repeats (>75%). The most frequent motif was AT, accounting for 36.3%-37.2%, followed by AG and AC. In total, 78 putative polymorphic SSR loci were found. A total of 49,179 SNPs were discovered between the two samples of E. phyllopogon, 67.1% of which were transversions and 32.9% were transitions. We used eight SSRs to study the genetic diversity of four E. phyllopogon populations collected from rice fields in China and all eight loci tested were polymorphic.

Entities:  

Keywords:  Echinochloa phyllopogon; Polymorphic; RAD sequencing; SNP; SSR

Year:  2017        PMID: 30159521      PMCID: PMC6112297          DOI: 10.1016/j.pld.2017.08.004

Source DB:  PubMed          Journal:  Plant Divers        ISSN: 2468-2659


Introduction

Echinochloa phyllopogon (= Echinochloa oryzicola) proliferation seriously threatens rice production worldwide. As a C4-photosynthetic weed, E. phyllopogon is highly adapted to rice (C3-photosynthesis type) planting environments, where it causes significant rice yield loss (Holm et al., 1979, Rao et al., 2007, Yamasue, 2001). Furthermore, E. phyllopogon has evolved resistance to various herbicides in different areas (Heap, 2015). Understanding the genetic diversity of agricultural pests, such as E. phyllopogon, is important for both evolutionary and population biology, and critical for agricultural management (Sun et al., 2015). Microsatellite markers (simple sequence repeats, SSR) and single-nucleotide polymorphisms (SNP) are useful tools for studying genetic diversity and evolution (Zhang et al., 2011), and for developing high density genetic maps (Zhang et al., 2012). SSRs are short tandem repetitive sequences, which are co-dominant, abundant, multi-allelic, uniformly distributed, and can be detected by simple reproducible assays (Wang et al., 2015). SNPs are usually bi-allelic and characterized by low mutation rates; and thus, SNPS are stable from generation to generation across the genome (Kruglyak, 1997). This stability coupled with the abundance of SNPs makes them very useful both for linkage and genetic diversity studies (Talukder et al., 2014). To date, there are only eight SSR markers available for E. phyllopogon (Osuna et al., 2011, Lee et al., 2015), and an even more limited number of SNPs. One promising approach to reduced-representation genomics is restriction site-associated DNA (RAD) sequencing, which sequences short DNA fragments flanking restriction enzyme cut sites, allowing orthologous sequences to be targeted across multiple samples to identify and score thousands of genetic markers (Miller et al., 2007). Therefore, a RAD sequencing approach can be successfully used to identify genome-wide SSRs (Gupta et al., 2015, Orjuela et al., 2010) and SNPs (Baird et al., 2008, Talukder et al., 2014, Vandepitte et al., 2013) in different species. In this study, we describe the generation of genomic RAD tags from E. phyllopogon plants. The RAD tags were sequenced using the Illumina platform and then annotated/categorized. These data allowed the discovery of a large number of SSR and SNP markers.

Material and methods

DNA isolation

Seeds from E. phyllopogon individuals were collected and cultivated to fruiting stage in a greenhouse at Nanjing Agricultural University. Two E. phyllopogon plants with typical characteristics were used for SSR identification. Total genomic DNA was extracted from young leaves using DNeasy Plant Mini Kits (Qiagen, USA) according to the manufacturer's protocol.

RAD library preparation, sequencing and assembly

The RAD library was constructed at Hengchuang Inc. (China), according to the protocol described by Baird et al. (2008). Briefly, genomic DNA (300 ng) was digested for 60 min at 37 °C in a 50 μL reaction containing 20 U each of SgrAI and PstI (New England Biolabs, Beverly MA, USA). Reactions were stopped by incubating at 65 °C for 20 min. The P1 adapter (a modified Illumina adapter, see Baird et al., 2008) was ligated to the products of the restriction reaction, and the “barcoding” of the various samples was achieved with a set of index nucleotides in the P1 adapter sequence. A 2.5 μL aliquot of 100 nM P1 adapter was added to each sample, along with 1 μL 10 mM ATP (Promega), 1 μL 10× NEBBuffer4, 1 μL (equivalent to 1000 U) T4 DNA ligase (Enzymatics, Inc) and 5 μL water, then incubated at room temperature for 20 min, before heat-inactivated (20 min at 65 °C). The reactions were then pooled and the products randomly sheared to a mean size of 500 bp using a Bioruptor (Diagenode). The material was electrophoresed through a 1.5% agarose gel, and the DNA in the range 300–800 bp isolated using a MinElute Gel Extraction Kit (Qiagen). dsDNA ends were treated with end blunting enzymes (Enzymatics, Inc) to remove overhangs, and the samples purified using a MinElute column (Qiagen). 3′-adenine overhangs were then added by the addition of 15 U Klenow exo-(Enzymatics), followed by incubation at 37 °C for 10 min. Following re-purification, 1 μL 10 μM P2 adapter (a modified Illumina adapter, see Baird et al., 2008) was ligated, as described above for P1. The samples were then purified as above, and eluted in a volume of 50 μL. Following quantification (Qubit fluorimeter), 20 ng were taken as the template for a 100 μL PCR containing 20 μL Phusion Master Mix (NEB), 5 μL 10 μM P1 adapter primer (Illumina), 5 μL 10 μM P2 adapter primer (Illumina) and water. The Phusion PCR settings followed product guidelines (NEB) over 18 cycles. The amplicons were gel purified, the size range 300–700 bp was excised from the gel, with the DNA content adjusted to 3 ng/μL. The constructed RAD libraries were sequenced on the NGS Illumina platform PE150 at Hengchuang Inc. (China), following the manufacturer's protocol. To obtain clean, high quality reads, we discarded low quality raw sequences with adapter contamination or N content >10%. We used Stacks software for RAD tag clustering for each sample (ustacks). The Reads group (Read1 and Read2) at a same enzyme loci RAD were assembled by using the ABYSS software (Catchen et al., 2011).

SSR identification

SSR motifs were identified by SSRIT software (http://www.gramene.org/db/markers/ssrtool) using default parameters (Temnykh et al., 2001). Both perfect and imperfect di-, tri-, tetra-, penta- and hexa-nucleotide motifs were targeted. Di-nucleotide motifs with at least 4 repeats and other motifs with at least 3 repeats were selected. We used Primer3 software (http://sourceforge.net/projects/primer3/) to design primers in the flank regions of SSR sequences (SSR sequences were not contained in the primers), the replicated primers were removed and unique primers and relative loci were retained. To analyze the frequency of SSR motifs, SSRs were first standardized (Wang et al., 2015). For example, SSRs with motifs of AT and TA were analyzed as AT, and motifs of ATG, TGA, GAT, TAC, ACT and CAT are analyzed as ATG.

Sequence annotation

For the contigs with SSR loci, sequence annotation and Gene Ontology analyses were further conducted. BlastN searches were performed against the Gene Ontology database (http://www.geneontology.org/), using 90% identity and a minimum alignment of 100 bp as cut-off parameters. A threshold E-value of e−15 was adopted for each annotation. The annotated sequences were assigned a function based on the Gene Ontology database (http://www.geneontology.org/); GO terms were determined with respect to cellular component, biological process and molecular function (Barchi et al., 2011).

SNP discovery

SNPs were detected by Stacks pipeline, ustacks software was used to build loci, cstacks software was used to create a catalog of loci, and sstacks software was used to match samples back against the catalog (Catchen et al., 2011). Default settings were used in Stacks.

Microsatellites amplification

To test the validity of the SSRs identified by RAD sequencing here, we used eight SSRs (Table 1) to study the genetic diversity of four E. phyllopogon populations collected from rice fields in China. We extracted total genomic DNA from four-leaf stage plants using a DNeasy Plant Mini Kit (Tiangen Biotech, Beijing, China) according to the manufacturer's instructions. Isolated DNA concentration and relative purity were checked using Nanodrop ND-1000 (Thermo Scientific), and adjusted to 30–40 ng/μL. Forward primers of SSRs were labeled with fluorescent tags (Table 1). PCR amplification was conducted in a total volume of 10 μL. The PCR mixture contained 0.2 μL of DNA, 0.4 μL of each primer (10 μM), 5 μL of 2× PCR Taq Mix (Dongsheng Biotech, China), and ddH2O to a final volume of 10 μL. The amplifications were performed using the following cycling program: initial denaturation at 94 °C for 4 min, followed by 35 cycles of 94 °C for 30 s, relative annealing temperatures for 30 s, and 72 °C for 1 min, with a final extension step at 72 °C for 10 min. The amplification products were combined with formamide and a size standard GeneScan-500 LIZ (Applied Biosystems, Foster City, California, USA), and separated on a 3730 ABI automated sequencer (Applied Biosystems). Sample profiles were scored manually using GeneMarker v. 2.4 (Applied Biosystems).
Table 1

Characteristics of the eight primers tested for E. phyllopogon genotyping: locus name, forward (F) and reverse (R) primer sequences, motif, annealing temperature (Tm), fluorescent dye used (Fl. dye), allele size range (ASR), number of alleles amplified per sample, and number of alleles amplified among the plants of four populations sampled (Allele. total).

MarkerSequenceMotifASR (bp)Fl. dyeTm (°C)No. of alleles per sample
Allele. total
Min.MeanMax.
EG _1F: GCTCCTGAACTGTGTACATTCTTGCTG123–153TAM4900.725
R: TCGATTCACCCTTCAGCTTCTC
EG_2F: CATCGGATTCAGATTGAAAGGGTA131–159FAM51.511.737
R: GGTCGTAGGTCTATAGTCCGTAGAGTCA
EG_301F: GCGTCGTCAAGTCGTTCTTCTAAT147–173TAM5702.438
R: TGTATTCAGCTGTCGTGCATGT
EG_302F: ATTCGAACACCCATCAACCAACATTT133–293FAM5712.8512
R: GAAACAGAAGGGAGGTGTGCTG
EG_305F: AGCCGTTCCTCTAGTCGGATTTCTAT100–162ROX5734.1614
R: TATTCAGCTGCCGTGCATGTAGTA
EG_306F: TAAAACAAAACGACCGGCGTAACT146–167HEX5711.2527
R: TCAATCATTTCAGCCTTCGGAT
EG_307F: AACATTGTCATCACAAATATCATCATCAATC108–134TAM5723.558
R: AATCAAGGAAGCCCCTTCACTC
EG_320F: CAACTCATAAGACAATTCAAAGGGTTTTA136–153FAM5723.045
R: GCATCATTTAAGCATCAAAATGACA

FAM: 6-carboxyfluorescein, HEX: hexachloro-fluoresceine, ROX: carboxy-X-rhodamine, and TAM: 5-TAMRA (5-Carboxytetramethylrhodamine).

Characteristics of the eight primers tested for E. phyllopogon genotyping: locus name, forward (F) and reverse (R) primer sequences, motif, annealing temperature (Tm), fluorescent dye used (Fl. dye), allele size range (ASR), number of alleles amplified per sample, and number of alleles amplified among the plants of four populations sampled (Allele. total). FAM: 6-carboxyfluorescein, HEX: hexachloro-fluoresceine, ROX: carboxy-X-rhodamine, and TAM: 5-TAMRA (5-Carboxytetramethylrhodamine).

Data analysis

The multilocus data were transformed to a binary matrix of presence/absence of each allele for each individual, which was used for further analysis with GenAlex 6.5 (Peakall and Smouse, 2012, Teixeira et al., 2014). Total number of alleles and the number of private alleles for each population were determined using GenAlex 6.5, and genetic diversity was determined using GenoDive2.0b23 (Teixeira et al., 2014), according to the tutorials (www.patrickmeirmans.com/software/GenoDive.html). GenoDive allows analyzing polyploids with unknown dosage of alleles (Meirmans and Van Tienderen, 2004).

Results

Sequencing and contig assembly

The sequencing procedure generated 71.45 million reads for the two E. phyllopogon samples (Table 2). After editing/trimming, 10,440.6 Mb of high quality sequences were available, which were assembled into 37,662 contigs. Average contig lengths for the two samples were 334 and 346 bp. The GC content of E. phyllopogon was 45.8%.
Table 2

Summary statistics of the RAD tags sequencing via Illumina for E. phyllopogon.

FeatureTotal
Illumina reads (million)71.45
Total base (million)10,440.6
GC%45.8%
Q20 (%)94.0%
No. of contigs37,662
Total length (bp)12,789,629
Contig length range (bp)200–588
Average contig length (bp)339.5
Summary statistics of the RAD tags sequencing via Illumina for E. phyllopogon.

Identification of SSRs

A screen of the dataset resulted in the identification of 4710 putative SSRs that permitted PCR primer design for E. phyllopogon. Tables S1 and S2 show motifs, number of repeats, sequence of 5′- and 3′-flanking, sequences and annealing temperatures of primers, sequence of PCR products and the potential relative genes for each SSR loci. The majority of motifs among the RAD SSRs were dinucleotide (>82%) for both samples, and 14%–15% of the SSR motifs were trinucleotide (Table 3). The majority of SSRs were four motif-repeats. The abundance of SSRs decreased significantly (P < 0.01) with increasing motif-repeats for E. phyllopogon (Fig. 1).
Table 3

Length distributions of SSR motifs identified for the two samples of E. phyllopogon tested.

Motif length1304
Dinucleotide1908 (83.0%)1998 (82.4%)
Trinucleotide329 (14.3%)360 (14.9%)
Tetranucleotide40 (1.7%)50 (2.1%)
Pentanucleotide15 (0.7%)13 (0.5%)
Hexanucleotide6 (0.3%)3 (0.1%)
Total22982424
Fig. 1

SSR motifs with different repeat numbers for the two samples of E. phyllopogon.

Length distributions of SSR motifs identified for the two samples of E. phyllopogon tested. SSR motifs with different repeat numbers for the two samples of E. phyllopogon. Nearly all (97.3%) E. phyllopogon SSR motifs consisted of dinucleotide plus trinucleotide repeats. Thus, we further analyzed dinucleotide and trinucleotide motifs. Before the analysis, SSRs were standardized. For example, SSRs with motifs of AT and TA were analyzed as AT, and motifs of ATG, TGA, GAT, TAC, ACT and CAT were analyzed as ATG. AT was the most frequent, accounting for 36.3%–37.5%, followed by AG and AC (Table 4). Among the four kinds of dinucleotide motifs, CG dinucleotide repeats represented the lowest percentage of all SSRs (<6%). CCG was the most frequent kind of trinucleotide motif for both samples (Table 4), accounting for about 4% of the total SSRs for E. phyllopogon. The predicted length of PCR products amplified by SSR primers designed in this study are shown in Table 4.
Table 4

SSR motifs with a frequency > 0.5% and the ranges of PCR product length (mean length) of the relative motifs for the two samples tested for E. phyllopogon.

MotifCount (% of total SSRs)
PCR product length (average length, bp)
13041304
AT854 (37.2)880 (36.3)80–234 (133.5)80–239 (131.0)
AG562 (24.5)617 (25.5)80–208 (126.5)80–208 (127.1)
AC372 (16.2)395 (16.3)80–225 (130.3)80–234 (126.6)
CG120 (5.2)106 (4.4)80–204 (131.6)80–237 (124.1)
CCG99 (4.3)103 (4.2)80–172 (132.0)80–160 (126.9)
AAG45 (2.0)43 (1.9)85–159 (128.8)81–153 (121.0)
AAT28 (1.2)30 (1.2)80–160 (130.3)80–160 (127.5)
ACC27 (1.2)14 (0.6)80–157 (122.9)80–220 (134.2)
AAC25 (1.1)47 (1.9)85–155 (120.3)122–155 (136.9)
AGG24 (1.0)25 (1.0)81–188 (128.3)80–159 (132.3)
AGC23 (1.0)29 (1.2)80–157 (122.6)89–159 (134.2)
ACG22 (1.0)15 (0.6)86–160 (136.1)83–159 (121.4)
AGT22 (1.0)32 (1.3)80–160 (133.9)81–160 (130.8)
ATG14 (0.6)22 (0.9)91–160 (134.5)87–159 (127.9)

Note: motifs with dinucleotide plus trinucleotide contributed to 97.3% of the total SSRs for both samples. Thus motifs with length >3 were not shown in this table.

SSR motifs with a frequency > 0.5% and the ranges of PCR product length (mean length) of the relative motifs for the two samples tested for E. phyllopogon. Note: motifs with dinucleotide plus trinucleotide contributed to 97.3% of the total SSRs for both samples. Thus motifs with length >3 were not shown in this table. In total, 78 putative polymorphic SSR loci were found by RAD sequencing (Table 5). These 78 SSRs include 65 SSRs with dinucleotide motifs, 10 SSRs with trinucleotide motifs, two with tetranucleotide motifs and one with a pentanucleotide motif. The AT dinucleotide repeat, which accounts for 49.4% of all motifs, was the most frequent kind.
Table 5

The 78 putative polymorphic SSR loci found by RAD sequencing.

MarkerMotifPrimer_FPrimer_RMarkerMotifPrimer_FPrimer_R
EG_1TGgctcctgaactgtgtacattcttgctcgattcacccttcagcttctcEG_40GAAaacagacaaaatacaaaagaaagcacagtttttcagcatcatcctgtgg
EG_2TAcatcggattcagattgaaagggggtcgtaggtctatagtccgtagagtcaEG_41ATtcactacgaaattatcgtttatggacaagcccgctccgtgtttagattat
EG_3TAttgctttctgcaatgccaattagtccatgtggagtcagggagttEG_42TAatgggcgacaagcaagtatgatgacggacgaaggtttgaagattt
EG_4TAccgttgatgattaactcgttgattttgatggtagctacaagcgttggEG_43GAcatcctctggctgcttctctctgaatgtgagaatctccgctgct
EG_5TAttcactatgctgaaccagcagcctgagtccggtatcgctccttaEG_44GAacacctttctccatcctctggcccgctgctgctactactcttgg
EG_6ATccatggtcaagtcactttgtctgtctggatctcccaaattcatgtcEG_45TGttgtacaagcttctgagataacctgaatttcagaaactgtttgaattaggattt
EG_7AAGcatttcttaccgtcccatctgccctttttcagggagaagccactEG_46TAaaatggatatggcaaacgcatcccaagtccatcatgccaagttt
EG_8ATttttgtaggcctaacctgttgtggtttttgctatgcatgtgtctactcgEG_47ATtttgggattgtttatgaggtttgacacacggcaaaatgaccaata
EG_9TGtataacatccctttcgttgccatctgcaatgaaattcagatattcggacEG_48AAGtgctatgcatgaggagatgcagccttataccttggaggctcgct
EG_10AAGtaaattgcccaaacaagaaagaggatcggagtcccactcaacaaagtaEG_49TAaattctagtttgcgacgggttattttgagtgaatgggatcgaaaaa
EG_11TGCAagccggtgcaggaagacagaagaagggaaaaggtagtcgttggEG_50TCaaggacaaagtcgcagcgtttatgggatttggttttggcttct
EG_12CTCTCtttgaagccttttcggtcttgaaacaagcagtggaagacgaaggEG_51GCgccgggtgattaacggattagtagactagctagccagcgggttg
EG_13ATggcccaatataatatccatgccctatcaagggcagctatttgggEG_52ATaattcaacacaaccaaaggtaaaaatcaatgccatattgattctccc
EG_14ATggtggtgtgtcctgatgtgtgttgtttccttttgtttttgttttgtttcEG_53TGtcaaatggcaaagtatggaactcatcattttctcaagaagcagtggtc
EG_15TAcatgaactgttctgactccaacaacaagcattgcagctctgtcttgtEG_54ATaatattaacgtacccttgacaaatgaatttttgttggtacgtaagataaacaatc
EG_16TAtcagttgagctccatcatttgttttcactggctgttctttaccgtactEG_55AGccaagaaaccaactaagagccaaaatttgtgcatgatgtgctttgc
EG_17CGGgatagcgactcgagcgtggttctcgagcatggggagagacEG_56AGagcaagaaaccaactaagagccaaaattcgtgcatgatgtgctttg
EG_18ATGagccatattgccttgtgaccaattttccttgcgcaatttttcatEG_57TCtgaaaagccagtggacagtcaggagttcctcctgatggcaagaa
EG_19ACccttcagctgatgtaatcttggtaagtccatctctcagcacctgaaaaEG_58TCtctccctccaaactttactattcaccgctcaaaagatttgtctcgtcg
EG_20ATgaaggtcgtgcactatggtgagagcaagttgaagcaatccaaggEG_59ATcgtcaagtcgttcctctagtcgtgtattcagctgtcgtgcatgt
EG_21ATcgccgtcaagtcattcctctatcagctgccgtgcatgtagtaEG_60ATtgccagacagtccaacaagctaggccgactctatattcatattagctgac
EG_22CTcacatgatacatccgttgcgtcatcggaggagggggaagagEG_61TAaatgcagtcaggcccttgtttagcacgggcacatttcctagt
EG_23AGaaaacgccgcaaaaacaaaagcccctctaggattctcgctgttEG_62TCcttcttcctcgcctccaattcaaacaagttattacccggcgct
EG_24TAacgagcacccattatgttttggcgagatcccagagcaaagctacEG_63TAcgattgcttaagggaataaatggcaacattttactggtaatcctttcttg
EG_25CTatcaaaccccctcgaattcctgagggagagaaagctgacaggcEG_64GAtcttggctgaaaaatctatttggGacctctcccacttgaagaagca
EG_26TAttcaaaaattcgatctttgctgcaaccttttccgtggcctacctEG_65ATcccctgagcaaatttcaatcatagggacagggaaggatcttgac
EG_27GAgctcagcatctccaacgaacttcaaaccaattctgaatcgaaaagcEG_66ATttcatagaggtggtgtgtcctgatggtttccttttgtttttattatgtttc
EG_28TAgatgacgtggctagcttgcatacgtaggacgaaggatgaaaacgEG_67ATcgcacactggctgtaattggtaccgagctttcagatttactcctca
EG_29CTcctccttcctttgctgagcCctgcagcatgccctttctatttEG_68TAaatgcaaaataggacaccacggggaacccatgaataagctgcaa
EG_30GAaggtcgtgcatgggctagagcggagtagcttcacgcttcagtEG_69ATggaaattgcatctgcatcaactcccatgcagcatactaatgtgaa
EG_31TCTttgagatgatgatgcattcacttgtgggaagccatgaagaatatggEG_70ATttcgttcatttcgctctcatcattggcaatagttttcaatcttgcat
EG_32TAgtgggctcataccttaatgcccggggagccatctctcttctcatEG_71GAaggaagaaaagagaagtgaggcGcgagcacctcctctaggaatca
EG_33ATgccgtcaagtcgttcctctagtcagctgccgtgcatgtaatactEG_72TActgcgggtgacatttgtacagtgtctgaacacgttaccacaccg
EG_34TCTgatgatgatgcattcacttgagttgtggatgatgtgagaggtgatggEG_301ATgcgtcgtcaagtcgttcttctatgtattcagctgtcgtgcatgt
EG_35ATtcctctagtcggatttcttaatttgctgtattcagctgtcgtgcatgtEG_302ATTTattcgaacacccatcaaccaacgaaacagaagggaggtgtgctg
EG_36AGcatgaccatcaggcatcatctcatgaagaagctactccgccgatEG_305ATagccgttcctctagtcggatttcttattcagctgccgtgcatgtagta
EG_37TCTtcagaaacaatatgttcctcatcatcacaaatgggtcacaagacgagaaEG_306CTtaaaacaaaacgaccggcgtaatcaatcatttcagccttcggat
EG_38TGggagctggagaaactgaaggaagcacttcgttgagggctcgatagEG_307ATCaacattgtcatcacaaatatcatcatcaaatcaaggaagccccttcactc
EG_39CAgtggcatgtgaattgtttccctcaatcttacctcccaccttcccEG_320TAcaactcataagacaattcaaagggtttgcatcatttaagcatcaaaatgaca
The 78 putative polymorphic SSR loci found by RAD sequencing. To test the validity of the SSRs identified by RAD sequencing here, we used eight SSRs to study the genetic diversity of four E. phyllopogon populations collected from rice fields in China. We amplified 66 alleles from the eight microsatellite loci. The primer sequence EG_305 amplified 14 alleles, EG_302 amplified 12 alleles, and EG_320 and EG_1 amplified five alleles (Table 1). EG_305 amplified three to six alleles per sample, while EG_307 and EG_320 amplified two to five and two to four alleles per sample, respectively. Moreover, EG_305 amplified the most alleles on average (4.1). On average, 3.1–4.8 alleles were amplified from one locus per population (Table 6). All four populations showed private alleles, among which the populations EP13 and EP50 showed 13 and eight private alleles, respectively. The heterozygosity values of these populations ranged from 0.064 to 0.091, and their Shannon's information indices ranged from 0.087 to 0.381. Analysis of molecular variance (AMOVA) indicated that 39% of diversity occurs among populations, while 61% of diversity occurs within populations (Table 7).
Table 6

Diversity of four populations of E. phyllopogon using eight nuclear microsatellite loci.

PopulationEP13EP14EP53EP50Total
No. of alleles3934253766
No. of alleles per locus4.8754.253.1254.6258.25
No. of private alleles13128/
Heterozygosity0.0860.0820.0640.0910.081
Shannon's information index0.3810.210.0870.2220.225
Table 7

Analysis of molecular variance (AMOVA) showing the partitioning of genetic variation within and between regions of E. phyllopogon.

SourcedfSSMSEst. var.%P
Among Pops3135.56345.1883.32839<0.01
Within Pops44231.2505.2565.25661<0.01

df = degree of freedom, SS = sum of squares, MS mean squares, Est. var. = estimate of variance, % = percentage of total variation, P-value is based on 9999 permutations.

Diversity of four populations of E. phyllopogon using eight nuclear microsatellite loci. Analysis of molecular variance (AMOVA) showing the partitioning of genetic variation within and between regions of E. phyllopogon. df = degree of freedom, SS = sum of squares, MS mean squares, Est. var. = estimate of variance, % = percentage of total variation, P-value is based on 9999 permutations.

Annotation of contigs with SSR loci

Using two E. phyllopogon individuals, we identified 4710 SSR loci in 4132 contigs, and annotated 643 contigs (Table S2). Among these 643 contigs, 8631 annotations, potentially referring to 2155 unigenes, were searched (a given gene product can be associated with more than one annotation). Annotated E. phyllopogon sequences with SSR loci were functionally assigned and arranged into Gene Ontology (GO) slim categories (Fig. 2). GO analyses suggested that contigs with SSR loci were mostly related to metabolic processes (12.1% of the total 2155 unigenes) and cellular processes (10.5%) among biological processes; cell (9.8%), cell part (9.7%) and organelle (8.6%) among cellular components; and binding (12.6%) and catalytic activity (8.7%) among molecular functions.
Fig. 2

Functional annotation of assembled sequences with SSR loci for the two samples of E. phyllopogon based on gene ontology (GO) terms.

Functional annotation of assembled sequences with SSR loci for the two samples of E. phyllopogon based on gene ontology (GO) terms. In total, 49,179 SNPs were discovered between the two samples of E. phyllopogon. Table S3 shows the kind, sequence and location of 49,179 SNPs discovered between two samples of E. phyllopogon. Among these SNPs, transversions (67.1% of total SNPs) were much more frequent than transitions (Fig. 3).
Fig. 3

Transitions and transversions occurring within a set of 49,179 E. phyllopogon SNPs.

Transitions and transversions occurring within a set of 49,179 E. phyllopogon SNPs.

Discussion

High GC content of E. phyllopogon genome

Higher GC content in plant genomes possibly contributes to an increased ability to adapt to various arable lands that are mainly maintained and regulated by human disturbance. Šmarda et al. (2014) studied GC content in 239 different plant genomes, finding that the GC content of monocots varied between 33.6% and 48.9%, and increased GC content was documented in species able to grow in seasonally cold and/or dry climates, which possibly indicates GC-rich DNA may confer more stability during cell freezing and desiccation. The GC content of E. phyllopogon was higher than those of many monocots such as Juncus inflexus (33.7%), Luzula badia (33.6%), Carex acutiformis (35.6%), Schoenoplectus lacustris (35.8%), Canna indica (39.7%), Oryza sativa (43.6%) and Triticum aestivum (44.7%); and only lower than those of a few Poaceae species such as Stipa calamagrostis (47.5%) and Zea mays (47.4%) (Raats et al., 2013).

Characteristics on SSR motifs of E. phyllopogon

The majority of RAD SSR motifs were dinucleotide and with four motif-repeats. Gupta et al. (2015) identified SSR motifs in peanut (Arachis hypogaea) through RAD sequencing, and found that 67.6% of the motifs were dinucleotide, 14.6% were trinucleotide, 12.5% were tetranucleotide, 3.2% were pentanucleotide and 2.2% were hexanucleotide. Nevertheless, in eggplant (Solanum melongena), the percentages among total motifs with two to six nucleotides of dinucleotide, trinucleotide, tetranucleotide, pentanucleotide and hexanucleotide were 20.4%, 37.9%, 12.8%, 18.1% and 10.9% (Barchi et al., 2011). Using RAD sequencing in eggplant, Barchi et al. (2011) found that AAC was the most frequent kind of motif, accounting for 19.0% of the total SSRs, followed by AT (9.6%). Wang et al. (2015) analyzed the genomes of nine plant species from the Poaceae family, and found that among the genome SSRs of O. sativa ssp. indica, O. sativa ssp. japonica, Phyllostachys heterocycla, Sorghum bicolor and Z. mays, AT was the most frequent motif, and also very frequent in other Poaceae plants. To test the validity of the SSRs identified by RAD sequencing here, we used eight SSRs to study the genetic diversity of four E. phyllopogon populations collected from rice fields in China. All eight loci were polymorphic, particularly when compared with the five SSRs that have been used for Echinochloa since 2002 (Danquah et al., 2002; Nozawa et al., 2006; Lee et al., 2015).

Potential usage of the SSRs and SNPs identified

A great number of Echinochloa species are aggressive invaders and managing crop lands requires unique strategies for each (Holm et al., 1979, Tabacchi et al., 2006). Thus, correctly identifying Echinochloa spp. is of agronomical and economic importance. The genus Echinochloa contains about 35 species that are widespread in both tropical and temperate regions and in dry or water-flooded soils (Flora of China, 2015). The taxonomy of this genus is complex, and Echinochloa species show wide variability in morphological, biological and physiological features (Danquah et al., 2002, Tabacchi et al., 2006, Vidotto et al., 2007). Conventionally, the identification of Echinochloa species has been attempted taxonomically using morphological assessment of plants, which has frequently been found to be difficult and uncertain (Tabacchi et al., 2006). Moreover, there are different taxonomic key systems for Echinochloa species, which may lead to misidentification (Flora of China, 2015, Tabacchi et al., 2006). Molecular identification of the Echinochloa species is not yet reliable and requires further study (Danquah et al., 2002, Kaya et al., 2014, Tabacchi et al., 2006). In addition, molecular markers may be very useful in studying the origin and distribution of herbicide-resistant populations (Okada et al., 2013, Osuna et al., 2011). SNPs and SSRs are ideal molecular tools for gene location and molecular breeding (Danquah et al., 2002, Gupta et al., 2015, Vandepitte et al., 2013, Zhang et al., 2011).
  18 in total

1.  SNP discovery using Paired-End RAD-tag sequencing on pooled genomic DNA of Sisymbrium austriacum (Brassicaceae).

Authors:  K Vandepitte; O Honnay; J Mergeay; P Breyne; I Roldán-Ruiz; T De Meyer
Journal:  Mol Ecol Resour       Date:  2012-12-11       Impact factor: 7.090

2.  Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential.

Authors:  S Temnykh; G DeClerck; A Lukashova; L Lipovich; S Cartinhour; S McCouch
Journal:  Genome Res       Date:  2001-08       Impact factor: 9.043

3.  Post-glacial evolution of Panicum virgatum: centers of diversity and gene pools revealed by SSR markers and cpDNA sequences.

Authors:  Yunwei Zhang; Juan E Zalapa; Andrew R Jakubowski; David L Price; Ananta Acharya; Yanling Wei; E Charles Brummer; Shawn M Kaeppler; Michael D Casler
Journal:  Genetica       Date:  2011-07-23       Impact factor: 1.082

4.  GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

Authors:  Rod Peakall; Peter E Smouse
Journal:  Bioinformatics       Date:  2012-07-20       Impact factor: 6.937

5.  Identification of SNP and SSR markers in eggplant using RAD tag sequencing.

Authors:  Lorenzo Barchi; Sergio Lanteri; Ezio Portis; Alberto Acquadro; Giampiero Valè; Laura Toppino; Giuseppe Leonardo Rotino
Journal:  BMC Genomics       Date:  2011-06-10       Impact factor: 3.969

6.  Stacks: building and genotyping Loci de novo from short-read sequences.

Authors:  Julian M Catchen; Angel Amores; Paul Hohenlohe; William Cresko; John H Postlethwait
Journal:  G3 (Bethesda)       Date:  2011-08-01       Impact factor: 3.154

7.  Evidence for high dispersal ability and mito-nuclear discordance in the small brown planthopper, Laodelphax striatellus.

Authors:  Jing-Tao Sun; Man-Man Wang; Yan-Kai Zhang; Marie-Pierre Chapuis; Xin-Yu Jiang; Gao Hu; Xian-Ming Yang; Cheng Ge; Xiao-Feng Xue; Xiao-Yue Hong
Journal:  Sci Rep       Date:  2015-01-27       Impact factor: 4.379

8.  Genome-wide distribution comparative and composition analysis of the SSRs in Poaceae.

Authors:  Yi Wang; Chao Yang; Qiaojun Jin; Dongjie Zhou; Shuangshuang Wang; Yuanjie Yu; Long Yang
Journal:  BMC Genet       Date:  2015-02-15       Impact factor: 2.797

9.  Rapid SNP discovery and genetic mapping using sequenced RAD markers.

Authors:  Nathan A Baird; Paul D Etter; Tressa S Atwood; Mark C Currey; Anthony L Shiver; Zachary A Lewis; Eric U Selker; William A Cresko; Eric A Johnson
Journal:  PLoS One       Date:  2008-10-13       Impact factor: 3.240

10.  Evolution and spread of glyphosate resistance in Conyza canadensis in California.

Authors:  Miki Okada; Bradley D Hanson; Kurt J Hembree; Yanhui Peng; Anil Shrestha; Charles Neal Stewart; Steven D Wright; Marie Jasieniuk
Journal:  Evol Appl       Date:  2013-03-11       Impact factor: 5.183

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.