| Literature DB >> 29018469 |
Yue Zhang1, Xue Zhang1, Yue-Hua Wang1, Shi-Kang Shen1.
Abstract
Transcriptome sequences generated by next-generation sequencing (NGS) technologies can be utilized to rapidly detect and characterize a large number of gene-based microsatellites from different plants. Rhododendron rex Lévl. is a perennial woody species from the family Ericaceae and an endangered plant with high ornamental value endemic to Southwestern China. Nevertheless, the genetic and genomic information of R. rex remain unknown. In this study, we performed transcriptome sequencing for R. rex leaf samples, and generated large transcript sequences for functional characterization and development gene-associated SSR markers. A total of 164,242 unigenes were assembled and 115,089 (70.07%) unigenes were successfully annotated in public databases. In addition, a total of 15,314 potential EST-SSRs were identified, and the frequency of SSRs in the R. rex unigenes was 9.32%, with an average of one EST-SSR per 5.65 kb. The most abundant type was repeated di-nucleotide (54.63%), followed by mono- (26.03%) and tri-nucleotide (18.51%) repeats. Based on the SSR-containing sequence, 100 primer pairs were randomly selected and synthesized and used for assessment of the polymorphism. Thirty-six primer pairs were polymorphic and revealed polymorphism among 20 individuals from four R. rex populations. A total of 197 alleles were identified, with an average of 5.472 alleles per locus. The Polymorphism Information Content ranged from 0.154 to 0.870, with a mean of 0.482. The newly developed EST-SSR markers exhibited high transferability (58.33-83.33%) among the six subgenera. Thus, these novel EST-SSR markers developed would provide valuable sequence resources for population structure, genetic diversity analysis, and genetic resource assessments of R. rex and its related species.Entities:
Keywords: EST-SSRs; Rhododendron rex; evolutionary adaptation; genomics; transcriptome sequencing
Year: 2017 PMID: 29018469 PMCID: PMC5622969 DOI: 10.3389/fpls.2017.01664
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Summary of the analysis of de novo assembled EST-SSRs for R. rex.
| Raw reads | Total raw read | 56989279 |
| Clean reads | Total clean reads | 55612495 |
| Total clean nucleotides (nt) | 8238889450 | |
| Q20 percentage | 98.22% | |
| Q30 percentage | 94.69% | |
| GC percentage | 4% | |
| Unigenes | Total sequence number | 164242 |
| Total sequence base | 86512813 | |
| Largest | 61022 | |
| Smallest | 201 | |
| Average | 526.74 | |
| N50(bp) | 752 | |
| N90 (bp) | 238 | |
| EST-SSR | Total number of examined sequences | 164242 |
| Total size of examined sequences (bp) | 86512813 | |
| Total number of identified SSRs | 15314 | |
| Number of SSR-containing sequences | 12188 | |
| Number of sequences containing more than one SSR | 2483 | |
| Number of SSRs present in compound formation | 1045 |
Figure 1Length distribution of all unigenes in R. rex. The x-axis represents the size of all unigenes, and the y-axis represents the number of all unigenes with a certain length.
Figure 2Summary of GO analysis of the unigene sequences of R. rex. The y-axis on the right indicates the number of genes in a category. The y-axis on the left indicates the percentage of a specific category of genes in that main category.
Figure 3COG analysis of the unigene sequences of R. rex. The y-axis indicates the number of unigenes in a specific functional cluster. The x-axis indicates the function class.
Figure 4KEGG metabolic pathway of R. rex. The y-axis is the name of the KEGG metabolic pathway, and the x-axis is the ratio of the number of genes and the number of genes to the number of genes.
Length distribution of the EST-SSRs of R. rex based on the number of nucleotide repeat units.
| 5 | 1,777 | 69 | 18 | 7 | 1,871 | 12.21 | ||
| 6 | 1,794 | 715 | 18 | 6 | 2,533 | 16.54 | ||
| 7 | 1,566 | 317 | 2 | 2 | 1,887 | 12.32 | ||
| 8 | 1,940 | 21 | 1 | 1,962 | 12.81 | |||
| 9 | 2,086 | 1 | 1 | 2,088 | 13.63 | |||
| 10 | 1,418 | 853 | 1 | 2,272 | 14.84 | |||
| 11 | 787 | 121 | 1 | 909 | 5.94 | |||
| 12 | 509 | 6 | 515 | 3.36 | ||||
| 13 | 351 | 351 | 2.29 | |||||
| 14 | 259 | 259 | 1.69 | |||||
| 15 | 197 | 197 | 1.29 | |||||
| 16 | 120 | 120 | 0.78 | |||||
| 17 | 108 | 108 | 0.71 | |||||
| 18 | 75 | 2 | 77 | 0.5 | ||||
| 19 | 47 | 1 | 48 | 0.31 | ||||
| 20 | 42 | 1 | 43 | 0.28 | ||||
| 21 | 38 | 38 | 0.25 | |||||
| 22 | 26 | 26 | 0.17 | |||||
| 23 | 8 | 8 | 0.05 | |||||
| 24 | 1 | 1 | 2 | 0.01 | ||||
| Total | 3,986 | 8,366 | 2,835 | 89 | 20 | 18 | 15,314 | |
| Percentage (%) | 26.03 | 54.63 | 18.51 | 0.58 | 0.13 | 0.12 |
Characteristics of the 36 novel EST-SSR markers in R. rex.
| Rho-1 | ACCGAGTCACAGCACTCCTT | CACATTCATCCTCCCCAATC | 52 | (CGC)5 | 3 | 0.050 | 0.165 | 0.412 | 0.0003 | 0.676 | 0.331 |
| Rho-4 | TTGACGAACTGCACCAACTC | TTCAGTCACAAACCTGCATCA | 55 | (TG)6 | 3 | 0.083 | 0.153 | 0.166 | 0.0001 | 0.323 | 0.231 |
| Rho-10 | GGGAGAGAGAGGTCCTACCG | CTGCCCTTGTTTGACGATTT | 60 | (GA)7 | 11 | 0.433 | 0.685 | 0.870 | 0.0000 | 0.221 | 0.265 |
| Rho-11 | GAGGACGAGGGTGGTACAAA | AAGCCATGGAGTTGATACGG | 60 | (GCA)7 | 3 | 0.350 | 0.299 | 0.338 | 0.0000 | 0.222 | 0.135 |
| Rho-13 | AGCCCTTCCTCTGTCTCCTC | TTCGAATGGATCAAATGGGT | 58 | (CTC)5 | 4 | 0.200 | 0.160 | 0.208 | 0.9998 | 0.141 | 0 |
| Rho-14 | GAGATTCTCAACCCAACCCA | TCCAAACAGACATCCGATCA | 51 | (GA)6 | 5 | 0.263 | 0.342 | 0.490 | 0.0000 | 0.390 | 0.226 |
| Rho-15 | CAATCAAGGGGCTACCATGT | CGAAAGTTGGTGGTATCCGT | 57 | (AGG)6 | 5 | 0.700 | 0.580 | 0.618 | 0.1278 | 0.134 | 0.241 |
| Rho-17 | AGTGGACAGTGAGGTCACCC | TCGGATGAATTGCGTTGTAA | 60 | (TGT)5 | 4 | 0.588 | 0.391 | 0.520 | 0.8759ns | 0.364 | 0.578 |
| Rho-20 | TCCTGGGTCCATAACACACC | GGTCACGTGTCTGAGCGTAA | 60 | (AC)9 | 2 | 0.200 | 0.120 | 0.352 | 0.0325 | 0.736 | 0.485 |
| Rho-26 | CCTGAATCCATCCTGTCCTG | GCTGAGGGATCACCAGACAT | 56 | (GA)8 | 9 | 0.700 | 0.583 | 0.798 | 0.0024 | 0.303 | 0.176 |
| Rho-27 | GGGGTAATACCGGAGGGTAA | GTTCCCTGAAGACATGGTGG | 56 | (GA)7 | 3 | 0.200 | 0.295 | 0.406 | 0.0012 | 0.312 | 0.391 |
| Rho-30 | GGAAGTTTCGGCAGCAGTAG | CCTTCTCCCAACTCCCTTTC | 56 | (GGT)5 | 7 | 0.650 | 0.498 | 0.564 | 0.2730ns | 0.169 | 0.11 |
| Rho-33 | CCACCCTTCCCTTATCCTTG | GAGAAGTGTGGCTTTGAGGG | 56 | (CT)7 | 9 | 0.450 | 0.571 | 0.737 | 0.5345ns | 0.237 | 0.224 |
| Rho-34 | AGTGGCCTTAGGGGAAAGAA | CAACCCTTACCCACCACATC | 56 | (GGC)5 | 2 | 0.050 | 0.045 | 0.053 | 1.0000ns | 0.077 | 0.972 |
| Rho-35 | TGAATCCACCACAACAAGGA | ACTCCCCTCTCGGAAATTGT | 55 | (TA)8 | 4 | 0.333 | 0.334 | 0.473 | 0.1908ns | 0.392 | 0.161 |
| Rho-37 | CATGGAGAAGACCCACTGGT | ACCCACGCATTAACTTCAGG | 56 | (CT)8 | 9 | 0.450 | 0.698 | 0.820 | 0.0000 | 0.169 | 0.324 |
| Rho-39 | GTGGCTCAAAATACAGGGGA | CAGATGAAGGCGATGTGAGA | 58 | (GCT)5 | 5 | 0.150 | 0.355 | 0.368 | 0.0129 | 0.123 | 0.209 |
| Rho-45 | CTATGGCGGGCCTATCTGT | CATACGAAGGACGAGGTGGT | 60 | (TCC)6 | 8 | 0.400 | 0.670 | 0.700 | 0.0000 | 0.092 | 0.238 |
| Rho-48 | CTGCGTCTTTGGGTTTCTTC | CAATCCAACCCACCATTTTC | 55 | (GAG)5 | 5 | 0.113 | 0.313 | 0.333 | 0.0000 | 0.167 | 0.238 |
| Rho-51 | AAATATGTTCACCCCCACCA | GCCTGGACTGTTGGAATGTT | 58 | (CAC)5 | 5 | 0.163 | 0.262 | 0.271 | 0.0058 | 0.097 | 0.175 |
| Rho-53 | AACCAGTACAGGACGCCAAG | CTCCCGAGAAGATCAAGCAG | 56 | (GCG)5 | 3 | 0.050 | 0.345 | 0.345 | 0.0001 | 0.161 | 0.481 |
| Rho-54 | GAAATACGGAAACGGACGAA | CCTCTCTCTCTCCGCACATT | 52 | (TGTA)5 | 3 | 0.392 | 0.332 | 0.517 | 0.1424ns | 0.435 | 0.26 |
| Rho-56 | CCTCCTCTCGCATCATTCTC | CACTGCCATCTCTCACTCCA | 56 | (TCG)5 | 4 | 0.213 | 0.207 | 0.237 | 0.0001 | 0.169 | 0.11 |
| Rho-59 | AAACCTCTCGCTCTCTTCCC | GGTGTCGGTCTTCATGGTTT | 56 | (CT)7 | 9 | 0.725 | 0.623 | 0.821 | 0.1762ns | 0.260 | 0.096 |
| Rho-62 | ATATGTTGCGCGGGAGAAT | TCTCGAAGGCAAAACAGCTT | 54 | (CT)6 | 5 | 0.250 | 0.345 | 0.365 | 0.0000 | 0.101 | 0.195 |
| Rho-67 | GGTGGATCAGAAGGGACTGA | ACATGAAGATCATGGGCGAT | 55 | (CTT)5 | 8 | 0.375 | 0.501 | 0.579 | 0.0000 | 0.169 | 0.196 |
| Rho-69 | CGAATCCTCCATCAAAGCAT | ATGCAAAACTGTGACCTCCC | 57 | (TTC)5 | 5 | 0.100 | 0.355 | 0.651 | 0.0000 | 0.459 | 0.425 |
| Rho-70 | GGCTGTGAGGGAGTCAAAGA | TCTCCATTGTCGAAACCTCC | 56 | (GA)6 | 7 | 0.450 | 0.383 | 0.410 | 0.0246 | 0.162 | 0 |
| Rho-74 | ATAACGCGCAAACTAGCGTT | ATGAGGAGGAGCGCACTTTA | 58 | (CTG)5 | 6 | 0.400 | 0.455 | 0.671 | 0.0001 | 0.366 | 0.502 |
| Rho-75 | GTAAATGGGCCCGTATTCCT | CTCCATTGAGAAACCCTCCA | 54 | (TTG)6 | 4 | 0.150 | 0.120 | 0.154 | 0.9999ns | 0.158 | 0 |
| Rho-79 | TGGTTCTGTTCTCTGGCCTC | TTCCAGGATAGTGCTCCTGC | 58 | (TC)7 | 9 | 0.400 | 0.415 | 0.684 | 0.0000 | 0.428 | 0.22 |
| Rho-81 | ATGGATCGTTCTGGACGAAG | AAGGCCACTAGAAGAAGCCC | 60 | (CT)8 | 10 | 0.613 | 0.601 | 0.801 | 0.0011 | 0.275 | 0.232 |
| Rho-84 | TGACGGACTTGTGCTGGATA | GAGAAAAGGGAAGAAGGACACA | 58 | (TC)7 | 2 | 0.050 | 0.310 | 0.372 | 0.0002 | 0.373 | 0.642 |
| Rho-93 | TGCAGAGTAAAACCCTGCTTG | TAAAGTTGAGGCGGCAAAGT | 58 | (CA)7 | 3 | 0.050 | 0.085 | 0.099 | 0.0000 | 0.117 | 0.153 |
| Rho-95 | GGGGTAGGGGGATACTTTGA | GTCGACGACTTTGGTCCAGT | 55 | (GCT)6 | 8 | 0.450 | 0.395 | 0.509 | 0.0001 | 0.249 | 0.084 |
| Rho-97 | TTTGCGGTGGTGTCTGAATA | AAATCCAATGATCCATCCCA | 55 | (GA)7 | 5 | 0.425 | 0.404 | 0.631 | 0.0016 | 0.412 | 0.159 |
| Mean | 5.472 | 0.323 | 0.372 | 0.482 | 0.268 | 0.263 |
ns, non-significance;
p < 0.05, significant difference;
p < 0.01, most significant difference;
P < 0.001, most significant difference; r, null allele frequency.