| Literature DB >> 28232872 |
Benjun Hou1, Suping Feng2, Yaoting Wu3.
Abstract
This research aimed to systematically identify and preliminarily validate the Hevea brasiliensis expressed sequence tag (EST) information using Simple Sequence Repeat (SSR) and provide evidence for further development of SSR molecular marker. The definition of general SSR features of Hevea EST splicing sequences and development of SSR primers founded the basis of diversity analysis and variety identification for Hevea tree resource. 1134 SSR loci were identified in the EST splicing sequence and distributed in 840 Unigene. The occurrence rate of SSR loci was 23.9%, and the average distribution distance of EST-SSR was 2.59 kb. The major repeat type was mononucleotide repeat motif, which accounted for 38.89%, while the corresponding value was 36.95% for dinucleotide repeat motif and 18.17% for trinucleotide repeat motif; the proportion of other motifs was only 5.99%. The superior repeat motifs for mononucleotide, dinucleotide, and trinucleotide were A/T, AG/CT, and AAG/CTT, respectively. 739 pair of primers were designed for 1134 SSR loci. PCR amplification was performed on Hevea Reyan5-11, Reyan87-6-47, and PR107, and 180 pairs of primers were selected which were able to amplify polymorphism bands.Entities:
Year: 2017 PMID: 28232872 PMCID: PMC5292370 DOI: 10.1155/2017/6590902
Source DB: PubMed Journal: J Nucleic Acids ISSN: 2090-0201
Figure 1Scheme used for data exploring and development of EST-SSRs markers from Hevea brasiliensis ESTs.
Results of microsatellite search.
| Designation | Numbers |
|---|---|
| Total number of sequences examined | 3519 |
| Total size of examined sequences (bp) | 2942162 |
| Number of SSR containing sequences | 840 |
| Number of sequences containing 1 SSR | 620 |
| Number of sequences containing more than 1 SSR | 220 |
| Number of SSRs present in compound formation | 136 |
| Total number of identified SSRs | 1134 |
Distribution to different repeat type classes.
| Motifs type | SSR Numbers | Accounting for | SSR Length | SSR average length |
|---|---|---|---|---|
| Mononucleotide | 441 | 38.89 | 6249 | 14.17 |
| Dinucleotide | 419 | 36.95 | 13022 | 31.08 |
| Trinucleotide | 206 | 18.17 | 4044 | 19.63 |
| Tetranucleotide | 33 | 2.91 | 568 | 17.21 |
| Pentanucleotide | 9 | 0.79 | 190 | 21.11 |
| Hexanucleotide | 26 | 2.29 | 786 | 30.23 |
| Total | 1134 | 100 | 24859 | 21.92 |
Frequency of classified repeat types (considering sequence complementary) in the analysed 840 splicing sequences.
| Repeats motif | Number of repeat units | Total repeats | Frequency of repeats | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | >15 | |||
| A/T | — | — | — | — | — | — | 119 | 66 | 45 | 28 | 23 | 18 | 137 | 436 | 38.45 |
| C/G | — | — | — | — | — | — | 4 | — | — | — | — | — | 1 | 5 | 0.44 |
| AC/GT | — | — | 3 | 2 | 2 | — | — | — | 1 | — | — | — | 1 | 9 | 0.79 |
| AG/CT | — | — | 52 | 22 | 25 | 23 | 20 | 16 | 15 | 9 | 13 | 13 | 141 | 349 | 30.78 |
| AT/AT | — | — | 15 | 6 | 5 | 4 | 2 | 3 | 1 | 4 | 4 | — | 17 | 61 | 5.38 |
| AAC/GTT | — | 2 | 1 | — | — | — | — | 1 | — | — | — | — | — | 4 | 0.35 |
| AAG/CTT | — | 28 | 15 | 6 | 9 | 11 | 4 | 3 | — | 1 | — | — | 1 | 78 | 6.88 |
| AAT/ATT | — | 15 | 5 | 1 | 2 | — | — | 2 | 1 | — | — | 2 | 28 | 2.47 | |
| ACC/GGT | — | 9 | 2 | 2 | 1 | — | 2 | — | — | — | — | — | — | 16 | 1.41 |
| ACG/CGT | — | 3 | — | — | — | — | 1 | 1 | — | — | — | — | — | 5 | 0.44 |
| ACT/AGT | — | — | 1 | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AGC/CTG | — | 14 | 6 | 2 | 2 | — | — | — | — | — | — | — | — | 24 | 2.12 |
| AGG/CCT | — | 16 | 4 | 6 | — | 1 | — | — | — | — | — | — | — | 27 | 2.38 |
| ATC/ATG | — | 10 | 4 | 3 | 3 | — | 1 | — | 1 | — | — | — | — | 22 | 1.94 |
| CCG/CGG | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAAC/GTTT | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 | |
| AAAG/CTTT | 7 | 1 | 1 | — | — | — | — | — | — | — | — | — | — | 9 | 0.79 |
| AAAT/ATTT | 6 | 1 | 2 | — | — | — | — | — | — | — | — | — | — | 9 | 0.79 |
| AACC/GGTT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AATG/ATTC | 2 | — | — | — | — | — | — | — | — | — | — | — | — | 2 | 0.18 |
| AATT/AATT | 4 | — | — | — | — | — | — | — | — | — | — | — | — | 4 | 0.35 |
| ACAG/CTGT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AGAT/ATCT | 1 | 1 | — | — | — | — | — | — | — | — | — | — | — | 2 | 0.18 |
| AGCG/CGCT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AGCT/AGCT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AGGC/CCTG | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| ATGC/ATGC | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAAAC/GTTTT | 2 | — | — | — | — | — | — | — | — | — | — | — | — | 2 | 0.18 |
| AAAAG/CTTTT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAAAT/ATTTT | 1 | 2 | — | — | — | — | — | — | — | — | — | — | — | 3 | 0.26 |
| AACAG/CTGTT | 2 | — | — | — | — | — | — | — | — | — | — | — | — | 2 | 0.18 |
| AAGAG/CTCTT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAAAAG/CTTTTT | 3 | — | — | — | — | — | — | — | — | — | — | — | — | 3 | 0.26 |
| AAAACG/CGTTTT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AACAGC/CTGTTG | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AACCCT/AGGGTT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AACCGC/CGGTTG | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AACCTG/AGGTTC | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AACGAG/CGTTCT | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AACGGG/CCCGTT | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAGACG/CGTCTT | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAGCCT/AGGCTT | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAGCTG/AGCTTC | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AAGGCC/CCTTGG | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| ACCAGC/CTGGTG | — | — | — | — | — | — | 1 | — | — | — | — | — | — | 1 | 0.09 |
| ACCATC/ATGGTG | — | — | 1 | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| ACCGCC/CGGTGG | 2 | — | — | — | — | — | — | — | — | — | — | — | — | 2 | 0.18 |
| ACCGGC/CCGGTG | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| ACCTCC/AGGTGG | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| ACCTGC/AGGTGC | — | — | — | — | — | 1 | — | — | — | — | — | — | — | 1 | 0.09 |
| ACTGAG/AGTCTC | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| ACTGCC/AGTGGC | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AGCAGG/CCTGCT | — | — | — | — | 1 | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AGCGGC/CCGCTG | — | 1 | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| AGGGGC/CCCCTG | 1 | — | — | — | — | — | — | — | — | — | — | — | — | 1 | 0.09 |
| Total | 45 | 114 | 112 | 50 | 50 | 40 | 154 | 92 | 63 | 43 | 40 | 31 | 300 | 1134 | 100 |
| Account for (100%) | 3.97 | 10.05 | 9.88 | 4.41 | 4.41 | 3.53 | 13.58 | 8.11 | 5.56 | 3.79 | 3.53 | 2.73 | 26.46 | ||
Figure 2Mononucleotide repeats of Unigene.
Figure 3Dinucleotide repeats of Unigene.
Figure 4Trinucleotide repeats of Unigene.
Figure 5Tetranucleotide repeats of Unigene.
Figure 6Pentanucleotide repeats of Unigene.
Figure 7Hexanucleotide repeats of Unigene.
SSR locus length.
| SSR locus length (bp) | Number of SSRs | Accounting for (100%) |
|---|---|---|
| 10–20 | 763 | 67.28 |
| 21–30 | 190 | 16.76 |
| 31–40 | 66 | 5.82 |
| 41–50 | 39 | 3.44 |
| 51–60 | 31 | 2.73 |
| 61–70 | 21 | 1.85 |
| 71–80 | 11 | 0.97 |
| 81–90 | 7 | 0.62 |
| 91–100 | 6 | 0.53 |
Sequence length of 840.
| Sequence length | Sequence numbers | Accounting for |
|---|---|---|
| 146–500 | 55 | 6.55 |
| 501–1000 | 503 | 59.88 |
| 1001–1500 | 192 | 22.86 |
| 1501–2000 | 60 | 7.14 |
| 2001–2500 | 23 | 2.74 |
| 2501–3000 | 6 | 0.71 |
| more than 3000 | 1 | 0.12 |