| Literature DB >> 24341681 |
Zhimin Gao1, Jie Wu, Zheng'an Liu, Liangsheng Wang, Hongxu Ren, Qingyan Shu.
Abstract
BACKGROUND: Microsatellites are ubiquitous in genomes of various organisms. With the realization that they play roles in developmental and physiological processes, rather than exist as 'junk' DNA, microsatellites are receiving increasing attention. Next-generation sequencing allows acquisition of large-scale microsatellite information, and is especially useful for plants without reference genome sequences.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24341681 PMCID: PMC3878651 DOI: 10.1186/1471-2164-14-886
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Length distribution of 454 sequencing reads. X- and y-axes refer respectively to sequence length in bp and the number of sequences of a given length.
Occurrence of microsatellites in the surveyed tree peony genome
| Total number of sequences examined | 675221 |
| Total size of examined sequences (bp) | 240672018 |
| Total number of identified SSRs | 237134 |
| Number of SSR-containing sequences | 164043 |
| Number of sequences containing more than 1 SSR | 44362 |
| Number of SSRs present in compound formation | 70570 |
Figure 2Distribution of SSR start positions from the 5′-terminus of the cloned library insert. The x-axis indicates the number of bp from the 5′ terminus of a sequence to the SSR start site. The y-axis corresponds to the number of SSRs beginning at that start position.
Microsatellite motif length distribution
| 30,573 | 9,407 | 10625 | 990 | 196 | 84 | 51875 | 2.45 | |
| 55,906 | 38,138 | 28480 | 5368 | 946 | 726 | 129564 | 5.4 | |
| 64,734 | 37,282 | 29189 | 2565 | 604 | 261 | 135265 | 2.54 | |
| 34,843 | 9386 | 5596 | 169 | 41 | 57 | 50092 | 0.53 | |
| 1,20,383 | 20999 | 9647 | 1079 | 216 | 137 | 152461 | 0.94 | |
| 194,557 | 54304 | 25130 | 3178 | 772 | 665 | 278606 | 1.66 | |
Data for B. distachyon, S. bicolor, O. sativa, A. thaliana, M. truncatula, and P. trichocarpa were obtained from Sonah et al. 2011 (PLoS One, 6: 1–9).
--: the percent was not calculated due to the Genome of P. suffruticosa was unkown.
Figure 3Total numbers of each repeat motif. The x-axis indicates different repeat motif. The y-axis indicates the number of SSRs with various repeat motifs. 1: mono- nucleotide repeats; 2: di- nucleotide repeats; 3: tri- nucleotide repeats; 4: tetra- nucleotide repeats; 5: penta- nucleotide repeats; 6: hexa- nucleotide repeats.
Frequency of mono-, di-, tri-, and tetra-nucleotide repeat motifs in the tree peony genome
| A/T | 3956 |
| C/G | 604 |
| AC/GT | 124208 |
| AG/CT | 59711 |
| AT/AT | 1868 |
| CG/CG | 124 |
| AAC/GTT | 17890 |
| AAG/CTT | 5394 |
| AAT/ATT | 106 |
| ACC/GGT | 1756 |
| ACG/CGT | 369 |
| ACT/ATG | 606 |
| AGC/CGT | 579 |
| AGG/CCT | 268 |
| ATC/AGT | 262 |
| CCG/CGG | 5 |
Figure 4SSR length distribution. The x-axis indicates the length of SSRs (bp). The y-axis indicates the number of SSRs with different length.
Figure 5Distribution of compound SSR interruption distances. The X-axis indicates the length of interruption between SSRs (bp). The y-axis indicates the number of SSRs with different interruption length.
Figure 6Distribution of SSR reads mapping onto , poplar, and grape genomes. The x-axis includes three reference plants namely Arabidopsis, Grape species and Poplar; The y-axis indicates the numbers of sequences with SSRs mapped with the reference plants at various positions within genes/genomes. UTR = Untranslated region; CDS = Coding DNA sequence.
Figure 7Distribution of SSR reads with various repeat motifs mapping onto the genome. The x-axis indicates various position within genes/genome of Arabidopsis; The y-axis indicates the relative numbers with different repeat motifs mapped with various position of genes/genome of Arabidopsis. Abbreviations are: p1 = mono-nucleotide repeats; p2 = di-nucleotide repeats; p3 = tri-nucleotide repeats; p4 = tetra-nucleotide repeats; p5 = penta-nucleotide repeats; p6 = hexa-nucleotide repeats; c* = compound SSR without an interruption between two motifs; c = compound SSR with an interruption between two motifs; UTR = Untranslated region; CDS = Coding DNA sequence.
Microsatellite distribution in different genomic regions of tree peony using the Arabidopsis genome as a reference
| Mapping type | c | c* | p1 | p2 | p3 | p4 | p5 | p6 |
| Intergenic | 8173 | 727 | 203 | 18797 | 410 | 754 | 2 | 10 |
| No-mapped | 10464 | 218 | 3560 | 25867 | 5039 | 766 | 24 | 80 |
| 3′ UTR | 915 | 180 | 0 | 1534 | 14 | 0 | 0 | 0 |
| CDS | 1755 | 397 | 15 | 4674 | 7440 | 0 | 0 | 9 |
| Intron | 12760 | 1051 | 1 | 28034 | 130 | 135 | 0 | 0 |
| 5′ UTR | 921 | 284 | 0 | 10409 | 59 | 0 | 0 | 1 |
| Multi-mapped | 3963 | 1024 | 0 | 12700 | 544 | 0 | 0 | 0 |
Compound SSRs are designated as follows: c* = no interruption between two motifs; c = interruption between two motifs; p1–p6 refers to the repeat motif (e.g., mono-nucleotide, di- nucleotide, etc.).
Figure 8GO classification of SSRs in coding regions, including the number/percentage of genes putatively involved in different subcellular functions. The x-axis refers to different functional classes within a cell performing various functions. The y-axis indicates the percentage (left) or number (right) of genes within SSRs belonging to various functional classes.
SSR loci amplified from 23 accessions of tree peony
| | | | | | | | ||
|---|---|---|---|---|---|---|---|---|
| 2A | TGG6 | AACTGCGCTAGTCGTCCCCATAAAC | AAAGCCGCCTACAGAGGATGTTCAT | 268 | 57 | 3 | 0.4647 | 0.0435 |
| 19A | CA16 | TAACATCTCACTACCACTCAGGCGA | CATAAGGGTGATGATCATGTGGTTG | 164 | 54.5 | 3 | 0.6493 | 0.0000 |
| 25A | TGT10 | CAATCCCTTTTGTAATGCCCCTTTC | CAGGCTGTACTAGCAAAGGCTTCCA | 215 | 54.5 | 3 | 0.5565 | 0.0000 |
| 26A | TTG7 | TGGGCCCTACAAGTGATGATATTCC | ATGGAATCCAGGTTTGTGAATGTGA | 245 | 54.5 | 3 | 0.559 | 0.0000 |
| 30A | CA13 | TGTCATACCGACTTCGGCTAGGCTA | AAGGGTGATCGTGTGGTTGATGTTT | 265 | 54.5 | 4 | 0.7275 | 1.0000 |
| 31A | CT11 | AGCGCGTTTAATTGCTCTTACCTTG | CTCCCTCCTCTAACTCCATGCTTGC | 303 | 54.5 | 3 | 0.6261 | 0.0000 |
| 36A | (TGG)6gctttggccggttcg(CTT)5 | GACTGTAGTGATGGTGGTGGATTGG | AGCTTATGAACCCTGATGATGACGC | 261 | 57 | 3 | 0.5797 | 0.0000 |
| 48A | CAG5 | ACAGCGTCAGCAGACAGGAAGTACC | AAGAGTACCTGTCACCCCATCCAAA | 364 | 57 | 4 | 0.5913 | 0.0000 |
| 49A | TGC5 | TCTGGGTGATAGGTGGAGCTGGTGC | GGAAGACGCCCACAATGAAATCACA | 314 | 57 | 4 | 0.6696 | 0.0000 |
| 50A | CA13 | CACGGCTTTAAAATGCGTCTCAACT | AGGCTGGTGATAGTGTTGTTGATGC | 252 | 54.5 | 4 | 0.5295 | 0.0000 |
| 53A | TCC5 | CTCTTGTCAACCCCCACTGCCTCCT | GAAGGGACTTTCGCTGGAATCTGGC | 353 | 59 | 4 | 0.6802 | 0.0000 |
| 54A | (CT)9(CA)14 | TGTCGGGCGGTAAGTTTAGGGAAGA | CCACTTGGGTTCTGTTGGAGACTCG | 388 | 59 | 3 | 0.5034 | 0.0435 |
| 56A | AC15 | CAGGTGGCATTTTTGGCTTCTCTCT | TTGGCCCAATCACATGTAATCCCTC | 388 | 57 | 3 | 0.5217 | 0.0000 |
| 58A | GCA6 | TAGGATGACAAAGTGCAGGAAACCC | TGCTCAAACTCATCCTCAAGCTGTG | 318 | 57 | 2 | 0.085 | 0.0000 |
| 59A | AC18 | TACAACACTTCTCGCCTAACGCACC | AGACATGGTGCAAGTATGGGAGACG | 270 | 59 | 3 | 0.4908 | 0.0000 |
| 63A | (TC)9(AC)17 | CACCGCATATCTCCAACCTCACCTC | TTGGGTAGAGATAGGAGGTTGGGGC | 277 | 59 | 3 | 0.6609 | 0.0000 |
| 65A | TGG5 | CATACCTCCATCATGATGCTGCTGT | ATGAAGGCTCAGTAAGAACCTCGGA | 355 | 57 | 3 | 0.3053 | 0.0000 |
| 73A | CAG5 | CCATCTCAGGGTCAGGGTTCTCGTA | TAGAGTGTACCTTCACCCCCATCGG | 375 | 59 | 4 | 0.6928 | 0.1379 |
| 78A | AC16 | TATCAAATGGGGATGGTCTCCTCTT | AATTCTGCCACTATGAGCTCGATCT | 314 | 54.5 | 5 | 0.6899 | 0.1579 |
| 79A | GCA5 | AGAGGAAGTTTGAGGCCATCAGTCG | CAACTGTAGCCTTCTGTTCCTGCCC | 367 | 57 | 2 | 0.4638 | 0.0000 |
| 80A | GTG5 | AAGGTTATGGTGGCAGTGAAGATGA | ACCGTCGTACTACCACTTACAGCCG | 207 | 54.5 | 4 | 0.6773 | 0.3043 |
| 87A | TG15 | TGTAATCGATCGAGTTTCTTGGGTC | CCTAACACTCCACCACTAAGTCGCT | 188 | 56 | 3 | 0.6261 | 0.0000 |
| 91A | (GT)9ttgta(TG)16 | TCAGCCCCTAGCATAGAAGAATCCA | TCTCACTACCACCTACGCGATGTTC | 384 | 60 | 3 | 0.6032 | 0.0000 |
Size = size of cloned allele; Ta = annealing temperature, Na = number of alleles; He = expected heterozygosity; Ho = observed heterozygosity.
Figure 9UPGMA dendrogram constructed from 23 tree peony accessions using the markers developed in this study. Numbers at leaf tips refer to accession code numbers listed in Additional file 2: Table S2.