| Literature DB >> 19812773 |
Zhenkang Xu1, Laura Gutierrez, Matthew Hitchens, Steve Scherer, Amy K Sater, Dan E Wells.
Abstract
The results of our bioinformatics analysis have found over 91,000 di-, tri-, and tetranucleotide microsatellites in our survey of 25% of the X. tropicalis genome, suggesting there may be over 360,000 within the entire genome. Within the X. tropicalis genome, dinucleotide (78.7%) microsatellites vastly out numbered tri- and tetranucleotide microsatellites. Similarly, AT-rich repeats are overwhelmingly dominant. The four AT-only motifs (AT, AAT, AAAT, and AATT) account for 51,858 out of 91,304 microsatellites found. Individually, AT microsatellites were the most common repeat found, representing over half of all di-, tri-, and tetranucleotide microsatellites. This contrasts with data from other studies, which show that AC is the most frequent microsatellite in vertebrate genomes (Toth et al. 2000). In addition, we have determined the rate of polymorphism for 5,128 non-redundant microsatellites, embedded in unique sequences. Interestingly, this subgroup of microsatellites was determined to have significantly longer repeats than genomic microsatellites as a whole. In addition, microsatellite loci with tandem repeat lengths more than 30 bp exhibited a significantly higher degree of polymorphism than other loci. Pairwise comparisons show that tetranucleotide microsatellites have the highest polymorphic rates. In addition, AAT and ATC showed significant higher polymorphism than other trinucleotide microsatellites, while AGAT and AAAG were significantly more polymorphic than other tetranucleotide microsatellites.Entities:
Keywords: Xenopus genome; microsatellite; polymorphism
Year: 2008 PMID: 19812773 PMCID: PMC2735965 DOI: 10.4137/bbi.s561
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
Core groupings of microsatellite motifs.
| Dinucleotides | Trinucleotides | Tetranucleotides |
|---|---|---|
Distribution of microsatellites in 25% of the X. tropicalis genome.
| Repeat type | Motif | Number of loci | % of total loci | % of repeat type loci | Number of loci/Mbp | Loci Interval distance Kbp |
|---|---|---|---|---|---|---|
| Di- | AT | 46488 | 50.92 | 64.72 | 104.47 | 9.57 |
| AC | 17221 | 18.86 | 23.98 | 38.70 | 25.84 | |
| AG | 7851 | 8.60 | 10.93 | 17.64 | 56.68 | |
| CG | 267 | 0.29 | 0.37 | 0.60 | 1666.56 | |
| Total | 71827 | 78.67 | 100.00 | 161.42 | 6.20 | |
| Tri- | AAT | 5080 | 5.56 | 68.35 | 11.42 | 87.59 |
| ATC | 580 | 0.64 | 7.80 | 1.30 | 767.19 | |
| AAG | 409 | 0.45 | 5.50 | 0.92 | 1087.95 | |
| AGC | 344 | 0.38 | 4.63 | 0.77 | 1293.52 | |
| AGG | 292 | 0.32 | 3.93 | 0.66 | 1523.87 | |
| AAC | 272 | 0.30 | 3.66 | 0.61 | 1635.92 | |
| ACT | 245 | 0.27 | 3.30 | 0.55 | 1816.21 | |
| ACC | 102 | 0.11 | 1.37 | 0.23 | 4362.46 | |
| ACG | 59 | 0.06 | 0.79 | 0.13 | 7541.88 | |
| CCG | 49 | 0.05 | 0.66 | 0.11 | 9081.04 | |
| Total | 7432 | 8.14 | 100.00 | 16.70 | 59.87 | |
| Tetra- | AGAT | 8973 | 9.83 | 74.50 | 20.17 | 49.59 |
| ACAT | 1677 | 1.84 | 13.92 | 3.77 | 265.34 | |
| ACAG | 441 | 0.48 | 3.41 | 0.99 | 1009.00 | |
| AAAT | 272 | 0.30 | 2.26 | 0.61 | 1635.92 | |
| AAAG | 255 | 0.28 | 2.12 | 0.57 | 1744.98 | |
| AAGG | 90 | 0.10 | 0.75 | 0.20 | 4944.12 | |
| AAAC | 64 | 0.07 | 0.53 | 0.14 | 6952.67 | |
| AACT | 31 | 0.03 | 0.26 | 0.07 | 14353.90 | |
| AGGC | 29 | 0.03 | 0.24 | 0.07 | 15343.82 | |
| AGGG | 27 | 0.03 | 0.22 | 0.06 | 16480.40 | |
| AATC | 26 | 0.03 | 0.22 | 0.06 | 17114.26 | |
| AATG | 26 | 0.03 | 0.22 | 0.06 | 17114.26 | |
| AATT | 18 | 0.02 | 0.15 | 0.04 | 24720.60 | |
| AAGT | 16 | 0.02 | 0.13 | 0.04 | 27810.67 | |
| ATCC | 16 | 0.02 | 0.13 | 0.04 | 27810.67 | |
| ACGT | 15 | 0.02 | 0.12 | 0.03 | 29664.72 | |
| ACTG | 14 | 0.02 | 0.12 | 0.03 | 31783.63 | |
| ACTC | 13 | 0.01 | 0.11 | 0.03 | 34228.52 | |
| ACCT | 12 | 0.01 | 0.10 | 0.03 | 37080.90 | |
| AACC | 8 | 0.01 | 0.07 | 0.02 | 55621.35 | |
| AACG | 7 | 0.01 | 0.06 | 0.02 | 63567.26 | |
| ACCC | 5 | 0.01 | 0.04 | 0.01 | 88994.16 | |
| AAGC | 4 | 0.00 | 0.03 | 0.01 | 111242.70 | |
| AGCT | 3 | 0.00 | 0.02 | 0.01 | 148323.60 | |
| ACGC | 2 | 0.00 | 0.02 | 0.00 | 222485.39 | |
| ATCG | 1 | 0.00 | 0.01 | 0.00 | 444970.79 | |
| Total | 12045 | 13.19 | 100.00 | 27.07 | 36.94 | |
| Total | 91304 | 100 | 205.1914 | 4.8735 |
Figure 1Mean tandem repeat number of microsatellite motifs in genomic DNA. Mean repeat numbers were determined for each di-, tri-, and tetranucleotide microsatellite containing a minimum of five perfect tandem repeats. Numbers for the entire genome were estimated from a survey of 444,970,789 base pairs (~25%) of the X. tropicalis genome. Only the four most prevalent motifs for each size class are shown. The AGAT tetranucleotide motif was significantly more common that other tetranucleotide motifs (p < 0.001). Similarly The AT dinucleotide motif was significantly more common that other dinucleotide motifs (p < 0.001). Standard errors are shown.
Distribution of nonredundant microsatellites in X. tropicalis.
| Repeat Type | Motifs | Number | Abundance (% of repeat type) |
|---|---|---|---|
| Di- | AT | 1722 | 90.30 |
| AC | 100 | 5.24 | |
| AG | 85 | 4.46 | |
| Total | 1907 | 100 | |
| Tri- | AAT | 686 | 73.52 |
| ATC | 104 | 11.15 | |
| AAG | 53 | 5.68 | |
| AGG | 38 | 4.07 | |
| ACT | 27 | 2.89 | |
| AGC | 11 | 1.18 | |
| AAC | 6 | 0.64 | |
| ACG | 5 | 0.54 | |
| ACC | 3 | 0.32 | |
| Total | 933 | 100 | |
| Tetra- | AGAT | 1361 | 59.48 |
| ACAT | 603 | 26.35 | |
| AAAG | 86 | 3.76 | |
| AAAT | 62 | 2.71 | |
| ACAG | 36 | 1.57 | |
| AAAC | 16 | 0.70 | |
| AAGG | 14 | 0.61 | |
| AACT | 13 | 0.57 | |
| ACGT | 12 | 0.52 | |
| AGGC | 11 | 0.48 | |
| AATT | 10 | 0.44 | |
| AATG | 10 | 0.44 | |
| ACCT | 10 | 0.44 | |
| AAGT | 9 | 0.39 | |
| ACTC | 8 | 0.35 | |
| AATC | 6 | 0.26 | |
| AACG | 5 | 0.22 | |
| ATCC | 4 | 0.17 | |
| AACC | 4 | 0.17 | |
| AGGG | 3 | 0.13 | |
| ACGC | 2 | 0.09 | |
| AAGC | 1 | 0.04 | |
| ACTG | 1 | 0.04 | |
| ACGG | 1 | 0.04 | |
| Total | 2288 | 100 | |
| Total | 5128 |
Figure 2Relative abundance in genomic and nonredundant DNA of each motif within each of the three microsatellite repeat size classes analyzed. The abundance of each motif within both the genomic sample and the nonredundant sample is plotted against as a percentage of the abundance of the entire size class. AT, ATT, and AGAT were statistically more abundant than other members of their respective size class motif in both genomic and nonredundant samples. Only the most prevalent motifs for each size class are shown. Nonredundant results are shown in black and compared to genomic results are shown in gray.
Comparison in mean repeat size between genomic and nonredundant microsatellites.
| # of repeat units | # of repeat units | |||
|---|---|---|---|---|
| Genomic | S.E. | Nonredundant | S.E. | |
| Di- | 8.33 | 0.27 | 23.96 | 0.06 |
| Tri- | 6.28 | 0.16 | 9.58 | 0.07 |
| Tetra- | 7.43 | 0.07 | 7.99 | 0.07 |
The nonredundant microsatellites have significantly longer repeats than their counterparts (student t-tests: for dinucleotides t = 57.02, df = 2958, p < 0.001, for trinucleotides t = 18.80, df = 1686, p < 0.001, for tetranucleotides t = 5.67, df = 4941, p < 0.001).
Figure 3Mean tandem repeat number in nonredundant DNA for each microsatellite motif. Mean repeat numbers were determined for each di-, tri-, and tetranucleotide contained in our nonredundant microsatellite sample (see methods). Nonredundant results are shown in black, and genomic results are shown in gray. Only the most prevalent motifs for each size class are shown (no GC microsatellites were seen in our nonredundant sample). Standard errors are shown.
Figure 4Polymorphism rate for repeat length classes of each nonredundant microsatellite motif. Each microsatellite motif was subdivided into seven groups based on the length of their core repeat sequences. The total number of loci analyzed is shown in each length class.
Comparison of polymorphic and non-polymorphic markers by repeat size.
| Class 1 (≤30 bp) | Class 2 (>30 bp) | Total | ||
|---|---|---|---|---|
| Di | Not Polymorphic | 254 (43.1%) | 496 (37.66%) | 750 (39.33%) |
| Polymorphic | 336 (56.9%) | 821 (62.34%) | 1157 (60.67%) | |
| Total | 590 | 1317 | 1907 | |
| Tri | Not Polymorphic | 324 (45.8%) | 70 (31.0%) | 394 (42.2%) |
| Polymorphic | 383 (54.2%) | 156 (69.0%) | 539 (57.8%) | |
| Total | 707 | 226 | 933 | |
| Tetra | Not Polymorphic | 481 (39.8%) | 290 (26.90%) | 873 (35.09%) |
| Polymorphic | 728 (60.2%) | 789 (73.10%) | 1615 (64.91%) | |
| Total | 1209 | 1079 | 2288 | |
| All | Not Polymorphic | 1191 (43.11%) | 856 (32.66%) | 2047 (38.02%) |
| Polymorphic | 1572 (56.89%) | 1765 (67.34%) | 3337 (61.98%) | |
| Total | 2506 | 2622 | 5128 | |
| + | ++ | ++ |
Comparing polymorphism rates between Class 1 and 2 for each microsatellite repeat group (di, tri-, and tetra-),
means the 2 size classes are significantly different (p < 0.05),
means highly significantly different (p < 0.01). Comparing polymorphism rates among the three microsatellite repeat groups (di, tri-, and tetra-), + means significant differences (p < 0.05), ++ means highly significantly differences (p < 0.01) (see text).
Figure 5Comparison of polymorphic rates among different microsatellite motifs. Nonredundant microsatellites were tested for polymorphism as described in the methods section. *AAT and ATC show significant higher polymorphism than AAG and AGG (p < 0.001). *AGAT and AAAG are significantly more polymorphic than ACAT and AAAT (p < 0.001).