Literature DB >> 19812773

Distribution of polymorphic and non-polymorphic microsatellite repeats in Xenopus tropicalis.

Zhenkang Xu1, Laura Gutierrez, Matthew Hitchens, Steve Scherer, Amy K Sater, Dan E Wells.   

Abstract

The results of our bioinformatics analysis have found over 91,000 di-, tri-, and tetranucleotide microsatellites in our survey of 25% of the X. tropicalis genome, suggesting there may be over 360,000 within the entire genome. Within the X. tropicalis genome, dinucleotide (78.7%) microsatellites vastly out numbered tri- and tetranucleotide microsatellites. Similarly, AT-rich repeats are overwhelmingly dominant. The four AT-only motifs (AT, AAT, AAAT, and AATT) account for 51,858 out of 91,304 microsatellites found. Individually, AT microsatellites were the most common repeat found, representing over half of all di-, tri-, and tetranucleotide microsatellites. This contrasts with data from other studies, which show that AC is the most frequent microsatellite in vertebrate genomes (Toth et al. 2000). In addition, we have determined the rate of polymorphism for 5,128 non-redundant microsatellites, embedded in unique sequences. Interestingly, this subgroup of microsatellites was determined to have significantly longer repeats than genomic microsatellites as a whole. In addition, microsatellite loci with tandem repeat lengths more than 30 bp exhibited a significantly higher degree of polymorphism than other loci. Pairwise comparisons show that tetranucleotide microsatellites have the highest polymorphic rates. In addition, AAT and ATC showed significant higher polymorphism than other trinucleotide microsatellites, while AGAT and AAAG were significantly more polymorphic than other tetranucleotide microsatellites.

Entities:  

Keywords:  Xenopus genome; microsatellite; polymorphism

Year:  2008        PMID: 19812773      PMCID: PMC2735965          DOI: 10.4137/bbi.s561

Source DB:  PubMed          Journal:  Bioinform Biol Insights        ISSN: 1177-9322


Introduction

Microsatellites are short tandem repeats of a DNA sequence that are highly abundant in the genomes of eukaryotes (Hearne et al. 1992; Tautz 1993; Schlotterer, 2000). The high levels of allelic variation, codominant inheritance, and ease of analysis have made these markers attractive for population genetics, genome mapping, pedigree studies, and forensic analyses (Wright and Bentzen, 1994; Ellegren, 2000). In spite of the promising aspects of microsatellites as useful molecular markers, little is known about their origin, evolution, organization, dynamics, and roles in genomes. Recently, with the exponential increase in the number of genomic sequences available for different organisms, bioinformatic approaches have been used to investigate the distribution and frequencies of different types of microsatellites (Toth et al. 2000; Katti et al. 2001; Subramanian et al. 2003; La Rota et al. 2005). Comparisons in the frequency and distribution of microsatellites among different eukaryotic genomes have revealed the most dominant microsatellite types vary across taxa (Toth et al. 2000). Xenopus laevis and its diploid sister species X. tropicalis are among the major model systems for the fields of molecular, cell, and developmental biology. In the past several years, the genomic information on Xenopus has accumulated rapidly, and NCBI now carries over 1.25 million EST sequences for X. tropicalis. The Joint Genome Institute (JGI) has released the assembly version 4.1 of the X. tropicalis whole genome shotgun reads at a coverage of 7.65X (http://genome.jgi-psf.org/Xentr4/Xentr4.info.html). The present study represents part of our efforts to generate a genetic map for X. tropicalis using microsatellites as markers. One of our initial steps in generation of the genetic map was to develop a large set of “nonredundant” microsatellite markers. In this context we define our nonredundant microsatellite markers as di-, tri-, and tetranucleotide microsatellites containing a minimum of five non-interrupted tandem repeats, which are embedded in single copy flanking sequences and thus (with proper primer design) can amplify a unique genomic location. The purpose of this manuscript is to investigate: (1) the distribution and frequency of perfect di-, tri-, and tetranucleotide microsatellites in the X. tropicalis genome; (2) the relative abundance of different repeat classes and motifs in nonredundant microsatellites; and (3) the variations in the rate of polymorphism within nonredundant microsatellites along with the factors, such as tandem repeat length and base composition, which affect these variations.

Materials and Methods

Animals

DNA samples from two unrelated X. tropicalis frogs from each of the two major inbred strains, Nigerian and Ivory Coast, were used for polymorphic analysis. Frogs and/or DNA samples were generously provided by R. Grainger, U. Va., and R. Harland, UC Berkeley. The JGI sequence data was obtained from a single female Nigerian frog.

Estimation of frequencies of genomic microsatellites

Xenopus tropicalis genome assembly 4.1, generated by the Joint Genome Institute (JGI), Department of Energy (DOE) was used to estimate the distribution and frequencies of di-, tri-, and tetranucleotide microsatellites. For this study, all non-interrupted di-, tri-, and tetranucleotide microsatellites with 5 or more tandem repeats were analyzed. A total of 445 million bases, representing about 25% of the Xenopus tropicalis genome, was analyzed using a Perl script SSRIT (Temnykh et al. 2001). So as not to skew for microsatellites present only on long scaffolds, we analyzed 256 scaffolds ranging in size from 23,997 bp (Scaffold-2010) to 7,817,814 bp (Scaffold-1). The repeat motifs of di-, tri-, and tetranucleotide microsatellites were compressed into core groups in which different reading frames and complementary strand sequence were merged (Table 1). The results from output tables of SSRIT were analyzed using Microsoft Excel.
Table 1

Core groupings of microsatellite motifs.

DinucleotidesTrinucleotidesTetranucleotides
AC (CA, GT, TG)AAC (CAA, ACA, TTG, TGT, GTT)AAAC (AACA, ACAA, CAAA, TTTG, TTGT, TGTT, GTTT)
AG (GA. CT, TC)AAG (AGA, GAA, CTT, TTC, TCT)AAAG (AAGA, AGAA, GAAA, TTTC, TTCT, TCTT, CTTT)
AT (TA)AAT (ATA, TAA, ATT, TTA, TAT)AAAT (ATAA, AATA, TAAA, TTTA, TTAT, TATT, ATTT)
CG (GC)ACC (CAC, CCA, TGG, GGT, GTG)AACC (CAAC, CCAA, ACCA, TTGG, GTTG, GGTT, TGGT)
ACG (TCG, CGT, GAC, GTC, CGA)AACG (GAAC, CGAA, ACGA, TTCG, TCGT, CGTT, GTTC)
ACT (CTA, TAC, AGT, GTA, TAG)AACT (ACTA, CTAA, TAAC, AGTT, GTTA, TTAG, TAGT)
AGC (GCA, CAG, GCT, CTG, TGC)AAGC (CTTG, TTGC, TGCT, GCTT, AGCA, GCAA, CAAG)
AGG (GGA, GAG, CCT, CTC, TCC)AAGG (AGGA, GGAA, GAAG, CCTT, CTTC, TTCC, TCCT)
ATC (TCA, CAT, GAT, ATG, TGA)AAGT (ACTT, CTTA, TTAC, TACT, TAAG, GTAA, AGTA)
CCG (GCG, CGG, GCC, GGC, CGC)AATC (TCAA, CAAT, ATCA, TTGA, TGAT, GATT, ATTG)
AATG (ATGA, TGAA, GAAT, CATT, ATTC, TTCA, TCAT)
AATT (ATTA, TTAA, TAAT)
ACAG (CAGA, AGAC, GACA, CTGT, TGTC, GTCT, TCTG)
ACAT (CATA, ATAC, TACA, ATGT, TGTA, GTAT, TATG)
ACCC (CCCA, CACC, CCAC, GGTG, GGGT, TGGG, GTGG)
ACCG (CGAC, GACC, CCGA, TCGG, CGGT, GGTC, GTCG)
ACCT (GGTA, GTAG, TAGG, AGGT, CCTA, CTAC, TACC)
ACGC (GCAC, CACG, CGCA, TGCG, GCGT, CGTG, GTGC)
ACGG (CGGA, GGAC, GACG, CCGT, CGTC, GTCC, TCCG)
ACGT (CGTA, GTAC, TACG)
ACTC (CTCA, TCAC, CACT, GAGT, AGTG, GTGA, TGAG)
ACTG (CTGA, TGAC, GACT)
AGAT (GATA, ATAG, TAGA, ATCT, TCTA, CTAT, TATC)
AGCT (GCTA, CTAG, TAGC)
AGGC (GGCA, GCAG, CAGG, GCCT, CCTG, CTGC, TGCC)
AGGG (GGGA, GGAG, GAGG, CCCT, CCTC, CTCC, TCCC)
ATCC (CATC, TCCA, CCAT, GATG, ATGG, TGGA, GGAT)
ATCG (GATC, TCGA, CGAT)

Selection of non-redundant microsatellites and polymorphism testing

The term “nonredundant microsatellites” refers to di-, tri-, and tetranucleotide microsatellites containing a minimum of five non-interrupted tandem repeats that are embedded in single copy sequences. These microsatellites were identified by a bioinformatics screen from the Xenopus tropicalis genome assembly 2.0. The data mining script was based on the publicly available computer program, Tandem Repeats Finder (TRF) (Benson, 1999), and modified to find di-, tri-, and tetranucleotide microsatellites with more than 5 repeats embedded in unique flanking sequences suitable for primer design. Initially, nonredundant di-, tri-, and tetranucleotide microsatellites were identified randomly from the entire genome. Subsequently, identification of nonredundant microsatellites was targeted to underrepresented scaffolds. Once nonredundant tri- and tetranucleotide repeat sequences had been identified from all scaffolds that include them, the data mining script was further modified to identify primarily dinucleotide repeats. Primer pairs with annealing temperature at 58 °C (± 2 °C) were designed and initially tested on agarose gels to confirm their amplification under standard conditions (58 °C, 1.5 mM Mg+2, and 30 cycles). All primer pairs that amplified single bands were tested for polymorphisms between Nigerian and Ivory Coast strains. Polymerase chain reaction conditions consisted of 10 ng DNA, 0.5 μM of forward and reverse primers, 1.5 mM MgCl2, 0.2 mM of dGTP, dCTP, dTTP, 0.02 mM of dATP, 0.05 U/μl of Taq, 1X buffer, and 0.07 μCi/ul of 35S dATP. PCR amplification profile is 94 °C for 4 min followed by 30 cycles of 94 °C for 1 min., 58 °C for 1 min and 72 °C for 2 min with a final elongation of 30 min at 72 °C. Amplified products were electrophoresed in polyacrylamide gels and visualized by autoradiography. The known sequence of the pGEM- 3zf(+) vector was used as a ladder to establish the size of the microsatellites.

Statistical analyses

Significance of the differences in length of di-, tri-, and tetranucleotide microsatellites and the mean copy number of different motifs was determined by ANOVA. This step was followed by a post-test using the GraphPad Prism software, which employs the Bonferroni correction to adjust for multiple comparisons. Comparisons in average copy numbers between genomic and nonredundant microsatellites were carried out by Student’s t-tests. Contingency tables were used to compare the polymorphism among microsatellites with different lengths, different types of microsatellites, and different motifs.

Results

Distribution and frequencies of di-, tri-, and tetranucleotide microsatellites

A total of 91,304 perfect di-, tri-, and tetranucleotide microsatellites with a minimum of five tandem repeat units were identified in 444,970,789 bp (~ 25%) of the X. tropicalis genome (Table 2). The total length of perfect di-, tri-, and tetranucleotide sequence represented in this sample is 1,705,957 bp, representing 0.38% of the total DNA analyzed. Dinucleotide microsatellites account for 78.7% of identified microsatellites and significantly outnumber tri- and tetranucleotide microsatellites (p< 0.001). The average distance between two trinucleotide microsatellites (59.9 kb) is almost 10 times that of dinucleotide microsatellites (6.2 kb). Our analysis suggests that in every one million base pairs of genomic sequence, there are an average of 161 dinucleotide, 27 tetranucleotide, and 17 trinucleotide microsatellites.
Table 2

Distribution of microsatellites in 25% of the X. tropicalis genome.

Repeat typeMotifNumber of loci% of total loci% of repeat type lociNumber of loci/MbpLoci Interval distance Kbp
Di-AT4648850.9264.72104.479.57
AC1722118.8623.9838.7025.84
AG78518.6010.9317.6456.68
CG2670.290.370.601666.56
Total7182778.67100.00161.426.20
Tri-AAT50805.5668.3511.4287.59
ATC5800.647.801.30767.19
AAG4090.455.500.921087.95
AGC3440.384.630.771293.52
AGG2920.323.930.661523.87
AAC2720.303.660.611635.92
ACT2450.273.300.551816.21
ACC1020.111.370.234362.46
ACG590.060.790.137541.88
CCG490.050.660.119081.04
Total74328.14100.0016.7059.87
Tetra-AGAT89739.8374.5020.1749.59
ACAT16771.8413.923.77265.34
ACAG4410.483.410.991009.00
AAAT2720.302.260.611635.92
AAAG2550.282.120.571744.98
AAGG900.100.750.204944.12
AAAC640.070.530.146952.67
AACT310.030.260.0714353.90
AGGC290.030.240.0715343.82
AGGG270.030.220.0616480.40
AATC260.030.220.0617114.26
AATG260.030.220.0617114.26
AATT180.020.150.0424720.60
AAGT160.020.130.0427810.67
ATCC160.020.130.0427810.67
ACGT150.020.120.0329664.72
ACTG140.020.120.0331783.63
ACTC130.010.110.0334228.52
ACCT120.010.100.0337080.90
AACC80.010.070.0255621.35
AACG70.010.060.0263567.26
ACCC50.010.040.0188994.16
AAGC40.000.030.01111242.70
AGCT30.000.020.01148323.60
ACGC20.000.020.00222485.39
ATCG10.000.010.00444970.79
Total1204513.19100.0027.0736.94
Total91304100205.19144.8735
Among the di-, tri-, and tetranucleotide repeat classes of microsatellites, the most abundant repeat motifs are AT, AAT, and AGAT respectively (Table 2). These three repeat motifs account for more than 66% of the microsatellites present in the X. tropicalis genome, with the AT microsatellite alone representing over 50% of the total microsatellites in the genome. Figure 1 graphically shows the mean number of tandem repeats present in each of the four most abundant microsatellite motifs for each repeat class. Interestingly, for both the dinucleotide and tetranucleotide repeat classes, the most abundant motif also contained the highest number of tandem repeats, that is, both the AT and AGAT repeats were significantly longer than other di-, and tetranucleotide repeats (p < 0.001). However, this trend was not seen in the trinucleotide repeat class, as the ATC repeat class is not significantly more prevalent than the AAT repeat class.
Figure 1

Mean tandem repeat number of microsatellite motifs in genomic DNA. Mean repeat numbers were determined for each di-, tri-, and tetranucleotide microsatellite containing a minimum of five perfect tandem repeats. Numbers for the entire genome were estimated from a survey of 444,970,789 base pairs (~25%) of the X. tropicalis genome. Only the four most prevalent motifs for each size class are shown. The AGAT tetranucleotide motif was significantly more common that other tetranucleotide motifs (p < 0.001). Similarly The AT dinucleotide motif was significantly more common that other dinucleotide motifs (p < 0.001). Standard errors are shown.

Relative abundance of nonredundant di-, tri-, and tetranucleotide microsatellites

As part of an ongoing effort to identify PCR amplifiable markers for use in developing a genetic map of X. tropicalis, data mining strategies were developed to identify microsatellites embedded in unique sequences suitable for unique genomic localization. To this end, we identified 5,128 non-redundant microsatellites, which were subsequently analyzed elsewhere for polymorphisms (see methods). The distribution and relative abundance of these nonredundant di-, tri-, and tetranucleotide microsatellites is shown in Table 3 and Figure 2. As was seen in the genomic survey, AT, AAT, and AGAT are also the most abundant nonredundant motifs, accounting for 90.30%, 73.52% and 59.48% of di-, tri-, and tetranucleotide motifs respectively (Table 3). Likewise, AC, ATC, and ACAT are the second most abundant motifs in their respective repeat classes. CG repeats, which were found in low numbers in the genomic survey, were absent from our set of nonredundant microsatellites.
Table 3

Distribution of nonredundant microsatellites in X. tropicalis.

Repeat TypeMotifsNumberAbundance (% of repeat type)
Di-AT172290.30
AC1005.24
AG854.46
Total1907100
Tri-AAT68673.52
ATC10411.15
AAG535.68
AGG384.07
ACT272.89
AGC111.18
AAC60.64
ACG50.54
ACC30.32
Total933100
Tetra-AGAT136159.48
ACAT60326.35
AAAG863.76
AAAT622.71
ACAG361.57
AAAC160.70
AAGG140.61
AACT130.57
ACGT120.52
AGGC110.48
AATT100.44
AATG100.44
ACCT100.44
AAGT90.39
ACTC80.35
AATC60.26
AACG50.22
ATCC40.17
AACC40.17
AGGG30.13
ACGC20.09
AAGC10.04
ACTG10.04
ACGG10.04
Total2288100
Total5128
Figure 2

Relative abundance in genomic and nonredundant DNA of each motif within each of the three microsatellite repeat size classes analyzed. The abundance of each motif within both the genomic sample and the nonredundant sample is plotted against as a percentage of the abundance of the entire size class. AT, ATT, and AGAT were statistically more abundant than other members of their respective size class motif in both genomic and nonredundant samples. Only the most prevalent motifs for each size class are shown. Nonredundant results are shown in black and compared to genomic results are shown in gray.

Table 4 shows a comparison in average number of repeat units between genomic and nonredundant di-, tri-, and tetranucleotide microsatellite repeat classes. In all cases, the nonredundant microsatellites have significantly longer repeats than their genomic counterparts (p < 0.001). This trend is also seen for most individual repeat motifs and is most pronounced for the dinucleotide motifs (Fig. 3).
Table 4

Comparison in mean repeat size between genomic and nonredundant microsatellites.

# of repeat units# of repeat units

GenomicS.E.NonredundantS.E.
Di-8.330.2723.96*0.06
Tri-6.280.169.58*0.07
Tetra-7.430.077.99*0.07

The nonredundant microsatellites have significantly longer repeats than their counterparts (student t-tests: for dinucleotides t = 57.02, df = 2958, p < 0.001, for trinucleotides t = 18.80, df = 1686, p < 0.001, for tetranucleotides t = 5.67, df = 4941, p < 0.001).

Figure 3

Mean tandem repeat number in nonredundant DNA for each microsatellite motif. Mean repeat numbers were determined for each di-, tri-, and tetranucleotide contained in our nonredundant microsatellite sample (see methods). Nonredundant results are shown in black, and genomic results are shown in gray. Only the most prevalent motifs for each size class are shown (no GC microsatellites were seen in our nonredundant sample). Standard errors are shown.

Polymorphism of di-, tri-, and tetranucleotide microsatellites

Effects of repeat length on the degree of polymorphism within microsatellites

To examine the relationship between repeat length and degree of polymorphism, microsatellite loci were classified into seven groups based on the length of their core repeat sequences. The percent of each group that is polymorphic is displayed graphically in Figure 4 for each repeat length group. Clear trends can be observed for the tri- and tetranucleotide microsatellites showing a correlation between repeat length and degree of polymorphism. To determine if these trends were statistically significant, each microsatellite motif was divided into two length classes. Loci with a motif length 30 bp or less were designated as Class I markers, while those more than 30 bp were designated as Class II markers. Analysis of these groups revealed the Class II markers exhibited a significantly higher degree of polymorphism than Class I markers for all the three microsatellite repeat classes (di-, tri-, and tetranucleotide) (Table 5). This strongly suggests that repeat length does affect the degree of polymorphism for microsatellites.
Figure 4

Polymorphism rate for repeat length classes of each nonredundant microsatellite motif. Each microsatellite motif was subdivided into seven groups based on the length of their core repeat sequences. The total number of loci analyzed is shown in each length class.

Table 5

Comparison of polymorphic and non-polymorphic markers by repeat size.

Class 1 (≤30 bp)Class 2 (>30 bp)Total
Di*Not Polymorphic254 (43.1%)496 (37.66%)750 (39.33%)
Polymorphic336 (56.9%)821 (62.34%)1157 (60.67%)
Total59013171907
Tri**Not Polymorphic324 (45.8%)70 (31.0%)394 (42.2%)
Polymorphic383 (54.2%)156 (69.0%)539 (57.8%)
Total707226933
Tetra**Not Polymorphic481 (39.8%)290 (26.90%)873 (35.09%)
Polymorphic728 (60.2%)789 (73.10%)1615 (64.91%)
Total120910792288
All**Not Polymorphic1191 (43.11%)856 (32.66%)2047 (38.02%)
Polymorphic1572 (56.89%)1765 (67.34%)3337 (61.98%)
Total250626225128
+++++

Comparing polymorphism rates between Class 1 and 2 for each microsatellite repeat group (di, tri-, and tetra-),

means the 2 size classes are significantly different (p < 0.05),

means highly significantly different (p < 0.01). Comparing polymorphism rates among the three microsatellite repeat groups (di, tri-, and tetra-), + means significant differences (p < 0.05), ++ means highly significantly differences (p < 0.01) (see text).

Variations in polymorphism among different types of microsatellites

Statistical analysis further indicates the polymorphic rates of the three repeat classes of microsatellites analyzed are significantly different (Table 5). Here, “polymorphism rate” refers to the proportion of microsatellites in a given class that were shown to be polymorphic among individuals from the two strains of X. tropicalis. The pairwise comparisons show that tetranucleotide microsatellites have the highest polymorphic rates, significantly higher than dinucleotide and trinucleotide microsatellites (p < 0.01). Specifically within the Class II markers, tetranucleotide microsatellites also exhibit the highest rate of polymorphism; however, this difference is significant only for dinucleotide microsatellites (p < 0.01), and not for trinucleotide microsatellites. In Class I markers tetranucleotide microsatellites also exhibit a significantly higher polymorphism rate than trinucleotide loci (p < 0.05). However the polymorphism rate for tetranucleotide microsatellites was not seen to be significantly higher than dinucleotide microsatellites (p = 0.21).

Variations in polymorphism among different motifs of microsatellites

Figure 5 shows the rate of polymorphism for the most common microsatellite motifs. Although there was no significant difference in the rate of polymorphism among the three dinucleotide motifs (AT, AC, and AG), among the four most abundant trinucleotide motifs, AAT and ATC show significantly higher polymorphism than AAG and AGG (p < 0.01). Likewise, the most abundant tetranucleotide motifs, AGAT and AAAG, are significantly more polymorphic than ACAT and AAAT (p < 0.01). The higher polymorphism of microsatellites with motifs of AAT, ATC, AGAT, and AAAG seem to be correlated with their relatively longer repeat length (Fig. 3).
Figure 5

Comparison of polymorphic rates among different microsatellite motifs. Nonredundant microsatellites were tested for polymorphism as described in the methods section. *AAT and ATC show significant higher polymorphism than AAG and AGG (p < 0.001). *AGAT and AAAG are significantly more polymorphic than ACAT and AAAT (p < 0.001).

Discussion

Characteristics of X. tropicalis genome and the distribution of microsatellites

Our bioinformatics analysis found over 91,000 di-, tri-, and tetranucleotide microsatellites in ~25% of the X. tropicalis genome, suggesting there may be over 360,000 within the entire genome. Within the X. tropicalis genome, dinucleotide (78.7%) microsatellites vastly out-number tri- (8.1%) and tetranucleotide (13.2%) microsatellites. Although, there is some variation in the literature, these observations generally agree with data from other vertebrates (Toth et al. 2000). In the present study, the trinucleotide repeats are the least abundant of the microsatellites, which is consistent with studies in other vertebrates as well. Trinucleotide repeats, however, are more prevalent in protein coding regions, while di- and tetranucleotide repeats are scarce in exons (Li et al. 2002, 2004; Morgante et al. 2002; Toth et al. 2000; Dieringer and Schlotterer, 2003). The latter is probably the result of negative selection against frameshift mutations, which limits the expansion of microsatellites in coding sequences (Metzgar et al. 2000). During our analysis of the three types of microsatellites in scaffolds from the Xenopus tropicalis genome assembly 4.1, we noticed trinucleotide repeats were over-represented in some scaffolds and underrepresented in others. This could enable us to distinguish exon-rich scaffolds from those scaffolds containing primarily intergenic regions. In the X. tropicalis genome the AT-rich repeats are overwhelmingly dominant. All three most abundant motifs in the three types of microsatellites (AT, AAT, and AGAT) are AT-rich (Table 2). Among all the di-, tri-, and tetranucleotide repeats identified, 51858 out of 91304 repeats (56.8%) are 100% AT repeats (e.g. AT, AAT, AAAT, and AATT), while 90128 (99%) repeats have an AT content not less than 50%. The high abundance of the AT-rich repeats in X. tropicalis could be partly attributable to the low melting temperature of AT-rich fragments and high mutation rates in poly (A/T) tracts (Prasad et al. 2005). However, these factors cannot explain why different taxa have different abundant repeat motifs. Although exceptions exist (Schug et al. 1998), AC repeats have been reported as the most common dinucleotide repeats in most animals, including humans (Beckmann and Weber, 1992; Nadir et al. 1996; Katti, 2001), primates (Jurka and Pethiyagoda, 1995; Toth et al. 2000), rodents (Beckmann and Weber, 1992; Toth et al. 2000), chickens (Moran, 1993), Fugu (Edwards et al. 1998), bivalves (Cruz et al. 2005), and Drosophila (Schug et al. 1998; Bachtrog et al. 1999). In contrast, AG repeats are found to be the most abundant dinucleotide repeats in honey bees (Estoup et al. 1993) and yellowjacket wasps (Thoren et al. 1995), while AT repeats dominate the dinucleotide microsatellites in silkworms (Prasad et al. 2005) and yeast (Toth et al. 2000). Significantly, the predominance of AT repeats in the X. tropicalis genome found in the present study is the first such report in vertebrates. Interestingly, our results differ from those of Toth et al. (2000) who found that AC repeats are the most abundant repeats in vertebrates, occurring more than twice as frequently as AT repeats. In their study, 12.15% of the vertebrate taxonomic group was represented by Xenopus laevis, sister species of X. tropicalis. Further analysis is needed to determine whether the distribution of repeat motifs observed in X. tropicalis is characteristic of Xenopus laevis or other closely related frog species. In contrast with dinucleotide abundance levels, the most prevalent tri- and tetranucleotide repeats of X. tropicalis, AAT and AGAT, are consistent with the results in some other vertebrates including X. laevis, although differing from those seen in some mammalian species (Edwards et al. 1998; Toth et al. 2000). Schlotterer (2000) has suggested that taxon-specific predominance of different repeat motifs could be influenced by a different base composition in the genome as well as differences in the DNA mismatch repair systems. In addition, Prasad et al. (2005) have suggested there is a potential relationship between distribution of repeat motifs and higher-order chromatin structure. Tetranucleotide microsatellites containing the AGAT (GATA) motif are known to be associated with the sex chromosome in humans and to play a role in higher order chromatin organization and function (Singh et al. 1994; Zhao et al. 1995; Subramanian et al. 2003). X. tropicalis certainly provides a unique opportunity for comparative studies on the role of AGAT repeats because of the predominance of AGAT repeats in its genome.

Comparisons between nonredundant and genomic microsatellites

The nonredundant di-, tri-, and tetranucleotide microsatellites, which were used as candidate markers for our linkage map, were independently identified from the X. tropicalis genome. Criteria for identifying nonredundant markers are that they have unique flanking sequences, that are long enough and have sufficient complexity to enable the design of unique PCR primers (Sharrocks, 1994). Among the three repeat types of nonredundant microsatellites analyzed, the distribution pattern of different motifs is generally consistent with that of genomic repeats, in that the most abundant di-, tri-, and tetranucleotide nonredundant motifs are AT, AAT, and AGAT. Although, this subset of microsatellite loci is similar to those identified in the entire X. tropicalis genome, the relative abundance of different motifs within di-, tri-, or tetranucleotide microsatellites show some divergence between nonredundant versus genomic repeats. For example, the AT repeats account for 64.7% of the total dinucleotide genomic loci, but 90.3% of the nonredundant dinucleotide loci respectively, suggesting a smaller proportion of AC and AG repeats are embedded in unique sequences with long enough flanking sequences for useful primer design. The discrepancies between the abundance of specific motifs in non-redundant loci versus genomic microsatellites may result from the appearance of AC or AG repeat strings embedded within more complex repetitive sequences. Large complex minisatellite repeats comprise over 1% of the X. tropicalis genome, with our initial surveys suggesting sequences containing AC repeats appear in very high copy numbers in these minisatellites. Inclusion of AC or AG repeat strings in larger, more complex minisatellite sequences could skew the distribution of repeat motifs among genomic microsatellites.

Factors affecting microsatellite variation

It is well known that microsatellites are hot spots for genome mutation and variation (Weber, 1990; Ellegren, 2004). The variability seen in microsatellites is primarily due to sequence length polymorphisms caused by variable numbers of tandem repeats (Ellegren, 2000, 2004; Neff and Gross, 2001). In the present study, we compared the percentage of polymorphic loci in two different size classes (class I: length ≤ 30 bp; class II: length >30 bp). We found class II markers are significantly more polymorphic than class I markers for all three microsatellite repeat types. This suggests loci with larger numbers of repeats are more prone to mutation/expansion than those with fewer repeats. This result is consistent with other observations based on pedigree analyses (Brinkmann et al. 1998; Schug et al. 1998; Bachtrog et al. 2000; Kayser et al. 2000) and population genetics studies (Goldstein and Clark, 1995). The correlation between repeats length and the variability of microsatellites is understandable according to the replication slippage model, which is one of the widely accepted mutation mechanisms (Levinson and Gutman, 1987), as the longer the repeats, the more chances exist for the slipped-strand mispairing to occur. Repeat type is another factor that has been found to affect stability of microsatellites (Schlotterer, 2000; Ellegren, 2004). Our study compared the polymorphism rate of 1,907 di-, 933 tri-, and 2,288 tetranucleotide microsatellites. The results indicate the tetranucleotide microsatellites have the highest rate of polymorphism while the dinucleotide microsatellites are the least polymorphic. Our results agree with Weber and Wong’s observation (1993) that the mutation rate for tetranucleotides is almost four times higher than that of dinucleotide repeats. Sia et al. (1997) reported similar mutation rates for tetranucleotide and dinucleotides repeats. However, two subsequent studies using different methodologies (Chakraborty et al. 1997; Lee et al. 1999) reached the conflicting conclusion that dinucleotide microsatellites have higher mutation rates than tetranucleotide microsatellites. However, the discrepancies between these studies may have resulted from insufficient data. It is worth noting that all three studies used only a small number of loci: Weber and Wong used 19 loci, Sia et al. used one di- and one tetranucleotide loci, Chakraborty et al. used 30 loci, and Lee et al. used two loci. Additional analysis is required to clarify the effects of repeat type on the polymorphism of microsatellites. It has also been reported that the base composition of the repeat motifs may play a role in the variations of microsatellites. When they compared slippage rates between different microsatellites with different base compositions in Drosophila using an in vitro replication system, Schlotterer and Tautz (1992) found that sequences with high AT content mutate faster than those with high GC content. In contrast, Bachtrog et al. (2000) found that GT/CA-containing microsatellites of D. melanogaster had the highest mutation rate, while the AT-containing microsatellites had the lowest. Still another study showed that the CA and GA repeats of similar length in Escherichia coli genome exhibit similar mutability (Eckert and Yan, 2000). Although, our results indicate there are no differences in the polymorphic rate among the three dinucleotide motifs (AT, AC, and AG), among those most predominant tri- and tetranucleotide microsatellites, AAT and ATC exhibit a higher rate of polymorphism than AAG and AGG, and AGAT and AAAG are more frequently polymorphic than ACAT and AAAT. It remains unclear if the higher variability in AAT, ATC, AGAT, and AAAG microsatellites is a universal or species-specific phenomenon. In humans, an AAAG tetranucleotide locus has also demonstrated hypermutability (Talbot et al. 1995). It is worth noting that of the four tri- and tetranucleotide microsatellite motifs showing the highest rate of polymorphism, all have a higher number of repeat units per loci than their less polymorphic members. However, this trend does not hold for the dinucleotide loci as AT loci have significantly more tandem repeats than either AC or AG loci, yet the rate of polymorphism of AT does not significantly differ from the other two.
  45 in total

1.  Slippage synthesis of simple sequence DNA.

Authors:  C Schlötterer; D Tautz
Journal:  Nucleic Acids Res       Date:  1992-01-25       Impact factor: 16.971

2.  Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat.

Authors:  B Brinkmann; M Klintschar; F Neuhuber; J Hühne; B Rolf
Journal:  Am J Hum Genet       Date:  1998-06       Impact factor: 11.025

3.  CENSOR--a program for identification and elimination of repetitive elements from DNA sequences.

Authors:  J Jurka; P Klonowski; V Dagman; P Pelton
Journal:  Comput Chem       Date:  1996-03

Review 4.  Notes on the definition and nomenclature of tandemly repetitive DNA sequences.

Authors:  D Tautz
Journal:  EXS       Date:  1993

5.  Mutation of human short tandem repeats.

Authors:  J L Weber; C Wong
Journal:  Hum Mol Genet       Date:  1993-08       Impact factor: 6.150

6.  Relative stabilities of dinucleotide and tetranucleotide repeats in cultured mammalian cells.

Authors:  J S Lee; M G Hanford; J L Genova; R A Farber
Journal:  Hum Mol Genet       Date:  1999-12       Impact factor: 6.150

Review 7.  Evolutionary dynamics of microsatellite DNA.

Authors:  C Schlötterer
Journal:  Chromosoma       Date:  2000-09       Impact factor: 4.316

8.  Characterization of (GT)n and (CT)n microsatellites in two insect species: Apis mellifera and Bombus terrestris.

Authors:  A Estoup; M Solignac; M Harry; J M Cornuet
Journal:  Nucleic Acids Res       Date:  1993-03-25       Impact factor: 16.971

9.  Microsatellite repeats in pig (Sus domestica) and chicken (Gallus domesticus) genomes.

Authors:  C Moran
Journal:  J Hered       Date:  1993 Jul-Aug       Impact factor: 2.645

10.  Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions.

Authors:  Subbaya Subramanian; Rakesh K Mishra; Lalji Singh
Journal:  Genome Biol       Date:  2003-01-23       Impact factor: 13.583

View more
  6 in total

Review 1.  Xenopus: An emerging model for studying congenital heart disease.

Authors:  Erin Kaltenbrun; Panna Tandon; Nirav M Amin; Lauren Waldron; Chris Showell; Frank L Conlon
Journal:  Birth Defects Res A Clin Mol Teratol       Date:  2011-04-28

2.  A genetic map of Xenopus tropicalis.

Authors:  Dan E Wells; Laura Gutierrez; Zhenkang Xu; Vladimir Krylov; Jaroslav Macha; Kerstin P Blankenburg; Matthew Hitchens; Larry J Bellot; Mary Spivey; Derek L Stemple; Andria Kowis; Yuan Ye; Shiran Pasternak; Jenetta Owen; Thu Tran; Renata Slavikova; Lucie Tumova; Tereza Tlapakova; Eva Seifertova; Steven E Scherer; Amy K Sater
Journal:  Dev Biol       Date:  2011-03-31       Impact factor: 3.582

3.  Reptilian-transcriptome v1.0, a glimpse in the brain transcriptome of five divergent Sauropsida lineages and the phylogenetic position of turtles.

Authors:  Athanasia C Tzika; Raphaël Helaers; Gerrit Schramm; Michel C Milinkovitch
Journal:  Evodevo       Date:  2011-09-26       Impact factor: 2.250

4.  A comparative survey of the frequency and distribution of polymorphism in the genome of Xenopus tropicalis.

Authors:  Chris Showell; Samantha Carruthers; Amanda Hall; Fernando Pardo-Manuel de Villena; Derek Stemple; Frank L Conlon
Journal:  PLoS One       Date:  2011-08-04       Impact factor: 3.240

5.  Inbreeding Ratio and Genetic Relationships among Strains of the Western Clawed Frog, Xenopus tropicalis.

Authors:  Takeshi Igawa; Ai Watanabe; Atsushi Suzuki; Akihiko Kashiwagi; Keiko Kashiwagi; Anna Noble; Matt Guille; David E Simpson; Marko E Horb; Tamotsu Fujii; Masayuki Sumida
Journal:  PLoS One       Date:  2015-07-29       Impact factor: 3.240

6.  Genomic approach for conservation and the sustainable management of endangered species of the Amazon.

Authors:  Paola Fazzi-Gomes; Jonas Aguiar; Gleyce Fonseca Cabral; Diego Marques; Helber Palheta; Fabiano Moreira; Marilia Rodrigues; Renata Cavalcante; Jorge Souza; Caio Silva; Igor Hamoy; Sidney Santos
Journal:  PLoS One       Date:  2021-02-24       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.