Literature DB >> 23292136

Patterns of evolutionary conservation of microsatellites (SSRs) suggest a faster rate of genome evolution in Hymenoptera than in Diptera.

Eckart Stolle1, Jonathan H Kidner, Robin F A Moritz.   

Abstract

Microsatellites, or simple sequence repeats (SSRs), are common and widespread DNA elements in genomes of many organisms. However, their dynamics in genome evolution is unclear, whereby they are thought to evolve neutrally. More available genome sequences along with dated phylogenies allowed for studying the evolution of these repetitive DNA elements along evolutionary time scales. This could be used to compare rates of genome evolution. We show that SSRs in insects can be retained for several hundred million years. Different types of microsatellites seem to be retained longer than others. By comparing Dipteran with Hymenopteran species, we found very similar patterns of SSR loss during their evolution, but both taxa differ profoundly in the rate. Relative to divergence time, Diptera lost SSRs twice as fast as Hymenoptera. The loss of SSRs on the Drosophila melanogaster X-chromosome was higher than on the other chromosomes. However, accounting for generation time, the Diptera show an 8.5-fold slower rate of SSR loss than the Hymenoptera, which, in contrast to previous studies, suggests a faster genome evolution in the latter. This shows that generation time differences can have a profound effect. A faster genome evolution in these insects could be facilitated by several factors very different to Diptera, which is discussed in light of our results on the haplodiploid D. melanogaster X-chromosome. Furthermore, large numbers of SSRs can be found to be in synteny and thus could be exploited as a tool to investigate genome structure and evolution.

Entities:  

Mesh:

Year:  2013        PMID: 23292136      PMCID: PMC3595035          DOI: 10.1093/gbe/evs133

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Large parts of eukaryotic genomes are composed of simple sequence repeats (SSRs), also called short tandem repeats (STRs) or microsatellites, are a common feature, and can account for up to 4% of genomes (Ellegren 2004; Schlötterer 2004; Molnar et al. 2012). These repeats occur throughout the genomes, the majority in noncoding regions, but they can be found also in protein coding sequences. Numerous studies showed apparent differences regarding their density, distribution, and composition (Tóth et al. 2000; Katti et al. 2001; Ross et al. 2003; Lim et al. 2004; Buschiazzo and Gemmell 2006; Galindo et al. 2009; Mayer et al. 2010; Pannebakker et al. 2010). Because of high levels of polymorphism in number of repeats, SSRs are widely used as molecular markers in a large diversity of studies. The high degree of polymorphism has been attributed to DNA slippage mutation during replication (Leclercq et al. 2010), but the process may be more complex and is still not fully understood (Li et al. 2002, 2004; Ellegren 2004; Buschiazzo and Gemmell 2006; Eckert and Hile 2009; Bhargava and Fuentes 2010; Kelkar et al. 2010; Leclercq et al. 2010). Frequent repeat number variation in SSRs at a rate of 10−2–10−6 per locus per generation (Schlötterer 2000) often follows a regular pattern which can be used as a short-term molecular clock (Sun et al. 2009) and for the inference of phylogeny (Buschiazzo and Gemmell 2009). Traditionally, SSRs are regarded as nonfunctional and hence neutrally evolving. Consequently, these genetic elements have a higher mutation rate compared with functional or coding sequences, which are more conserved in response to selection (Schlötterer 2000). This, in combination with the polymorphic nature of SSRs, leads to the expectation of a highly dynamic system of gain, change, and loss of SSR repeats in genomes within natural populations. Nevertheless, there have been several reports of highly conserved SSRs within and across taxa. Interspecies amplification of SSR loci reveals that many SSRs are shared between closely related species (Blanquer-Maumont and Crouauroy 1995; Primmer et al. 1996; Green et al. 2001; Reber Funk et al. 2006; Barbará et al. 2007; Katada et al. 2007; Meglécz et al. 2007; Paxton et al. 2009; Stolle et al. 2009) and, for a few loci, even between species with a phylogenetic split of more than 100 Myr (Vaiman et al. 1994; FitzSimmons et al. 1995; Rico et al. 1996; Ezenwa et al. 1998; Moore et al. 1998; Green et al. 2001; Barbará et al. 2007; Buschiazzo and Gemmell 2009). Recently, Buschiazzo and Gemmell (2010) showed that a significant fraction of SSRs in vertebrates have been conserved for up to 450 Myr, but the mechanisms underlying this conservation over long evolutionary times are unknown. Some SSRs possess biological function regarding chromosome stability, RNA folding, amino acid repeats or relations to human diseases, recombination hotspots, or transposable elements (Goldstein and Schlötterer 1999; Li et al. 2002, 2004; Brandström et al. 2008; Thomou et al. 2009; Bonen et al. 2010; Grover and Sharma 2011; Wang et al. 2012). However, the large majority of SSR repeats are located in regions without a known biological function. Nevertheless, on the basis of the flanking regions adjacent to SSRs, Stolle et al. (2011) reported a high structural conservation of the chromosomes in the honeybee Apis mellifera and the bumblebee Bombus terrestris, which diverged approximately 100 Ma. Indeed genomes of Hymenoptera have been reported to be slowly evolving compared with those of Dipteran flies or various other animal groups (Weinstock et al. 2006; Stolle et al. 2011). However, the disparate life histories within the Insecta have a considerable impact when comparing evolutionary time scales across taxa. For example, generation time and effective population size may differ by several orders of magnitude. Social insects typically have very long-lived sexual females but with a relatively small effective population size, as per generation, only one or few individuals are responsible for reproduction. In addition, other particular characteristics such as haplodiploidy, multiple mating, worker reproduction, longevity of individuals, and colonies may further obscure the actual rates of evolutionary change over generations. Here, we investigate SSR conservation across different insect groups. Our expectation, based on the polymorphic and neutral nature of SSRs, was a fast decay of SSR loci in both Hymenoptera and Diptera. Our data suggest that high proportions of SSRs can be conserved between species. Some even can be retained for hundreds of millions of years of divergent evolution. Comparing the insect groups of Hymenoptera and Diptera, the degree of conservation differs markedly, depending upon SSR types and motif lengths, but the overall pattern is surprisingly similar. Using species with well-established phylogenies and robust divergence time estimates, we compare the rates of evolution accounting for the effect of generation time.

Materials and Methods

Genome Sequences and SSR Identification

Whole-genome sequences of 12 Drosophila, 3 mosquitoes, and 11 Hymenopteran species (fig. 1) were retrieved via GenBank (National Center for Biotechnology Information [NCBI]) and flybase (January 2011) and scanned for SSR repeats using the Phobos software (version 3.3.11, Mayer 2006–2010) with the following settings: imperfect search with minimum thresholds of 70% repeat perfection, four repetitive units of 2–5 bp motifs, and 10 bp total length, extraction of 350 bp flanking sequence at both sides. We choose these repeats because they typically account for the majority of SSRs. Further, we left out the mononucleotide repeats to avoid a bias due to differential representation in different genomes caused by the problems of sequencing homopolymers.
F

Species overview. Summary data for species used in this study. Their phylogenetic relationships are shown at the left with divergence times at the nodes, species names with respective genome size and generation time are given in the middle part, and SSR counts and densities for each species with species abbreviations is given at the right part.

Species overview. Summary data for species used in this study. Their phylogenetic relationships are shown at the left with divergence times at the nodes, species names with respective genome size and generation time are given in the middle part, and SSR counts and densities for each species with species abbreviations is given at the right part. The output, with standardized SSR motifs (e.g., GA, TC, and CT are defined as AG, automatically done by Phobos), was then filtered for potential double entries, for example, if a specific imperfect SSR was found as the dinucleotide repeat AT and the trinucleotide AAT. Therefore, SSRs with a distance of 15 bp or closer to the start or end of the following SSR were discarded. This yielded initial information about the composition and genome-wide distribution of these SSRs for each species (fig. 1).

BLAST Analyses and Filtering

Libraries of SSRs flanked by 350 bp sequence were then used in pairwise Basic Local Alignment Search Tool (BLAST) analyses (NCBI BLAST 2.2.25+ [Altschul et al. 1990]), using one library (species A) as query and another library (species B) as reference. The analyses were performed using a custom-made Perl script with the SSRs sequences themselves being masked as “N.” For each query sequence, the four highest BLAST hits within the reference sequences were recorded. The resulting BLAST hits were then processed with a second custom-made Perl script. First, those BLAST hits where the SSR motif of the query was not matching that of the reference were discarded. Second, if a query sequence yielded multiple BLAST hits on the identical reference sequence, for example, due to the gap by the masked SSR, the scores of these BLAST hits were summed up. Third, BLAST hits smaller than 100 bp and 70% or less sequence identity were excluded from further analyses. Each query sequence, representing a SSR of species A, which passed these thresholds, was then assigned to a single sequence within the reference, representing a SSR of species B. If for a query sequence more than one BLAST hit within the reference sequences was remaining after the filtering steps, the assignment was conducted by choosing the BLAST hit with highest score. In cases where there were two or more BLAST hits with exactly same score, these entries were discarded as it could not be matched unambiguously, even if this score was the highest among the recorded BLAST hits. Similarly, we searched for multiple matches to a reference sequence. If there were more than one query sequence assigned to the same reference sequence, all were discarded but the one reference sequence which gave the highest BLAST score with the respective query. Again, we excluded those entries where two or more reference sequences had the exactly same BLAST score, even if this score represented the highest BLAST score. Hence, the final data set contained only pairs of unique query sequences assigned to unique reference sequences, both having the same SSR motif irrespective of the number of repeat units or level of perfection. For each final data set, the result of the pairwise comparison between a query and a reference, the number of detected SSR loci was related to the number of SSR loci in the respective reference. Each query SSR locus detected in the reference is defined as a conserved SSR, although we cannot rule out the possibility that a SSR was lost during evolution within a species or lineage and independently a new, nonhomologous SSR with the same motif arose at the same or very similar position. The conserved SSR loci were determined for each analyzed species pair, the sum and the numbers for each individual SSR motif.

Validation of the Method

We validated our method by comparing the SSR libraries of Drosophila melanogaster, A. mellifera, Solenopsis invicta, Atta cephalotes, and Nasonia vitripennis with itself. The expectation was a correct recovery of each detected SSR after applying the very same thresholds, filtering, and processing steps. The result of this test is a benchmark of our approach and allows for the determination of the false-positive error rate by simply detecting erroneously assigned SSRs in the final data set. Furthermore, we evaluated the Muller element B (chromosome 2L) of the D. melanogaster genome for synteny between D. melanogaster and D. simulans. To proof the assumption that the BLAST analysis gives the same result irrespective which species is used as query and which as reference in a species pair, we conducted some selected reciprocal runs for the species pairs DmelDsim, Dmel–Dpse, Dmel–Dvir, Amel–Soli, and Acep–Soli (for abbreviations see fig. 1).

Divergence Time and Generation Time

The generation time (here the number of generations produced per year, fig. 1) was estimated from data from the literature. The Dipteran species used in this study typically have a short generation time, and in particular, the tropical species can produce many generations per year (>20 [Keightley 2000]). For most Drosophila species, we assumed 10 generations per year (Li and Nei 1977; Laayouni et al. 2003; Hutter et al. 2007; Cutter 2008; Barker 2011). Some Drosophila species from mountainous areas or from colder climates or such species with more extended life cycles (Begon 1976; Keightley 2000; Jennings et al. 2011) are known to have fewer generations per year, similar to D. willistoni and the Hawaiian D. grimshawi for which we assumed five generations per year. The Hymenopteran Nasonia species are nonsocial parasites and have been reported to reproduce four to five times a year in the wild (Werren J, personal communication) (Raychoudhury et al. 2010; Powell et al. 2011). The generation times for the other, eusocial, species are typically much longer. For A. mellifera, Linepithema humile, and Harpegnathos saltator, sexual offspring is typically produced once a year. The ant species Camponotus floridanus, S. invicta, and A. cephalotes, Acromyrmex echinator, and Pogonomyrmex rugosus with larger colonies have more long living queens, and sexual offspring is only produced every 2–3 years (Hölldobler and Wilson 1990; Taber 1998, 2000; Bekkevold and Boomsma 2000; Peeters and Liebig 2000; Gadau et al. 2012). Divergence time estimates were obtained from several phylogenetic studies based on both the fossil record and molecular clocks (Rasnitsyn and Quicke 2002; Tamura et al. 2004; Grimaldi and Engel 2005; Moreau et al. 2006; O’Grady and Desalle 2008; Werren et al. 2010; Gadau et al. 2012). On the basis of the divergence time (in million years before present) and the number of generations, we obtained an estimate of how many generations had passed from the separation of lineage or species until the present (fig. 1 and table 1).
Table 1

Pairwise Comparisons for SSR Conservation

QueryReferenceConserved SSRs (n)SSRs in Reference (n)Conserved (%)Divergence Time (Ma)Generations per YearGenerations (Million)
DperDpse232,926353,38365.910.85108.5
DsecDsim129,022201,05364.170.93109.3
DsimDmel115,053246,10646.755.41054
DsecDmel117,217246,10647.635.41054
DereDyak93,489256,42736.4610.410104
DereDmel88,213246,10635.8412.610126
DyakDmel90,661246,10636.8412.610126
DmojDvir104,551456,10722.924010400
DgriDvir86,405456,10718.9442.97.5321.75
DanaDyak36,375256,42714.1944.210442
DanaDmel36,511246,10614.8444.210442
DanaDsim34,477201,05317.1544.210442
DperDmel26,608246,10610.8154.910549
DpseDmel27,504246,10611.1854.910549
DwilDmel14,855246,1066.0462.27.5466.5
DwilDsim13,619201,0536.7762.27.5466.5
DwilDvir22,130456,1074.8562.97.5471.75
DgriDmel13,618246,1065.5362.97.5471.75
DmojDmel14,099246,1065.7362.910629
DvirDmel14,742246,1065.9962.910629
AedesCulex8,689561,1351.55205214,305
AgamCulex4,311561,1350.77217163,472
PogoSoli104,911671,43715.62850.4235.42
AcroSoli127,832671,43719.04900.4237.5
AcepSoli120,761671,43717.99900.4237.5
CfloSoli77,896671,43711.61100.555
LhumSoli73,464671,43710.941400.75105
HsalSoli66,510671,4379.911600.75120
AmelSoli19,323671,4372.881680.75126
NvitSoli5,933671,4370.881852.25416.25
AcroAcep285,148603,45547.25100.333.33
LhumAcep66,203603,45510.971400.6793.33
HsalCflo59,205562,52510.521600.75120
CfloAmel26,633704,5463.781680.75126
HsalAmel20,134704,5462.861681168
NvitAmel6,393704,5460.911852370
NlonNgir322,594426,70475.60.4152.05
NvitNgir317,256426,70474.35155

Note.—The analyzed species pairs (query vs. reference) are shown with the detected number of SSRs (conserved between both species), the number of used SSRs (number of SSRs in the reference), the proportion found to be conserved, the time when both species split (divergence time), the generation time as the average of the number of generations produced per year by each species in this pair, and the number of million generation potentially produced since divergence.

Pairwise Comparisons for SSR Conservation Note.—The analyzed species pairs (query vs. reference) are shown with the detected number of SSRs (conserved between both species), the number of used SSRs (number of SSRs in the reference), the proportion found to be conserved, the time when both species split (divergence time), the generation time as the average of the number of generations produced per year by each species in this pair, and the number of million generation potentially produced since divergence.

Conservation of SSR Loci in Genomes of Species Pairs

Each node in the phylogeny represents the time at which the most recent common ancestor species separated into two different lineages or species. Drosophila melanogaster (Dmel) was selected as the reference genome because it is an intensely studied model species. Hence, all other Drosophila species were compared with Dmel. In addition, some additional pairwise comparisons were chosen to cover nodes that provided additional phylogenetic time points (e.g., D. secchellia–D. simulans or D. mojavense–D. virilis). We analogously proceeded within the Hymenoptera, with the S. invicta (Soli) as the main reference genome to cover most nodes on the phylogenetic tree.

Rate of Decay of SSR Loci

An exponential decay function was fitted to our data to determine the rate of decay of SSR conservation using R (Team 2011). This was achieved by minimizing the square of the deviance of our data points to the decay function, searching the parameter space with the assumption of a constant rate of decay.

Conservation SSR Types and Motifs

Pairwise comparisons were used to analyze the conservation of specific SSR types, di-, tri-, tetra-, and pentanucleotide repeats, and their motifs. First, counts for each SSR type and motif were determined in the reference species. The same was done for the data set resulting from the pairwise comparison, the conserved SSRs loci. The relationship between the total numbers of SSRs shared between both species represents the total decay of the SSRs or the proportion of all SSRs which are conserved between both species. This analysis was repeated for each of the SSR types and motifs. The decay of each different type of SSRs and the repeat motif length (di-, tri-, tetra-, and pentanucleotide repeats) between both species was related to the decay of total number of SSRs. Comparing the four SSR types, we can determine whether the decay of a specific type of SSR is slower (less decay) than the overall decay of all SSRs. Analogously, the specific SSR sequence motifs were analyzed within each SSR type, that is, the decay of a certain dinucleotide repeat motif was compared with the decay of all dinucleotide repeats. Therefore, if certain motifs decay slower than others, it infers that they are more stable than others over evolutionary time scales. Differences across motifs and types of SSRs were tested by comparing within (including correction for multiple testing) and between the Hymenoptera and Diptera using a two-tailed Mann–Whitney U test.

Results

Genomic SSR Content

SSRs with repeat units of two to five base pairs were identified in 12 Drosophila, 3 mosquitoes, 3 Nasonia, 1 bee, and 7 ant genomes. The total numbers, the density, and the composition vary among the genomes of different species, sometimes even between closely related species (fig. 1 and supplementary file S1, Supplementary Material online). There was a positive linear relation of genome size and SSR count (supplementary file S1, Supplementary Material online).

Conservation of SSRs between Pairs of Species

Each pairwise comparison of the SSR libraries with Blast identifies potentially homologous SSR loci between species, which were retained since divergence of both species from a common ancestor. As expected, SSRs conservation decreases over phylogenetic time scales (table 1 and supplementary file S2, Supplementary Material online). Species that separated within the last 1 Myr retained more than 60% of the SSR loci. The Drosophila species of the subgenus Sophophora retained still more than 5% of the SSR loci during their more than 60 Myr of separate evolution; the ants and the honeybee retained approximately 3% since 185 Myr and Aedes and Culex more than 1.5% since more than 200 Myr. Even between the Diptera and the Hymenoptera, separated for approximately 300 Myr (Grimaldi and Engel 2005), approximately 0.1% of the SSR loci were conserved. As a benchmark of our method, we compared the genomes of several species with themselves, using identical processing and filtering. For D. melanogaster, we detected 80.84%, for A. mellifera 88.06%, for S. invicta 84.36%, for A. cephalotes 91.49%, and for N. vitripennis 83%. When checked for the correct assignment of the identical SSRs, we found 0.8% of the SSRs in D. melanogaster to be incorrectly assigned. This measure represents the rate of false positives detected with our method and filtering thresholds. For A. mellifera, this rate was 1.87%, for S. invicta 2.46%, for A. cephalotes 1.1%, and for N. vitripennis 1.29%, giving an average of 1.68% for the tested Hymenoptera. Approximately a quarter of these false positives are SSRs close by the correct SSR, within the 350 bp flanking sequence and with the same motif, thus this fraction could potentially be corrected by manual inspection. Another indication of the validity of our approach is the comparison of genome structure between the closely related D. melanogaster and D. simulans using the detected conserved SSRs. Using more than 19,000 SSRs from Muller element B (chromosome 2L) from both species, we found this element to be highly similar in terms of the order and distances of the SSRs, which indicated that the majority of this chromosome is in synteny. This agrees largely with the previous findings using gene locations (Bhutkar et al. 2008). The syntenic relationship of the first 9,030 SSRs corresponding to the first 10 Mbp from Muller element B are visualized with AutoGRAPH (Derrien et al. 2007) (supplementary file S3, Supplementary Material online). Reciprocal BLAST analysis in some selected species pairs yielded very similar numbers of conserved SSRs. The difference in proportion of conserved SSRs caused by slightly different absolute numbers in reciprocal runs are for DmelDsim 0%, Dmel–Dvir 0.14%, Dmel–Dpse 0.24%, Amel–Soli 0.34%, and Acep–Soli 1.01%, thus neglectable in our analysis.

Rate of Decay of SSRs Loci

Fitting an exponential decay function to the proportion of conserved SSR loci, we were able to determine the rate of decay for both Dipteran and Hymenopteran SSRs (table 2). The decay rates were related to the time of divergence between two species (fig. 2) and to the estimated number of generations passed since then (fig. 3). In both cases, this fit was highly significant with low standard errors. Although the Dipteran SSR decay rate is two times faster than in the Hymenoptera, the Hymenoptera show an 8.5 times faster decay of SSR loci than the Diptera in relation to the number of generations.
Table 2

Comparison of SSR Decay in Hymenoptera and Diptera in Relation to Divergence Time or Generation Time

Decay (Slope) EstimateDecay (Slope) SEOrigin (Intercept) EstimateOrigin (Intercept) SEFP
Hymenoptera, divergence time1.590.018370.642.4658185.75238.33E − 11
Diptera, divergence time3.320.014360.91.0954285.20236.77E − 15
Hymenoptera, generations3.240.016772.42.689162.53891.05E − 07
Diptera, generations0.389.45E − 0462.141.0857211.77731.03E − 13

Note.—Comparison between the decay of SSRs in Hymenoptera and Diptera in relation to divergence time (split in Ma) and to the estimated number of million generations passed using an exponential decay function. Score and P value from a general regression statistics (F test) are given as well as standard errors (SE) for the slope estimate (decay) and the intercept.

F

Conserved SSR proportions by divergence time. Proportions of SSRs conserved in species pairs of Hymenoptera and Diptera relative to their phylogenetic divergence time (split in Ma, log scale).

F

Conserved SSR proportions by generation time. Proportions of SSRs conserved in species-pairs of Hymenoptera and Diptera relative to the estimated number of million generations since their divergence (log scale).

Conserved SSR proportions by divergence time. Proportions of SSRs conserved in species pairs of Hymenoptera and Diptera relative to their phylogenetic divergence time (split in Ma, log scale). Conserved SSR proportions by generation time. Proportions of SSRs conserved in species-pairs of Hymenoptera and Diptera relative to the estimated number of million generations since their divergence (log scale). Comparison of SSR Decay in Hymenoptera and Diptera in Relation to Divergence Time or Generation Time Note.—Comparison between the decay of SSRs in Hymenoptera and Diptera in relation to divergence time (split in Ma) and to the estimated number of million generations passed using an exponential decay function. Score and P value from a general regression statistics (F test) are given as well as standard errors (SE) for the slope estimate (decay) and the intercept. A more stringent analysis, in which Dmel or Soli SSRs were only considered to be conserved if they were found in species from subsequent branches in the phylogeny, gave much lower proportion of conserved SSRs but showed essentially the very same pattern of decay (supplementary file S4, Supplementary Material online). Another additional analysis was performed using only those SSRs, which are located on the Dmel X-chromosome in comparison to the other Dmel chromosomes. The haplodiploid X-chromosome showed a slightly faster loss of SSR loci compared with the diploid chromosomes (supplementary file S5, Supplementary Material online, Wilcoxon matched pairs test: P = 0.0077).

Conservation of SSR Types and Motifs

From each pairwise comparison, we separately analyzed the different types of SSRs: di-, tri-, tetra-, and pentanucleotide repeats and their motifs. For the Hymenoptera, we found a distinct relationship between the length of the repeat motif and its conservation. Dinucleotide repeats were found to decay more slowly than the overall rate (set to zero), indicated by a positive value of relative SSR loss, trinucleotide repeats slightly faster, and tetra- and pentanucleotide repeats significantly faster, indicated by a negative value of relative SSR loss (fig. 4 and supplementary file S2, Supplementary Material online). In Diptera, the pattern is similar but the trinucleotide repeats decay was slower than SSRs in general.
F

Relative loss of SSRs by their motif length. The loss of di-, tri-, tetra-, and pentanucleotide SSRs compared with the loss of all SSRs (y = 0, indicated by a black line). The Diptera are shown in white and the Hymenoptera with gray filling. The black bar within the box shows the median. Black dots represent outliers. All groups are significantly different.

Relative loss of SSRs by their motif length. The loss of di-, tri-, tetra-, and pentanucleotide SSRs compared with the loss of all SSRs (y = 0, indicated by a black line). The Diptera are shown in white and the Hymenoptera with gray filling. The black bar within the box shows the median. Black dots represent outliers. All groups are significantly different. Dinucleotide repeats, although slower decaying than SSRs altogether, show significant differences among their four motifs (fig. 5 and supplementary file S2, Supplementary Material online). In Hymenoptera, AC and AT repeats are very similar and decay slightly faster than dinucleotide repeats altogether, whereas AG and CG repeats similarly decay slower. Differing in Diptera, AC repeats decay slowest of all the dinucleotide repeats, and AG and CG repeats decay slightly faster.
F

Relative loss of 2 nt SSRs by their motif sequence. The loss of the different 2 nt SSRs compared with the loss of all 2 nt SSRs (y = 0, indicated by a black line). The Diptera are shown in white, the Hymenoptera with gray filling. The black bar within the box shows the median. Black dots represent outliers. All groups are significantly different, except those indicated with “NS.” NS, not significant.

Relative loss of 2 nt SSRs by their motif sequence. The loss of the different 2 nt SSRs compared with the loss of all 2 nt SSRs (y = 0, indicated by a black line). The Diptera are shown in white, the Hymenoptera with gray filling. The black bar within the box shows the median. Black dots represent outliers. All groups are significantly different, except those indicated with “NS.” NS, not significant. Trinucleotide repeats show significant differences in both groups as well within the groups (fig. 6 and supplementary file S2, Supplementary Material online). In Hymenoptera, a slower decay was detected for ACC, ACG, CCG, and especially AGC repeats and a faster decay for AAG and especially ACT repeats; the other motifs are close to zero, so their decay is very similar to the overall decay 3 nt SSRs. In Diptera, AAC, ATC, and especially AGC decay slower than the trinucleotide repeats altogether, and ACG, AGG and CCG are close to zero. The remaining motifs, and especially ACT, were found to have a faster decay. So despite some variance, the strongest deviation from the overall decay of all trinucleotide repeats in both insect orders was found for AGC and ACT repeats (fig. 6 and supplementary file S2, Supplementary Material online).
F

Relative loss of 3 nt SSRs by their motif sequence. The loss of the different 3 nt SSRs compared with the loss of all 3 nt SSRs (y = 0, indicated by a black line). The Diptera are shown in white and the Hymenoptera with gray filling. The black bar within the box shows the median, outliers not shown.

Relative loss of 3 nt SSRs by their motif sequence. The loss of the different 3 nt SSRs compared with the loss of all 3 nt SSRs (y = 0, indicated by a black line). The Diptera are shown in white and the Hymenoptera with gray filling. The black bar within the box shows the median, outliers not shown. The numbers of tetra- and pentanucleotide repeats and the proportion detected as conserved were much lower than in the previous SSR types. Therefore, the data show higher variability (supplementary file S2, Supplementary Material online). Consistently, ACTG is the slowest decaying motif in both insect orders. In contrast, ACCT is lost slowest in Hymenoptera but relatively rapid in Diptera. Between closely related species, the relative losses of specific SSRs were usually very similar. Interestingly, in the AT-rich genomes of the Hymenoptera, AT-rich SSRs are common (AT as well as AAT, AAAT, AATT, AAAAT, and AATAT). Similarly, frequencies of AG might be somewhat correlated with the similar motifs AAG, AAAG, and AAAAG; CG with CCG, CCCG, CCGG; and AC with AAC, AAAC, and AAAAC. In the Dipteran genomes, such potential correlations apparently do not occur except for AT with AAT, AAAT, and AAAAT (supplementary file S1, Supplementary Material online).

Discussion

We show that SSRs can be conserved for many millions of years in the genomes of Hymenoptera and Diptera. Unlike previous work on vertebrates (Buschiazzo and Gemmell 2010), our data are not based on whole-genome alignments and subsequent selection of homologous regions to extract conserved SSRs. We used a BLAST-based approach to find a homologous SSR in pairwise genome comparisons. For both approaches, there is a risk to erroneously detect SSRs in the other genomes as a homolog because of its proximity to the correct locus, which might have lost the SSR during evolution. We tested our method by analyzing a genome with itself. Overall, our methodology correctly recovered 99.2% and 98.3% of loci in Drosophila and the Hymenoptera, respectively. The erroneously assigned SSRs were mainly located toward the ends of chromosomes or scaffolds. Although some studies show that SSR loci related to transposable elements can influence and bias SSR detection (Smýkal et al. 2009; Tay et al. 2010), our recovery rates and low error rates suggest that these cases are not relevant at the phylogenetic level. Another advantage of our approach is that it is not dependent on any previous alignment of homologous regions conserved for many species, which might introduce a bias toward more conserved loci resulting in a reduced sample size. In our method, each locus is analyzed independently for each pairwise comparison, this way we can include many more SSRs independent of possible differences of chromosome structures. Furthermore, the analysis is independent of the quality of the assembly in terms of misassembled sequences or assembly gaps. As predicted, we found that the number of shared SSRs between two species decreases with increasing phylogenetic distance. Nevertheless, high numbers of conserved SSRs are still present many million years after divergence of two species. In support of vertebrate data (Buschiazzo and Gemmell 2010), a very small fraction of below 0.1% of SSRs were even retained over more than 300 Myr of separate evolution of Diptera and Hymenoptera. Interestingly, Janes et al. (2011) discovered additional noncoding DNA sequences that were retained for long times and in differential proportions in both reptiles and mammals. This suggests that, in general, noncoding DNA elements can be conserved for many millions of years and/or generations. There might be a balance between SSR length and probability of a mutation event, the longer the SSR, the greater the probability it will be "broken" by a point mutation, which might impair further slippage mutation. Thus a higher rate of decay would be expected if the mutation rate is high. This point of view is also supported by Sun et al. (2009). Under the assumption that that the majority of SSRs do not exhibit any relevant function and are thus neutrally evolving, SSR decay could be interpreted as a measure of the rate of genome evolution. We detected slower rates of genome evolution in bees, wasps, and ants relative to the flies. This supports earlier reports where a high degree of conservation of structural chromosomal organization was observed between the bumble bee B. terrestris and the honeybee A. mellifera despite diverging approximately 100 Ma (Stolle et al. 2011) or where higher sequence identities in orthologous genes in A. mellifera than in other insects were found (Weinstock et al. 2006). However, estimating rates of evolution solely based on mutations over time has been repeatedly criticized (Kimura 1983; Easteal 1985). Two compared organisms might comprise very different characteristics in many aspects, so that sequence differences can be achieved in very different time scales, potentially leading to false conclusions regarding relative rates evolution. Mutation rates can be affected by life history traits such as metabolism or body size (Mooers and Harvey 1994; Bromham et al. 1996) and can be linked to diversification rate or environmental energy (Davies et al. 2004; Lanfear et al. 2010). Furthermore, population structure can be an important factor, especially effective population size (Kimura and Ohta 1971; Woolfit and Bromham 2005), which determines the level of genetic drift. Traits such as fecundity, longevity, or ploidy can also covary with rates of molecular evolution and could influence on population genetic structure. The comparison of SSR decay in our study showed a 2-fold slower decay over phylogenetic time in the Hymenopterans than in the Dipterans. Numerous studies in plants and vertebrates highlighted the importance of the generation time for the rate of evolution (Sarich and Wilson 1973; Kimura 1983; Easteal 1985; Laroche and Bousquet 1999; Gissi et al. 2000; Andreasen and Baldwin 2001; Nabholz et al. 2008; Welch et al. 2008). Species that produce more generations per unit time tend to have faster evolutionary rates, presumably due to more meiotic DNA replication errors, as observed within the invertebrates (Thomas et al. 2010). The species used in our study differ in the number of generation produced per year. Some social Hymenoptera produce reproductive individuals only after several years (Hölldobler and Wilson 1990), whereas the Drosophila species have many generation each year (Keightley 2000). We corrected for this discrepancy by relating our data to the number of generations, the Hymenopteran SSRs decay 8.5 times faster than the Dipteran SSRs. This striking difference might be explained by several factors. We find differences of several orders of magnitude on examining the population sizes of the species studied here. Compared with Drosophila and mosquitoes, the Hymenopteran species represented in this study are parasitic or social and both have very small effective population sizes (Moran 1984; Owen and Owen 1989; Peeters and Liebig 2000; Zayed 2004; Nolte and Schlötterer 2008; Petit and Barbadilla 2009; Alves et al. 2010; Elias et al. 2010; Jaffé et al. 2010; Andolfatto et al. 2011), although some species have reproductive females with very high fecundity and longevity (Nabholz et al. 2008; Welch et al. 2008). Small effective population sizes enhance the loss of genetic diversity through drift and hence could cause smaller SSR polymorphism. Furthermore, social Hymenoptera have been shown to have a much higher genomic recombination rate (11.15 cM/Mb) compared with that of Drosophila (1.59 cM/Mb) (Wilfert et al. 2007; Lattorff and Moritz 2008; Stolle et al. 2011). There is no sufficient data for all species to investigate this relationship further, but recombination rate could influence genome evolution and thus SSR loss. Finally, in Hymenopterans, males are haploid, whereas Dipterans are diploid. The haploid male sex further decreases the effective population size and thus could have some influence on the rates of evolution. Interestingly, we found a slightly faster rate of SSR loss for the D. melanogaster haplodiploid X-chromosome than in the diploid Dmel chromosomes. The X-chromosome has only 75% effective population size than the other chromosomes (males are haploid for the X-chromosome, which means it has 50% of the effective population size, and females are diploid for the X-chromosome, which means 100% of the effective population size). Because of stronger genetic drift, one could expect a lower degree of polymorphism, which was confirmed by previous studies (Begun and Whitley 2000; Betancourt et al. 2002; Andolfatto et al. 2011). However, because a loss of polymorphism due to genetic drift has probably no influence on mutation rate as such, differences in effective population size might have little effect on the pattern we found in the Hymenoptera and Diptera. A possible explanation could be differences in the number of cell divisions in the germ cells between both sexes, whereby although detected, the difference was found to be weak in D. melanogaster (Bauer and Aquadro 1997). However, if D. melanogaster females would reproduce early in their life, the weak female bias in the number of germ-cell divisions could enhance the SSR turnover in the X chromosome and thus cause a slightly faster SSR loss. If such differing numbers of germ-cell divisions between sexes would play a role in other species as well, it might explain a faster loss of SSRs in the Hymenoptera, in which all chromosomes are haplodiploid. And this further could be enhanced by the longevity of queens of the social Hymenoptera in comparison to the short living males. On the other hand, this would be detectable by enhanced evolutionary rates, for which previous studies (Bauer and Aquadro 1997; Begun and Whitley 2000; Betancourt et al. 2002) found no evidence in Drosophila, and is also opposed by the finding of a faster mutation rate on the male Y chromosome versus the X-chromosome (Bachtrog 2008). Although distinct patterns relating to motif composition within and between insect orders are lacking, differences in the frequency and conservation of particular motifs were observed between Hymenoptera and Diptera. This constraint could indicate that some motifs are more stable than others or actually might be somehow selected. Our data suggest at least a constraint of the length of a motif which might be related to probabilities of point mutations disrupting the slippage-mutational process. There also might be a relationship between frequency and conservation of a motif, and the frequencies of related motifs which could give some indications for the turnover (birth and death rate) of specific motifs. However, other conclusions for the different patterns within and between each insect order, especially for specific repeat motifs, are hard to draw, especially as the process of birth and death of a SSR, potentially from SSRs changed by mutations, is poorly understood. The functional implications of the conservation or frequency of SSRs, if there are any, also unfortunately must remain unclear at this stage. Opposing the general view of functionless DNA elements, some SSRs could play some functional roles, although this would not explain the whole pattern of the large number of SSRs. Palindromic repeats, such as AT and CG, could be involved in formation of DNA hairpin structures, some trinucleotide repeats could be constrained by functions within coding regions or on chromosomal level. Thus far, only a few specific SSRs are known to be involved in some biological processes (for further reading see Goldstein and Schlötterer 1999; Li et al. 2002, 2004; Buschiazzo and Gemmell 2010; Grover and Sharma 2011) or other relevant impact (Auer et al. 2001; Kerrest et al. 2009; Blackwood et al. 2010; Bonen et al. 2010; Mueller et al. 2011). Some SSRs were also related to recombination hotspots (Brandström et al. 2008) and transposable elements (Smýkal et al. 2009; Tay et al. 2010). Irrespective of the actual mechanisms that drive the evolutionary changes in SSRs, we show that they allow for a comparison of rates of genome evolution. We find that the rate of decay of SSRs, and, therefore, the rate of genome evolution, is not 2-fold slower in the Hymenoptera compared with the Diptera as indicated by absolute substitution rates but is 8.5 times faster when correcting for generation time. Thus, previous studies on structural conservation (Stolle et al. 2011) and sequence similarity (Weinstock et al. 2006) based on absolute time should be re-evaluated regarding generation time and future studies need to account for it. Conserved SSRs can potentially also be exploited for a rapid, cost-efficient, and yet comprehensive development of markers for arrays of even distantly related species. They can also be a powerful tool to investigate genome structure and synteny between genomic regions with a resolution, which can be orders of magnitude higher than using genes.

Supplementary Material

Supplementary files S1–S5 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).
  87 in total

1.  Evolution of the mitochondrial rps3 intron in perennial and annual angiosperms and homology to nad5 intron 1.

Authors:  J Laroche; J Bousquet
Journal:  Mol Biol Evol       Date:  1999-04       Impact factor: 16.240

2.  Effective population size of natural populations of Drosophila buzzatii, with a comparative evaluation of nine methods of estimation.

Authors:  J S F Barker
Journal:  Mol Ecol       Date:  2011-09-27       Impact factor: 6.185

3.  Phylogeny of the ants: diversification in the age of angiosperms.

Authors:  Corrie S Moreau; Charles D Bell; Roger Vila; S Bruce Archibald; Naomi E Pierce
Journal:  Science       Date:  2006-04-07       Impact factor: 47.728

4.  Metabolic rate, generation time, and the rate of molecular evolution in birds.

Authors:  A O Mooers; P H Harvey
Journal:  Mol Phylogenet Evol       Date:  1994-12       Impact factor: 4.286

5.  Role for CCG-trinucleotide repeats in the pathogenesis of chronic lymphocytic leukemia.

Authors:  R L Auer; C Jones; R A Mullenbach; D Syndercombe-Court; D W Milligan; C D Fegan; F E Cotter
Journal:  Blood       Date:  2001-01-15       Impact factor: 22.113

6.  Polymorphism of CAG repeats in androgen receptor of carnivores.

Authors:  Qin Wang; Xiuyue Zhang; Xiaofang Wang; Bo Zeng; Xiaodong Jia; Rong Hou; Bisong Yue
Journal:  Mol Biol Rep       Date:  2011-06-04       Impact factor: 2.316

7.  470 million years of conservation of microsatellite loci among fish species.

Authors:  C Rico; I Rico; G Hewitt
Journal:  Proc Biol Sci       Date:  1996-05-22       Impact factor: 5.349

8.  Nonrecombining genes in a recombination environment: the Drosophila "dot" chromosome.

Authors:  Jeffrey R Powell; Kirstin Dion; Montserrat Papaceit; Montserrat Aguadé; Saverio Vicario; Ryan C Garrick
Journal:  Mol Biol Evol       Date:  2010-10-12       Impact factor: 16.240

9.  African Drosophila melanogaster and D. simulans populations have similar levels of sequence variability, suggesting comparable effective population sizes.

Authors:  Viola Nolte; Christian Schlötterer
Journal:  Genetics       Date:  2008-01       Impact factor: 4.562

10.  Temporal patterns of fruit fly (Drosophila) evolution revealed by mutation clocks.

Authors:  Koichiro Tamura; Sankar Subramanian; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2003-08-29       Impact factor: 16.240

View more
  17 in total

1.  The genomes of two key bumblebee species with primitive eusocial organization.

Authors:  Ben M Sadd; Seth M Barribeau; Guy Bloch; Dirk C de Graaf; Peter Dearden; Christine G Elsik; Jürgen Gadau; Cornelis J P Grimmelikhuijzen; Martin Hasselmann; Jeffrey D Lozier; Hugh M Robertson; Guy Smagghe; Eckart Stolle; Matthias Van Vaerenbergh; Robert M Waterhouse; Erich Bornberg-Bauer; Steffen Klasberg; Anna K Bennett; Francisco Câmara; Roderic Guigó; Katharina Hoff; Marco Mariotti; Monica Munoz-Torres; Terence Murphy; Didac Santesmasses; Gro V Amdam; Matthew Beckers; Martin Beye; Matthias Biewer; Márcia M G Bitondi; Mark L Blaxter; Andrew F G Bourke; Mark J F Brown; Severine D Buechel; Rossanah Cameron; Kaat Cappelle; James C Carolan; Olivier Christiaens; Kate L Ciborowski; David F Clarke; Thomas J Colgan; David H Collins; Andrew G Cridge; Tamas Dalmay; Stephanie Dreier; Louis du Plessis; Elizabeth Duncan; Silvio Erler; Jay Evans; Tiago Falcon; Kevin Flores; Flávia C P Freitas; Taro Fuchikawa; Tanja Gempe; Klaus Hartfelder; Frank Hauser; Sophie Helbing; Fernanda C Humann; Frano Irvine; Lars S Jermiin; Claire E Johnson; Reed M Johnson; Andrew K Jones; Tatsuhiko Kadowaki; Jonathan H Kidner; Vasco Koch; Arian Köhler; F Bernhard Kraus; H Michael G Lattorff; Megan Leask; Gabrielle A Lockett; Eamonn B Mallon; David S Marco Antonio; Monika Marxer; Ivan Meeus; Robin F A Moritz; Ajay Nair; Kathrin Näpflin; Inga Nissen; Jinzhi Niu; Francis M F Nunes; John G Oakeshott; Amy Osborne; Marianne Otte; Daniel G Pinheiro; Nina Rossié; Olav Rueppell; Carolina G Santos; Regula Schmid-Hempel; Björn D Schmitt; Christina Schulte; Zilá L P Simões; Michelle P M Soares; Luc Swevers; Eva C Winnebeck; Florian Wolschin; Na Yu; Evgeny M Zdobnov; Peshtewani K Aqrawi; Kerstin P Blankenburg; Marcus Coyle; Liezl Francisco; Alvaro G Hernandez; Michael Holder; Matthew E Hudson; LaRonda Jackson; Joy Jayaseelan; Vandita Joshi; Christie Kovar; Sandra L Lee; Robert Mata; Tittu Mathew; Irene F Newsham; Robin Ngo; Geoffrey Okwuonu; Christopher Pham; Ling-Ling Pu; Nehad Saada; Jireh Santibanez; DeNard Simmons; Rebecca Thornton; Aarti Venkat; Kimberly K O Walden; Yuan-Qing Wu; Griet Debyser; Bart Devreese; Claire Asher; Julie Blommaert; Ariel D Chipman; Lars Chittka; Bertrand Fouks; Jisheng Liu; Meaghan P O'Neill; Seirian Sumner; Daniela Puiu; Jiaxin Qu; Steven L Salzberg; Steven E Scherer; Donna M Muzny; Stephen Richards; Gene E Robinson; Richard A Gibbs; Paul Schmid-Hempel; Kim C Worley
Journal:  Genome Biol       Date:  2015-04-24       Impact factor: 13.583

2.  Characterization of Adelphocoris suturalis (Hemiptera: Miridae) Transcriptome from Different Developmental Stages.

Authors:  Caihong Tian; Wee Tek Tay; Hongqiang Feng; Ying Wang; Yongmin Hu; Guoping Li
Journal:  Sci Rep       Date:  2015-06-05       Impact factor: 4.379

3.  Transcriptome Analysis of the Carmine Spider Mite, Tetranychus cinnabarinus (Boisduval, 1867) (Acari: Tetranychidae), and Its Response to β-Sitosterol.

Authors:  Chunya Bu; Jinling Li; Xiao-Qin Wang; Guanglu Shi; Bo Peng; Jingyu Han; Pin Gao; Younian Wang
Journal:  Biomed Res Int       Date:  2015-05-11       Impact factor: 3.411

Review 4.  Challenges in analysis and interpretation of microsatellite data for population genetic studies.

Authors:  Alexander I Putman; Ignazio Carbone
Journal:  Ecol Evol       Date:  2014-10-30       Impact factor: 2.912

5.  Microsatellite abundance across the Anthozoa and Hydrozoa in the phylum Cnidaria.

Authors:  Dannise V Ruiz-Ramos; Iliana B Baums
Journal:  BMC Genomics       Date:  2014-10-27       Impact factor: 3.969

6.  Transcriptome profiling of Diachasmimorpha longicaudata towards useful molecular tools for population management.

Authors:  M Constanza Mannino; Máximo Rivarola; Alejandra C Scannapieco; Sergio González; Marisa Farber; Jorge L Cladera; Silvia B Lanzavecchia
Journal:  BMC Genomics       Date:  2016-10-12       Impact factor: 3.969

7.  Microsatellite (SSR) amplification by PCR usually led to polymorphic bands: Evidence which shows replication slippage occurs in extend or nascent DNA strands.

Authors:  Abasalt Hosseinzadeh-Colagar; Mohammad Javad Haghighatnia; Zahra Amiri; Maryam Mohadjerani; Majid Tafrihi
Journal:  Mol Biol Res Commun       Date:  2016-09

8.  Sequencing and Characterization of the Invasive Sycamore Lace Bug Corythucha ciliata (Hemiptera: Tingidae) Transcriptome.

Authors:  Fengqi Li; Ran Wang; Cheng Qu; Ningning Fu; Chen Luo; Yihua Xu
Journal:  PLoS One       Date:  2016-08-05       Impact factor: 3.240

9.  The mitochondrial genome of Muga silkworm (Antheraea assamensis) and its comparative analysis with other lepidopteran insects.

Authors:  Deepika Singh; Debajyoti Kabiraj; Pragya Sharma; Hasnahana Chetia; Ponnala Vimal Mosahari; Kartik Neog; Utpal Bora
Journal:  PLoS One       Date:  2017-11-15       Impact factor: 3.240

10.  Conserved microsatellites in ants enable population genetic and colony pedigree studies across a wide range of species.

Authors:  Ian A Butler; Kimberly Siletti; Peter R Oxley; Daniel J C Kronauer
Journal:  PLoS One       Date:  2014-09-22       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.