| Literature DB >> 20333231 |
Emmanuel Buschiazzo1, Neil J Gemmell.
Abstract
The sequencing and comparison of vertebrate genomes have enabled the identification of widely conserved genomic elements. Chief among these are genes and cis-regulatory regions, which are often under selective constraints that promote their retention in related organisms. The conservation of elements that either lack function or whose functions are yet to be ascribed has been relatively little investigated. In particular, microsatellites, a class of highly polymorphic repetitive sequences considered by most to be neutrally evolving junk DNA that is too labile to be maintained in distant species, have not been comprehensively studied in a comparative genomic framework. Here, we used the UCSC alignment of the human genome against those of 11 mammalian and five nonmammalian vertebrates to identify and examine the extent of conservation of human microsatellites in vertebrate genomes. Out of 696,016 microsatellites found in human sequences, 85.39% were conserved in at least one other species, whereas 28.65% and 5.98% were found in at least one and three nonprimate species, respectively. An exponential decline of microsatellite conservation with increasing evolutionary time, a comparable distribution of conserved versus nonconserved microsatellites in the human genome, and a positive correlation between microsatellite conservation and overall sequence conservation, all suggest that most microsatellites are only maintained in genomes by chance, although exceptionally conserved human microsatellites were also found in distant mammals and other vertebrates. Our findings provide the first comprehensive survey of microsatellite conservation across deep evolutionary timescales, in this case 450 Myr of vertebrate evolution, and provide new tools for the identification of functional conserved microsatellites, the development of cross-species microsatellite markers and the study of microsatellite evolution above the species level.Entities:
Keywords: comparative genomics; mammals; multiple alignment; tandem repeats; vertebrates
Year: 2010 PMID: 20333231 PMCID: PMC2839350 DOI: 10.1093/gbe/evq007
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FSpecies-specific microsatellite enrichment. (A) Alignability to the human genome and conservation of human microsatellites in vertebrate species. (B) Scatter plot showing the ratio (rp) of percentage of microsatellite conservation to percentage of alignment relative to human. Dotted lines represent a 5% significance threshold. Species are arranged from left to right by increasing distance (substitution rate) from human (Miller et al. 2007).
FPhylogenetic extent of conservation of human microsatellites. (A) Decay of conservation in different genomic locations as a function of phylogenetic distance from human (Miller et al. 2007). Microsatellite conservation is measured as the fraction of human microsatellites identified in the aligned portion of the human genome that is found conserved in at least one other species. Scatter plots for total microsatellites and microsatellites in introns and intergenic regions (IGR) overlay. (B) Conservation profiles of human microsatellites in vertebrate genomes. Each profile is a proportional distribution of the range of conservation of human microsatellites conserved in at least each of the species, from exclusive (1 species, leftmost bar) to wide (12 species, rightmost bar). Each bar thus represents a percentage of human microsatellites that fall in each range category, for each species, and bars of identical range add up to 100%. No microsatellite was found in 13 species and only one in all 14 species. Species are arranged from left to right by increasing branch length from human (Miller et al. 2007). Primates were excluded to allow the observation of differences among species distantly related to human.
FDistribution of human microsatellites conserved in nonprimates species. The number of species is color coded as indicated in the legend.
Covariation between Human Microsatellites and Other Genomic Features
| HM | G + C | Gene | SINE | LINE | LTR | Rrecomb | SNP | cIND | tfbs | |
| HM | — | n.s. | −0.22*** | 0.11*** | −0.37*** | −0.26*** | −0.33*** | −0.05** | 0.41*** | 0.40*** |
| HCM | 0.98*** | −0.07*** | −0.25*** | 0.14*** | −0.32*** | −0.24*** | −0.31*** | −0.09*** | 0.45*** | 0.42*** |
| PSM | 0.90*** | −0.11*** | −0.27** | 0.17*** | −0.28*** | −0.12*** | −0.26*** | n.s. | 0.16*** | 0.15*** |
| NPM | 0.84*** | n.s. | −0.19*** | 0.08*** | −0.27*** | −0.29*** | −0.28*** | −0.16*** | 0.67*** | 0.61*** |
| NP3M | 0.63*** | 0.17*** | 0.04* | 0.15*** | −0.35*** | −0.41*** | 0.25*** | −0.23*** | 0.74*** | 0.75*** |
| A + T-rich | — | −0.33*** | −0.38*** | 0.24*** | −0.04* | −0.14*** | −0.13*** | −0.21*** | 0.61*** | 0.48*** |
| G + C-rich | — | 0.65*** | 0.45*** | 0.51*** | −0.62*** | −0.54*** | 0.35*** | −0.08*** | 0.41*** | 0.57*** |
| AT = GC | — | n.s. | −0.24*** | 0.17*** | −0.20*** | −0.17*** | −0.32*** | −0.07*** | 0.54*** | 0.46*** |
NOTE. —Left to right: Density of microsatellites in aligned sequences (HM) and conserved in at least one species (HCM), primates only (PSM), and at least 1 (NPM) and 3 (NP3M) nonprimate species; NPMs are also differentiated as A + T-rich (motif G + C content <50%), G + C-rich (>50%), and AT = GC (=50%) (see Materials and Methods); G + C content; gene density; SINE, LINE and LTR coverage; average recombination rate; SNP density; indel-purified sequence coverage (cIND), and density of tfbsCons. Source: UCSC Genome Browser. Spearman’s rank correlation factor ρ, P value significance: 0 < *** < 0.001 < ** < 0.01 < * < 0.05 < not significant (n.s.).