| Literature DB >> 18447906 |
Svitlana Tyekucheva1, Kateryna D Makova, John E Karro, Ross C Hardison, Webb Miller, Francesca Chiaromonte.
Abstract
BACKGROUND: The evolutionary distance between human and macaque is particularly attractive for investigating local variation in neutral substitution rates, because substitutions can be inferred more reliably than in comparisons with rodents and are less influenced by the effects of current and ancient diversity than in comparisons with closer primates. Here we investigate the human-macaque neutral substitution rate as a function of a number of genomic parameters.Entities:
Mesh:
Year: 2008 PMID: 18447906 PMCID: PMC2643947 DOI: 10.1186/gb-2008-9-4-r76
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Regression results for neutral substitution rates estimated from non-CpG and all sites
| Non-CpG sites | All sites | |||||||
| Predictors | Significance† | VIF‡ | Variability explained§ | Significance† | VIF‡ | Variability explained§ | ||
| X chromosome/autosome indicator | 13.94 | <10-4 | 1.2 | 0.08 | 15.25 | <10-4 | 1.3 | 0.09 |
| GC content | ||||||||
| Linear term | -10.34 | <10-4 | 3.7 | 0.12 | -5.08 | <10-4 | 3.4 | 0.14 |
| Quadratic term | 15.85 | <10-4 | 1.3 | 18.78 | <10-4 | 1.2 | ||
| Exon density | -7.03 | <10-4 | 2.4 | 0.02 | -9.37 | <10-4 | 2.4 | 0.03 |
| SNP density | 6.25 | <10-4 | 1.2 | 0.02 | 6.85 | <10-4 | 1.2 | 0.02 |
| Male recombination rate | 3.69 | 0.003 | 1.6 | 0.01 | 4.46 | <10-4 | 1.6 | 0.01 |
| Female recombination rate | NS | NS | NS | NS | NS | NS | NS | NS |
| Distance to telomere | ||||||||
| Linear term | -12.33 | <10-4 | 2.5 | 0.06 | -16.78 | <10-4 | 2.5 | 0.11 |
| Quadratic term | 7.63 | <10-4 | 2.0 | 10.77 | <10-4 | 2.0 | ||
| Mouse-rat orthologous neutral rate | 7.95 | <10-4 | 1.8 | 0.09 | 6.64 | <10-4 | 1.4 | 0.07 |
| Dog-cow orthologous neutral rate | 10.56 | <10-4 | 1.3 | 10.41 | <10-4 | 1.4 | ||
| Multiple R2 | 0.52 | 0.53 | ||||||
| Adjusted R2 | 0.52 | 0.52 | ||||||
Non-CpG and all sites were taken in ancestral repeats orthologous to mouse, rat, dog and cow for each of 2,270 windows of size 1 Mb. *t value, test statistic of null hypothesis that each predictor's coefficient is equal to zero; †p-values adjusted for multiple tests (using Bonferroni correction); ‡VIF, variance inflation factor; §relative contribution to explained variability computed for each predictor. NS, non-significant
Figure 1Neutral rates, GC and distance to telomeres. (a-d) Scatter plots of human-macaque neutral substitution rates from non-CpG and all sites in ancestral repeats against human GC content ((a) and (b), respectively) and distance to telomeres ((c) and (d), respectively). Each point represents one of 2,270 windows of size 1 Mb. Lowess smoothers are superimposed to the plots to help visualize the relationships. These non-parametric fits reveal some curvature in the way GC content and distance to telomeres are related to neutral substitutions, which is consistent with the significant quadratic terms in our regression fits.
Correlations between neutral substitution rates in orthologous regions
| All sites | Non-CpG sites | |||||
| Human-macaque | Mouse-rat | Dog-cow | Human-macaque | Mouse-rat | Dog-cow | |
| All sites | ||||||
| Human-macaque | 0.28 | 0.42 | 0.9 | 0.28 | 0.48 | |
| Mouse-rat | <10-4 | 0.05 | 0.37 | 0.89 | 0.22 | |
| Dog-cow | <10-4 | 0.02 | 0.27 | -0.13 | 0.87 | |
| Non-CpG sites | ||||||
| Human-macaque | <10-4 | <10-4 | <10-4 | 0.44 | 0.45 | |
| Mouse-rat | <10-4 | <10-4 | 0.51 | <10-4 | 0.26 | |
| Dog-cow | <10-4 | <10-4 | <10-4 | <10-4 | <10-4 | |
Upper-right off-diagonal: pair-wise Pearson's correlation coefficients between human-macaque, mouse-rat and dog-cow orthologous substitution rates estimated from non-CpG and all sites in ancestral repeats orthologous to mouse, rat, dog and cow for each of 2,270 windows of size 1 Mb. Lower-left off diagonal: p-values expressing significance of the correlation coefficients.
Regression results for neutral substitution rates estimated from CpG sites
| CpG sites | ||||
| Predictors | Significance† | VIF‡ | Variability explained§ | |
| X chromosome/autosome indicator | 13.99 | <10-4 | 1.1 | 0.02 |
| GC content | ||||
| Linear term | -57.37 | <10-4 | 2.7 | 0.32 |
| Quadratic term | 5.73 | <10-4 | 1.2 | |
| Exon density | -6.28 | <10-4 | 2.3 | 0.003 |
| SNP density | NS | NS | NS | NS |
| Male recomb rate | NS | NS | NS | NS |
| Female recomb rate | NS | NS | NS | NS |
| Distance to telomeres | ||||
| Linear term | NS | NS | NS | NS |
| Quadratic term | NS | NS | NS | |
| Multiple R2 | 0.82 | |||
| Adjusted R2 | 0.82 | |||
CpG sites were taken in ancestral repeats (without requiring orthology to mouse, rat, dog and cow) for each of 2,270 windows of size 1 Mb. *t value, test statistic of null hypothesis that each predictor's coefficient is equal to zero; †p-values adjusted for multiple tests (using Bonferroni correction); ‡VIF, variance inflation factor; §relative contribution to explained variability computed for each predictor. NS, non-significant.
Figure 2Neutral rates, GC and CpG content. Scatter plots of (a) human-macaque JC neutral substitution rates against GC content, for CpG sites (triangles), non-CpG sites (circles), and 'union' sites (crosses), and (b) fraction of CpG sites against GC content. Each point represents one of 2,270 windows of size 1 Mb. Lowess smoothers are superimposed to the plots to help visualize the relationships. Note the different scales on the truncated y-axis for (a).
Associations between human-macaque neutral substitution rates and frequencies of various classes of functional elements
| Class of elements | Short description | Conservation based | Reference | Correlation coefficient | Partial correlation coefficient |
| phyloHMM (P) | Predicted functional elements; highly conserved non-exonic sequences identified by phyloHMM | Yes (17 vertebrate species) | [52] | -0.32 | -0.38 |
| ESPERR-RP (P) | Predicted regulatory elements; non-exonic sequences with high regulatory potential, as measured by the ESPERR-RP score | Yes (7 mammalian species) | [39] | -0.24 | -0.30 |
| Enhancers (P) | Predicted enhancers; non-exonic sequences under strong constraint in human-rodent comparisons | Yes (human, mouse, rat) | [38,48] | -0.06 | -0.22 |
| CTCF-binding sites (P) | Predicted CTCF binding sites; identified by single sequence motif finding methods | No | [41] | -0.12 | -0.10 |
| CTCF-binding sites (E) | Experimentally mapped CTCF binding sites | No | [41] | -0.20 | -0.08 |
| ER binding sites (E) | Experimentally mapped estrogen receptor binding sites | No | [42] | -0.14 | -0.09 |
| RNA polymerase II binding sites (E) | Experimentally mapped RNA polymerase II binding sites | No | [42] | -0.11 | 0.01 |
Pearson's correlation and partial correlation coefficients. The substitution rates are estimated from all sites in ancestral repeats (without requiring orthology to mouse, rat, dog and cow) for each of 2,270 windows of size1 Mb.