| Literature DB >> 18261230 |
Frédéric J J Chain1, Dora Ilieva, Ben J Evans.
Abstract
BACKGROUND: The mechanism by which duplicate genes originate - whether by duplication of a whole genome or of a genomic segment - influences their genetic fates. To study events that trigger duplicate gene persistence after whole genome duplication in vertebrates, we have analyzed molecular evolution and expression of hundreds of persistent duplicate gene pairs in allopolyploid clawed frogs (Xenopus and Silurana). We collected comparative data that allowed us to tease apart the molecular events that occurred soon after duplication from those that occurred later on. We also quantified expression profile divergence of hundreds of paralogs during development and in different tissues.Entities:
Mesh:
Year: 2008 PMID: 18261230 PMCID: PMC2275784 DOI: 10.1186/1471-2148-8-43
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Phylogenetic and genealogical relationships of species and paralogs in this study. Phylogenetic relationships are depicted among species, orthologs, and paralogs of a diploid with 20 chromosomes, S. tropicalis (ST), two tetraploids with 40 chromosomes, S. epitropicalis (EP) and S. new tetraploid (NT), and four tetraploids with 36 chromosomes, Xenopus laevis (XL), X. borealis (XB), X. gilli (XG), and X. muelleri (XM). (A) Clawed frogs speciate by allopolyploidization and by regular speciation without a change in genome size. Allotetraploidization occurred independently in Xenopus and in Silurana and produced two paralogs in the resulting tetraploid ancestor – α and β – that are indicated as brown and green lineages respectively. After allopolyploidization, some of the diploid lineages probably went extinct, and this is indicated by a dagger. As a result of these extinctions, the portion of some paralogous lineages that evolved in a diploid, indicated as dashed lines, cannot be dissected apart from the portion that evolved in an allopolyploid. Numbered nodes indicate (0) divergence of the genera Xenopus and Silurana, (1) divergence of the diploid (2n = 18) ancestors of Xenopus, (2) allotetraploidization in Xenopus, (3) the first speciation event of the tetraploid ancestor of extant Xenopus, (4 and 5) more recent speciation events of Xenopus tetraploids, (6) divergence of the diploid (2n = 20) ancestors of Silurana, (7) allotetraploidization in Silurana, (8) speciation of a tetraploid Silurana without change in genome size. Sequences from individual paralogs were used to construct genealogies in order to compare (B) an early to a later stage of evolution after WGD in XLα, (C) an early to a later stage of evolution after WGD of EPα and (D) an intermediate to a later stage after WGD in XLα. Depending on the paralog for which data were obtained, sometimes NTα was considered in (C) or XBα was considered in (D).
Comparison of alternatively parameterized models of evolution in Fig. 1 indicates no significant difference in the Ka/Ks ratio at an early and a later stage of duplicate gene evolution.
| Comparison | # base pairs | -lnL Ho | -lnL Ha | P value | Ka/Ks combined early and late | Ka/Ks ratio early | Ka/Ks ratio late | Ka/Ks diploid |
|---|---|---|---|---|---|---|---|---|
| Fig. 1B | 80856 | -165602.720 | -165602.386 | 0.414 | 0.164 | 0.158 | 0.169 | 0.126 |
| Fig. 1C | 9717 | -15699.366 | -15697.250 | 1.000 | 0.208 | 0.124 | 0.346 | 0.198 |
| Fig. 1D | 6966 | -13187.865 | -13186.872 | 0.160 | 0.126 | 0.187 | 0.105 | NA |
| Fig. 1B (partitioned) | 80856 | -160085.863 | -159889.926 | 1.000 | NL | Af2 | Af2 | NL |
| Fig. 1C (partitioned) | 9717 | -15400.349 | -15393.089 | 0.888 | NL | Af2 | Af2 | NL |
| Fig. 1D (partitioned) | 6966 | -12983.343 | -12978.034 | 0.807 | NL | Af2 | Af2 | NA |
Indicated for comparisons depicted in Fig. 1B, C and D are likelihoods of the null model (early and later Ka/Ks are the same) and the alternative model (early and later Ka/Ks are not the same), the one-sided probability of the Ka/Ks ratio being higher in the early stage, and the Ka/Ks ratios estimated from each of these models. For the first two tests, the Ka/Ks ratio of the diploid lineage was estimated using a different model where a unique Ka/Ks ratio was estimated for each branch (a free ratio model). Also listed are the joint likelihoods of these models from an analysis partitioned by gene fragment. For the partitioned analyses, Ka/Ks ratios for each fragment are either listed in Additional file 2 (Af2), not listed (NL), or not applicable (NA).
Comparison of alternatively parameterized models of evolution indicates significant departure from neutrality at an early stage of duplicate gene evolution.
| Comparison | # bp | -lnL Ho | -lnL Ha | P value | Fixed Ka/Ks ratio in early lineage in null model | Estimated Ka/Ks ratio in other lineages in null model | Estimated Ka/Ks ratio in early lineage in alternative model | Estimated Ka/Ks ratio in other lineages in alternative model |
|---|---|---|---|---|---|---|---|---|
| Fig. 1B | 80856 | -166032.2641 | -165608.1273 | 0.0000 | 1 | 0.1322 | 0.158 | 0.1415 |
| Fig. 1C | 9717 | -15716.3195 | -15698.0601 | 0.0000 | 1 | 0.2261 | 0.1242 | 0.228 |
| Fig. 1D | 6966 | -13235.97398 | -13187.03582 | 0.0000 | 1 | 0.141 | 0.1052 | 0.1557 |
| Fig. 1B (partitioned) | 80856 | -160755.0615 | -160071.0687 | 0.0000 | 1 | NL | NL | NL |
| Fig. 1C (partitioned) | 9717 | -15436.44329 | -15413.5068 | 0.0000 | 1 | NL | NL | NL |
| Fig. 1D (partitioned) | 6966 | -13077.6404 | -13016.5679 | 0.0000 | 1 | NL | NL | NL |
Likelihoods of a null model with the Ka/Ks ratio fixed at one at an early stage of duplicate gene evolution and an alternative model with this ratio estimated are indicated. Species acronyms are the same as in Fig. 1 and abbreviations are the same as in Table 1.
Figure 2Functional constraints are similar in early and later stages of duplicate gene evolution in . (A) Binned Ka/Ks of early (blue) and later (red) stages of duplicate gene evolution. (B) Regression of Ka/Ks versus Ks in the early and later stages indicates that selection (relaxed purifying + positive) is not more common in the early stage of duplicate gene evolution (blue dots) than the later stage (red dots). The Y-intercept of these regression lines was set to zero and Ka/Ks ratios greater 2 (including undefined ratios) were given a value of 2. In (A) and (B), a dashed line indicates the neutral expectation. Fragments with Ka/Ks > 2 are, on average, half of the size of those with Ka/Ks < 2. Ka/Ks ratios above 2 may therefore be attributable in part to stochastic variance in Ks [43].
Figure 3Expression of both paralogs is generally detected in the same treatments, irrespective of the probe specificity (the degree to which each probe matches one but not the other paralog) or the detection threshold (the minimum raw intensity scored as expressed). These data are based on (A) "Standard" and (B) "Conservative" threshold levels for detection of expression and three probe specificities were compared that are labeled low, medium, and high (see Methods). We report paralogous profiles whose presence/absence scores in all five treatments were identical in the medium and high specificity analysis (shaded in gray on the left of each chart). 1789 and 1462 genes had consistent present/absent expression profiles in the medium and high specificity analyses using the standard and conservative thresholds. These sets of genes included 841 and 632 paralogous pairs, respectively. The tables on the right compare paralogous profiles by tabulating whether they are both present and absent in the same treatments (identical), the expression profile of one overlaps entirely with the other (overlap), or paralogs in which each duplicate has a unique component (distinct).
Figure 4Binned expression profile correlations between 841 pairs of paralogs over five developmental stages or adult tissue types in the medium specificity analysis. The proportion of Pearson correlation coefficients between non-paralogous expression profiles (white bars) and between paralogous expression profiles (black bars). Ninety percent of the non-paralogous expression profiles have a Pearson correlation coefficient that is greater than -0.861 but less than 0.865. The Pearson correlation coefficients of 62% of the paralogous expression profiles are less than 0.865, and 0.3% of them are less than -0.861.