| Literature DB >> 23025552 |
Joan Ho-Huu1, Joëlle Ronfort, Stéphane De Mita, Thomas Bataillon, Isabelle Hochu, Audrey Weber, Nathalie Chantret.
Abstract
BACKGROUND: Gene duplications are a molecular mechanism potentially mediating generation of functional novelty. However, the probabilities of maintenance and functional divergence of duplicated genes are shaped by selective pressures acting on gene copies immediately after the duplication event. The ratio of non-synonymous to synonymous substitution rates in protein-coding sequences provides a means to investigate selective pressures based on genic sequences. Three molecular signatures can reveal early stages of functional divergence between gene copies: change in the level of purifying selection between paralogous genes, occurrence of positive selection, and transient relaxed purifying selection following gene duplication. We studied three pairs of genes that are known to be involved in an interaction with symbiotic bacteria and were recently duplicated in the history of the Medicago genus (Fabaceae). We sequenced two pairs of polygalacturonase genes (Pg11-Pg3 and Pg11a-Pg11c) and one pair of auxine transporter-like genes (Lax2-Lax4) in 17 species belonging to the Medicago genus, and sought for molecular signatures of differentiation between copies.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23025552 PMCID: PMC3517903 DOI: 10.1186/1471-2148-12-195
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Phylogenetic trees for and genes. Phylogenetic trees obtained for Lax2 and Lax4 (a) and for Pg3, Pg11a and Pg11c genes (b). Bootstrap values (% of 100 re-sampled data set) are indicated for each branch. For presentation convenience, the branch leading to the outgroup was shortened (as indicated by an inclined double segment). Sequences underlined in light grey represent instances were a paralog is placed in the wrong clade when considering the copy specific primers used for its amplification.
Figure 2Schematic representation of the codon models used. (a) Models allowing dN/dS variation along lineages. Arrows indicate the questions addressed by the comparison between models (in (a), the arrows correspond to hierachical relationships). (b) Models allowing dN/dS variation along the gene. In M8A, ω follows a β distribution discretized into 10 categories of similar frequency (0 < ω1–10 < 1) and an additional category of ω is fixed at 1 (ωad = 1, accounting for neutral sites); M8 differs from M8A only by the additional category of ω which is constraint to be superior to 1 (ωad > 1), to account for sites under positive selection. (c) Phylogenetic trees harbours two clades, one for each paralog (the outgroup is not represented here). Models are either specifying identical categories of dN/dS in both clades (M3), or allowing one category to take a different value in each clade (MD).
Branch models: estimated parameters and log-likelihood ratio tests
| | M0 | −2847.12 | 58 | OG | 0.05 | | | |
| | | | | 0.08 | | | | |
| | | | | OG | 0.05 | | | |
| | MP | −2846.50 | 59 | 0.07 | vs. M0 | 1.24 | 0.26 | |
| | | | | 0.10 | | | | |
| | | | | OG | 0.05 | | | |
| | MA | −2840.62 | 59 | 0.14 | vs. M0 | 13.0** | 0.00031 | |
| | | | | 0.05 | | | | |
| | | | | OG | 0.05 | | | |
| | | | | 0.19 | vs. MP | 13.8** | 0.001 | |
| | MPA | −2839.60 | 61 | 0.11 | | | | |
| | | | | 0.04 | vs. MA | 2.04 | 0.36 | |
| | | | | 0.06 | | | | |
| | M0 | −4127.20 | 62 | OG | 0.06 | | | |
| | | | | 0.33 | | | | |
| | | | | OG | 0.06 | | | |
| | MP | −4125.06 | 63 | 0.25 | vs. M0 | 4.28* | 0.04 | |
| | | | | 0.41 | | | | |
| | | | | OG | 0.06 | | | |
| | MA | −4124.78 | 63 | 0.22 | vs. M0 | 4.84* | 0.03 | |
| | | | | 0.38 | | | | |
| | | | | OG | 0.06 | | | |
| | | | | 0.24 | vs. MP | 3.68 | 0.16 | |
| | MPA | −4123.22 | 65 | 0.21 | | | | |
| | | | | 0.29 | vs. MA | 3.12 | 0.21 | |
| | | | | 0.44 | | | | |
| | M0 | −3604.36 | 40 | OG | 0.27 | | | |
| | | | | 0.44 | | | | |
| | | | | OG | 0.27 | | | |
| | MP | −3603.75 | 41 | 0.38 | vs. M0 | 1.22 | 0.27 | |
| | | | | 0.50 | | | | |
| | | | | OG | 0.27 | | | |
| | MA | −3604.19 | 41 | 0.50 | vs. M0 | 0.34 | 0.57 | |
| | | | | 0.42 | | | | |
| | | | | OG | 0.27 | | | |
| | | | | 0.50 | vs. MP | 0.72 | 0.70 | |
| | MPA | −3603.39 | 43 | 0.54 | | | | |
| | | | | 0.35 | vs. MA | 1.60 | 0.45 | |
| 0.49 | ||||||||
*Note. Models: for M0, ω is allowed to take a different value only in the branch of the outgroup (OG); for MP, MA and MPA ω is allowed to take a different values according to the tested effect, i.e. “paralogs”, “age” or combined as explained in Figure 1. np: number of free parameters; logL: log-likelihood; LRT: likelihood ratio test statistic between indicated models; one (respectively two) asterisk indicates that the probability of observing such an LRT or higher under the compared model is <0.05 (respectively <0.01), assuming that the LRT follows a χ2 distribution with the difference of free parameters between the compared models as the number of degrees of freedom.
Site models results: estimated parameters and log-likelihood ratio tests
| M8A | 35 | −1600.22 | p = 0.01 q = 2.86 | | | | |
| | | | | ωad = 1 pad = 0.03 | | | |
| | M8 | 36 | −1600.19 | p = 0.01 q = 3.00 | vs. M8A | 0.04 | 0.84 |
| | | | | ωad = 1.10 pad = 0.03 | | | |
| M8A | 23 | −1391.38 | p = 6.30 q = 99.00 | | | | |
| | | | | ωad = 1 pad = 0.00 | | | |
| | M8 | 24 | −1391.38 | p = 6.30 q = 99.00 | vs. M8A | 0.00 | 1 |
| | | | | ωad = 1.00 pad = 0.00 | | | |
| M8A | 23 | −1616.24 | p = 4.89 q = 99.0 | | | | |
| | | | | ωad = 1.00 pad = 0.24 | | | |
| | M8 | 24 | −1615.18 | p = 0.13 q = 0.40 | vs. M8A | 2.11 | 0.15 |
| | | | | ωad = 4.88 pad = 0.01 | | | |
| M8A | 39 | −2221.99 | p = 2.16 q = 99.00 | | | | |
| | | | | ωad = 1 pad = 0.36 | | | |
| | M8 | 40 | −2210.38 | p = 0.01 q = 20.01; | vs. M8A | 23.22** | 1.44 10-6 |
| | | | | ωad = 6.30 pad = 0.04 | | | |
| M8A | 19 | −1388.97 | p = 2.32 q = 89.36 | | | | |
| | | | | ωad = 1.00 pad = 0.35 | | | |
| | M8 | 20 | −1385.65 | p = 0.47 q = 0.96 | vs. M8A | 6.64** | 9.99 10-3 |
| | | | | ωad = 11.61 pad = 0.02 | | | |
| M8A | 21 | −1519.41 | p = 0.01 q = 2.54 | | | | |
| | | | | ωad = 1.00 pad = 0.32 | | | |
| | M8 | 22 | −1512.06 | p = 0.01 q = 0.05 | vs. M8A | 14.69** | 1.26 10-4 |
| ωad = 4.45 pad = 0.10 | |||||||
Note. Models are M8A, ω following a β distribution discretized into 10 categories of similar frequency (0 < ω1–10 < 1) plus an additional category of ωad = 1, accounting for neutral sites; M8 differs from M8A only by the additional category of ω which is constraint to be superior to 1 (ωad > 1), to account for sites under positive selection.
Parameters are frequencies and values of ω for M8A and M8, p and q are the parameters in β distribution; for M8A and M8 pad and ωad are the frequencies and values of additional class of ω; NB ωad is fixed equal to one in M8A. np: number of free parameters; logL: log-likelihood; LRT: likelihood ratio test statistic between indicated models; one (respectively two) asterisk indicates that the probability of observing such an LRT or higher under the compared model is <0.05 (respectively <0.01), assuming that the LRT follows a χ2 distribution with the difference of free parameters between the compared models as the number of degrees of freedom.
Branch-site models: estimated parameters and log-likelihood ratio tests
| | M3 (2) | 58 | −2375.94 | | | | | | |
| MD (2) | 59 | −2369.54 | vs. M3 (2) | 12.8** | 3.47 10-4 | 0.76 | | 0.00 | |
| | | | | | | | 0.24 | 0.15 | |
| | | | | | | | | 0.52 | |
| | M3 (3) | 60 | −2375.94 | | | | | | |
| | MD (3) | 61 | −2361.85 | vs. M3 (3) | 28.17** | 1.11 10-7 | 0.77 | | 0.005 |
| | | | | | | | 0.03 | | 0.97 |
| | | | | | | | 0.20 | 0.55 | |
| | | | | | | | | 0.97 | |
| | M3 (2) | 61 | −3411.03 | | | | | | |
| MD (2) | 62 | −3407.95 | vs. M3 (2) | 6.16* | 0.01 | 0.68 | | 0.09 | |
| | | | | | | | 0.32 | 0.74 | |
| | | | | | | | | 1.35 | |
| | M3 (3) | 63 | −3400.58 | | | | | | |
| | MD (3) | 64 | −3399.59 | vs. M3 (3) | 1.98 | 1.16 | | | |
| | M3 (2) | 40 | −2216.23 | | | | | | |
| MD (2) | 41 | −2214.01 | vs. M3 (2) | 4.45* | 0.03 | 0.79 | | 0.12 | |
| | | | | | | | 0.21 | 1.54 | |
| | | | | | | | | 3.13 | |
| | M3 (3) | 42 | −2210.38 | | | | | | |
| MD (3) | 43 | −2209.37 | vs. M3 (3) | 2.01 | 0.16 | ||||
Note. Models: for M3 (discrete model), ω is free to take the number of values indicated in brackets (k); these values are homogenous in all branches of the tree; for MD, as explained in Figure 1 (b), one category of ω is allowed to differ between the two paralogous genes clades of the tree [33]. np: number of free parameters; logL: log-likelihood; LRT: likelihood ratio test statistic between indicated models; one (respectively two) asterisk indicates that the probability of observing such an LRT or higher under the compared model is <0.05 (respectively <0.01), assuming that the LRT follows a χ2 distribution with the difference of free parameters between the compared models as the number of degrees of freedom.