| Literature DB >> 27176219 |
Evgeniy S Balakirev1,2,3, Maria Anisimova4,5, Vladimir A Pavlyuchkov6, Francisco J Ayala7.
Abstract
BACKGROUND: The sperm gene bindin encodes a gamete recognition protein, which plays an important role in conspecific fertilization and reproductive isolation of sea urchins. Molecular evolution of the gene has been extensively investigated with the attention focused on the protein coding regions. Intron evolution has been investigated to a much lesser extent. We have studied nucleotide variability in the complete bindin locus, including two exons and one intron, in the sea urchin Strongylocentrotus intermedius represented by two morphological forms. We have also analyzed all available bindin sequences for two other sea urchin species, S. pallidus and S. droebachiensis.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27176219 PMCID: PMC4866015 DOI: 10.1186/s12863-016-0374-5
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Fig. 1DNA polymorphism in the bindin gene of the sea urchin Strongylocentrotus intermedius. The numbers above the top sequence represent the position of segregating sites and the start of a deletion or insertion. Nucleotides are numbered from the beginning of our sequence (position 877 in [59], starting the mature bindin protein). The coding nucleotides are in bold. Amino acid replacement polymorphisms are marked with asterisks. Dots indicate the same nucleotide as the reference sequence. The hyphens represent deleted nucleotides. ▲ denotes a deletion; † denotes the absence of a deletion; ▼ denotes an insertion; ‡ denotes the absence of an insertion. The recombinant sequence U-7 is in bold; the putative conversion tract is underlined. The exon - intron coordinates are: 1–237: exon I; 238–1191: intron; 1192–1518: exon II
Fig. 2Maximum likelihood tree of the strongylocentrotid sea urchins bindin sequences. The tree is based on Kimura 2-parameter (K2P) model as the best-fitting model of substitution under the maximum likelihood criterion [66] for constructing an ML tree of the bindin sequences. The numbers at the nodes are bootstrap percent probability values based on 1,000 replications. The bindin sequences of Hemicentrotus pulcherrimus (AF077318 and AF077319) are used as outgroups. The specimens of S. intermedius are marked with letters “G” and “U”. DRO = S. droebachiensis, PAL = S. pallidus; POL = S. polyacanthus; PUR = S. purpuratus; PUL = Hemicentrotus pulcherrimus
Nucleotide diversity and divergence in the bindin gene of the sea urchins Strongylocentrotus intermedius, S. pallidus, and S. droebachiensis
| Exon I | Exon II | Exon I + Exon II | Intron | Silent | Total | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Syn | Nsyn | Total | Syn | Nsyn | Total | Syn | Nsyn | Total | ||||
| INT (25) | ||||||||||||
| N | 57 | 180 | 237 | 78 | 246 | 324 | 135 | 426 | 561 | 871 | 1009 | 1435 |
| S | 2 | 2 | 4 | 3 | 2 | 5 | 5 | 4 | 9 | 39 | 44 | 48 |
| π | 0.0111 | 0.0025 | 0.0046 | 0.0050 | 0.0018 | 0.0026 | 0.0076 | 0.0021 | 0.0034 | 0.0078 | 0.0077 | 0.0060 |
| θ | 0.0092 | 0.0030 | 0.0045 | 0.0136 | 0.0022 | 0.0050 | 0.0117 | 0.0025 | 0.0047 | 0.0119 | 0.0118 | 0.0090 |
|
| 0.1925 | 0. | 0.1165 | 0.1465 | 0. | 0.0702 | 0.1654 | 0. | 0.0892 | 0.0739 | 0.0851 | 0.0796 |
| PAL (11) | ||||||||||||
| N | 57 | 180 | 237 | 75 | 234 | 309 | 132 | 414 | 546 | 934 | 1069 | 1483 |
| S | 4 | 1 | 5 | 1 | 4 | 5 | 5 | 5 | 10 | 53 | 58 | 63 |
| π | 0.0271 | 0.0046 | 0.0100 | 0.0022 | 0.0073 | 0.0061 | 0.0131 | 0.0061 | 0.0078 | 0.0205 | 0.0195 | 0.0158 |
| θ | 0.0253 | n.a. | 0.0072 | 0.0046 | 0.0058 | 0.0055 | 0.0131 | n.a. | 0.0063 | 0.0194 | n.a. | 0.0145 |
|
| 0.1408 | 0. | 0.1042 | 0.1704 | 0. | 0.0755 | 0.1569 | 0. | 0.0883 | 0.0736 | 0.0825 | 0.0786 |
| DRO (16) | ||||||||||||
| N | 56 | 180 | 237 | 75 | 240 | 315 | 131 | 420 | 552 | 919 | 1054 | 1474 |
| S | 1 | 3 | 4 | 4 | 4 | 8 | 5 | 7 | 11 | 27 | 32 | 38 |
| π | 0.0041 | 0.0062 | 0.0057 | 0.0192 | 0.0077 | 0.0105 | 0.0128 | 0.0071 | 0.0084 | 0.0080 | 0.0086 | 0.0081 |
| θ | 0.0053 | 0.0050 | 0.0051 | 0.0160 | 0.0050 | 0.0077 | 0.0114 | 0.0050 | 0.0066 | 0.0089 | 0.0092 | 0.0080 |
|
| 0.1790 | 0. | 0.1042 | 0.1630 | 0. | 0.0773 | 0.1699 | 0. | 0.0890 | 0.0671 | 0.0784 | 0.0748 |
The region analyzed represents the full mature bindin gene, which includes exon I (excluding the signal peptide), intron, and exon II (excluding the repeat region)
INT = S. intermedius, DRO = S. droebachiensis, PAL = S. pallidus; N, number of sites (indels are excluded); S, number of polymorphic sites; π, average number of nucleotide differences per site among all pairs of sequences ([134], p. 256), obtained for the silent, synonymous, nonsynonymous, and total number of sites; θ, average number of segregating nucleotide sites among all sequences, based on the expected distribution of neutral variants in a panmictic population at equilibrium [135]; K int-pul, K pal-pul, and K dro-pul are the average proportion of nucleotide differences between S. intermedius, S. pallidus, S. droebachiensis and Hemicentrotus pulcherrimus (AF077318 and AF077319), respectively, corrected according to [136]; Syn, synonymous sites; Nsyn, nonsynonymous sites; Silent, silent sites (synonymous and noncoding intronic sites). The segregating sites associated with indels are excluded from the π, θ, and, K calculations
Fig. 3Sliding-window plots of polymorphism (π, thin line) and divergence (K, thick line) along the bindin gene of S. intermedius (INT), S. pallidus (PAL), and S. droebachiensis (DRO). The bindin sequence of S. polyacanthus (AF077317) was used for the K calculations. Window sizes are 100 nucleotides with 1-nucleotide increments. A schematic representation of the bindin gene is at the bottom. Vertical arrows indicate the locations of the regions with high within species polymorphism and low between species divergence
McDonald tests
| Gmax | Runs | K.-S. | Gavg | |||||
|---|---|---|---|---|---|---|---|---|
| DRO | PAL | DRO | PAL | DRO | PAL | DRO | PAL | |
| INT | 22.220 | 17.264 | 34 | 36 | 0.090 | 0.062 | 5.666 | 6.306 |
|
|
|
|
|
|
| 0.216 |
|
|
| PAL | 17.750 | 16.326 | 30 | 38 | 0.089 | 0.080 | 6.896 | 7.494 |
|
|
|
|
| 0.491 |
|
|
|
|
| DRO | -- | 27.985 | -- | 25 | -- | 0.116 | -- | 9.278 |
|
| -- |
| -- |
| -- |
| -- |
|
Gmax, Runs, Kolmogorov—Smirnov (K.-S.), and Gavg are test statistics (see [73, 74]). Marginally significant and significant P values are in bold. The bindin sequences of S. droebachiensis (AF133796) and S. polyacanthus (AF077317) were used in interspecific comparisons. Other comments see Table 1
Fig. 4Sliding window plots of polymorphism-to-divergence ratio, and the average sliding G value along the bindin genes of S. intermedius, S. pallidus, and S. droebachiensis. The bindin sequence of S. polyacanthus (AF077317) was used as an outgroup. Window size is 10 variable substitutions for the polymorphism-to-divergence ratio and 12 variable substitutions for the average sliding G value. Other comments see Fig. 3
McDonald-Kreitman test
|
|
|
| ||||
|---|---|---|---|---|---|---|
| Fixed | Polymorphic | Fixed | Polymorphic | Fixed | Polymorphic | |
| Silent | 24 | 45 | 17 | 59 | 20 | 33 |
| Replacement | 14 | 5 | 13 | 5 | 12 | 5 |
| % Replacement | 50.0 | 10.0 | 43.3 | 7.8 | 37.5 | 13.2 |
| Fisher’s test |
|
|
| |||
| G test |
|
|
| |||
The bindin sequence of S. polyacanthus (AF077317) was used for interspecific comparisons
Fig. 5Sliding-window plots for the Wall’s (1999) B neutrality test statistic along the bindin gene region in S. intermedius (INT), S. pallidus (PAL), and S. droebachiensis (DRO). The total length of the sequence is 1398 bp with indels excluded. Window sizes are 50 nucleotides with five-nucleotide increments. Thin horizontal lines indicate expected values of the B neutrality test statistic with P = 0.01 (lower line) and P = 0.001 (upper line) obtained by coalescent simulations conditioned on the number of polymorphic sites without recombination. Other comments see Fig. 3
Sites inferred under positive selection using the Bayesian prediction based on codon models M2a and M8
| Data set | Positively selected sites | M2a NEB | M2a BEB | M8 NEB | M8 BEB |
|---|---|---|---|---|---|
| All species | 2 P/V/G | 0.941 | 0.945 | 0.962* | 0.974* |
| 35 G/R/A | 0.825 | 0.842 | 0.862 | 0.901 | |
| 74 I/F/V/I/T | 0.642 | 0.704 | 0.704 | 0.793 | |
| 160 F/L | 0.745 | 0.747 | 0.763 | 0.801 | |
| 188 V/G/S/A | 1.000** | 1.000** | 1.000** | 1.000** | |
| 199 L/Q/R | 0.993** | 0.991* | 0.996** | 0.996** | |
|
| 66 I/L | 1.000** | 0.668 | 1.000** | 0.777 |
| 162 P/V | 1.000** | 0.816 | 1.000** | 0.904 | |
| 188 A/G | 1.000** | NA | 1.000** | 0.580 | |
| 199 L/I | 1.000** | NA | 1.000** | 0.582 |
Site numbers are mapped to the full species alignment after the intron sites were removed
*: P > 95 %; **: P > 99 %. See Additional file 5: Figure S5 for the bindin amino acid alignment
Models of variable ζ among sites
| Model | Free parameters | Site classes | Proportions of sites from a corresponding class |
|---|---|---|---|
| Neutral | ζ0,
| ζ0 < 1, ζ1 = 1 |
|
| Two-category | ζ0, ζ1,
| ζ0 < 1, ζ1 ≥ 1 |
|
Parameter estimates for ζ-models
| Data | Model | κ | ω | ζ-model parametersa | Log-likelihood values |
|---|---|---|---|---|---|
| INT | Neutral | 3.30 | 0.49 | ζ0 = 0.99, | −2886.217945 |
| Two-category | 2.62 | 0.53 | ζ0 = 0.69, | −2870.740681 | |
| PAL | Neutral | 3.12 | 0.27 | ζ0 = 1.00, | −2533.273804 |
| Two-category | 2.87 | 2.12 | ζ0 = 0.00, | −2522.600909 | |
| DRO | Neutral | 1.14 | 0.31 | ζ0 = 0.00, | −2279.474923 |
| Two-category | 1.17 | 0.41 | ζ0 = 0.00, | −2277.438710 |
aValues in square brackets are fixed (ζ1 in neutral model) or calculated from estimates (p 1 = 1− p 0)
See Table 1 for the species designation