Literature DB >> 17127215

Evaluation of six methods for estimating synonymous and nonsynonymous substitution rates.

Abstract

Methods for estimating synonymous and nonsynonymous substitution rates among protein-coding sequences adopt different mutation (substitution) models with subtle yet significant differences, which lead to different estimates of evolutionary information. Little attention has been devoted to the comparison of methods for obtaining reliable estimates since the amount of sequence variations within targeted datasets is always unpredictable. To our knowledge, there is little information available in literature about evaluation of these different methods. In this study, we compared six widely used methods and provided with evaluation results using simulated sequences. The results indicate that incorporating sequence features (such as transition/transversion bias and nucleotide/codon frequency bias) into methods could yield better performance. We recommend that conclusions related to or derived from Ka and Ks analyses should not be readily drawn only according to results from one method.

Entities: Chemical Gene Species

Mesh：

Substances：
Codon

Year: 2006 PMID： 17127215 PMCID： PMC5054070 DOI： 10.1016/S1672-0229(06)60030-2

Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN： 1672-0229 Impact factor: 7.691

Introduction

In the field of molecular evolution, one of the powerful tools for understanding the mechanisms of DNA sequence evolution, reconstructing phylogenic trees, and identifying protein-coding exons is to estimate nonsynonymous (amino-acid replacing) and synonymous (silent) substitution rates among protein-coding sequences, termed as Ka and Ks, respectively 1., 2., 3., 4., 5.. Ka reflects nonsynonymous substitutions per nonsynonymous site, and Ks reflects synonymous substitutions per synonymous site. The Ka/Ks ratio (denoted as ω) is widely used as an estimator of selective strength for DNA sequence evolution, with ω > 1 indicating positive selection, ω < 1 indicating purifying (negative) selection, and ω close to 1 indicating neutral mutation. Over the past two decades, several methods have been developed for Ka and Ks estimations. Although these methods consider different features of sequence evolution, they fall into two classes: approximate methods and maximum-likelihood methods. Approximate methods normally involve three steps to estimate Ka and Ks: Firstly, count the numbers of synonymous (S) and nonsynonymous (N) sites (the sum of S and Ν is scaled to the length of the sequences compared); Secondly, calculate the numbers of synonymous (Sd) and nonsynonymous (Nd) substitutions (the sum of Sd and Nd equals to the number of substitutions between pairwise sequences); Thirdly, correct for multiple substitutions due to the fact that the observed number of substitutions underestimates the real number of substitutions as sequences diverge over time (. Different from approximate methods, maximum-likelihood methods adopt the probability theory to finish the three steps in one go (. We list the definitions of symbols used in Ka and Ks estimations in Table 1. In addition, these methods can also be classified as nucleotide-based or codon-based methods according to their adopted mutation models. In this study, we focus on six of them: Nei-Gojobori method (NG; ref. ), Li-Wu-Luo method (LWL; ref. ), Li-Pamilo-Bianchi method (LPB; ref. 10., 11.), Goldman-Yang method (GY; ref. ), Yang-Neilsen method (YN; ref. ), and modified Yang-Neilsen method (MYN; ref. ). Among them, only GY belongs to the maximum-likelihood method.

Table 1

Definitions of Symbols Used in Ka and Ks Estimations

Symbol	Definition
S	Number of synonymous sites
Ν	Number of nonsynonymous sites
S_d	Number of synonymous substitutions
N_d	Number of nonsynonymous substitutions
Ks	Synonymous substitution rate
Ka	Nonsynonymous substitution rate
ω	Estimator of selective strength, ω = Ka/Ks
t	Divergence time between two sequences, the expected number of nucleotide
	substitutions per codon, t = (Ks × 3S + Ka × 3N)/(S+N)
κ_R	Ratio of transitional rate between purines to transversional rate
κ_Y	Ratio of transitional rate between pyrimidines to transversional rate
κ	Ratio of transitional rate to transversional rate

It should be noted that different methods adopt different mutation models 12., 15., 16., 17., 18. with subtle yet significant differences, which lead to diverse estimates of evolutionary distance (. Since Ka, Ks, and ω are broadly applied in molecular evolution, it is necessary to evaluate the accuracies of these methods so that evolutionary information among compared sequences can be accurately captured. To our knowledge, few studies have been done on comprehensive evaluation for these six widely used methods. Therefore, we conducted this study to compare and evaluate these methods by computer simulations and empirical data. In addition, we recommend that methods for estimating Ka and Ks should be used cautiously, and conclusions related to or derived from Ka and Ks analyses should not be readily drawn only according to results from one method.

Results

Comparative results

Effects of codon frequency bias and transition/transversion bias

We performed simulations to generate long sequences as consistency analysis (. Since the two ratios of transitional rate between purines (κR) and between pyrimidines (κY) to transversional rate often vary from 1.5 to 5, we considered 3.75 as a “typical value”. Hence, we can fix one of them to 3.75 and set the other to vary from 1 to 10. We plotted estimates of ω that were calculated with these six methods against κR (fixing κY = 3.75) under three different codon frequencies (Figure 1A–I). Similar results can be obtained for fixed κR and variable κY (data not shown).

Fig. 1

Estimated ω with six methods when κY = 3.75, considering κR varying from 1 to 10. Three sets of codon frequencies are used: equal (A, D, G), human (Β, Ε, H) calculated from human protein-coding genes, and rice (C, F, I) derived from rice protein-coding genes. ω = 0.3 (A–C), ω = 1 (D–F), and ω = 3 (G–I) are considered as typical values for purifying selection, neutral mutation, and positive selection, respectively.

According to the results, codon frequencies have obvious influence on NG, LWL, and LPB, but minor influence on GY, YN, and MYN (Figure 1A–I). Although LWL is more biased than NG in ω estimation, they both have a nearly parallel trend with an increasing κR and tend to underestimate ω for most of the parameter combinations examined. These results are in substantial agreement with previous studies 13., 19.. Despite the fact that closer results are sometimes estimated for neutral mutation (Figure 1D–F), LPB, which was proposed as a modification of LWL, performs unsteadily as κR increases: overestimate ω for purifying selection (Figure 1A–C) and underestimate ω for positive selection (Figure 1G–I). Taking the human codon frequencies as an example, when κR = 4 and 10, the estimates of ω given by LPB are 0.316 and 0.345 for ω = 0.3, 0.944 and 0.991 for ω = 1, and 2.380 and 2.406 for ω = 3, respectively. As a whole, LPB has a better performance than NG and LWL. GY and YN give rise to similar estimates of ω primarily due to the fact that they both take account of major features of DNA sequence evolution (transition/transversion rate bias, nucleotide/codon frequency bias). Ignoring the difference between κR and κY, GY and YN produce closer estimates only when κR ≈ 3.75. For instance, when κR = 4, estimates of ω given by GY and YN under equal codon frequencies are 0.303 and 0.297 for ω = 0.3, 1.010 and 1.012 for ω = 1, and 3.036 and 3.049 for ω = 3, respectively. GY and YN tend to underestimate ω when κR < κY and to overestimate ω when κR > κY, and their biases become more serious as κR increases or decreases to extremes (. Compared with NG, LWL, and LPB, GY and YN perform better for most of the parameter combinations tested, which is attributable to the consideration of more evolutionary features. MYN, a modified YN method, allows for two different ratios of transitional rate between purines (κR) and between pyrimidines (κY) to transversional rate as well as nucleotide/codon frequency. It can become equivalent to YN when κR = κY and thus similar results can be observed by the two methods. For example, when κR = 4, estimates of ω by YN and MYN under human codon frequencies are 0.306 and 0.307 for ω = 0.3, 1.025 and 1.026 for ω = 1, and 3.024 and 3.023 for ω = 3, respectively. When κR ≠ κY, MYN sometimes yields biased estimates, but it represents a better performance for most of the parameter combinations tested. We also examined Ks estimations and plotted percentage errors of estimated Ks against κR (Figure 2A–I; see Materials and Methods). NG and LWL have a tendency to overestimate Ks for most of the parameter settings examined, and the bias of LWL is more serious than that of NG, which is consistent with those found in ω estimations. As to LPB, closer estimates of Ks can be obtained only for neutral mutation (Figure 2D–F). It tends to give rise to negative percentage errors of Ks for purifying selection (Figure 2A–C) and positive percentage errors of Ks for positive selection (Figure 2G–I). These results also agree well with ω estimations.

Fig. 2

Percentage errors of estimated Ks with six methods when κY = 3.75, considering κR varying from 1 to 10. Three sets of codon frequencies are used: equal (A, D, G), human (Β, Ε, H) calculated from human protein-coding genes, and rice (C, F, I) derived from rice protein-coding genes. ω = 0.3 (A–C), ω = 1 (D–F), and ω = 3 (G–I) are considered as typical values for purifying selection, neutral mutation, and positive selection, respectively.

GY and YN produce similar results of Ks for most of the parameter combinations: closer estimation only when κR ≈ 3.75, overestimation when κR < κY (not apparent in Figure 2B), and underestimation when κR > κY. In comparison with NG, LWL, and LPB that do not allow for transition/transversion bias or nucleotide/codon frequency bias, GY and YN both perform better in Ks estimations. MYN gives estimates of Ks similar to GY and YN when κR ≈ κY, which is in agreement with ω estimations. Taking the human codon frequencies as an example, when κR = 4, the percentage errors of estimated Ks calculated with GY, YN, and MYN are −3.957%, −1.991%, and −2.114% for ω = 0.3, −2.184%, −3.770%, and −3.778% for ω = 1, and −0.709%, −2.614%, and −2.558% for ω = 3, respectively. MYN yields biased estimates when κR < κY and closer estimates when κR > κY. For instance, when κR = 1, the percentage errors of estimated Ks calculated with GY, YN, and MYN are −3.397%, 0.246%, and −9.156% for ω = 0.3, 1.582%, 1.001%, and −7.544% for ω = 1, and 5.779%, 4.383%, and −4.053% for ω = 3, respectively. Similarly, when κR = 10, those with GY, YN, and MYN are −13.949%, −14.498%, and −7.114% for ω = 0.3, −9.991%, −11.787%, and −5.367% for ω = 1, and −6.386%, −8.213%, and −2.006% for ω = 3, respectively. As a whole, MYN is less biased than other methods for most of the parameter combinations examined. Estimates of Ka with these six methods were also tested (Figure 3A–I). NG and LWL underestimate Ka, which is consistent with those found in Ks and ω estimations. Compared with LWL, NG gives slightly better estimates of Ka, and they both are more biased than other methods. LPB has a tendency to overestimate Ka for purifying selection [Figure 3A–C; it is not apparent because of slight underestimations of Ka and Ks arising from about 4% loss of sites due to mutations leading to stop codons (], and to underestimate Ka for neutral mutation and positive selection (Figure 3D–I). GY, YN, and MYN perform similarly and give rise to less bias in Ka estimation. Taking the human codon frequencies as an example, the percentage errors of estimated Ka calculated with GY, YN, and MYN for the expected ω = 3 (Figure 3H) are −6.379%, −7.134%, and −3.466% when κ = 1, −0.749%, −1.851%, and −1.802% when κR = 4, and −1.269%, −2.826%, and −5.638% when κR = 10, respectively. Biases of Ka given by these methods are overall relatively smaller when compared with Ks and ω estimations.

Fig. 3

Percentage errors of estimated Ka with six methods when κY = 3.75, considering κR varying from 1 to 10. Three sets of codon frequencies are used: equal (A, D, G), human (Β, Ε, H) calculated from human protein-coding genes, and rice (C, F, I) derived from rice protein-coding genes. ω = 0.3 (A–C), ω = 1 (D–F), and ω = 3 (G–I) are considered as typical values for purifying selection, neutral mutation, and positive selection, respectively.

Effects of divergence time

Since the amount of sequence variations reflected in divergence time (t) is always unpredictable within targeted datasets, we performed simulations to examine its effect under the human codon frequencies. Three different combinations of κR and κY were tested: κR (=1) < κY (=10), κR (=10) > κY (=1), and κR = κY (= 3.75). We plotted estimates of ω against t varying from 0.1 to 1 for the expected ω = 0.3, 1, and 3, respectively (Figure 4A–I).

Fig. 4

Estimates of ω with six methods considering divergence time (t) varying from 0.1 to 1. Sequences are simulated with the human codon frequencies derived from human protein-coding genes. Three different combinations of κR and κY are examined: κR = 1, κY = 10 (A, D, G); κR = 10, κY = 1 (Β, E, H); κR = κY = 3.75 (C, F, I). ω = 0.3 (A–C), ω = 1 (D–F), and ω = 3 (G–I) are considered as typical values for purifying selection, neutral mutation, and positive selection, respectively.

With t increasing, NG and LWL tend to give better estimates of ω for purifying selection and biased estimates for positive selection, whereas t has no obvious influence on them for neutral mutation. LPB overestimates ω for purifying selection and underestimates ω for positive selection, which is consistent with those found above. GY and YN represent a similar trend with t increasing: underestimate ω when κR < κY, overestimate ω when κR > κY, and yield closer estimates when κR = κY. As to MYN, the bias of estimated ω tends to become more serious as t increases to extremes when κR ≠ κY, whereas t has minor influence and closer estimates could be observed when κR = κY.

Evaluation results

NG and LWL

NG considers all possible evolutionary pathways among compared DNA sequences and assumes that each nucleotide is substituted by any other at equal rate (κR=κY=1) when counting sites and substitutions. It adopts the Jukes-Cantor’s one-parameter formula ( to correct for multiple substitutions. Since transitions are more likely to occur than transversions, NG often underestimates transition/transversion rate ratio (κ) and thus the number of synonymous sites (S), which results in overestimation of Ks and underestimation of ω. This phenomenon can be observed in our simulation results, which was also found by Yang and Nielsen (. LWL classifies sites and substitutions as i-fold degenerate sites (i = 0, 2, 4) (three-fold degenerate sites, ATT, ATC, and ATA, are considered as two-fold ones). It considers unequal rates between transitional and transversional changes only when counting substitutions, but equal rates when counting sites. In detail, LWL assumes that two-fold degenerate sites are one-third synonymous and two-thirds nonsynonymous with (1), (2):where L is the number of i-fold degenerate sites, and A and B are the numbers of transitional and transversional substitutions per i-fold degenerate site (i = 0, 2, 4), respectively. Hence, the total number (K) of substitutions per i-fold degenerate site is formulated as K = A + B. Although the Kimura’s two-parameter formulas ( are used for correction of multiple substitutions, LWL performs similarly to NG, since the number of substitutions in most cases is less than that of sites and thus the influence of κ on substitutions is not stronger than that on sites (LWL considers the transition/transversion bias only when counting substitutions). Interestingly, it seems that NG and LWL, with increasing t, tend to give better estimates for purifying selection but biased estimates for positive selection, whereas t has no influence on them for neutral mutation (Figure 4). To explain this result, we derived an approximate formula for ω = Ka/Ks ≈ (Nd/N)/(Sd/S) = (S/N) × (Nd/Sd) (the symbol of “≈“ is due to the absence of correcting for multiple substitutions). Therefore, ω is composed of two parts: S/N, which is always underestimated, arising from the assumption of κR = κY = 1 when counting sites by NG and LWL; Nd/Sd, which is related to t since an increase in t leads to more substitutions. For purifying selection, synonymous substitutions are more likely to occur than nonsynonymous ones. Therefore, small t tends to give rise to synonymous substitutions with only one difference, whereas an increase in t may result in more differences between two compared codons and thus provoke not only synonymous substitutions but also nonsynonymous ones according to different evolutionary pathways. Hence, for purifying selection, the value of Nd/Sd is on the rise as t increases, which can cancel the underestimation of S/N and thus lead to better estimates of ω for large t than for small t. In a similar way, the value of Nd/Sd for positive selection is on the decrease as t increases, which leads to the underestimation of ω; for neutral mutation, the value of Nd is close to that of Sd since synonymous and nonsynonymous substitutions per site occur with equal frequency and therefore ω seems to be nearly constant. These theoretical results are consistent well with the data found in Figure 4.

LPB

LPB, proposed as a modification of LWL, corrects for the bias in counting sites by using different formulas for Ka and Ks estimations (which is the only difference between LWL and LPB) with (3), (4): LPB considers that Ka comprises two parts: the transitional nonsynonymous substitution rate A0 and the transversional nonsynonymous substitution rate (L0B0+L2B2)/(L0+L2). Likewise, Ks comprises the transitional synonymous substitution rate B4 and the transversional synonymous substitution rate (L2A2+L4A4)/(L2+L4). Based on these modifications, LPB improves the performance of LWL for most of the parameter combinations observed in simulations. It can be observed from our comparative results that LPB tends to overestimate ω for purifying selection and to underestimate ω for positive selection. This result can be explained by assuming Ka = K0 and Ks = K4 at the perfect condition, since substitutions at nondegenerate sites are all nonsynonymous and those at four-fold degenerate sites are all synonymous (. We reformulated the equations of Ka and Ks and found that the weighted average of K0 and A0+B2 over 0- and 2-fold degenerate sites is considered as Ka (Equation 3) and the weighted average of K4 and A2+B4 over 2- and 4-fold degenerate sites is Ks (Equation 4). Let us first examine purifying selection. Transversions at two-fold degenerate sites can lead to synonymous substitutions (for example, CGG to AGG), whereas those at nondegenerate sites cannot. Since synonymous substitutions are more likely to occur than nonsynonymous ones for purifying selection, the value of B0 is less than that of B2, which leads to K0 = (A0 + B0) < (A0 + B2). In addition, the value of A4 is greater than that of A2 due to the fact that synonymous substitutions occur with higher possibilities at four-fold degenerate sites than two-fold ones. As a result, K4 = (A4 + B4) > (A2 + B4). Therefore, we can conclude that LPB overestimates ω for purifying selection, arising from Ka > K0 and Ks < Κ4. For positive selection, LPB underestimates Ka and overestimates Ks in a similar way, resulting in the underestimation of ω. We can see that this theoretical conclusion agrees well with simulation results for most of the parameter combinations examined.

GY, YN, and MYN

GY, based on a codon-based model, takes account of more features of DNA sequence evolution (such as transition/transversion rate bias and nucleotide/codon frequency bias) and calculates Ka and Ks by maximum likelihood estimation. YN, a simplified version of GY, adopts the Hasegawa-Kishino-Yano model ( that also considers these evolutionary features and thus gives a close approximation of the maximum-likelihood method. To allow for more features of sequence evolution, MYN exploits the Tamura-Nei model ( and uses two different ratios of transitional rate between purines (κR) and between pyrimidines (κY) over the transversional rate when counting sites and substitutions. As a whole, these three methods perform better than NG, LWL, and LPB, while MYN improves the performance of YN for most of the parameter combinations (. However, we cannot conclude that which one of them is more accurate than other methods since our simulations are merely approximate and all methods may more or less give biased results for at least some parameter settings. We summarized the above analyses for these six methods in Table 2. In addition, there still have other methods 21., 22., 23., 24. that are not included in our study. For example, a method similar to GY was proposed by Muse and Gaut (, and some modified versions of LWL or LPB were improved by subdividing two-fold degenerate sites and substitutions, taking account of transition/transversion rate bias in counting sites, correcting for Arginine (ATT, ATC, and ATA), and so on.

Table 2

Mutation Models and Evolutionary Features in Different Methods

Method	Mutation model	Transition/transversion		Codon/nucleotide frequency
		site#1	substitution#2
NG	Jukes-Cantor	κ_R = κ_Y = 1	κ_R = κ_Y = 1	equal
LWL	Κimura	κ_R = κ_Y = 1	κ_R = κ_Y	equal
LPB	Kimura	_*	_*	equal
GY	Codon-based	κ_R = κ_Y	κ_R = κ_Y	unequal
YN	Hasegawa-Kishino-Yano	κ_R = κ_Y	κ_R = κ_Y	unequal
MYN	Tamura-Nei	κ_R ≠ κ_Y	κ_R ≠ κ_Y	unequal

#κR and κY are assumed by different methods:

in the step of counting sites

in the step of counting substitutions.

LPB has no specific definition of synonymous and nonsynonymous sites or substitutions.

Discussion

It can be found from our results that incorporating more features of sequence evolution (such as transition/transversion bias and nucleotide/codon frequency bias) into Ka and Ks estimations could accurately capture more reliable estimates among protein-coding sequences. Although it is still hard to accommodate the trade-off between considering more parameters (evolutionary features) and avoiding over-parameterization (, and simple methods (such as NG) are more suitable for short sequences, methods taking more evolutionary features into account should be the first choice to yield estimates with high quality and accuracy. However, it should also be noted that all methods may one way or another give rise to biased results for at least some parameter combinations. How can we obtain the most reliable estimates of Ka and Ks? As mentioned above, these methods adopt different nucleotide substitution or mutation models, leading to diverse estimates of evolutionary distance (. In addition, since the amount and the degree of sequence substitutions vary among datasets, a single model or a single method is not adequate for Ka and Ks calculations. As a consequence, model selection, that is, choosing a best-fit model according to compared sequences when estimating Ka and Ks, becomes critical for capturing appropriate evolutionary information (. Therefore, implementation of different mutation models in a framework of maximum likelihood could help us include as many features as needed in Ka and Ks estimations, which accordingly needs the Akaike Information Criterion or the Bayesian Information Criterion ( as a measure of fitness between models and data.

Materials and Methods

Simulated sequences were generated from hypothetical common ancestral sequences. Each codon of the common ancestral sequences was randomly chosen from 64 codons (except stop codons) according to codon frequencies. In this study, we considered three sets of codon frequencies derived from three empirical datasets: (1) equal codon frequencies, that is, each sense codon frequency is 1/(64—the number of stop codons) (; (2) human codon frequencies deduced from 39,420 human protein-coding genes from the ENSEMBL database (Release 35; ref. ); and (3) rice codon frequencies retrieved from 19,079 rice protein-coding genes (. In addition to codon frequencies, other parameters were set in simulations, including sequence length, divergence time (t), two ratios of transitional rate between purines (κR) and between pyrimidines (κY) to transversional rate, and the selective strength (ω = Ka/Ks). Although ω varies from gene to gene, ω = 0.3, 1, and 3 can be regarded as “typical values” for negative selection, neutral mutation, and positive selection, respectively 3., 13., 30., which could be observed from real datasets. To accurately examine the effect of one parameter and avoid stochastic errors arising from other factors, we simulated sequences with 2,000,000 codons. To compare the accuracies of Ka and Ks estimations with different methods, we estimated expected Ks and Ka values by counting the numbers of synonymous and nonsynonymous sites of ancestral sequences and then using the formulas Ks = 3×(S+N)×t/ (S+ω×Ν) and Ka = ω×Ks. Considering that simulated sequences have different expected values of Ka and Ks, we used the formula 100% × [(estimated value) — (expected value)]/(expected value) to calculate percentage errors for a better display of relative biases between estimated and expected values. For Ka and Ks estimations, we used six different methods: NG, LWL, LPB, GY, YN, and MYN, which have been implemented in our software KaKs_Calculator (prepared for submission). For error-checking, we also compared the results estimated by KaKs-Calculator with those by other tools, such as PAML (, in which NG, GY, and YN were implemented.

Authors’ contributions

ZZ performed computer simulations to generate sequences, carried out the comparative analysis, and drafted the manuscript. JY supervised the research and revised the manuscript. Both authors read and approved the final manuscript.

Competing interests

The authors have declared that no competing interests exist.

26 in total

1. Episodic adaptive evolution of primate lysozymes.

Authors: W Messier; C B Stewart
Journal: Nature Date: 1997-01-09 Impact factor: 49.962

2. Unbiased estimation of the rates of synonymous and nonsynonymous substitution.

Authors: W H Li
Journal: J Mol Evol Date: 1993-01 Impact factor: 2.395

3. A method for estimating the numbers of synonymous and nonsynonymous substitutions per site.

Authors: J M Comeron
Journal: J Mol Evol Date: 1995-12 Impact factor: 2.395

4. Estimating synonymous and nonsynonymous substitution rates.

Authors: S V Muse
Journal: Mol Biol Evol Date: 1996-01 Impact factor: 16.240

5. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome.

Authors: S V Muse; B S Gaut
Journal: Mol Biol Evol Date: 1994-09 Impact factor: 16.240

6. A codon-based model of nucleotide substitution for protein-coding DNA sequences.

Authors: N Goldman; Z Yang
Journal: Mol Biol Evol Date: 1994-09 Impact factor: 16.240

Review 7. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes.

Authors: W H Li; C I Wu; C C Luo
Journal: Mol Biol Evol Date: 1985-03 Impact factor: 16.240

8. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees.

Authors: K Tamura; M Nei
Journal: Mol Biol Evol Date: 1993-05 Impact factor: 16.240

9. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.

Authors: M Hasegawa; H Kishino; T Yano
Journal: J Mol Evol Date: 1985 Impact factor: 2.395

10. Statistical methods for detecting molecular adaptation.

Authors:
Journal: Trends Ecol Evol Date: 2000-12-01 Impact factor: 17.712

18 in total

1. Genetic structure and contrasting selection pattern at two major histocompatibility complex genes in wild house mouse populations.

Authors: D Cížková; J Gouy de Bellocq; S J E Baird; J Piálek; J Bryja
Journal: Heredity (Edinb) Date: 2010-09-08 Impact factor: 3.821

2. Correlation between Ka/Ks and Ks is related to substitution model and evolutionary lineage.

Authors: Jun Li; Zhang Zhang; Søren Vang; Jun Yu; Gane Ka-Shu Wong; Jun Wang
Journal: J Mol Evol Date: 2009-03-24 Impact factor: 2.395

3. Investigation of the evolutionary development of the genus Bifidobacterium by comparative genomics.

Authors: Gabriele Andrea Lugli; Christian Milani; Francesca Turroni; Sabrina Duranti; Chiara Ferrario; Alice Viappiani; Leonardo Mancabelli; Marta Mangifesta; Bernard Taminiau; Véronique Delcenserie; Douwe van Sinderen; Marco Ventura
Journal: Appl Environ Microbiol Date: 2014-08-08 Impact factor: 4.792

4. Who's who in Magelona: phylogenetic hypotheses under Magelonidae Cunningham & Ramage, 1888 (Annelida: Polychaeta).

Authors: Kate Mortimer; Kirk Fitzhugh; Ana Claudia Dos Brasil; Paulo Lana
Journal: PeerJ Date: 2021-09-21 Impact factor: 2.984

5. Evolutionary constraints and expression analysis of gene duplications in Rhodobacter sphaeroides 2.4.1.

Authors: Anne E Peters; Anish Bavishi; Hyuk Cho; Madhusudan Choudhary
Journal: BMC Res Notes Date: 2012-04-25

6. Nonsynonymous substitution rate (Ka) is a relatively consistent parameter for defining fast-evolving and slow-evolving protein-coding genes.

Authors: Dapeng Wang; Fei Liu; Lei Wang; Shi Huang; Jun Yu
Journal: Biol Direct Date: 2011-02-22 Impact factor: 4.540

7. Challenges to the common dogma.

Authors: Jun Yu
Journal: Genomics Proteomics Bioinformatics Date: 2012-06-09 Impact factor: 7.691

8. Divergent Evolutionary Pattern of Sugar Transporter Genes is Associated with the Difference in Sugar Accumulation between Grasses and Eudicots.

Authors: Wei Wang; Hui Zhou; Baiquan Ma; Albert Owiti; Schuyler S Korban; Yuepeng Han
Journal: Sci Rep Date: 2016-06-30 Impact factor: 4.379

9. Gamma-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates.

Authors: Da-Peng Wang; Hao-Lei Wan; Song Zhang; Jun Yu
Journal: Biol Direct Date: 2009-06-16 Impact factor: 4.540

10. Does the genetic code have a eukaryotic origin?

Authors: Zhang Zhang; Jun Yu
Journal: Genomics Proteomics Bioinformatics Date: 2013-01-20 Impact factor: 7.691