| Literature DB >> 19455208 |
Laurent Gatto1, Daniele Catanzaro, Michel C Milinkovitch.
Abstract
The General Time Reversible (GTR) model of nucleotide substitution is at the core of many distance-based and character-based phylogeny inference methods. The procedure described by Waddell and Steel (1997), for estimating distances and instantaneous substitution rate matrices, R, under the GTR model, is known to be inapplicable under some conditions, ie, it leads to the inapplicability of the GTR model. Here, we simulate the evolution of DNA sequences along 12 trees characterized by different combinations of tree length, (non-)homogeneity of the substitution rate matrix R, and sequence length. We then evaluate both the frequency of the GTR model inapplicability for estimating distances and the accuracy of inferred alignments. Our results indicate that, inapplicability of the Waddel and Steel's procedure can be considered a real practical issue, and illustrate that the probability of this inapplicability is a function of substitution rates and sequence length.We also discuss the implications of our results on the current implementations of maximum likelihood and Bayesian methods.Entities:
Keywords: GTR model; homogeneity; nucleotide substitution; phylogeny inference; simulations
Year: 2007 PMID: 19455208 PMCID: PMC2674669
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1Tree topology along which the sequences have been simulated. Four different tree lengths have been analyzed. The trees are described by giving the length of branches 1 to 6: tree T0 = {1, 1, 2, 2, 2, 2}; tree T1 = {2, 2, 4, 4, 4, 4}; tree T2 = {2, 2, 8, 4, 8, 4} and tree T3 = {5, 4, 8, 12, 10, 15}.
Frequency of observing at least one negative eigenvalue for each pairwise sequence comparison (i vs j) and across all sequence comparisons (total). Values are color-coded as follows: , 0 < , 0.5 ≤ ne < 0.8 and 0.8 ≤ .
| 200 | 200 extracted | 1000 | 200 with indel | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| T0 | T1 | T2 | T3 | T0 | T1 | T2 | T3 | T0 | T1 | T2 | T3 | T0 | T1 | T2 | T3 | ||
| S1 | 3 vs 4 | 0 | 0.05 | 0.30 | 0.62 | 0 | 0.01 | 0.37 | 0.62 | 0 | 0 | 0.06 | 0.43 | 0 | 0.03 | 0.31 | 0.65 |
| 3 vs 5 | 0.02 | 0.33 | 0.69 | 0.78 | 0 | 0.25 | 0.65 | 0.78 | 0 | 0.07 | 0.43 | 0.74 | 0 | 0.27 | 0.61 | 0.71 | |
| 3 vs 6 | 0 | 0.26 | 0.53 | 0.75 | 0 | 0.28 | 0.51 | 0.78 | 0 | 0.06 | 0.21 | 0.72 | 0 | 0.35 | 0.53 | 0.68 | |
| 4 vs 5 | 0 | 0.32 | 0.49 | 0.79 | 0.01 | 0.23 | 0.53 | 0.76 | 0 | 0.02 | 0.31 | 0.72 | 0 | 0.29 | 0.47 | 0.83 | |
| 4 vs 6 | 0 | 0.26 | 0.28 | 0.79 | 0 | 0.19 | 0.34 | 0.76 | 0 | 0.05 | 0.04 | 0.75 | 0 | 0.27 | 0.23 | 0.74 | |
| 5 vs 6 | 0 | 0.02 | 0.26 | 0.71 | 0 | 0.04 | 0.29 | 0.70 | 0 | 0 | 0.01 | 0.62 | 0 | 0.03 | 0.24 | 0.78 | |
| S2 | 3 vs 4 | 0 | 0.07 | 0.28 | 0.65 | 0 | 0.01 | 0.26 | 0.64 | 0 | 0 | 0.03 | 0.44 | 0 | 0.06 | 0.42 | 0.65 |
| 3 vs 5 | 0 | 0.29 | 0.73 | 0.79 | 0.02 | 0.38 | 0.59 | 0.74 | 0 | 0.09 | 0.43 | 0.70 | 0 | 0.26 | 0.70 | 0.77 | |
| 3 vs 6 | 0 | 0.24 | 0.56 | 0.79 | 0 | 0.33 | 0.50 | 0.87 | 0 | 0.05 | 0.27 | 0.69 | 0 | 0.35 | 0.59 | 0.82 | |
| 4 vs 5 | 0 | 0.23 | 0.55 | 0.76 | 0 | 0.27 | 0.49 | 0.74 | 0 | 0.02 | 0.19 | 0.69 | 0.01 | 0.20 | 0.45 | 0.77 | |
| 4 vs 6 | 0 | 0.31 | 0.23 | 0.82 | 0 | 0.32 | 0.35 | 0.84 | 0 | 0.03 | 0.02 | 0.85 | 0 | 0.21 | 0.27 | 0.79 | |
| 5 vs 6 | 0 | 0.03 | 0.24 | 0.69 | 0 | 0.03 | 0.33 | 0.76 | 0 | 0 | 0.04 | 0.64 | 0 | 0.02 | 0.33 | 0.76 | |
| S3 | 3 vs 4 | 0 | 0.04 | 0.32 | 0.67 | 0 | 0.03 | 0.26 | 0.64 | 0 | 0 | 0.06 | 0.43 | 0 | 0.06 | 0.35 | 0.69 |
| 3 vs 5 | 0 | 0.37 | 0.68 | 0.74 | 0.01 | 0.34 | 0.65 | 0.82 | 0 | 0.05 | 0.48 | 0.68 | 0.01 | 0.24 | 0.66 | 0.77 | |
| 3 vs 6 | 0 | 0.28 | 0.58 | 0.99 | 0 | 0.29 | 0.62 | 0.99 | 0 | 0.03 | 0.32 | 0.99 | 0 | 0.24 | 0.61 | 0.98 | |
| 4 vs 5 | 0 | 0.32 | 0.55 | 0.79 | 0.01 | 0.36 | 0.57 | 0.83 | 0 | 0.03 | 0.23 | 0.83 | 0 | 0.26 | 0.49 | 0.81 | |
| 4 vs 6 | 0 | 0.31 | 0.26 | 1 | 0 | 0.25 | 0.22 | 0.98 | 0 | 0.02 | 0 | 1 | 0 | 0.22 | 0.23 | 0.98 | |
| 5 vs 6 | 0 | 0.04 | 0.29 | 0.99 | 0 | 0 | 0.15 | 0.99 | 0 | 0 | 0.02 | 0.96 | 0 | 0.02 | 0.25 | 0.96 | |
Percentage and standard deviation of mean pairwise divergence among tip sequences.
| T0 | T1 | T2 | T3 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| S1 | 200 | 31 | 43 | 46 | 51 | ||||
| 200 ext | 32 | 42 | 47 | 51 | |||||
| 1000 | 31 | 43 | 47 | 51 | |||||
| S2 | 200 | 30 | 43 | 47 | 52 | ||||
| 200 ext | 31 | 43 | 47 | 52 | |||||
| 1000 | 31 | 43 | 47 | 52 | |||||
| S3 | 200 | 32 | 47 | 50 | 61 | ||||
| 200 ext | 33 | 46 | 50 | 61 | |||||
| 1000 | 33 | 46 | 51 | 61 | |||||
| average | 31.7 | 43.9 | 47.9 | 54.6 | |||||
Accuracy of sequence alignments using ClustalW (Thompson et al. 1994). Frequency (f) of wrong alignments, mean ( ) and standard deviation (CS) of the CS scores (100 simulations). Values are color-coded as follows: , 0 < and 0.9 < .
| T0 | T1 | T2 | T3 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| f | f | F | f | ||||||||||
| S1 | 200 | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0.02 | 0.999 | 0.006 | 0.02 | 0.999 | 0.005 |
| 200ext | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0.01 | 1.000 | 0.004 | 0.05 | 0.997 | 0.013 | |
| 1000 | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0.08 | 0.998 | 0.008 | |
| 200ID | 0.17 | 0.985 | 0.011 | 1 | 0.952 | 0.028 | 0.99 | 0.958 | 0.025 | 1 | 0.928 | 0.033 | |
| S2 | 200 | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0.03 | 0.998 | 0.010 |
| 200ext | 0 | 1.000 | 0.000 | 0.01 | 0.999 | 0.009 | 0 | 1.000 | 0.000 | 0.04 | 0.998 | 0.010 | |
| 1000 | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0 | 1.000 | 0.000 | 0.07 | 0.999 | 0.005 | |
| 200ID | 0.22 | 0.988 | 0.009 | 0.98 | 0.966 | 0.020 | 0.97 | 0.956 | 0.026 | 1 | 0.907 | 0.044 | |
| S3 | 200 | 0 | 1.000 | 0.000 | 0.1 | 0.991 | 0.031 | 0.05 | 0.996 | 0.019 | 1 | 0.127 | 0.175 |
| 200ext | 0 | 1.000 | 0.000 | 0.09 | 0.995 | 0.016 | 0.07 | 0.996 | 0.018 | 1 | 0.147 | 0.215 | |
| 1000 | 0 | 1.000 | 0.000 | 0.18 | 0.996 | 0.010 | 0.18 | 0.996 | 0.010 | 1 | 0.069 | 0.083 | |
| 200ID | 0.07 | 0.974 | 0.016 | 0.9 | 0.969 | 0.033 | 1 | 0.914 | 0.058 | 1 | 0.123 | 0.160 | |
Percentage and standard deviation of identical columns in the multiple alignments.
| T0 | T1 | T2 | T3 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| S1 | 200 | 45 | 25 | 20 | 15 | ||||
| 200 ext | 43 | 26 | 20 | 15 | |||||
| 1000 | 44 | 26 | 20 | 15 | |||||
| S2 | 200 | 46 | 26 | 19 | 15 | ||||
| 200 ext | 44 | 26 | 20 | 15 | |||||
| 1000 | 44 | 26 | 20 | 15 | |||||
| S3 | 200 | 44 | 23 | 18 | 13 | ||||
| 200 ext | 43 | 24 | 18 | 13 | |||||
| 1000 | 43 | 24 | 18 | 13 | |||||
| average | 44 | 25 | 19 | 14 | |||||