| Literature DB >> 22111761 |
Maximilian P Nesnidal1, Martin Helmkampf, Iris Bruchhaus, Bernhard Hausdorf.
Abstract
BACKGROUND: The phylogenetic relationships of the lophophorate lineages, ectoprocts, brachiopods and phoronids, within Lophotrochozoa are still controversial. We sequenced an additional mitochondrial genome of the most species-rich lophophorate lineage, the ectoprocts. Although it is known that there are large differences in the nucleotide composition of mitochondrial sequences of different lineages as well as in the amino acid composition of the encoded proteins, this bias is often not considered in phylogenetic analyses. We applied several approaches for reducing compositional bias and saturation in the phylogenetic analyses of the mitochondrial sequences.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22111761 PMCID: PMC3285623 DOI: 10.1186/1471-2164-12-572
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Structure of the mitochondrial genome of (GenBank accession number JQ061319). The arrows indicate the direction of transcription. Numbers indicate noncoding nucleotides between genes (negative values refer to gene overlaps). The tRNA genes are named using single-letter amino acid abbreviations. Those coding for leucine, serine and tryptophan are named L1 for the tRNALeu(CUN) (anticodon UAG) gene, L2 for the tRNALeu(UUR) (anticodon UAA) gene, S1 for the tRNASer(AGN) (anticodon UCU) gene, S2 for the tRNASer(UCN) (anticodon UGA) gene, and W1 for the tRNATrp(UGR) (anticodon UCA) gene and W2 for the tRNATrp(UGR) (anticodon UCA) gene. The genomic features are described in the table on the right. a: Start and end positions of rRNA genes and MNCR determined by boundaries of adjacent genes. b: Incomplete termination codon, which is probably extended by post-transcriptional adenylation.
Figure 2Putative secondary structures of the 23 tRNAs identified in the mitochondrial genome of . Bars indicate Watson-Crick base pairings, and crosses between G and U pairs mark canonical base pairings appearing in RNA.
Figure 3Comparison of the arrangement of the mitochondrial genes of representatives of ectoprocts, entoprocts, brachiopods, phoronids, and molluscs. The arrows indicate the direction of transcription. Gene and genome size are not to scale.
Breakpoint distance matrix between orders of mitochondrial protein coding genes and rDNAs of representatives of ectoprocts, entoprocts, brachiopods, phoronids, and molluscs.
| Taxa | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 12 | 12 | 13 | 12 | 12 | 12 | 14 | 12 | 12 | |
| 12 | 0 | 12 | 12 | 10 | 9 | 14 | 13 | 9 | 9 | |
| 12 | 12 | 0 | 13 | 14 | 14 | 14 | 15 | 14 | 14 | |
| 13 | 12 | 13 | 0 | 13 | 13 | 14 | 15 | 13 | 13 | |
| 12 | 10 | 14 | 13 | 0 | 5 | 14 | 13 | 7 | 4 | |
| 12 | 9 | 14 | 13 | 5 | 0 | 15 | 13 | 4 | 2 | |
| 12 | 14 | 14 | 14 | 14 | 15 | 0 | 15 | 15 | 15 | |
| 14 | 13 | 15 | 15 | 13 | 13 | 15 | 0 | 14 | 13 | |
| 12 | 9 | 14 | 13 | 7 | 4 | 15 | 14 | 0 | 3 | |
| 12 | 9 | 14 | 13 | 4 | 2 | 15 | 13 | 3 | 0 |
Nucleotide composition and AT-and CG-skews of mitochondrial genomes.
| Taxon | Length (bp) | A | C | G | T | AT% | AT skew | GC skew |
|---|---|---|---|---|---|---|---|---|
| 18338 | 0.251 | 0.137 | 0.242 | 0.370 | 62.0% | -0.192 | 0.277 | |
| 17443 | 0.269 | 0.169 | 0.212 | 0.349 | 61.9% | -0.129 | 0.112 | |
| 15708 | 0.251 | 0.314 | 0.171 | 0.264 | 51.4% | -0.026 | -0.295 | |
| 15719 | 0.295 | 0.205 | 0.170 | 0.330 | 62.5% | -0.057 | -0.091 | |
| 16005 | 0.264 | 0.116 | 0.156 | 0.464 | 72.8% | -0.274 | 0.149 | |
| 16569 | 0.309 | 0.313 | 0.131 | 0.247 | 55.6% | 0.112 | -0.410 | |
| 17553 | 0.331 | 0.235 | 0.135 | 0.300 | 63.0% | 0.049 | -0.270 | |
| 11423 | 0.394 | 0.147 | 0.125 | 0.334 | 72.8% | 0.081 | -0.082 | |
| 11905 | 0.364 | 0.182 | 0.167 | 0.286 | 65.0% | 0.120 | -0.044 | |
| 14919 | 0.303 | 0.144 | 0.165 | 0.388 | 69.1% | -0.123 | 0.068 | |
| 13794 | 0.314 | 0.089 | 0.149 | 0.448 | 76.2% | -0.175 | 0.253 | |
| 16706 | 0.405 | 0.230 | 0.097 | 0.265 | 67.0% | 0.209 | -0.405 | |
| 14411 | 0.320 | 0.086 | 0.173 | 0.421 | 74.1% | -0.135 | 0.334 | |
| 14985 | 0.375 | 0.227 | 0.097 | 0.301 | 67.6% | 0.111 | -0.399 | |
| 14215 | 0.416 | 0.172 | 0.106 | 0.369 | 78.6% | 0.059 | -0.235 | |
| 14747 | 0.298 | 0.199 | 0.180 | 0.323 | 62.1% | -0.041 | -0.049 | |
| 15695 | 0.369 | 0.204 | 0.117 | 0.310 | 67.9% | 0.087 | -0.269 | |
| 15101 | 0.358 | 0.182 | 0.131 | 0.330 | 68.8% | 0.041 | -0.163 | |
| 15984 | 0.353 | 0.167 | 0.127 | 0.354 | 70.6% | -0.001 | -0.136 | |
| 15205 | 0.348 | 0.246 | 0.130 | 0.276 | 62.4% | 0.114 | -0.308 | |
| 15881 | 0.398 | 0.185 | 0.098 | 0.319 | 71.7% | 0.109 | -0.305 | |
| 14407 | 0.293 | 0.097 | 0.197 | 0.411 | 70.4% | -0.166 | 0.341 | |
| 13588 | 0.191 | 0.080 | 0.250 | 0.479 | 67.1% | -0.430 | 0.515 | |
| 14085 | 0.249 | 0.084 | 0.206 | 0.461 | 71.0% | -0.299 | 0.422 | |
| 14862 | 0.412 | 0.148 | 0.118 | 0.322 | 73.4% | 0.123 | -0.111 | |
| 15323 | 0.392 | 0.163 | 0.131 | 0.314 | 70.6% | 0.110 | -0.108 | |
| 13026 | 0.271 | 0.235 | 0.176 | 0.318 | 58.9% | -0.079 | -0.142 | |
| 14144 | 0.364 | 0.163 | 0.131 | 0.342 | 70.6% | 0.030 | -0.108 | |
| 15433 | 0.377 | 0.176 | 0.124 | 0.323 | 70.0% | 0.078 | -0.173 | |
| 16089 | 0.248 | 0.114 | 0.222 | 0.417 | 66.5% | -0.254 | 0.321 | |
| 14018 | 0.334 | 0.168 | 0.166 | 0.332 | 66.6% | 0.002 | -0.005 | |
| 28818 | 0.261 | 0.161 | 0.219 | 0.359 | 62.0% | -0.158 | 0.153 | |
| 15451 | 0.295 | 0.277 | 0.151 | 0.277 | 57.2% | 0.033 | -0.294 | |
| 14017 | 0.208 | 0.151 | 0.265 | 0.375 | 58.4% | -0.286 | 0.272 | |
| 14291 | 0.199 | 0.134 | 0.275 | 0.392 | 59.1% | -0.328 | 0.344 | |
| 16296 | 0.275 | 0.102 | 0.148 | 0.474 | 74.9% | -0.266 | 0.182 | |
| 15388 | 0.213 | 0.119 | 0.224 | 0.445 | 65.7% | -0.352 | 0.306 | |
| 15502 | 0.268 | 0.297 | 0.161 | 0.274 | 54.2% | -0.013 | -0.297 | |
| 15538 | 0.330 | 0.195 | 0.133 | 0.343 | 67.2% | -0.020 | -0.188 | |
| 15113 | 0.315 | 0.235 | 0.144 | 0.305 | 62.0% | 0.016 | -0.240 | |
| 15619 | 0.312 | 0.204 | 0.154 | 0.329 | 64.1% | -0.026 | -0.141 | |
| 14998 | 0.298 | 0.225 | 0.158 | 0.318 | 61.6% | -0.031 | -0.176 | |
| 15532 | 0.314 | 0.119 | 0.186 | 0.380 | 69.4% | -0.095 | 0.220 | |
| 14492 | 0.370 | 0.132 | 0.127 | 0.371 | 74.1% | -0.002 | -0.021 | |
| 16258 | 0.337 | 0.285 | 0.119 | 0.258 | 59.6% | 0.133 | -0.412 | |
| 17211 | 0.388 | 0.195 | 0.092 | 0.325 | 71.3% | 0.089 | -0.358 | |
| 15744 | 0.411 | 0.176 | 0.076 | 0.337 | 74.9% | 0.099 | -0.397 | |
| 14189 | 0.274 | 0.183 | 0.205 | 0.337 | 61.1% | -0.103 | 0.056 | |
| 14117 | 0.286 | 0.154 | 0.182 | 0.377 | 66.3% | -0.137 | 0.085 | |
| 13670 | 0.331 | 0.113 | 0.141 | 0.416 | 74.6% | -0.114 | 0.110 | |
AT skew = (A%-T%)/(A%+T%); GC skew = (G%-C%)/(C%+G%); a partial; b repetitive
Nucleotide composition and AT- and GC-skews of the mitochondrial protein-encoding and ribosomal RNA genes and the entire Flustra foliacea genome.
| Gene | Proportion of nucleotides | AT% | AT skew | GC skew | |||
|---|---|---|---|---|---|---|---|
| A | G | C | T | ||||
| 0.213 | 0.225 | 0.123 | 0.439 | 65.2 | -0.347 | 0.293 | |
| 0.306 | 0.189 | 0.099 | 0.405 | 71.1 | -0.139 | 0.313 | |
| 0.227 | 0.219 | 0.135 | 0.419 | 64.6 | -0.297 | 0.237 | |
| 0.225 | 0.237 | 0.124 | 0.414 | 63.9 | -0.296 | 0.313 | |
| 0.196 | 0.266 | 0.110 | 0.426 | 62.2 | -0.370 | 0.415 | |
| 0.225 | 0.214 | 0.130 | 0.430 | 65.5 | -0.313 | 0.244 | |
| 0.226 | 0.217 | 0.103 | 0.454 | 68.0 | -0.335 | 0.356 | |
| 0.246 | 0.217 | 0.104 | 0.434 | 68.0 | -0.276 | 0.352 | |
| 0.177 | 0.234 | 0.105 | 0.484 | 66.1 | -0.464 | 0.381 | |
| 0.214 | 0.219 | 0.106 | 0.462 | 67.6 | -0.367 | 0.348 | |
| 0.212 | 0.242 | 0.072 | 0.474 | 68.6 | -0.382 | 0.541 | |
| 0.217 | 0.222 | 0.116 | 0.445 | 66.2 | -0.344 | 0.314 | |
| 0.187 | 0.224 | 0.085 | 0.503 | 69.0 | -0.458 | 0.450 | |
| 0.336 | 0.215 | 0.142 | 0.306 | 64.2 | 0.047 | 0.204 | |
| 0.357 | 0.197 | 0.115 | 0.331 | 68.8 | 0.038 | 0.263 | |
| Entire genome | 0.248 | 0.222 | 0.114 | 0.417 | 66.5 | -0.254 | 0.321 |
| Protein coding sequences | 0.219 | 0.224 | 0.114 | 0.442 | 66.1 | -0.337 | 0.325 |
| 1st codon position | 0.27 | 0.257 | 0.117 | 0.358 | 62.8 | -0.140 | 0.374 |
| 2nd codon position | 0.169 | 0.183 | 0.186 | 0.462 | 63.1 | -0.464 | -0.008 |
| 3rd codon position | 0.218 | 0.233 | 0.042 | 0.506 | 72.4 | -0.398 | 0.695 |
AT skew = (A%-T%)/(A%+T%); GC skew = (G%-C%)/(C%+G%)
Figure 4Comparison of codon family usage in ectoproct mtDNAs.
Phylogenetic relationships of ectoprocts, brachiopods and phoronids according to different phylogenetic analyses (only sister group relationships with one other phylum; more complex relationships are not considered).
| Method | Data set | Tree Figure | Ectoprocta+Phoronida | Ectoprocta+Entoprocta | Ectoprocta+Annelida | Ectoprocta+Gastropoda | Brachiopoda+Annelida | Phoronida+Nemertea | Phoronida+Entoprocta |
|---|---|---|---|---|---|---|---|---|---|
| Maximum-likelihood (MtZoa+F model) | Amino acid data set, with | Additional file | <50 | ||||||
| Maximum-likelihood (GTR model) | Nucleotide data set | Additional file | <50 | 86 | <50 | ||||
| Maximum-likelihood (GTR model) | Nucleotide data set (Gblocks edited) | Additional file | <50 | ||||||
| Maximum-likelihood (GTR model) | Nucleotide data set (direct nucleotide alignment) | Additional file | 99 | ||||||
| Maximum-likelihood (MtZoa+F model) | Amino acid data set | Additional file | 52 | 52 | |||||
| Maximum-likelihood (MtZoa+F model) | Amino acid data set (Gblocks edited) | Additional file | <50 | <50 | |||||
| Maximum-likelihood (GTR model) | 1st and 2nd codon positions | 5B | <50 | <50 | <50 | ||||
| nhPhyML | Nucleotide data set; starting tree GTR tree | Additional file | x | x | |||||
| nhPhyML | Nucleotide data set; starting tree CAT tree | Additional file | x | ||||||
| Bayesian (CAT model) | Amino acid data set | 5A | 0.84 | ||||||
| Bayesian (CAT model) | Amino acid data set; 10 taxa with the most strongly differing amino acid composition excluded | Additional file | 0.78 | 0.58 | |||||
| Maximum-likelihood (MtZoa+F model) | Amino acid data set; 10 taxa with the most strongly differing amino acid composition excluded | Additional file | <50 | <50 | |||||
| Bayesian (CAT model) | Amino acid data set recoded using 9 minmax chi-squared bins' | Additional file | 0.92 | ||||||
| Maximum-likelihood (MULTIGAMMA model) | Amino acid data set recoded using 9 minmax chi-squared bins | Additional file | 60 | ||||||
| Bayesian (CAT model) | Amino acid data set recoded using 6 minmax chi-squared bins | Additional file | 0.96 | ||||||
| Maximum-likelihood (MULTIGAMMA model) | Amino acid data set recoded 6 minmax chi-squared bins | Additional file | <50 | <50 | |||||
| Bayesian (CAT model) | Amino acid data set recoded using Dayhoff groups | Additional file | |||||||
| Maximum-likelihood (MULTIGAMMA model) | Amino acid data set recoded Dayhoff groups | Additional file | <50 | ||||||
| Bayesian (CAT+BP model) | Amino acid data set | Additional file | 0.63 | ||||||
| Maximum-likelihood (GTR model) | Nucleotide data set, 20% of the alignment positions with highest sitewise rates removed | Additional file | 98 | ||||||
| Maximum-likelihood (MtZoa+F model) | Amino acid data set; 10% of the alignment positions with highest sitewise rates removed | Additional file | <50 | 58 |
Unless noted otherwise, the analyses are based on alignments edited with ALISCORE and the nucleotide alignments are derived from the amino acid alignments. If a group is monophyletic, the posterior probability respectively the bootstrap support is given.
Figure 5Metazoan phylogeny based on mitochondrial sequences of 49 taxa. (A) Bayesian inference reconstructions calculated with the CAT model based on 2,729 amino acid positions. Bayesian posterior probabilities are shown to the right of the nodes; posterior probabilities equal to 1.0 are indicated by black circles. (B) Maximum likelihood tree calculated with the GTR model based on 7,537 nucleotides from first and second codon positions. Bootstrap support values larger than 50% are shown to the right of the nodes; 100% bootstrap values are indicated by black circles.
Results of posterior predictive tests indicating the ability of different approaches to reduce compositional bias in mitochondrial amino acid data sets.
| Approach | Remaining taxa | Number of taxa with significantly deviating amino acid composition | ||
|---|---|---|---|---|
| Original data set | 49 | 8.657 | 0.000 | 40 |
| Exclusion of the 10 taxa with the most strongly differing amino acid composition | 39 | 7.308 | 0.000 | 32 |
| Recoding using 9 minmax chi-squared bins | 49 | 8.690 | 0.003 | 38 |
| Recoding using 6 minmax chi-squared bins | 49 | 7.196 | 0.005 | 21 |
| Recoding using Dayhoff groups | 49 | 11.285 | 0.000 | 30 |