| Literature DB >> 16473847 |
David H Ardell1, Siv G E Andersson.
Abstract
We present TFAM, an automated, statistical method to classify the identity of tRNAs. TFAM, currently optimized for bacteria, classifies initiator tRNAs and predicts the charging identity of both typical and atypical tRNAs such as suppressors with high confidence. We show statistical evidence for extensive variation in tRNA identity determinants among bacterial genomes due to variation in overall tDNA base content. With TFAM we have detected the first case of eukaryotic-like tRNA identity rules in bacteria. An alpha-proteobacterial clade encompassing Rhizobiales, Caulobacter crescentus and Silicibacter pomeroyi, unlike a sister clade containing the Rickettsiales, Zymomonas mobilis and Gluconobacter oxydans, uses the eukaryotic identity element A73 instead of the highly conserved prokaryotic element C73. We confirm divergence of bacterial histidylation rules by demonstrating perfect covariation of alpha-proteobacterial tRNA(His) acceptor stems and residues in the motif IIb tRNA-binding pocket of their histidyl-tRNA synthetases (HisRS). Phylogenomic analysis supports lateral transfer of a eukaryotic-like HisRS into the alpha-proteobacteria followed by in situ adaptation of the bacterial tDNA(His) and identity rule divergence. Our results demonstrate that TFAM is an effective tool for the bioinformatics, comparative genomics and evolutionary study of tRNA identity.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16473847 PMCID: PMC1363771 DOI: 10.1093/nar/gkj449
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Resubstitution analysis of TFAM performance with the MSDB
| Model | N | Sn | Sp | Seq1 | Seq2 |
|---|---|---|---|---|---|
| A | 52 | 100 | 91 | A: 33 [48,5] | V: −3 [21,−26] |
| A w/o ac | 52 | 98 | 91 | A: 29 [44,3] | V: −1 [22,−24] |
| C | 13 | 100 | 92 | C: 24 [37,0] | S: −12 [5,−34] |
| C w/o ac | 13 | 100 | 92 | C: 21 [33,−3] | S: −12 [8,−35] |
| D | 21 | 100 | 100 | D: 20 [37,4] | E: −3 [7,−11] |
| D w/o ac | 21 | 100 | 100 | D: 17 [34,1] | E: −4 [6,−12] |
| E | 23 | 100 | 100 | E: 38 [54,15] | D: −11 [3,−37] |
| E w/o ac | 23 | 100 | 100 | E: 34 [51,11] | D: −11 [3,−37] |
| F | 22 | 100 | 100 | F: 23 [30,11] | C: −16 [−1,−33] |
| F w/o ac | 22 | 100 | 100 | F: 19 [26,7] | K: −11 [−1,−23] |
| G | 45 | 100 | 100 | G: 26 [46,6] | W: −14 [−2,−30] |
| G w/o ac | 45 | 100 | 100 | G: 22 [42,2] | H: −11 [0,−19] |
| H | 15 | 100 | 100 | H: 23 [36,12] | Q: −14 [−7,−22] |
| H w/o ac | 15 | 100 | 100 | H: 19 [32,8] | Q: −15 [−8,−23] |
| I | 61 | 96 | 95 | I: 27 [38,−25] | K: −10 [0,−35] |
| I w/o ac | 61 | 95 | 95 | I: 24 [35,−27] | R: −6 [8,−37] |
| K | 22 | 100 | 95 | K: 16 [28,7] | N: −10 [−1,−22] |
| K w/o ac | 22 | 100 | 95 | K: 13 [25,4] | F: −7 [0,−20] |
| L | 63 | 100 | 100 | L: 61 [79,30] | Y: −13 [4,−54] |
| L w/o ac | 63 | 100 | 100 | L: 59 [77,28] | Y: −10 [7,−51] |
| M | 14 | 92 | 100 | M: 13 [18,−7] | X: −10 [−4,−14] |
| M w/o ac | 14 | 78 | 100 | M: 10 [15,−10] | X: −13 [−7,−17] |
| N | 21 | 100 | 100 | N: 20 [32,1] | K: −12 [−4,−23] |
| N w/o ac | 21 | 95 | 95 | N: 16 [29,−2] | K: −12 [−5,−23] |
| P | 28 | 100 | 100 | P: 29 [36,6] | X: −12 [−4,−23] |
| P w/o ac | 28 | 100 | 100 | P: 25 [34,4] | X: −7 [0,−17] |
| Q | 19 | 100 | 100 | Q: 30 [39,0] | H: −16 [−5,−33] |
| Q w/o ac | 19 | 100 | 100 | Q: 27 [36,−2] | C: −15 [−6,−29] |
| R | 48 | 100 | 100 | R: 20 [32,5] | N: −8 [−2,−29] |
| R w/o ac | 48 | 100 | 96 | R: 18 [25,3] | N: −4 [1,−25] |
| S | 53 | 100 | 100 | S: 77 [104,44] | Y: 0 [18,−23] |
| S w/o ac | 53 | 100 | 100 | S: 74 [101,44] | Y: 0 [19,−22] |
| T | 43 | 100 | 100 | T: 17 [24,6] | N: −8 [4,−14] |
| T w/o ac | 43 | 100 | 93 | T: 14 [21,4] | N: −7 [5,−13] |
| V | 26 | 73 | 100 | V: 14 [25,0] | A: 4 [14,−13] |
| V w/o ac | 26 | 69 | 94 | V: 11 [22,−1] | A: 4 [14,−13] |
| W | 18 | 100 | 100 | W: 23 [30,1] | G: −17 [−2,−33] |
| W w/o ac | 18 | 94 | 100 | W: 20 [26,−2] | G: −17 [−1,−30] |
| X | 28 | 100 | 100 | X: 46 [51,29] | M:−28 [−8,−48] |
| X w/o ac | 28 | 100 | 100 | X: 42 [47,25] | P:−23 [−9,−54] |
| Y | 20 | 100 | 100 | Y: 45 [64,31] | S:7 [33,−11] |
| Y w/o ac | 20 | 100 | 100 | Y: 41 [60,28] | S:9 [35,−8] |
| Average | 655 | 98.1 | 98.7 | ||
| Average w/o ac | 655 | 96.6 | 97.7 |
aTraining set size.
bModel percent sensitivity.
cModel percent specificity.
dSequence class with highest median score against the model and its distribution: median and range.
eSequence class with second-highest median score against the model and its distribution: median and range.
fSequences are scored against the model including the anticodon positions.
gw/o ac = without anticodon. Sequences are scored against the model excluding the anticodon positions.
htRNAiMet.

Box-and-whisker plots of tFAM scores by eubacterial taxon, after standardization of TFAM scores by their tDNA identity class. Boxes show the interquartile range and median. Whiskers extend to 1.5 times the interquartile range. Plusses (+) show outliers.
Scores of eubacterial tDNAs (60 genomes) with MSDB TFAMs
| Taxa | tRNA | N | His-tFAM | Other tFAM |
|---|---|---|---|---|
| RC | His | 7 | −10 [−8, −21] | −4 [−2, −12] |
| Other | His | 62 | 15 [28, −11] | −10 [−1, −18] |
| *** | NS | |||
| RC | Other | 340 | −34 [−11, −58] | 22 [90, −8] |
| Other | Other | 2786 | −34 [−3, −55] | 24 [110, −16] |
| NS | * |
aScore distribution of tRNAs against His-TFAM (median and range).
bMedian and range of maximum scores against all other tFAMs but the His-tFAM.
cRhizobiales + Caulobacter.
dMann–Whitney test, ***P < 0.001, *P < 0.05, NS, not significant.
eAll tDNAs except His, Sel-Cys, Pseudo and Undet (according to tRNAscan-SE).

tRNAHis acceptor stems (with adjacent 5′ leader bases) showing the different identity elements in the α-proteobacteria. The phylogeny is taken from (35,72). Also shown is a parsimony-based reconstruction of ancestral stems and leaders. The most obvious functionally significant differences are outlined in blue for the RCS-clade and in pink for the RZG-clade, to correspond to the covariation of HisRS shown in Figures 3 and 4. Abbreviations are as follows: Bh, Bartonella henselae; Bm, Brucella melitensis; Ml, Mesorhizobium loti; Sm, Sinorhizobium meliloti; At, Agrobacterium tumefaciens; Rp, Rhodopseudomonas palustris; Bj, Bradyrhizobium japonicum; Cc, Caulobacter crescentus; Sp, Silicibacter pomeroyi; Zm, Zymomonas mobilis; Go, Gluconobacter oxydans; Rc, Rickettsia conorii; Rp, Rickettsia prowazekii; wMel, Wolbachia strain wMel; wBm, Wolbachia strain wBm; Am, Anaplasma marginale; Er, Ehrlichia ruminantium.

Sequence logos of the motif IIb loop in bacterial HisRS. The cladogram at left indicates source and number of sequences. Within α-proteobacteria, Q224 covaries perfectly with tRNAHis C73 in Figure 2.

An unrooted protein likelihood HisRS/HisZ phylogram. Sequences are colored by taxonomic source: α-proteobacteria in red, other bacteria in green, eukaryotes in gold and archaea in blue. HisZ sequences are in black. The RCS-clade is shaded in blue and the RZG-clade is shaded in pink, to correspond to the covariation with tRNAHis acceptor stems shown in Figure 2. The tree topology shows the consensus of 100 bootstrapped PHYML likelihood trees, where edges of less than 70% bootstrap support with likelihood were collapsed. Splits with two support values show likelihood and BIONJ percent bootstrap values in that order. Single split support values indicate support by likelihood only. Units of branch length are substitutions per site.