| Literature DB >> 14519199 |
L Aravind1, Lakshminarayan M Iyer, Vivek Anantharaman.
Abstract
BACKGROUND: There is considerable heterogeneity in the phyletic patterns of major chromosomal DNA-binding proteins in archaea. Alba is a well-characterized chromosomal protein from the crenarchaeal genus Sulfolobus. While Alba has been detected in most archaea and some eukaryotic taxa, its exact functions in these taxa are not clear. Here we use comparative genomics and sequence profile analysis to predict potential alternative functions of the Alba proteins.Entities:
Mesh:
Substances:
Year: 2003 PMID: 14519199 PMCID: PMC328453 DOI: 10.1186/gb-2003-4-10-r64
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Multiple sequence alignment of the Alba family was constructed using the T-Coffee program after parsing high-scoring pairs from PSI-BLAST search results. The alignment of the Alba superfamily is superimposed on the structure-based alignments with members of the IF3-C and YhbY superfamilies. The secondary structure, derived from the representatives of these superfamilies with available structures, is shown above the alignment, with E representing a β strand and H an α-helix. The box shows the shared loop, which is typically bounded by small residues in Alba and YhbY. The 95% consensus shown below each family was derived using the following amino acid classes: hydrophobic (h: ALICVMYFW, yellow shading) and their aliphatic subset (l: ILV, yellow shading); aromatic (a: FHWY, yellow shading); small (s: ACDGNPSTV, green) and their tiny subset (u: GAS, green shading); charged (c: DEHKR, pink) and their basic subset (+: HKR, pink) and acidic subset (-: DE, pink); and polar (p: CDEHKNQRST, blue) and their alcoholic subset (o: ST, blue). A 'G', 'K' or 'E' shows the completely conserved amino acid in that group. The position corresponding to the lysine, which is acetylated in Alba, is shown in red. The limits of the domains are indicated by the residue positions, on each end of the sequence. The 'R' to the right of the sequence denotes those members that have the RNA-binding 'RGG' repeats at the C terminus. The numbers within the alignment are non-conserved inserts that have not been shown. The sequences are denoted by their gene name followed by the species abbreviation and GeneBank Identifier. Species abbreviations are as follows: Af, Archaeoglobus fulgidus; Aga, Anopheles gambiae; Ana, Anabaena sp.; Ap, Aeropyrum pernix; At, Arabidopsis thaliana; Bsub, Bacillus subtilis; Cace, Clostridium acetobutylicum; Ce, Caenorhabditis elegans; Dm, Drosophila melanogaster; Ec, Escherichia coli; Fac, Ferroplasma acidarmanus; Gste, Geobacillus stearothermophilus; Hs, Homo sapiens; Linf, Leishmania infantum; Mjan, Methanococcus jannaschii; Mka, Methanopyrus kandleri; Mm, Mus musculus; Mta, Methanothermobacter thermautotrophicus; Nm, Neisseria meningitidis; Osa, Oryza sativa; Pa, Pyrococcus abyssi; Pfa, Plasmodium falciparum; Pfu, Pyrococcus furiosus; Ph, Pyrococcus horikoshii; Pyae, Pyrobaculum aerophilum; Pyo, Plasmodium yoelii; Rn, Rattus norvegicus; Sc, Saccharomyces cerevisiae; Slem, Stylonychia lemnae; Sp, Schizosaccharomyces pombe; Ssh, Sulfolobus shibatae; Sso, Sulfolobus solfataricus; Sst, Sulfolobus tokodaii; Tac, Thermoplasma acidophilum.
Figure 2Phylogenetic trees of the Alba and YhbY superfamilies and an evolutionary scheme of the various superfamilies within the IF3-C fold. (a) Phylogenetic tree of the Alba superfamily. The tree shown here was constructed using the maximum-likelihood optimization as described in the Materials and methods section. The black circles indicate nodes with Rell-BP support of 80% or greater. The proteins are named as described in the legend to Figure 1. The divergent versions of Sso10b from the archaeal family of the Alba superfamily are indicated by a red circle. (b) Phylogenetic tree of the YhbY superfamily. Species abbreviations are as in the legend to Figure 1; additional species abbreviations are as follows: Blo, Bifidobacterium longum; Cau, Chloroflexus aurantiacus; Fnu, Fusobacterium nucleatum; Gme, Geobacter metallireducens; Hi, Haemophilus influenzae; Hsp, Halobacterium sp.; Lla, Lactococcus lactis; Pae, Pseudomonas aeruginosa; Spy, Streptococcus pyogenes; Vch, Vibrio cholerae; Xfa, Xylella fastidiosa. (c) An evolutionary scheme of the various superfamilies within the IF3-C fold. The horizontal lines indicate temporal epochs corresponding to certain major transitions in evolution, such as the last common ancestor of extant cellular life forms (LUCA), the divergence between the archaeo-eukaryotic lineage and the bacterial lineage and, finally, the emergence of the extant eukaryotes. The known or clearly predicted biochemical functions of the various superfamilies of this fold and their phyletic patterns have been indicated along with their names. R stands for RNA binding and D for DNA binding. The '>' as in B>E in the phyletic patterns, indicates an ancient transfer from bacteria to eukaryotes. The overall topology of the phylogram was derived using clustering based on DALI Z-scores, and specific shared derived characters. The YhhP family was first identified and predicted to function as a redox regulator in reference [52]. The red lineages are those which can be confidently traced to the LUCA, the black lineages are exclusively archaeo-eukaryotic and the green lineages are mainly bacterial with transfers to eukaryotes.
Figure 3A ribbons representation of the IF3-C fold in the Alba, YhbY, IF3 proteins. The shared loop seen in YhbY and Alba (Figure 1) is in orange. The long carboxy-terminal β-hairpin is unique to Alba and is prone to great variability.