Literature DB >> 29208747

Proposed Role for KaiC-Like ATPases as Major Signal Transduction Hubs in Archaea.

Kira S Makarova1, Michael Y Galperin2, Eugene V Koonin2.   

Abstract

All organisms must adapt to ever-changing environmental conditions and accordingly have evolved diverse signal transduction systems. In bacteria, the most abundant networks are built around the two-component signal transduction systems that include histidine kinases and receiver domains. In contrast, eukaryotic signal transduction is dominated by serine/threonine/tyrosine protein kinases. Both of these systems are also found in archaea, but they are not as common and diversified as their bacterial and eukaryotic counterparts, suggesting the possibility that archaea have evolved other, still uncharacterized signal transduction networks. Here we propose a role for KaiC family ATPases, known to be key components of the circadian clock in cyanobacteria, in archaeal signal transduction. The KaiC family is notably expanded in most archaeal genomes, and although most of these ATPases remain poorly characterized, members of the KaiC family have been shown to control archaellum assembly and have been found to be a stable component of the gas vesicle system in Halobacteria Computational analyses described here suggest that KaiC-like ATPases and their homologues with inactivated ATPase domains are involved in many other archaeal signal transduction pathways and comprise major hubs of complex regulatory networks. We predict numerous input and output domains that are linked to KaiC-like proteins, including putative homologues of eukaryotic DEATH domains that could function as adapters in archaeal signaling networks. We further address the relationships of the archaeal family of KaiC homologues to the bona fide KaiC of cyanobacteria and implications for the existence of a KaiC-based circadian clock apparatus in archaea.IMPORTANCE Little is currently known about signal transduction pathways in Archaea Recent studies indicate that KaiC-like ATPases, known as key components of the circadian clock apparatus in cyanobacteria, are involved in the regulation of archaellum assembly and, likely, type IV pili and the gas vesicle system in Archaea We performed comprehensive comparative genomic analyses of the KaiC family. A vast protein interaction network was revealed, with KaiC family proteins as hubs for numerous input and output components, many of which are shared with two-component signal transduction systems. Putative KaiC-based signal transduction systems are predicted to regulate the activities of membrane-associated complexes and individual proteins, such as signal recognition particle and membrane transporters, and also could be important for oxidative stress response regulation. KaiC-centered signal transduction networks are predicted to play major roles in archaeal physiology, and this work is expected to stimulate their experimental characterization.

Entities:  

Keywords:  ATPase; Archaea; KaiC; circadian clock; signal transduction

Mesh:

Substances:

Year:  2017        PMID: 29208747      PMCID: PMC5717392          DOI: 10.1128/mBio.01959-17

Source DB:  PubMed          Journal:  mBio            Impact factor:   7.867


INTRODUCTION

Signal transduction systems are essential components of all forms of life and serve as information channels between the organism and the environment. The general organization of a signal transduction system includes at least three components, namely, a sensor (input), a transmitter, and an effector (output). Bacterial signal transduction is dominated by two-component systems that typically transmit information through histidine kinases, whereas eukaryotic signal transduction systems are dominated by Ser/Thr/Tyr protein kinases (1–6). Archaeal signal transduction systems have not been studied to a comparable extent (7), but studies support the existence of some archaea-encoded, complex signal transduction networks that mimic systems employed in bacteria and eukarya. Two-component systems homologous to bacterial counterparts have been experimentally characterized in Halobacteria that regulate complex protein-protein interaction networks that influence chemotaxis, phototaxis, and archaellum (archaeal rotary motor) activity (8). In the methanogenic archaeon Methanosaeta harundinacea, the histidine kinase FilI controls cell morphology and affects methane production (9). Phylogenetic analysis suggests that archaea acquired two-component signal transduction system components from bacteria on multiple occasions (10). This conclusion is compatible with the results of the reconstruction of the last archaeal common ancestor (LACA) on the basis of the arCOG (archaeal clusters of orthologous genes) database that estimated the probability of histidine kinases being present in the LACA at <0.2 (11). Notably, the two-component systems are mostly found in mesophilic archaea that appear to have captured numerous bacterial genes via horizontal gene transfer (HGT) (9, 10, 12–17) (Table 1). Protein phosphorylation, mostly attributed to Ser/Thr/Tyr (here, S/T) protein kinases, apparently plays an important role in archaea, but details about the specific roles of protein phosphorylation in signal transduction are scarce (7). Unlike histidine kinases, three S/T protein kinase families, RIO1, RIO2, and SPS1 (corresponding to COG1718, COG0478, and COG0515, respectively, in the Clusters of Orthologous Genes [COG] database [18]), have been traced back to the LACA (11). At least some of these kinases appear to be key regulators of the archaeal cell cycle, motility, and membrane remodeling (19). However, S/T kinases are not particularly prone to expansion in archaea (Table 1) and probably comprise only a limited part of the archaeal signal transduction networks.
TABLE 1 

Three major protein superfamilies involved in signal transduction in selected archaea and bacteria

GenomeNo. of proteins in the familya
Serine/threonine protein kinasesbKaiC-like ATPasescSensor histidine kinasesd
Archaea
    Aeropyrum pernix K1330
    Desulfurococcus kamchatkensis 1221n330
    Ignicoccus hospitalis KIN4 I530
    Hyperthermus butylicus DSM 5456450
    Pyrolobus fumarii 1A620
    Sulfolobus acidocaldarius DSM 639920
    Pyrobaculum aerophilum IM25120
    Archaeoglobus fulgidus DSM 430431414
    Halobacterium salinarum R161013
    Haloferax volcanii DS261123
    Methanothermobacter thermautotrophicus Delta H2616
    Methanocaldococcus jannaschii DSM 2661320
    Methanococcus maripaludis S2313
    Methanocella conradii HZ25431019
    Methanosarcina acetivorans C2A4753
    Methanosarcina mazei Go13734
    Pyrococcus furiosus COM14210
    Thermococcus kodakarensis KOD13321
    Thermoplasma acidophilum DSM 1728330
    Candidatus Korarchaeum cryptofilum OPF8230
    Nanoarchaeum equitans Kin4M210
    Nitrosoarchaeum koreensis MY13212
    “Candidatus Caldiarchaeum subterraneum”250
Bacteria
    Escherichia coli K-12 MG16552030
    Bacillus subtilis subsp. subtilis 1685036
    Nostoc sp. strain PCC 7120542139
    Thermus thermophilus HB85011

The numbers of respective proteins were taken from previous publications (5, 73) and/or retrieved from recent updates of the COG (18) and arCOG (66) databases. The data were verified by using PSI-BLAST searches against the complete-genome database (as of March 2016).

COG0478, COG0515, COG0661, COG1718, COG2112, and COG2766.

COG0467.

COG0642, COG0643, COG2205, COG2972, COG3275, COG3290, and COG3920.

Three major protein superfamilies involved in signal transduction in selected archaea and bacteria The numbers of respective proteins were taken from previous publications (5, 73) and/or retrieved from recent updates of the COG (18) and arCOG (66) databases. The data were verified by using PSI-BLAST searches against the complete-genome database (as of March 2016). COG0478, COG0515, COG0661, COG1718, COG2112, and COG2766. COG0467. COG0642, COG0643, COG2205, COG2972, COG3275, COG3290, and COG3920. Given the relative paucity of identifiable signal transduction systems in Archaea, a search for new, perhaps, archaea-specific signal transduction systems is an important goal. Our previous analyses of type IV pili systems and the archaellum identified several KaiC-like ATPases (members of the COG0467 family) that appear to be involved in the regulation of these systems (20). These observations prompted us to undertake a comprehensive analysis of the KaiC family in archaea. Because of the high similarity to the eukaryotic recombinase component Rad55 (homologue of bacterial RecA and archaeal RadA), until recently, the COG0467 family (18) in archaea has been implicated in DNA recombination pathways (21). One of the archaeal proteins, namely, the SSO2452 protein of Sulfolobus solfataricus, has been experimentally studied in this context and was shown not to be an active recombinase but could bind single-stranded DNA and inhibit D-loop formation by RadA (22). However, outside the Archaea, the best-studied protein in this family is the cyanobacterial circadian clock ATPase KaiC, which does not appear to be involved in DNA recombination (23, 24). The cyanobacterial circadian clock system, an ATP-dependent, posttranslational molecular oscillator, has been thoroughly characterized biochemically, structurally, and functionally (25–31). Typically, the system consists of three protein components, KaiA, KaiB, and KaiC (Fig. 1A). The KaiA protein forms a homodimer that interacts directly with the C-terminal ATPase domain (CII) of KaiC and promotes its phosphorylation. Structurally, KaiA is a two-domain protein with an N-terminal four-helix bundle domain and a C-terminal OmpR-like winged helix-turn-helix (HTH) DNA-binding domain. KaiB has the thioredoxin fold and interacts with the N-terminal (CI) domain of KaiC, promoting dissociation of KaiA and dephosphorylation of KaiC. The cyanobacterium Prochlorococcus marinus encodes a minimal circadian system that lacks KaiA but nevertheless shows some features of an autonomous oscillator that, however, does not persist long under constant-light conditions, so that the system apparently requires a reset each diel cycle (26, 32). However, even when all three components are present, this is not always sufficient to reproduce all of the canonical properties of a circadian clock, as is the case in the purple nonsulfur bacterium Rhodopseudomonas palustris, which only poorly maintains rhythmicity under constant conditions (33). Multiple input and output components have been shown to interact with the cyanobacterial circadian clock system, forming a complex, interconnected network that includes transcriptional regulators, receiver (REC) domains, and sensory histidine kinases, as well as light-sensitive redox molecules such as quinones (26). Some of the input and output proteins contain KaiA- or KaiB-like domains and directly interact with KaiC.
FIG 1 

Overview of the KaiC family. (A) Organization of the cyanobacterial circadian clock system. (B) Scheme of relationships of the KaiC family with other RecA-like ATPase families. (C) Known archaeal systems associated with KaiC-like proteins. KaiC family protein N- and C-terminal ATPase domains are red and pink, respectively. Genes are represented by arrows. For archaeal systems, arCOG numbers are shown below the arrows. Homologous genes are color coded. Models show interactions between subunits in the respective complexes (see the text for details and discussion).

Overview of the KaiC family. (A) Organization of the cyanobacterial circadian clock system. (B) Scheme of relationships of the KaiC family with other RecA-like ATPase families. (C) Known archaeal systems associated with KaiC-like proteins. KaiC family protein N- and C-terminal ATPase domains are red and pink, respectively. Genes are represented by arrows. For archaeal systems, arCOG numbers are shown below the arrows. Homologous genes are color coded. Models show interactions between subunits in the respective complexes (see the text for details and discussion). Phylogenetic analysis indicates that the COG0467 family forms a separate clade within the RecA ATPase superfamily (34) (Fig. 1B), implying a separate function that does not involve DNA transactions. It has been hypothesized that, given the wide spread and major expansion of this family in archaea, which contrasts with its patchy distribution in bacteria, the KaiC component of the cyanobacterial circadian clock was acquired by HGT from archaea (34). Recently, the structure of the FlaH protein (COG2874 family), which is always encoded within the archaellum operon (Fig. 1C), has been solved and shown to be similar to the C-terminal domain of KaiC (35, 36). FlaH has been shown to form a hexamer and interact with the archaellum subunit FlaI, the motor ATPase, and in crenarchaea, with the FlaX ring (35, 36). KaiC family ATPase GvpD was found to be involved in the regulation of Halobacterium-specific gas vesicles (37) (Fig. 1C). Several halobacterial KaiC-like proteins have been studied with respect to their potential involvement in light-dependent gene expression (38). Very recently, KaiC proteins from the hyperthermophiles Thermococcus litoralis and Pyrococcus horikoshii were shown to be capable of KaiA-independent autophosphorylation at both 30°C and 75°C (34, 39). Finally, structural analysis of a distinct family of archaea-specific uncharacterized proteins (DUF835, PF05763 in the Pfam database [40]) has shown that these proteins are inactivated ATPases that are most closely related to KaiC (41). Thus, currently, at least two additional protein families can be included in the archaeal KaiC group (Fig. 1B). Evolutionary reconstructions suggest that KaiC-like ATPases from arCOG01171, arCOG001174, and arCOG04148 (FlaH) were likely present already in the LACA (11). Prompted by the above observations and the extraordinary diversity of the KaiC ATPases in archaea, we performed a comprehensive phylogenomic analysis of this protein family. The results strongly suggest that the KaiC family ATPases and their homologues with inactivated ATPase domains are key components of the archaeal signaling network(s).

RESULTS

Genomic census of the KaiC ATPase family in archaea and bacteria.

To perform a comprehensive phylogenomic analysis of the KaiC ATPase superfamily, 2,635 sequences from the three KaiC subfamilies (COG0467, COG2874, and pfam05763) and related arCOGs (see Table S1 in the supplemental material) were extracted from the data set of complete archaeal and bacterial genomes. Genomic loci (five genes upstream and downstream from each kaiC-like gene) were retrieved for the genomic neighborhood analysis (Table S2). These loci were annotated by using PSI-BLAST and the CDD (Conserved Domain Database) collection of multiple sequence alignments, and the archaeal proteins were assigned to arCOGs (see Materials and Methods for details). Notably, members of the KaiC superfamily are present even in the archaea with the smallest genomes, such as Nanoarchaeota, and various KaiC families are expanded in many archaeal lineages, especially, in Thermococci and Thermoproteales (Table S1). Phyletic patterns of KaiC-like and associated arCOGs. Download TABLE S1, XLSX file, 0.1 MB. KaiC-encoding genomic loci. Download TABLE S2, XLSX file, 2.5 MB. From this collection of KaiC-like protein sequences, we selected a nonredundant set of proteins that could be expected to contain at least one full-sized ATPase domain (~200 amino acid residues). This nonredundant set was used to build a dendrogram by using a combination of the FastTree method and the unweighted pair group method using average linkages (UPGMA) (Text S1; see Materials and Methods for details). The resulting tree topology was largely consistent with results of previous phylogenetic analyses (34, 39). KaiC tree, Newick format. Download TEXT S1, TXT file, 0.1 MB. Despite the considerable overrepresentation of bacterial compared to archaeal genomes in the database, archaeal (and cyanobacterial) proteins dominate the KaiC family, in agreement with the previous conclusion that this family originated in Archaea (42). A phylogenetic tree was built for a nonredundant subset of KaiC family members (Fig. 2A). The tree contains 28 distinct strongly supported archaeal branches (A1 to A28) and 6 predominantly bacterial branches (B1 to B6). Bacterial sequences are mostly scattered over the tree, suggesting frequent HGT from archaea to bacteria. The large, mostly bacterial clade combining branches B2 and B3 corresponds to cyanobacterial KaiC components of the circadian clock (B3) and KaiC-like sequences (B2) including experimentally studied proteins of Rhodopseudomonas and Legionella (33, 43) (Fig. 2A). The strongly supported (95%) B2 clade contains several archaeal proteins, in addition to bacterial ones, all from different methanogens (branches A5a and A5b), which indicates likely HGT from bacteria to archaea. In Rhodopseudomonas (branch B2), involvement of the KaiC homologues in clock-like gene expression has been demonstrated, whereas in Legionella (branch B2), these proteins are implicated in oxidative and sodium stress resistance and do not appear to be components of an oscillator. This clade is deeply nested among diverse archaeal branches, in accord with the scenario in which the ancestral components of the circadian clock were transferred from archaea to bacteria (Fig. 2A). Proteins containing two ATPase domains and those with a single ATPase domain are interspersed in the tree, suggesting that multiple gene fusions and gene fissions occurred during the evolution of this family in Archaea. Furthermore, active and inactivated (as determined from the disruption of the Walker A and B signature motifs of the P-loop domain) ATPases are also interspersed, indicating multiple independent ATPase inactivations (Fig. 2A; Table 2). Here, we collectively refer to all groups of the KaiC homologues with inactivated ATPase domains as iKaiC; clearly, despite the abrogation of the ATPase activity, iKaiC could perform other functions, as discussed below. Archaeal branch A9 consists of KaiC-like proteins that are well represented in both Euryarchaeota and the TACK (Thaumarchaeota, Aigarchaeota, Crenarchaeota, Korarchaeota) superphylum, and thus appear to be ancestral (Table 2). Although the support of this branch is not very strong (44), all of these proteins belong to the same cluster, arCOG01171, and have a single ATPase domain, so two independent approaches to sequence clustering give similar results. The same considerations apply to branch A3, which includes KaiC-like proteins with two active ATPase domains. The third branch (A17) that appears to be ancestral consists of FlaH proteins, essential archaellum components (36, 45). The remaining tree branches are either lineage specific or include only a few archaeal lineages (Table 2; Table S3). Thus, this analysis supports the previous conclusions that at least three KaiC families could be represented in the LACA (11). The multiple long branches and inactivation of the ATPase domain imply frequent subfunctionalization of the KaiC family proteins, especially in Thermococci and Thermoproteaceae and to a lesser extent in Aciduliprofundum and Archaeoglobi. This evolutionary trend resulted in the appearance of numerous subfamilies of highly diverged iKaiC proteins (Table S1).
FIG 2 

Phylogeny and conserved gene neighborhoods of the KaiC family. (A) The dendrogram reflecting the relationships between archaeal and bacterial representatives of the KaiC protein family was constructed as described in Materials and Methods. Major distinct branches are collapsed and shown as triangles numbered A1 to A28 for the archaeal branches and B1 to B6 for the bacterial branches. Bootstrap values calculated by the FastTree program are shown for several key nodes, and values for the major, well-supported branches are shown in red. Each sequence in the tree is described by the locus tag number and species name. Colors: green, bacterial genes; orange, archaeal genes. (B) For each branch, a conserved gene arrangement (if detected) is shown. Genes are shown as arrows. An arCOG number is shown for each gene. Functionally linked or homologous genes are represented as follows: KaiC-like genes, red; two-component signal transduction system genes, brown; type IV pili, dark blue; membrane transporters, angled grid; uncharacterized genes, white. Other domains are colored according to their descriptions provided above the domain icon. Abbreviations: V4R, V4R small-molecule-binding domain; FlhG, FlhG/MinD/FleN family ATPase, antiactivator of flagellar biosynthesis. For the complete tree, see Text S1.

TABLE 2 

Descriptions of the major archaeal branches shown in Fig. 2

BranchPhyletic distributionCommenta
A1Aciduliprofundum and several Methanomicrobia genomesMostly 2-domain ATPases; both domains are active
A2Methanocella only2-domain ATPase; second domain is inactivated and diverged
A3Patchy distribution in most archaeal lineages; three paralogs in Thermoproteales2 active ATPase domains; possibly an ancestral group
A4Few different archaeaSingle active ATPase domain
A5Several methanomicrobia (A5a) and several Methanothermobacteriales (A5b)2 active ATPase domains, most closely related to bona fide cyanobacterial KaiC, likely lateral transfer from bacteria
A6Many euryarchaeal lineages but with patchy distributionSingle active ATPase domain; all belong to arCOG01173; ATPase is often fused to a large low-complexity N-terminal domain
A7Several euryarchaeal lineages but with patchy distributionSingle active ATPase domain
A8Patchy distribution in Halobacteria, Methanocella, and Nitrosopumilus, present in a small genome of archaeon_GW2011_AR10Single active ATPase domain
A9Most archaeal lineages, including NanoarchaeotaSingle active ATPase domain; possibly an ancestral group
A10Most euryarchaeal lineages; duplication in ThermococciSingle active ATPase domain
A11Most archaeal lineagesBoth a single active ATPase domain and 2 active ATPase domains (fused); includes bacterial branch B5, all with 2 ATPase domains
A12Thermoproteales only, 2 paralogsSingle active ATPase domain
A13Methanomicrobia and Methanothermobacteriales, 2 paralogs in MethanosarcinalesSingle active ATPase domain
A14Most crenarcheal lineages and KoarchaeumSingle active ATPase domain
A15A few different archaeaSingle active ATPase domain
A16MethanothermobacterialesSingle active ATPase domain
A17Most archaeal lineagesSingle active ATPase domain; archaellum-associated protein FlaH; possibly an ancestral group
A18Thermoproteaceae onlySingle active ATPase domain; arCOG05482 monophyletic
A19Several HalobacteriaLikely an active ATPase fused to metallochaperone-like domain (TRASH)
A20Patchy distribution in Archaeoglobi, Methanomicrobiales and Aciduliprofundum; present in most HalobacteriaSingle inactivated ATPase domain; arCOG01172 monophyletic
A21Archaeoglobi onlySingle inactivated ATPase domain
A22Patchy distribution in Methanomicrobiales and AciduliprofundumSingle inactivated ATPase domain; most sequences belong to arCOG01175
A23Thermococci onlySingle active ATPase domain
A24Archaeoglobi onlySingle active ATPase domain; group with several bacteria
A25Several Halobacteria, all Methanosaeta and all Aciduliprofundum genomes2 ATPase domains; second domain is inactivated and diverged; in Halobacteria, it is GvpD, a component and regulator of a gas vesicle system
A26Thermoproteaceae only; 2 paralogsSingle active ATPase domain
A27Thermoproteaceae only2 ATPase domains; second domain is inactivated and diverged
A28Thermoproteaceae only2 ATPase domains; second domain is inactivated and diverged

ATPase domains are denoted active if they have intact Walker A and B motifs.

KaiC tree branches. Download TABLE S3, XLSX file, 0.1 MB. Phylogeny and conserved gene neighborhoods of the KaiC family. (A) The dendrogram reflecting the relationships between archaeal and bacterial representatives of the KaiC protein family was constructed as described in Materials and Methods. Major distinct branches are collapsed and shown as triangles numbered A1 to A28 for the archaeal branches and B1 to B6 for the bacterial branches. Bootstrap values calculated by the FastTree program are shown for several key nodes, and values for the major, well-supported branches are shown in red. Each sequence in the tree is described by the locus tag number and species name. Colors: green, bacterial genes; orange, archaeal genes. (B) For each branch, a conserved gene arrangement (if detected) is shown. Genes are shown as arrows. An arCOG number is shown for each gene. Functionally linked or homologous genes are represented as follows: KaiC-like genes, red; two-component signal transduction system genes, brown; type IV pili, dark blue; membrane transporters, angled grid; uncharacterized genes, white. Other domains are colored according to their descriptions provided above the domain icon. Abbreviations: V4R, V4R small-molecule-binding domain; FlhG, FlhG/MinD/FleN family ATPase, antiactivator of flagellar biosynthesis. For the complete tree, see Text S1. Descriptions of the major archaeal branches shown in Fig. 2 ATPase domains are denoted active if they have intact Walker A and B motifs.

Predicted interaction partners of KaiC proteins in Archaea.

Analysis of conserved gene neighborhoods (Fig. 2B) and domain fusions (Fig. 3) revealed a complex and diverse set of proteins and domains that can be predicted to interact with KaiC family members.
FIG 3 

KaiC protein fusions. Individual domains are shown as rectangles. KaiC-related domains are designated by either arCOG numbers or Pfam identifiers. Species names and the respective protein IDs are show on the right. Homologous domains are color coded. Abbreviations: HAMP, PAS, REC, HisKA_7TM, and GAF, known domains shared with two-component signal transduction systems; HHH, triple-helix DNA-binding domain; TRASH, metal-binding domain predicted to be involved in heavy-metal sensing; ATPase_N, AAA ATPase N-terminal region; ATPase_C, AAA ATPase C-terminal region; TM, transmembrane segment; V4R, V4R small-molecule-binding domain.

KaiC protein fusions. Individual domains are shown as rectangles. KaiC-related domains are designated by either arCOG numbers or Pfam identifiers. Species names and the respective protein IDs are show on the right. Homologous domains are color coded. Abbreviations: HAMP, PAS, REC, HisKA_7TM, and GAF, known domains shared with two-component signal transduction systems; HHH, triple-helix DNA-binding domain; TRASH, metal-binding domain predicted to be involved in heavy-metal sensing; ATPase_N, AAA ATPase N-terminal region; ATPase_C, AAA ATPase C-terminal region; TM, transmembrane segment; V4R, V4R small-molecule-binding domain. The three most common contextual themes involving the KaiC family are (i) type IV pilus systems and other membrane-associated complexes such as the signal recognition particle (SRP) GTPase Ffh or a FlgN-like flagellar biosynthesis/secretory pathway chaperone (20), (ii) signal-transducing and sensory proteins that are typically associated with histidine kinases in bacteria, and (iii) membrane transporters (Fig. 2B). Specifically, we can confidently predict the interacting partners for two ancestral KaiC families, in addition to FlaH, for which such partners are already known. Branch A9 KaiC proteins are predicted to interact with uncharacterized proteins from arCOG00921 (COG1318, predicted DNA-binding transcriptional regulators of the GlpR family) (Fig. 2B). The data in Table S1 indicate that arCOG00921 proteins and proteins from the A9 KaiC branch are always present in the same genome and often adjacently, even in the smallest archaeal genome of Nanoarchaeum equitans (NEQ174 and NEQ534, respectively). The coincidence of retention suggests that both components are involved in the same important cellular process(es). A third component also could be linked to this system, namely, a protein of the poorly characterized DUF77 family (pfam01910/COG0011) that is present in most archaea (arCOG04373) and appears to descend from the LACA (11). The structure of a protein from this family has been solved, revealing a ferredoxin fold, and it has been shown to form homotetramers and bind thiamine; in Thermotoga, the expression of the gene for this protein is upregulated under oxidizing conditions (46). Accordingly, it has been proposed that the protein is involved in an oxidative stress response mechanism (11). Additionally, arCOG007764 (a paralog of arCOG00921) is associated with KaiC-like ATPases of branch A24, whereas arCOG04373 is also associated with KaiC-like ATPases of branch A14, reinforcing the functional linkage of these three protein families (Fig. 2B). Proteins with two ATPase domains, which most closely resemble the bona fide cyanobacterial KaiC protein, are typically associated with a small protein, either KaiB (branches A5a and A5b) or a member of an uncharacterized protein family (e.g., arCOG07117, arCOG03757, arCOG03758, arCOG11224, and arCOG10037) in ancestral branch A3 (Fig. 2B). The structure of one protein of this family has been solved (PDB code 2p9x), revealing a four-helix bundle fold. Structural comparison by using VAST (47) shows that the best match for this protein is the eukaryotic DEATH domain (a domain named for death, meaning its involvement in apoptosis, also often referred to as DD) with a root mean square deviation of 0.97 Å from the DD of the human RAIDD (DD-containing protein; the abbreviation is complex and is explained in reference 48) protein (PDB code 2O71) (49, 50) (Fig. S1). The DDs and related α-helical adapter domains are key components of eukaryotic signal transduction pathways, particularly those involved in programmed cell death (apoptosis), where these domains mediate connections between different components through homotypic interactions (i.e., different DD-related adapter domains interact with one another) (49, 50). Exceptions to this association are the two-domain ATPases from the halobacteria-specific clade of branch A11 and from mostly methanomicrobial branch A1, for which no small partner protein encoded in the same locus could be identified. Bacterial branch B1 lies within an archaeal subtree that includes branches A1 to A4; several internal branches within this subtree are strongly supported (>90%) (Fig. 2A), suggesting horizontal transfer from archaea to bacteria. The majority of the kaiC genes associated with branch B1 are located next to genes related to two-component systems, suggesting that archaeal KaiC of branch A1 could interact with the analogous components encoded in other loci in the respective archaeal genomes. DD-like domains are specifically expanded in the class Thermococci and several members of the phylum Thaumarchaeota (Fig. S1B; Table S1). Some of them are fused to a diverged iKaiC domain, REC domain, or ferritin domain, further linking these proteins to KaiC. Moreover, genes encoding DD-like domain proteins are found in several conserved neighborhoods together with other uncharacterized genes, suggesting that additional components could be linked to the KaiC-based signal transduction network (Fig. S1B). Taking these observations into account, we predict that the DD-like domains also serve as modulators of the autophosphorylation activity of KaiC. DD-like family in archaea. Download FIG S1, PDF file, 1.2 MB. Single-domain KaiC-like ATPases are often encoded as doublets of paralogs, of which some are active and others are inactivated, suggesting that they might form heterodimers, recapitulating the organization of the two-domain KaiC-like ATPases (Fig. 2B). The fusions of KaiC with other domains are also informative, showing either the same trend as that observed for the conserved neighborhoods or suggesting the involvement of KaiC-like domains in more complex signal transduction pathways (Fig. 3). Many of these fusions (e.g., to the TRASH [trafficking, resistance, and sensing of heavy metals] sensory domain, rubredoxins, and ferritins) point to an involvement in oxidative stress. Most often, we observe ferritin domains both fused to KaiC or DUF835 and found in the respective neighborhoods (Fig. 3; Table S2). Ferritins are iron-binding proteins whose role in the oxidative stress response is well established (51). The association with the SRP GTPase Ffh and the regulatory GTPase Srp102/FtsY suggests that KaiC-like proteins might regulate the targeting of nascent secreted or membrane proteins from the ribosome to the membrane through the SRP (44, 52). The iKaiC domains of the DUF835 family are often found in multidomain proteins (Fig. 3). Many of these contain sensory and signal-transducing domains that have been thoroughly studied in the context of bacterial two-component signal transduction systems (53, 54). This connection suggests that DUF835 proteins are involved in signal transduction pathways. Many proteins of this family are membrane associated, presumably interacting with other membrane proteins, some of which are fused to the DUF835 domain (e.g., the Na+/proline symporter-like domain) (Fig. 3; Tables S1 and S2). Fusions with other regulatory and signal transduction proteins, such as AAA ATPases containing tetratricopeptide repeats and cyclases, in particular (Fig. 3), suggest that KaiC family proteins are involved in highly complex pathways, which include cross-talk with other signal transduction systems. Finally, the previously described MEDS (methanogen/methylotroph DcmR sensory) domain (arCOG03567, pfam14417) shows a clear affinity for the KaiC family. PSI-BLAST searches initiated with any of the MEDS domain sequences against the arCOG database reveal significant sequence similarity of this domain with the members of KaiC-like arCOG01171 (E value of 4e-05 in the second iteration), although the MEDS domain is unlikely to be an active ATPase because of the lack of catalytic residues in the Walker A and B ATPase motifs. The MEDS domain has been described previously (55) both as a stand-alone domain, often encoded in the genomic neighborhoods of other components of signal transduction systems, and in a fusion with sensory histidine kinases along with other sensory domains (55). Here we also identified fusions of MEDS and DD-like domains (Fig. S1). Taken together, these observations suggest that the MEDS domain could be functionally similar to the DUF835 domain described above. Many domains and genes linked to iKaiC proteins remain uncharacterized. There is an expansion of two protein families in Halobacteria that are associated with iKaiC of arCOG02452. One of these has been discussed previously in the context of signal transduction systems and is called HalX (arCOG02601/pfam08663) (56). The HalX domain is often fused to the REC domain and is found in the context of genes associated with a two-component signal transduction system (Fig. 2; Tables S1 and S2). The second expanded domain has not been previously described. In halobacterial genomes, it is represented by multiple paralogs that belong to five arCOGs, arCOG08928, arCOG08103, arCOG08989, arCOG09008, and arCOG08980. Among these, only arCOG08928 is often located next to an iKaiC of arCOG02452, and a few arCOG08980 members are fused to iKaiC (Fig. 2 and 3). Both domains might function as input domains for the respective KaiC-like proteins.

Models of KaiC-based signal transduction systems.

The multiple lines of evidence discussed above indicate that the KaiC family is likely a major hub of a versatile and complex archaeal signaling network that so far has largely escaped attention. Nevertheless, the available experimental data on a halobacterial circadian clock (24, 57) and the recent progress in the study of the functions of FlaH in the archaellum (35, 36, 45) allow us to propose two models of the roles of KaiC-like proteins in signal transduction (Fig. 4). The first model is essentially identical to the circadian clock mechanism and postulates the formation of either a homohexameric ring of KaiC proteins containing two ATPase domains or heterohexamers of interacting KaiC proteins, each containing a single ATPase domain. Both domains can be active ATPases, or alternatively, one of the domains could be inactivated, such as one of the multiple DUF835 domains, which would pass the signal from an input domain to the active KaiC-like domain (Fig. 4). Each of the hexameric ATPase rings would interact with multiple partners and, as with other signal transduction systems, such partners can be roughly classified into input and output components (Fig. 4). In addition, the KaiC rings could interact with modulators of the ATPase activity, such as KaiB in the circadian clock, which might compete for binding with other output proteins.
FIG 4 

Models of protein complex architectures and putative functions of the components of KaiC-based signal transduction pathways in archaea. KaiC pathway protein components are shown as colored shapes. Below the scheme of predicted protein-protein interaction, selected input, modulator, and output components are listed inside the oval borders, which are colored according to the predicted functions of these components. Each protein family name is shown next to a circle of the same color used for this component in Fig. 2 and 3.

Models of protein complex architectures and putative functions of the components of KaiC-based signal transduction pathways in archaea. KaiC pathway protein components are shown as colored shapes. Below the scheme of predicted protein-protein interaction, selected input, modulator, and output components are listed inside the oval borders, which are colored according to the predicted functions of these components. Each protein family name is shown next to a circle of the same color used for this component in Fig. 2 and 3. The second model postulates interaction of a single-domain KaiC-like ATPase homohexamer directly with an output domain, similar to the potential interaction between FlaH and FlaI in the archaellum (36, 45) (Fig. 4). Many of the predicted components remain uncharacterized, and thus no specific functions can be predicted for them at this time. Furthermore, the KaiC-centered signaling systems could be interconnected with other signal transduction pathways, in particular, with two-components systems, via shared domains of input proteins (Fig. 4), and with Ras-like GTPases, either directly through the interaction with Srp102/FtsY or through Roadblock family proteins as described for both bacteria and eukaryotes (58, 59). The mode of signaling apparently can be modified with relative ease. For example, a two-domain KaiC-like ATPase and a DD-like protein are encoded in the type IV pilus loci in Thermoproteales, whereas in Euryarchaeota, these loci contain a gene coding for a single-domain KaiC (arCOG01175) linked to a FlhG-like secretion chaperone. Thus, two distinct models could apply to the same process of regulation of type IV pilus (or archaellum) assembly in different archaea (Fig. 4). It can be predicted that many KaiC-like proteins lack autophosphorylation activity but could bind and/or hydrolyze ATP to transduce the signal. Indeed, many of these proteins lack the pair of serine/threonine residues that are conserved among the bona fide KaiC proteins and are autophosphorylated in the circadian clock system (30). However, several archaeal KaiC subfamilies, especially the two-domain ATPases, retain this motif or at least one of the two hydroxy amino acids and could be active autokinases.

Implications for the archaeal circadian clock.

Among archaea, diurnal gene expression has been demonstrated only in Halobacteria (24, 57, 60). It has been shown that KaiC-like proteins undergo cyclic expression, and deletion of most of them affected the expression of the others, suggesting that Halobacteria indeed might have a bona fide KaiC-based circadian mechanism (38). Similarly to cyanobacteria, Halobacteria adjust their metabolism to light conditions via rhodopsin-based proton pumps that generate a proton gradient and sensory rhodopsins that control phototaxis (61). Halobacteria encode two-domain KaiC-like ATPases (both within branch A11 in Fig. 2A), which do not group with KaiC from cyanobacteria. Furthermore, neither KaiB nor KaiA nor any potential analogue of these KaiC interactors could be identified in the genomic neighborhoods of the halobacterial KaiC-like ATPases. Moreover, there was a weak, if any, correlation between the presence of two-domain KaiC ATPases and rhodopsin-like proteins in halobacterial genomes (Table S1). Accordingly, the functions of these proteins in Halobacteria remain unclear. To the best of our knowledge, no evidence of a circadian clock in any other archaea has been reported and no rhodopsins have been identified. A putative minimal circadian clock system consisting of KaiC from branch A5 and KaiB is present in some methanogens (Fig. 2; Table S3). However, as in the case in Legionella, this system could be involved in regulatory pathways distinct from the circadian clock. Apart from Halobacteria, there seems to be no indication that archaea can sense light and modulate their metabolism accordingly. It thus appears unlikely that most archaea possess circadian clocks similar to those of photosynthetic bacteria. However, if indications of clock mechanisms in archaea (other than Halobacteria) were found, the best candidates would be the systems containing the two-domain KaiC-like ATPases of branch A5, which is associated with KaiB, and those of branch A3, associated with a DEATH-like domain, a potential analogue of KaiB (Fig. 2 and 4).

Concluding remarks.

The striking proliferation and diversification of the KaiC-like ATPase family in archaea imply that these proteins comprise the core of diverse, unexplored, and apparently, archaea-specific signal transduction networks. These signal transduction systems are likely involved in the regulation of membrane-associated complexes and individual proteins, such as the archaellum, type IV pili, SRP, and membrane transporters. Additionally, the KaiC-centered signal transduction machinery can be predicted to regulate a response to oxidative stress. However, it appears unlikely that archaea, apart, maybe, from Halobacteria, possess cyanobacterial-type, KaiC-centered circadian clocks. The KaiC-based signaling mechanisms appear to be ancestral in Archaea, with at least three KaiC paralogs projected to the LACA. One ancestral KaiC subfamily that includes a protein containing an HTH domain (arCOG00921) might be involved in as-yet-uncharacterized global response pathways because it is encoded even in the minimal genome of the Nanoarchaeota. The predicted KaiC-based signal transduction system appears to be interconnected with two-component signal transduction systems through iKaiC of the DUF835 family and MEDS domains that are predicted to interact with active KaiC ATPases. In contrast, we could not identify any connections between the KaiC-centered network and genes involved in the S/T kinase pathway. Additionally, inspection of the available data on archaeal phosphoproteomes yielded no indications of extensive phosphorylation of KaiC pathway-related genes (62–64). Thus, the KaiC network appears to be largely disjointed from the S/T kinase-mediated regulatory pathways in Archaea. The phylogenomic analysis reported here can produce only crude models of archaeal signal transduction. Nevertheless, these observations expose multiple experimental directions that can be expected to shed light on key aspects of archaeal cell biology.

MATERIALS AND METHODS

Archaeal and bacterial complete genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/) in March 2016. Altogether, the database includes 4,961 completely sequenced and assembled genomes. These genomes were assigned to COGs and Pfam families by using the PSI-BLAST program with an E-value cutoff of 1e-4 and low-complexity filtering turned off against a collection of multiple sequence alignments (profiles) from the CDD database (65) derived from COGs, Pfam, and CDD itself. The same approach was used to assign archaeal proteins to arCOGs as described previously (66). All proteins that were assigned to any of the three groups (COG0467/pfam06745, COG2874, pfam05763) or to arCOGs associated with the KaiC family were retrieved. Genomic loci containing five genes upstream and downstream of all kaiC-like genes were extracted for neighborhood analysis. KaiC-like sequences were clustered by using BLASTCLUST (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html) with a length coverage of 90% and a sequence identity threshold of 90% to obtain a nonredundant set of sequences. Among those, readily alignable groups of predominantly active ATPase sequences were selected for phylogenetic analysis (several inactivated ATPases aligned poorly and were not included in this analysis; also, protein fragments in the nonredundant set were discarded). The final set used for tree reconstruction included 1,011 sequences. Tree reconstruction was performed by two approaches, (i) a combination of FastTree and UPGMA for full-length sequences and (ii) the default FastTree method for the N-terminal ATPase domain only. For the first approach, initial sequence clusters were obtained by using UCLUST (67) with a sequence similarity threshold of 0.5; the sequences were aligned within clusters by using MUSCLE (68). Cluster-to-cluster similarity scores were then obtained by using HHsearch (69) (including trivial clusters consisting of a single sequence each). A UPGMA dendrogram was constructed from the pairwise similarity scores. Highly similar clusters (pairwise-score to self-score ratio, >0.1) were aligned with each other by using HHALIGN (69). This procedure was repeated iteratively. At the last step, sequence-based trees were reconstructed from the cluster alignments by using FastTree (70) as described below and rooted by midpoint; these trees were grafted onto the tips of the profile similarity-based UPGMA dendrogram. Sites with gap character fraction values of >0.5 and homogeneity values of <0.1 were removed from the alignment (71). In both cases, the FastTree program (70) was executed with the WAG evolutionary model and the discrete gamma model with 20 rate categories. The same program was used to compute SH (Shimodaira-Hasegawa)-like node support values. To identify remote sequence similarity, HHpred with default parameters (69) and CD search (72) with an E value cutoff of 10 and composition-based statistics adjustment turned off were used. In addition, web-based, manually curated PSI-BLAST searches were run with and without the composition-based statistics adjustment and with low-complexity filtering turned off. Inclusion E-value thresholds of 0.1 to 1e-8, depending on sequence length and content, were used, and some searches were run against the archaeal subset of the NCBI nonredundant protein database. The VAST program (47) was used with default parameters for structural comparison.
  73 in total

1.  Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell.

Authors:  K S Makarova; L Aravind; M Y Galperin; N V Grishin; R L Tatusov; Y I Wolf; E V Koonin
Journal:  Genome Res       Date:  1999-07       Impact factor: 9.043

2.  Origin and evolution of circadian clock genes in prokaryotes.

Authors:  Volodymyr Dvornyk; Oxana Vinogradova; Eviatar Nevo
Journal:  Proc Natl Acad Sci U S A       Date:  2003-02-25       Impact factor: 11.205

3.  Structural classification of bacterial response regulators: diversity of output domains and domain combinations.

Authors:  Michael Y Galperin
Journal:  J Bacteriol       Date:  2006-06       Impact factor: 3.490

4.  A sequential program of dual phosphorylation of KaiC as a basis for circadian rhythm in cyanobacteria.

Authors:  Taeko Nishiwaki; Yoshinori Satomi; Yohko Kitayama; Kazuki Terauchi; Reiko Kiyohara; Toshifumi Takao; Takao Kondo
Journal:  EMBO J       Date:  2007-08-23       Impact factor: 11.598

Review 5.  Timing the day: what makes bacterial clocks tick?

Authors:  Carl Hirschie Johnson; Chi Zhao; Yao Xu; Tetsuya Mori
Journal:  Nat Rev Microbiol       Date:  2017-02-20       Impact factor: 60.633

6.  FastTree 2--approximately maximum-likelihood trees for large alignments.

Authors:  Morgan N Price; Paramvir S Dehal; Adam P Arkin
Journal:  PLoS One       Date:  2010-03-10       Impact factor: 3.240

7.  Structural and functional characterisation of a conserved archaeal RadA paralog with antirecombinase activity.

Authors:  Anne-Marie McRobbie; Lester G Carter; Melina Kerou; Huanting Liu; Stephen A McMahon; Kenneth A Johnson; Muse Oke; James H Naismith; Malcolm F White
Journal:  J Mol Biol       Date:  2009-05-03       Impact factor: 5.469

Review 8.  Molecular Mechanisms of Two-Component Signal Transduction.

Authors:  Christopher P Zschiedrich; Victoria Keidel; Hendrik Szurmant
Journal:  J Mol Biol       Date:  2016-08-09       Impact factor: 5.469

Review 9.  Horizontal gene transfer, dispersal and haloarchaeal speciation.

Authors:  R Thane Papke; Paulina Corral; Nikhil Ram-Mohan; Rafael R de la Haba; Cristina Sánchez-Porro; Andrea Makkay; Antonio Ventosa
Journal:  Life (Basel)       Date:  2015-05-19

10.  Choreography of the transcriptome, photophysiology, and cell cycle of a minimal photoautotroph, prochlorococcus.

Authors:  Erik R Zinser; Debbie Lindell; Zackary I Johnson; Matthias E Futschik; Claudia Steglich; Maureen L Coleman; Matthew A Wright; Trent Rector; Robert Steen; Nathan McNulty; Luke R Thompson; Sallie W Chisholm
Journal:  PLoS One       Date:  2009-04-08       Impact factor: 3.240

View more
  5 in total

1.  Exploring the (Almost) Unknown: Archaeal Two-Component Systems.

Authors:  Tino Krell
Journal:  J Bacteriol       Date:  2018-03-12       Impact factor: 3.490

2.  RadA, a Key Gene of the Circadian Rhythm of Escherichia coli.

Authors:  Aissatou Bailo Diallo; Soraya Mezouar; Asma Boumaza; Oksana Fiammingo; Benjamin Coiffard; Pierre Pontarotti; Benoit Desnues; Jean-Louis Mege
Journal:  Int J Mol Sci       Date:  2022-05-30       Impact factor: 6.208

3.  Phyletic Distribution and Lineage-Specific Domain Architectures of Archaeal Two-Component Signal Transduction Systems.

Authors:  Michael Y Galperin; Kira S Makarova; Yuri I Wolf; Eugene V Koonin
Journal:  J Bacteriol       Date:  2018-03-12       Impact factor: 3.490

Review 4.  Towards functional characterization of archaeal genomic dark matter.

Authors:  Kira S Makarova; Yuri I Wolf; Eugene V Koonin
Journal:  Biochem Soc Trans       Date:  2019-02-01       Impact factor: 5.407

Review 5.  Towards Elucidating the Rotary Mechanism of the Archaellum Machinery.

Authors:  João Nuno de Sousa Machado; Sonja-Verena Albers; Bertram Daum
Journal:  Front Microbiol       Date:  2022-03-21       Impact factor: 5.640

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.