Literature DB >> 16299590

Stealth proteins: in silico identification of a novel protein family rendering bacterial pathogens invisible to host immune defense.

Peter Sperisen1, Christoph D Schmid, Philipp Bucher, Olav Zilian.   

Abstract

There are a variety of bacterial defense strategies to survive in a hostile environment. Generation of extracellular polysaccharides has proved to be a simple but effective strategy against the host's innate immune system. A comparative genomics approach led us to identify a new protein family termed Stealth, most likely involved in the synthesis of extracellular polysaccharides. This protein family is characterized by a series of domains conserved across phylogeny from bacteria to eukaryotes. In bacteria, Stealth (previously characterized as SacB, XcbA, or WefC) is encoded by subsets of strains mainly colonizing multicellular organisms, with evidence for a protective effect against the host innate immune defense. More specifically, integrating all the available information about Stealth proteins in bacteria, we propose that Stealth is a D-hexose-1-phosphoryl transferase involved in the synthesis of polysaccharides. In the animal kingdom, Stealth is strongly conserved across evolution from social amoebas to simple and complex multicellular organisms, such as Dictyostelium discoideum, hydra, and human. Based on the occurrence of Stealth in most Eukaryotes and a subset of Prokaryotes together with its potential role in extracellular polysaccharide synthesis, we propose that metazoan Stealth functions to regulate the innate immune system. Moreover, there is good reason to speculate that the acquisition and spread of Stealth could be responsible for future epidemic outbreaks of infectious diseases caused by a large variety of eubacterial pathogens. Our in silico identification of a homologous protein in the human host will help to elucidate the causes of Stealth-dependent virulence. At a more basic level, the characterization of the molecular and cellular function of Stealth proteins may shed light on fundamental mechanisms of innate immune defense against microbial invasion.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16299590      PMCID: PMC1285062          DOI: 10.1371/journal.pcbi.0010063

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


Introduction

Colonization of hosts by microorganisms is a complex process that determines if the microorganism will coexist with the host as commensal, become an invasive pathogen, or be efficiently eliminated by the host's immune defense [1,2]. Consequently, microorganisms have developed a variety of measures to cope with the increasingly sophisticated defense strategies of the host's immune system [3-7]. Amongst them, the generation of an extracellular coat made of polysaccharides has proved to be a simple but effective strategy. Bacterial surface polysaccharides can be either amorphous exopolysaccharides, anchored in the lipid layer (lipopolysaccharides, another known regulator of the immune system), or organized as a capsule (capsule polysaccharides [CPSs]). The latter have been shown to mediate adherence to cells and, more importantly, protection against the host's innate immune system [8-11]. Different strategies to escape host immune surveillance have evolved through vertical evolution but also through horizontal gene transfer [12-15]. Though a subject of long-standing controversy, there is increasing evidence suggesting that horizontal gene transfer also occurs from eukaryotes to prokaryotes [16]. Even though the recombined bacteria seemed to have preferentially retained individual domains of proteins [16], a first example was recently reported in which certain bacterial strains kept an entire open reading frame [17]. Here we describe a novel protein family named “Stealth.” Based on a comparative genomics approach, we propose a biological function and an evolutionary scenario for this new protein family.

Results/Discussion

Identification of Stealth

In a screen of the human genome for Notch-related proteins, a novel protein containing two copies of Lin-12/Notch repeats was identified. The protein also showed strong sequence similarity to a number of animal and bacterial proteins, including several virulence factors of human pathogens published under different names. This previously unknown protein family was named “Stealth” because experimentally characterized members of this family appear to render bacterial and protozoan invaders invisible to the host's immune surveillance system. Stealth proteins are characterized by four conserved regions (CRs) referred to as CR1 to CR4 (Figure 1). The N-terminal CR1 consists of a short but strongly conserved sequence motif, IDVVYTF or very similar. The second region, CR2, is approximately 100 residues long and constitutes the most conserved part of this protein family. A standard BLAST search [18] with any CR2 domain identifies all other members of the Stealth family in the current database with highly significant E-values. CR3 is about 50 residues long but less well conserved. Finally, the C-terminal CR4 includes an almost universally conserved tetrapetide, CLND or CIND. Adjacent and between these domains are divergent sequence regions of variable length that may contain additional domains (Figures 1 and 2A).
Figure 1

Multiple Alignments of CRs

Multiple alignments of the four CRs for a representative set of protein sequences (>15% dissimilarity over all four CRs) are shown. Sequences are identified by a species code (see Table 1), protein name (from literature as proposed in this paper), and database accession number, where available. The lengths of the sequences omitted between or within CRs are indicated in square brackets. The last row shows the secondary structure prediction obtained by jnetpred [65] for the human Stealth protein, where H stands for helices and E for beta-sheets. The color scheme used is the ClustalX default scheme, with the colors for conserved amino acids being more intense than those for nonconserved ones.

Figure 2

Domain Architecture and Genome Structure

(A) CR1 to CR4, found through multiple alignments, are represented by rectangles ranging from light blue (CR1) to dark blue (CR4). Other motifs are represented as follows: predicted signal peptides as magenta rectangles, transmembrane regions as orange rectangles, Lin-12/Notch repeats as red pentagons, and EF-hands as green circles.

(B) The genome structure of the human and fly Stealth homologs is represented, with the exons depicted as green rectangles separated by introns of indicated size.

(C) Two splice variants lead to different N-terminal sequences, as supported by mouse EST sequences. Splicing reconstructs a codon for tyrosine (Y). Both proteins contain a predicted signal peptide.

Multiple Alignments of CRs

Multiple alignments of the four CRs for a representative set of protein sequences (>15% dissimilarity over all four CRs) are shown. Sequences are identified by a species code (see Table 1), protein name (from literature as proposed in this paper), and database accession number, where available. The lengths of the sequences omitted between or within CRs are indicated in square brackets. The last row shows the secondary structure prediction obtained by jnetpred [65] for the human Stealth protein, where H stands for helices and E for beta-sheets. The color scheme used is the ClustalX default scheme, with the colors for conserved amino acids being more intense than those for nonconserved ones.
Table 1

Summary of All Species Containing Stealth Proteins

Domain Architecture and Genome Structure

(A) CR1 to CR4, found through multiple alignments, are represented by rectangles ranging from light blue (CR1) to dark blue (CR4). Other motifs are represented as follows: predicted signal peptides as magenta rectangles, transmembrane regions as orange rectangles, Lin-12/Notch repeats as red pentagons, and EF-hands as green circles. (B) The genome structure of the human and fly Stealth homologs is represented, with the exons depicted as green rectangles separated by introns of indicated size. (C) Two splice variants lead to different N-terminal sequences, as supported by mouse EST sequences. Splicing reconstructs a codon for tyrosine (Y). Both proteins contain a predicted signal peptide. Summary of All Species Containing Stealth Proteins Continued

Taxonomic Distribution

Stealth proteins are found encoded in the genomes of chordates, echinodermates, hydras, fungi, and flies but appear to be absent from nematodes and plants. Interestingly, a few organisms contain multiple Stealth genes (Table 1). Stealth proteins also occur in the protist genomes of Dictyostelium, Giardia, Leishmania, Entamoeba, and Phytophthora, and among the hitherto sequenced bacteria, they are found in the following phyla: alpha-, beta-, and gamma-proteobacteria (mostly pathogens), firmicutes (mostly the commensals), and actinobacteria (some animal pathogens) (Table 1; Figure S1). It is noteworthy that the large majority of completely sequenced bacterial genomes do not harbor Stealth. The species that do contain a member of this family are not necessarily closely related, and include Gram-positive as well as Gram-negative bacteria.

Stealth in Bacteria

Several of the documented bacterial Stealth genes belong to capsule group II biosynthesis operons generating carbohydrate-phosphodiester-containing CPSs [19-24]. In the case of Stealth-expressing bacteria, these CPSs turned out to inhibit complement-mediated lysis, as shown for serogroup A and X of Neisseria meningitidis [23,24] and to correlate with serum and phagocyte survival abilities as shown for Aeromonas hydrophila [25]. The majority of Stealth-expressing bacteria that have been analyzed so far for the composition of their exopolysaccharides turned out to build phosphoglycans consisting of phosphodiester-linked hexose mono- or disaccharide building blocks [26-29]. On the other hand, certain bacteria living in a biofilm community contain CPSs consisting of phosphodiester-linked hexa- or heptasaccharide repeating units [30,31]. These carbohydrates, also called receptor polysaccharides, are synthesized by a series of different glycosyltransferases, with Stealth amongst them [22]. Strains encoding Stealth carry a hexose phosphodiester linker [31] in their receptor polysaccharides, whereas strains lacking Stealth build receptor polysaccharides with a pentose phosphodiester linker. Definite proof for an essential function of Stealth in CPS biosynthesis was shown in N. meningitidis serogroup A by selective deletion of the gene sacB (i.e., Stealth), giving rise to virtually unencapsulated mutants [23], and by deletion of part of the gene xcbA (i.e., Stealth), together with flanking open reading frames in a serogroup X strain, which resulted in complement-sensitive mutants [24]. Moreover, when the gene cps1A (i.e., Stealth) was deleted in Actinobacillus pleuropneumoniae, the resulting strains lost their pathogenicity in pigs [20]. Taken together, all of the above data suggest that Stealth is a D-hexose-1-phosphoryl transferase that generates interglycosidic phosphate diester linkages.

Characteristics of Metazoan Stealth

Unlike the bacterial Stealth proteins, the vertebrate members of this family are not properly represented in current protein databases. We have manually reconstructed the gene and protein sequences for a number of species with the aid of EST sequences and cross-genome comparisons (Table 1). The human gene consists of 21 exons (Figure 2B), and the translated protein sequence is identical to the RefSeq entry NP_077288. The intron–exon structures of genes found in other vertebrates are essentially the same. In the mouse, however, there is a facultative intron near the start codon spliced out predominantly in transcripts from dendritic cells. This alternative splicing leads to two protein variants with different N-termini (Figure 2C). The hypothetical Drosophila melanogaster and D. yakuba Stealth genes, however, have a completely different intron–exon structure (Figure 2B). Finally, pieces of Stealth-encoding sequences were also found in the preliminary genomes or ESTs of other mammals (Table 1). Metazoan Stealth proteins are characterized by additional domains. There is a predicted signal peptide and, near the C-terminus, a transmembrane helix. One or two Notch/Lin-12 repeats [32] are inserted between CR2 and CR3, and an EF-hand domain [33] appears between CR3 and CR4. So far, all reconstructed Stealth proteins contain these domains, and in some of the cases where only pieces of sequences are available one can identify these motifs. The strong conservation of the Stealth domain architecture suggests that this protein plays an essential role. No experimental knowledge is available about the function of metazoan Stealth proteins today (note, however, that Stealth-deficient mice have been generated by O. Z. and coworkers and will be made available upon request). In view of the high degree of sequence similarity to their bacterial homologs, it is reasonable to speculate that they have a similar molecular function and thus are also implicated in exopolysaccharide synthesis. Public expression profiles derived from SAGE experiments indicate a rather broad tissue distribution. The Stealth-dependent polysaccharides could be host-specific structural surface elements exploited by the immune system for self-recognition. In this case, the Stealth-dependent resistance of human pathogens to complement-mediated lysis and other host defense mechanisms would be a straightforward case of molecular mimicry. Alternatively, host-encoded Stealth proteins may play an active role in down-regulating the immune response. The presence of Stealth in both insects and urochordates further suggests that this protein interferes with processes related to innate rather than adaptive immunity [34,35].

Stealth and Protists

Although higher eukaryotes haven't yet been investigated for the presence of phosphoglycan structures similar to the CPSs, such structures have been identified in D. discoideum and in Leishmania species. In D. discoideum such polysaccharides were found on lysosomal cysteine proteinases and spore coat proteins [36,37]. The lysosomal enzymes of D. discoideum have two types of carbohydrate modifications [38,39] found in two separate sets of lysosomal vesicles [40,41]. The major component of Leishmania lipophosphoglycan is a heteropolymer of 10–40 phosphodiester-linked disaccharide units, depending on species and developmental stage [42]. Lipophosphoglycan is predominantly expressed by promastigotes, is essential for intracellular survival in macrophages and for the virulence of Leishmania major and L. donovani, and disappears when the pathogen intracellularly differentiates into amastigotes within host phagolysosomes [43-47]. The genes encoding these hexose-phosphoryl transferases have been identified neither in D. discoideum nor in Leishmania. Given, however, Stealth's presumed enzymatic activity and its comparative biochemical characterization from three different Leishmania species using synthetic acceptor substrate analogs [48], the two Stealth proteins found in Leishmania and those found in D. discoideum are good candidates for this function.

Evolution of Stealth

The peculiar taxonomic distribution of Stealth (Figure 3) could be the outcome of two different evolutionary scenarios: (i) differential loss of an ancient protein already present in an ancestral form of life, or (ii) horizontal gene transfer between eukaryotes and eubacteria. The second hypothesis appears to be the more plausible, but the direction of the transfer is more difficult to assess. Overall, the protein tree largely follows species phylogeny, at least with regard to the higher level taxonomic groups. This indicates that transfer between eukaryotes and prokaryotes must have been an ancient event. However, several observations suggest that Stealth proteins continue to be horizontally transferred within and between certain bacterial groups. In Gram-negative bacteria, Stealth is inserted into group II capsule operons, which exhibit strong sequence similarity across many species, thus facilitating horizontal gene transfer via homologous recombination [49,50]. Moreover, certain Stealth genes have significantly lower G+C content than the remaining part of the genome [19,21,24,51], which is indicative of a recent acquisition from another species, and some of these genes are flanked by recombination-promoting IS insertion elements or residual fragments thereof [21,24].
Figure 3

Phylogenetic Tree

Trees were calculated from amino acid sequence alignments of the four CRs. As in Figure 1, sequences are identified by a species code (see Table 1), protein name (from literature as proposed in this paper), and database accession number, and are color-coded. Dissimilarities are represented by the length of the branches (all with posterior probabilities above 0.95).

Phylogenetic Tree

Trees were calculated from amino acid sequence alignments of the four CRs. As in Figure 1, sequences are identified by a species code (see Table 1), protein name (from literature as proposed in this paper), and database accession number, and are color-coded. Dissimilarities are represented by the length of the branches (all with posterior probabilities above 0.95).

Materials and Methods

Sequence analysis.

Multiple amino acid sequence alignments of the four CRs were generated using T-Coffee [52]. The signal peptides were predicted with SignalP v2.0 using the combined NN/HMM-based method [53,54], the transmembrane predictions were made using TMHMM v2.0 [55,56], and the Lin-12/Notch repeats were identified using the profile PS50258 in PROSITE [57]. The EF-hand domains were detected using the Pfam HMM PF00036 [58]. The human and the fly gene structures were constructed with the aid of the trome database [59-61].

Sequence database searches.

Other members of the Stealth protein family were identified by searching with either the human or the Streptomyces coelicolor CR2 using BLAST [18] on either nucleic acid or protein databases.

Calculation of sequence trees.

For each CR a separate multiple amino acid sequence alignment was generated. These multiple alignments were concatenated, resulting in a multiple alignment that represents the four CRs. CRs that are absent in certain species are represented as gaps in the multiple alignment. Processed alignments were used to derive tree topologies using Bayesian inference of phylogeny as implemented by MrBayes v3.0 [62,63]. MrBayes was used with four heated chains over 200,000 generations, sampling every 20 trees. The likelihoods of these trees were examined to estimate the length of the burn-in phase, and all trees sampled 20,000 generations later than this point were used to create a consensus tree using the 50% majority rule. MrBayes was used with the mixed model of amino acid substitution, assuming the presence of invariant sites and using a gamma distribution approximated by four different rate categories to model rate variation between sites, estimating amino acid frequencies from the alignment. The consensus tree was displayed using DRAWGRAM of the PHYLIP package [64].

Taxonomic Distribution of Stealth in Bacteria

(57 KB DOC) Click here for additional data file.
Table 1

Continued

  63 in total

1.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors:  A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-01-19       Impact factor: 5.469

Review 2.  Breaching the mucosal barrier by stealth: an emerging pathogenic mechanism for enteroadherent bacterial pathogens.

Authors:  J M Fleckenstein; D J Kopecko
Journal:  J Clin Invest       Date:  2001-01       Impact factor: 14.808

3.  Multiple independent horizontal transfers of informational genes from bacteria to plasmids and phages: implications for the origin of bacterial replication machinery.

Authors:  D Moreira
Journal:  Mol Microbiol       Date:  2000-01       Impact factor: 3.501

4.  Impaired recruitment of the small GTPase rab7 correlates with the inhibition of phagosome maturation by Leishmania donovani promastigotes.

Authors:  S Scianimanico; M Desrosiers; J F Dermine; S Méresse; A Descoteaux; M Desjardins
Journal:  Cell Microbiol       Date:  1999-07       Impact factor: 3.715

5.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction.

Authors:  J A Cuff; G J Barton
Journal:  Proteins       Date:  1999-03-01

6.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

7.  Complete genome sequence of Neisseria meningitidis serogroup B strain MC58.

Authors:  H Tettelin; N J Saunders; J Heidelberg; A C Jeffries; K E Nelson; J A Eisen; K A Ketchum; D W Hood; J F Peden; R J Dodson; W C Nelson; M L Gwinn; R DeBoy; J D Peterson; E K Hickey; D H Haft; S L Salzberg; O White; R D Fleischmann; B A Dougherty; T Mason; A Ciecko; D S Parksey; E Blair; H Cittone; E B Clark; M D Cotton; T R Utterback; H Khouri; H Qin; J Vamathevan; J Gill; V Scarlato; V Masignani; M Pizza; G Grandi; L Sun; H O Smith; C M Fraser; E R Moxon; R Rappuoli; J C Venter
Journal:  Science       Date:  2000-03-10       Impact factor: 47.728

8.  Characterization of the elongating alpha-D-mannosyl phosphate transferase from three species of Leishmania using synthetic acceptor substrate analogues.

Authors:  F H Routier; A P Higson; I A Ivanova; A J Ross; Y E Tsvetkov; D V Yashunsky; P A Bates; A V Nikolaev; M A Ferguson
Journal:  Biochemistry       Date:  2000-07-11       Impact factor: 3.162

9.  The complete cps gene cluster from Streptococcus thermophilus NCFB 2393 involved in the biosynthesis of a new exopolysaccharide.

Authors:  E Almirón-Roig; F Mulholland; M J Gasson; A M Griffin
Journal:  Microbiology       Date:  2000-11       Impact factor: 2.777

10.  Multiple O-glycoforms on the spore coat protein SP96 in Dictyostelium discoideum. Fuc(alpha1-3)GlcNAc-alpha-1-P-Ser is the major modification.

Authors:  M Mreyen; A Champion; S Srinivasan; P Karuso; K L Williams; N H Packer
Journal:  J Biol Chem       Date:  2000-04-21       Impact factor: 5.157

View more
  23 in total

1.  Multiple Domains of GlcNAc-1-phosphotransferase Mediate Recognition of Lysosomal Enzymes.

Authors:  Eline van Meel; Wang-Sik Lee; Lin Liu; Yi Qian; Balraj Doray; Stuart Kornfeld
Journal:  J Biol Chem       Date:  2016-02-01       Impact factor: 5.157

2.  UDP-GlcNAc:Glycoprotein N-acetylglucosamine-1-phosphotransferase mediates the initial step in the formation of the methylphosphomannosyl residues on the high mannose oligosaccharides of Dictyostelium discoideum glycoproteins.

Authors:  Yi Qian; Christopher M West; Stuart Kornfeld
Journal:  Biochem Biophys Res Commun       Date:  2010-02-17       Impact factor: 3.575

3.  Efficient solid-phase synthesis of meningococcal capsular oligosaccharides enables simple and fast chemoenzymatic vaccine production.

Authors:  Timm Fiebig; Christa Litschko; Friedrich Freiberger; Andrea Bethe; Monika Berger; Rita Gerardy-Schahn
Journal:  J Biol Chem       Date:  2017-11-29       Impact factor: 5.157

4.  Molecular cloning and functional characterization of components of the capsule biosynthesis complex of Neisseria meningitidis serogroup A: toward in vitro vaccine production.

Authors:  Timm Fiebig; Friedrich Freiberger; Vittoria Pinto; Maria Rosaria Romano; Alan Black; Christa Litschko; Andrea Bethe; Dmitry Yashunsky; Roberto Adamo; Andrei Nikolaev; Francesco Berti; Rita Gerardy-Schahn
Journal:  J Biol Chem       Date:  2014-05-21       Impact factor: 5.157

5.  Role of spacer-1 in the maturation and function of GlcNAc-1-phosphotransferase.

Authors:  Lin Liu; Wang-Sik Lee; Balraj Doray; Stuart Kornfeld
Journal:  FEBS Lett       Date:  2017-01-01       Impact factor: 4.124

6.  A dual-chain assembly pathway generates the high structural diversity of cell-wall polysaccharides in Lactococcus lactis.

Authors:  Ilias Theodorou; Pascal Courtin; Simon Palussière; Saulius Kulakauskas; Elena Bidnenko; Christine Péchoux; François Fenaille; Christophe Penno; Jennifer Mahony; Douwe van Sinderen; Marie-Pierre Chapot-Chartier
Journal:  J Biol Chem       Date:  2019-10-03       Impact factor: 5.157

7.  The capsule polymerase CslB of Neisseria meningitidis serogroup L catalyzes the synthesis of a complex trimeric repeating unit comprising glycosidic and phosphodiester linkages.

Authors:  Christa Litschko; Maria Rosaria Romano; Vittoria Pinto; Heike Claus; Ulrich Vogel; Francesco Berti; Rita Gerardy-Schahn; Timm Fiebig
Journal:  J Biol Chem       Date:  2015-08-18       Impact factor: 5.157

8.  A novel xylosylphosphotransferase activity discovered in Cryptococcus neoformans.

Authors:  Morgann C Reilly; Steven B Levery; Sherry A Castle; J Stacey Klutts; Tamara L Doering
Journal:  J Biol Chem       Date:  2009-10-28       Impact factor: 5.157

9.  An enzyme-based protocol for cell-free synthesis of nature-identical capsular oligosaccharides from Actinobacillus pleuropneumoniae serotype 1.

Authors:  Insa Budde; Christa Litschko; Jana I Führing; Rita Gerardy-Schahn; Mario Schubert; Timm Fiebig
Journal:  J Biol Chem       Date:  2020-03-09       Impact factor: 5.157

10.  The DMAP interaction domain of UDP-GlcNAc:lysosomal enzyme N-acetylglucosamine-1-phosphotransferase is a substrate recognition module.

Authors:  Yi Qian; Heather Flanagan-Steet; Eline van Meel; Richard Steet; Stuart A Kornfeld
Journal:  Proc Natl Acad Sci U S A       Date:  2013-06-03       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.