| Literature DB >> 35746620 |
Peter Evseev1, Mikhail Shneider1, Konstantin Miroshnikov1.
Abstract
Sheath proteins comprise a part of the contractile molecular machinery present in bacteriophages with myoviral morphology, contractile injection systems, and the type VI secretion system (T6SS) found in many Gram-negative bacteria. Previous research on sheath proteins has demonstrated that they share common structural features, even though they vary in their size and primary sequence. In this study, 112 contractile phage tail sheath proteins (TShP) representing different groups of bacteriophages and archaeal viruses with myoviral morphology have been modelled with the novel machine learning software, AlphaFold 2. The obtained structures have been analysed and conserved and variable protein parts and domains have been identified. The common core domain of all studied sheath proteins, including viral and T6SS proteins, comprised both N-terminal and C-terminal parts, whereas the other parts consisted of one or several moderately conserved domains, presumably added during phage evolution. The conserved core appears to be responsible for interaction with the tail tube protein and assembly of the phage tail. Additional domains may have evolved to maintain the stability of the virion or for adsorption to the host cell. Evolutionary relations between TShPs representing distinct viral groups have been proposed using a phylogenetic analysis based on overall structural similarity and other analyses.Entities:
Keywords: phage tail assembly; sheath protein; tail contraction
Mesh:
Substances:
Year: 2022 PMID: 35746620 PMCID: PMC9230969 DOI: 10.3390/v14061148
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
List of experimentally determined structures for sheath proteins and related structures.
| PDB Code | Description | Organism | Resolution | Method | References |
|---|---|---|---|---|---|
| 3FO8 | Crystal structure of the bacteriophage T4 tail sheath protein, protease-resistant fragment gp18PR | 1.8 Å | X-ray diffraction | [ | |
| 3FOA | Crystal structure of the bacteriophage T4 tail sheath protein, deletion mutant gp18M | 3.5 Å | X-ray diffraction | [ | |
| 3HXL | Crystal structure of the sheath tail protein (DSY3957) from |
| 1.90 Å | X-ray diffraction | [ |
| 3LML | Crystal structure of the sheath tail protein Lin1278 from |
| 3.3 Å | X-ray diffraction | [ |
| 3SPE | Crystal structure of the tail sheath protein protease-resistant fragment from bacteriophage phiKZ | 2.4 Å | X-ray diffraction | [ | |
| 5LI4 | Bacteriophage phi812K1-420 ( | 4.2 Å | Electron microscopy | [ | |
| 6GKW | Crystal structure of the R-type bacteriocin (diffocin) sheath protein CD1363 from |
| 1.9 Å | X-ray diffraction | [ |
| 6PYT | CryoEM structure of precontracted pyocin R2 trunk from |
| 2.9 Å | Electron microscopy | [ |
| 3J9O | CryoEM structure of a type VI secretion system from | 3.70 Å | Electron microscopy | [ | |
| 5N8N | CryoEM structure of contracted sheath of a |
| 3.28 Å | Electron microscopy | [ |
| 3J9G | Atomic model of the VipA/VipB, the type VI secretion system contractile sheath of |
| 3.5 Å | Electron microscopy | [ |
| 6RAO | Cryo-EM structure of the anti-feeding prophage (AFP) baseplate for |
| 3.1 Å | Electron microscopy | [ |
| 6J0B | Cryo-EM structure of an extracellular contractile injection system (CIS), PVC sheath-tube complex in extended state from | 2.9 Å | Electron microscopy | [ | |
| 7AE0 | Cryo-EM structure of an extracellular contractile injection system from the marine bacterium |
| 2.4 Å | Electron microscopy | [ |
| 7B5I | Cryo-EM structure of the contractile injection system cap complex from | 2.8 Å | Electron microscopy | [ |
Figure 1RCSB Protein Bank Database structures depicted with Pymol. (a) 3FOA, crystal structure of the bacteriophage T4 tail sheath protein, deletion mutant gp18M; 3HXL, crystal structure of the sheath tail protein (DSY3957) from Desulfitobacterium hafniense; 3LML, crystal structure of the sheath tail protein Lin1278 from Listeria innocua; 5LI4, bacteriophage phi812K1-420 tail sheath protein after contraction; 6GKW, crystal structure of the R-type bacteriocin sheath protein CD1363 from Clostridium difficile in the pre-assembled state; 6PYT, cryoEM structure of precontracted pyocin R2 trunk from Pseudomonas aeruginosa. (b) 3J9G, sheath protein (VipB) from the type VI secretion system of Vibrio cholerae; 3J9O, sheath protein (IglB) from the type VI secretion system of Francisella tularensis subsp. novicida; 5N8N, sheath protein (TssC) from the type VI secretion system of Pseudomonas aeruginosa; 6RAO_E, 6RBN_C, 6RBN_D, three sheath proteins of the anti-feeding prophage (AFP) of Serratia entomophila. The models are coloured based on a rainbow gradient scheme, where the N-terminus of the polypeptide chain is coloured blue, and the C-terminus is coloured red.
Figure 2(a) Visualisation of the structural alignment made with mTM-align for fifteen experimentally determined sheath proteins (deletion mutant of Escherichia phage T4 TShP; Desulfitobacterium hafniense prophage TShP; Listeria innocua prophage TShP; Staphylococcus phage 812 TShP; R-type bacteriocin sheath protein from Peptoclostridium difficile; pyocin R2 sheath protein from Pseudomonas aeruginosa; sheath proteins of the type VI secretion system from Francisella tularensis subsp. novicida, Pseudomonas aeruginosa, and Vibrio cholerae; anti-feeding prophage sheaths from Serratia entomophila and Photorhabdus asymbiotica). The proteins are depicted as ribbons. The parts with a maximum pairwise residue distance of less than 4 Å are coloured magenta. (b) Visualisation of the structural alignment of the fifteen modelled sheath proteins obtained by the translation of genes encoding the proteins used for the experimentally determined structures listed in Figure 2a. (c) Structural alignment of the fifteen modelled sheath proteins obtained by the translation of genes encoding the proteins used for the experimentally determined structures listed in Figure 2a. Columns in magenta have a maximum pairwise residue distance of less than 4Å. The insertions interrupting the conserved domains of phages T4 and 812 are coloured blue. (d) The 3D-model of the TShP of Staphylococcus phage 812, coloured according to a rainbow gradient scheme, where the N-terminus of the polypeptide chain is coloured blue, the C-terminus is coloured red, and the model superimposed with the “common core” of the experimentally determined sheath is coloured magenta.
Figure 3Protein topology graphs based on PDB structures constructed using the PTGL database for secondary structure-based protein topologies. Structural elements are depicted as geometric figures according to the legends. Connections between structural elements are shown as lines coloured according to the legends.
Figure 4(a) Structure of the tail tube hexamer (coloured grey) and the model of the tail sheath protein (coloured white) fitted into the cryo-EM reconstruction of the T4 tail [56] in the extended (3J2M) and contracted (3J2N) states. (b) The same as Figure 4a but superimposed with the AlphaFold 2 model of the T4 tail sheath. The AlphaFold 2 model is coloured based on a rainbow gradient scheme, where the N-terminus of the polypeptide chain is coloured blue, and the C-terminus is coloured red. The conserved core is circled red. TT, tail tube proteins; TshP, tail sheath proteins.
Figure 5Examples of the structural architecture for the modelled contractile phage sheath proteins. The TShPs consisting of two and more domains are superimposed with the modelled structure of Burkholderia phage BEK tail sheath protein, depicted in the red colour. The schemes on the left show the structural architecture of proteins. The main domain is depicted as a circle, and the additional domains are represented as squares with rounded corners. The direction of the polypeptide chain from the N- to C-termini is shown with arrows.
Figure 6(a) The 3D-model of the TShP of Burkholderia phage BEK (left) coloured according to a rainbow gradient scheme, where the N-terminus of the polypeptide chain is coloured blue, the C-terminus is coloured red, and the model superimposed with the “common core” of the experimentally determined sheath proteins shown in Figure 2. (b) The model of the TShP of Halomonas phage HAP1 (yellow orange) superimposed with Burkholderia phage BEK TShP (salmon). (c) The model of putative sheath protein found in the genome assembly attributed as Candidatus Bathyarchaeota archaeon isolate Bin-L-2 (light blue) superimposed with Burkholderia phage BEK TShP (salmon). (d) The model of the putative tail sheath protein found in the genome of Erwinia phage ENT90 (slate) superimposed with Burkholderia phage BEK TShP (salmon). (e) The model of the putative tail sheath protein found in the genome of Flavobacterium phage FPSV-S1 (green) superimposed with Burkholderia phage BEK TShP (salmon). (f) The model of the putative tail sheath protein found in the genome of Ralstonia phage RSY1 (blue) superimposed with Burkholderia phage BEK TShP (salmon). (g) The model of the putative sheath protein found in the genome of Vibrio phage vB_VpaM_MAR (yellow) superimposed with Burkholderia phage BEK TShP (salmon).
List of 153 contractile sheath proteins and homologous sequences for which the tertiary structures have been modelled.
| # | Organism Name (AFP, Anti-Feeding Prophage; BCN, Bacteriocin; CHR, Chromosome or Genome Assembly; PMD, Plasmid; T6SS, Type VI Secretion System) | NCBI Taxonomy | Length of Sheath Protein, Amino Acid Residues | Number of Domains in the Modelled Structure |
|---|---|---|---|---|
| 1 |
|
| 487 | 2 |
| 2 |
| 472 | 2 | |
| 3 |
| 370 | 1 | |
| 4 |
| 355 | 1 | |
| 5 |
| 440 | 1 | |
| 6 |
| 424 | 1 | |
| 7 |
| 417 | 1 | |
| 8 |
| 354 | 1 | |
| 9 |
| 451 | 1 | |
| 10 |
|
| 838 | 5+ |
| 11 |
| 1086 | 5+ | |
| 12 |
|
| 987 | 3 |
| 13 |
| 568 | 3 | |
| 14 |
| 571 | 3 | |
| 15 |
| 579 | 3 | |
| 16 |
| 987 | 3 | |
| 17 |
| 571 | 3 | |
| 18 |
| 494 | 2 | |
| 19 |
| 568 | 3 | |
| 20 |
|
| 727 | 4 |
| 21 |
| 356 | 1 | |
| 22 |
| 386 | 1 | |
| 23 |
| 437 | 2 | |
| 24 |
| 437 | 2 | |
| 25 |
| 437 | 2 | |
| 26 |
| 342 | 1 | |
| 27 |
| 391 | 1 | |
| 28 |
| 477 | 2 | |
| 29 |
|
| 397 | 1 |
| 30 |
| 636 | 3 | |
| 31 |
|
| 688 | 3 |
| 32 |
|
| 508 | 2 |
| 33 |
| 446 | 2 | |
| 34 |
| 460 | 2 | |
| 35 |
|
| 410 | 1 |
| 36 |
|
| 321 | 1 |
| 37 |
| 369 | 1 | |
| 38 |
| 521 | 2 | |
| 39 |
|
| 634 | 3 |
| 40 |
| 577 | 2 | |
| 41 |
| 523 | 2 | |
| 42 |
|
| 805 | 4 |
| 43 |
| 574 | 3 | |
| 44 |
| 343 | 1 | |
| 45 |
| 540 | 2 | |
| 46 | 509 | 2 | ||
| 47 |
| 520 | 2 | |
| 48 |
| 508 | 2 | |
| 49 |
| 348 | 1 | |
| 50 |
| 478 | 2 | |
| 51 | 474 | 2 | ||
| 52 |
| 476 | 2 | |
| 53 |
| 474 | 2 | |
| 54 |
| 436 | 2 | |
| 55 |
| 452 | 2 | |
| 56 |
| 658 | 3 | |
| 57 |
| 355 | 1 | |
| 58 |
| 436 | 2 | |
| 59 |
| 355 | 1 | |
| 60 |
| 375 | 1 | |
| 61 |
| 635 | 3 | |
| 62 |
| 632 | 3 | |
| 63 |
| 632 | 3 | |
| 64 |
| 558 | 2 | |
| 65 |
| 477 | 2 | |
| 66 |
| 681 | 2 | |
| 67 |
|
| 498 | 2 |
| 68 |
| 569 | 2 | |
| 69 |
| 569 | 3 | |
| 70 |
| 389 | 1 | |
| 71 |
| 680 | 2 | |
| 72 |
| 563 | 2 | |
| 73 |
| 695 | 2 | |
| 74 |
| 681 | 2 | |
| 75 |
| 713 | 2 | |
| 76 |
| 458 | 2 | |
| 77 |
| 512 | 2 | |
| 78 |
| 495 | 2 | |
| 79 |
| 396 | 1 | |
| 80 |
| 631 | 3 | |
| 81 |
| 887 | 4 | |
| 82 |
| 659 | 3 | |
| 83 |
| 659 | 3 | |
| 84 |
| 475 | 2 | |
| 85 |
|
| 482 | 2 |
| 86 |
| 481 | 2 | |
| 87 |
| 384 | 1 | |
| 88 |
|
| 390 | 1 |
| 89 |
| 482 | 2 | |
| 90 |
| 437 | 2 | |
| 91 |
| 430 | 2 | |
| 92 |
| 432 | 2 | |
| 93 |
| 434 | 2 | |
| 94 |
|
| 681 | 2 |
| 95 |
| 430 | 2 | |
| 96 |
| 388 | 1 | |
| 97 |
| 430 | 2 | |
| 98 |
| 430 | 2 | |
| 99 |
| 431 | 2 | |
| 100 |
| 430 | 2 | |
| 101 |
| 657 | 3 | |
| 102 |
| 663 | 3 | |
| 103 |
| 888 | 4 | |
| 104 |
| 660 | 3 | |
| 105 |
| 612 | 3 | |
| 106 |
| 562 | 3 | |
| 107 |
| 562 | 3 | |
| 108 |
|
| 472 | 2 |
| 109 |
| 774 | 4 | |
| 110 |
| 581 | 3 | |
| 111 |
| 581 | 3 | |
| 112 |
|
| 482 | 2 |
| 113 |
| 426 | 2 | |
| 114 |
| 478 | 2 | |
| 115 |
| 483 | 2 | |
| 116 |
|
| 1283 | 5+ |
| 117 |
|
| 1248 | 5+ |
| 118 |
|
| 881 | 5+ |
| 119 |
| 814 | 4 | |
| 120 |
| 539 | 2 | |
| 121 |
| 669 | 3 | |
| 122 |
| 547 | 2 | |
| 123 |
| 695 | 2 | |
| 124 |
| 427 | 2 | |
| 125 |
| 648 | 3 | |
| 126 |
|
| 826 | 5+ |
| 127 |
| 391 | 1 | |
| 128 |
|
| 955 | 4 |
| 129 |
| 636 | 3 | |
| 130 |
| 663 | 3 | |
| 131 |
|
| 493 | 2 |
| 132 |
|
| 838 | 5+ |
| 133 |
| 587 | 3 | |
| 134 |
| 587 | 3 | |
| 135 |
| 586 | 3 | |
| 136 |
| 731 | 3 | |
| 137 |
| 432 | 1 | |
| 138 |
| 506 | 1 | |
| 139 |
| 498 | 1 | |
| 140 |
| 493 | 1 | |
| 141 |
| 499 | 1 | |
| 142 |
| 491 | 1 | |
| 143 |
| 493 | 1 | |
| 144 |
| 509 | 1 | |
| 145 |
| 1032 | 3 | |
| 146 |
|
| 648 | 3 |
| 147 |
|
| 486 | 2 |
| 148 |
|
| 378 | 1 |
| 149 |
|
| 682 | 2 |
| 150 |
| 386 | 1 | |
| 151 |
| 756 | 4 | |
| 152 |
|
| 383 | 1 |
| 153 |
|
| 714 | 3 |
Figure 7The models of the type 2 sheath proteins listed below superimposed with Burkholderia phage BEK TShP (painted salmon): (a) Mycobacterium phage Phabba, (b) Brevibacillus phage Jimmer2, (c) genome assembly attributed as Thermoprotei archaeon B19_G17, (d) Erwinia phage vB_EamM_RisingSun, (e) Escherichia phage Mu, (f) Halobacterium virus ChaoS9, (g) Cellulophaga phage phi38:2, (h) Faecalibacterium phage FP_Mushu, (i) Gordonia phage GMA6, (j) Halocynthia phage JM-2012, (k) genome assembly attributed as Thermoplasmata archaeon isolate B28_Guay1, (l) Vibrio phage BONAISHI.
Figure 8The models of the type 3 sheath proteins listed below superimposed with Burkholderia phage BEK TShP (painted salmon): (a) Agrobacterium phage Atu_ph07, (b) Bacillus phage BC01, (c) prophage TShP of Halovivax ruber XH-70, (d) genome assembly attributed as Crenarchaeota archaeon isolate__LB_CRA_1, (e) Salicola phage SCTP-2, (f) Serratia phage phiMAM1, (g) Klebsiella phage Miro, (h) Kosakonia phage Kc304, (i) Klebsiella phage vB_KleM_RaK2, (j) genome assembly attributed as phage Mad1_20_16, (k) Ralstonia phage RSP15.
Figure 9The 3D-model of the TShP of Bacillus phages PBS1 (violet) superimposed with the Burkholderia phage BEK TShP (salmon).
Figure 10Circular tree constructed with 153 sheath proteins based on structural similarity assessed by mTM-align and clustered by BioNJ.
Figure 11Best-scoring maximum likelihood (ML) phylogenetic tree constructed with 90 amino acid sequences for phage TShPs aligned with the mTM-align structural alignment algorithm and trimmed to the conserved “main” domain. NCBI taxonomy is shown to the right of the phage name. Total number of domains in the modelled structures are shown in the boxes to the right of the taxonomic assignment. The next column of bars indicate the phage genome length, as given in the NCBI phage GenBank database, and the numbers to the right correspond to the genome length. The genome of Campylobacter phage CAM-P21 seems to be incomplete, and the corresponding prophage sequences were used for some analyses. The numbers near the tree branches indicate the fraction of the bootstrap trees supporting the branch. The total number of bootstrap trees was 1000. The scale bar shows 0.5 estimated substitutions per site and the tree was rooted to the midpoint. The abbreviation “M” stands for Myoviridae.
Figure 12Best-scoring ML phylogenetic tree constructed with 90 amino acid sequences of phage major capsid protein aligned with MAFFT. The NCBI taxonomy is shown to the right of the phage name. Total number of domains in the modelled structures of the corresponding tail sheath proteins are shown in boxes to the right of the taxonomic assignment. The numbers near the tree branches indicate the fraction of the bootstrap trees supporting the branch. The total number of bootstrap trees was 1000. The scale bar shows 0.5 estimated substitutions per site and the tree was rooted to the midpoint. The abbreviation “M” stands for Myoviridae.