| Literature DB >> 31504472 |
Laura Gomez-Valero1,2, Alvaro Chiner-Oms3, Iñaki Comas4,5, Carmen Buchrieser1,2.
Abstract
The Dot/Icm type IVB secretion system of Legionella pneumophila is essential for its pathogenesis by delivering >300 effector proteins into the host cell. However, their precise secretion mechanism and which components interact with the host cell is only partly understood. Here, we undertook evolutionary analyses of the Dot/Icm system of 58 Legionella species to identify those components that interact with the host and/or the substrates. We show that high recombination rates are acting on DotA, DotG, and IcmX, supporting exposure of these proteins to the host. Specific amino acids under positive selection on the periplasmic region of DotF, and the cytoplasmic domain of DotM, support a role of these regions in substrate binding. Diversifying selection acting on the signal peptide of DotC suggests its interaction with the host after cleavage. Positive selection acts on IcmR, IcmQ, and DotL revealing that these components are probably participating in effector recognition and/or translocation. Furthermore, our results predict the participation in host/effector interaction of DotV and IcmF. In contrast, DotB, DotO, most of the core subcomplex elements, and the chaperones IcmS-W show a high degree of conservation and not signs of recombination or positive selection suggesting that these proteins are under strong structural constraints and have an important role in maintaining the architecture/function of the system. Thus, our analyses of recombination and positive selection acting on the Dot/Icm secretion system predicted specific Dot/Icm components and regions implicated in host interaction and/or substrate recognition and translocation, which will guide further functional analyses.Entities:
Keywords: zzm321990 Legionellazzm321990 ; Dot/Icm system; diversifying-selection; evolution; negative-selection; positive-selection
Mesh:
Substances:
Year: 2019 PMID: 31504472 PMCID: PMC6761968 DOI: 10.1093/gbe/evz186
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Schematic representation of the Dot/Icm secretion apparatus and the gene loci encoding the different Dot/Icm components. (a) The representation of the core transmembrane subcomplex is based mainly on the work of Ghosal et al. (2017). The representation of the coupling protein subcomplex is based on studies reported earlier (Kwak et al. 2017; Chetrit et al. 2018; Meir et al. 2018). DotC for which the structure is not known is drawn as circle. DotH and IcmX are represented in the shapes reported from densities seen in the subtomogram averages or difference maps (Ghosal et al. 2017). (b) Genes coding the different Dot/Icm components represented according to their size, position, and orientation in the Legionella pneumophila Paris genome, and colored according to the schematic presentation shown in (a). A double arrow above the gene indicates positive selection acting on it, either on specific sites and/or specific nodes of the phylogeny. If positive selection was detected only with aBSREL and only on one node of the phylogeny, it is indicated only when P values are <0.01. An “R” above a gene indicates that more than one sequence was affected by recombination. OM, outer membrane; IN, inner membrane; PG, peptidoglycan layer.
General Features of Each Dot/Icm Component in the 80 Legionella Strains Analyzed in This Study
| Protein Name | Protein Label | No. of Sequences | Amino Acid Identity (%) | Gene Length | Aligment Length | Average Identical Nucleotides | % Change | Recombin |
|---|---|---|---|---|---|---|---|---|
| IcmT | Lpp0507 | 80 | 78–100 (87%) | 261 | 249 | 196.4 | 20.5 | No |
| IcmS | Lpp0508 | 80 | 74–100 (91%) | 345 | 333 | 263.6 | 20.4 | No |
| IcmR | Lpp0509 | 16 | 48–99 (94%) | 363 | 357 | 337.0 | 6.6 | No |
| IcmQ | Lpp0510 | 80 | 49–99 (70%) | 576 | 504 | 364.7 | 27.1 | No |
| IcmP /DotM | Lpp0511 | 80 | 64–99 (79%) | 1,131 | 1,098 | 834.8 | 23.5 | Yes (3) |
| IcmO /DotL | Lpp0512 | 80 | 77–100 (89%) | 2,352 | 2,325 | 1,854.6 | 19.9 | Yes (10) |
| IcmN /DotK | Lpp0513 | 80 | 50–99 (69%) | 570 | 486 | 341.1 | 29.0 | No |
| IcmM /DotJ | Lpp0514 | 80 | 33–99 (63%) | 285 | 291 | 188.6 | 35.0 | No |
| IcmL /DotI | Lpp0515 | 80 | 69–100 (90%) | 639 | 633 | 504.7 | 19.7 | No |
| IcmK /DotH | Lpp0516 | 80 | 65–99 (79%) | 1,083 | 867 | 672.6 | 21.9 | No |
| IcmE /DotG | Lpp0517 | 80 | 49–100 (71%) | 3,147 | 2,706 | 1,904.1 | 27.8 | Yes (43) |
| IcmG /DotF | Lpp0518 | 80 | 37–99 (57%) | 810 | 504 | 362.1 | 27.4 | No |
| IcmC /DotE | Lpp0519 | 80 | 57–100 (76%) | 585 | 525 | 384.7 | 25.5 | No |
| IcmD /DotP | Lpp0520 | 80 | 57–99 (78%) | 399 | 315 | 242.0 | 22.5 | No |
| IcmJ /DotN | Lpp0521 | 80 | 72–100 (87%) | 627 | 603 | 482.8 | 19.5 | No |
| IcmB /DotO | Lpp0522 | 80 | 80–100 (88%) | 3,030 | 3,006 | 2,366.4 | 20.6 | Yes (1) |
| IcmF | Lpp0524 | 80 | 34–100 (70%) | 2,922 | 2,811 | 1,985.7 | 28.2 | No |
| IcmH /DotU | Lpp0525 | 80 | 36–100 (71%) | 786 | 747 | 537.1 | 27.5 | No |
| DotV | Lpp0537 | 79 | 44–100 (69%) | 543 | 462 | 332.1 | 28.1 | Yes (6) |
| LvgA | Lpp0590 | 80 | 42–100 (69%) | 627 | 528 | 380.5 | 27.3 | Yes (1) |
| DotD | Lpp2728 | 80 | 69–100 (84%) | 492 | 465 | 364.2 | 21.0 | No |
| DotC | Lpp2729 | 80 | 69–100 (81%) | 912 | 810 | 634.4 | 20.9 | No |
| DotB | Lpp2730 | 80 | 86–100 (94%) | 1,134 | 1,101 | 878.1 | 19.7 | No |
| DotA | Lpp2740 | 80 | 47–99 (60%) | 3,108 | 2,025 | 1,419.0 | 28.5 | Yes (20) |
| IcmV | Lpp2741 | 80 | 49–99 (67%) | 456 | 429 | 300.6 | 29.4 | No |
| IcmW | Lpp2742 | 80 | 71–100 (89%) | 456 | 453 | 361.0 | 19.8 | Yes (1) |
| IcmX | Lpp2743 | 78 | 36–89 (51%) | 1,419 | 711 | 464.6 | 34.1 | Yes (4) |
Note.—Amino acid (aa) identity represents the minimum, maximum, and average values (average between parenthesis) of aa identity between each Dot/Icm protein from L. pneumophila strain Paris against the corresponding orthologous in other species calculated by BLASTp. The parameters: alignment length, average of identical nucleotides, and the % of changes are calculated from the multiple alignment of each protein after cleaning with Gblocks and taking as a reference the genome of L. pneumophila Paris for calculating identity and % of change.
L. pneumophila strain Paris.
With respect to L. pneumophila Paris sequence.
After cleaning with Gblocks.
Number of recombinant sequences.
Identity value based on the comparison of 16 strains (15 belonging to the same species L. pneumophila and only one to a different species L. norrlandica).
Results of Negative/Positive Selection Acting on Dot/Icm Genes at Interspecific Level
| Methods Used to Infer Selection | ||||||||
|---|---|---|---|---|---|---|---|---|
| Gene Name | Gene Label* | No. of Sequences | FEL | MEME (codon under positive selection) | FUBAR (codon under positive selection) | ABSREL (nodes under positive selection) | ||
| No. of Codons Tested | No. of Codons Under Neg. Select. | Codon Under Pos. Select. | ||||||
|
|
| 58 | 83 | 76 | 0 | 0 | 0 | 0 |
|
|
| 58 | 111 | 109 | 0 | 0 | 0 | 0 |
|
|
| 2 | — | — | — | — | — | — |
|
|
| 58 | 168 | 161 | 1 (0.0558) | 1 (0.0756) | 0 | 0 |
|
|
| 56 | 366 | 340 | 0 | 22 (0.0153) 362 (0.0594) | 0 | 74 (0.03936) |
|
|
| 56 | 777 | 753 | 0 | 0 | 0 | 2 (0.0168) 94 (0.03696) |
|
|
| 58 | 162 | 150 | 0 | 0 | 0 | 0 |
|
|
| 57 | 74 | 66 | 0 | 0 | 0 | 0 |
|
|
| 58 | 211 | 203 | 0 | 0 | 0 | 0 |
|
|
| 58 | 285 | 266 | 0 | 0 | 0 | 0 |
|
|
| 35 | 902 | 873 | 0 | 0 | 0 | 1 (0.00376) 2 (0.00459) 14 (0.01272) |
|
|
| 58 | 168 | 159 | 124 (0.0152 ) | 124 (0.0238) 127 (0.0348) | 124 (0.9729) | 0 |
|
|
| 58 | 175 | 171 | 0 | 0 | 0 | 0 |
|
|
| 58 | 105 | 101 | 0 | 0 | 0 | 6 (0.02037) |
|
|
| 58 | 201 | 194 | 0 | 0 | 0 | 0 |
|
|
| 58 | 1002 | 937 | 0 | 0 | 0 | 0 |
|
|
| 58 | 937 | 908 | 0 | 434 (0.0314) | 0 | 4 (0.00270) 69 (0.00397) 95 (0.00504) |
|
|
| 58 | 248 | 240 | 0 | 0 | 0 | 0 |
|
|
| 52 | 155 | 146 | 0 | 43 (0.0065) | 0 | 0 |
|
|
| 57 | 176 | 172 | 0 | 0 | 0 | 6 (0.03112) |
|
|
| 58 | 155 | 146 | 0 | 0 | 0 | 0 |
|
|
| 58 | 270 | 251 | 2 (0.0004) 5 (0.0630) 6 (0.0017) | 2 (0.00090) 5 (0.0840) 6 (0.0032) 250 (0.0678) | 2 (0.9978) 5 (0.9314) 6 (0.9975) | 4 (0.00039) 36 (0.02733) |
|
|
| 58 | 367 | 360 | 0 | 0 | 0 | 15 (0.01046) |
|
|
| 48 | 675 | 644 | 0 | 0 | 0 | 50 (0.00047) 34 (0.00173) 63 (0.01155) 1 (0.02190) 29 (0.02565) |
|
|
| 58 | 143 | 127 | 0 | 0 | 0 | 0 |
|
|
| 57 | 150 | 143 | 0 | 0 | 0 | 0 |
|
|
| 53 | 237 | 229 | 0 | 0 | 0 | 17 (0.00005) |
Note.—Number in parenthesis indicates either P value or posterior probability (FUBAR method) associated to the codon/branch inferred to be subjected to positive selection. Gray areas highlight those genes where positive selection was found.
. 2.—Amino acids/codons under positive selection (detected by at least two different methods) in the proteins IcmG/DotF, DotC, and IcmQ. The fragment/s of the alignment containing amino acids/codons under positive selection in IcmG/DotF, DotC, or IcmQ are indicated for each species ordered according to their position in the phylogenetic tree depicted on the left hand side of the figure. The lines with circle heads point to the amino acids that are under positive selection whereas the corresponding codons are framed with a black line. The numbers above highlighted codons correspond to the codon number in the analyzed alignment whereas the corresponding codon number in the sequence from Legionella pneumophila strain Paris is shown between brackets. The main domains of the protein and the position of the codons under positive selection are shown in the bottom of the figure taking as reference the L. pneumophila Paris sequence.