| Literature DB >> 20122221 |
Yang Yang1, Jiayuan Zhao, Robyn L Morgan, Wenbo Ma, Tao Jiang.
Abstract
BACKGROUND: Type III secretion system (T3SS) is a specialized protein delivery system in gram-negative bacteria that injects proteins (called effectors) directly into the eukaryotic host cytosol and facilitates bacterial infection. For many plant and animal pathogens, T3SS is indispensable for disease development. Recently, T3SS has also been found in rhizobia and plays a crucial role in the nodulation process. Although a great deal of efforts have been done to understand type III secretion, the precise mechanism underlying the secretion and translocation process has not been fully understood. In particular, defined secretion and translocation signals enabling the secretion have not been identified from the type III secreted effectors (T3SEs), which makes the identification of these important virulence factors notoriously challenging. The availability of a large number of sequenced genomes for plant and animal-associated bacteria demands the development of efficient and effective prediction methods for the identification of T3SEs using bioinformatics approaches.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20122221 PMCID: PMC3009519 DOI: 10.1186/1471-2105-11-S1-S47
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The T3SS apparatus in Pseudomonas syringae.
Figure 2The composition of a typical effector protein.
Positive and negative sample numbers in the two data sets. Set I is the redundant data set and set II the non-redundant data set.
| Data | Positive | Negative | Total |
|---|---|---|---|
| I | 283 | 3779 | 4062 |
| II | 108 | 3424 | 3532 |
Cross validation results on the two data sets
| Data | |||
|---|---|---|---|
| I | 99.0 | 94.1 | 85.4 |
| II | 98.6 | 90.8 | 64.8 |
Presence of the statistical biases in confirmed type III effectors in rhizobia. Feature 1 means at least 10% Ser residues within the first 50 amino acids.
| Features | |||||
|---|---|---|---|---|---|
| Species | Effector | GI number | 1 | 2 | 3 |
| Sino | NopA | 55668600 | 0 | 1 | 1 |
| NopP | 63103266 | 1 | 0 | 0 | |
| NolB | 19749321 | 1 | 1 | 1 | |
| NolX | 52631913 | 1 | 0 | 1 | |
| NopC* | 255767012 | 1 | 1 | 1 | |
| NopL* | 2182720 | 1 | 1 | 0 | |
| NopP* | 2182742 | 1 | 0 | 0 | |
| NopB* | 2182730 | 1 | 1 | 1 | |
| NopX* | 2182728 | 1 | 0 | 1 | |
| Meso | NopB* | 13475298 | 1 | 1 | 1 |
| NolX* | 13475296 | 1 | 1 | 1 | |
| Brady | NodN* | 27379070 | 0 | 1 | 0 |
| NolB* | 27376923 | 1 | 0 | 1 | |
Feature 2 means that an Ile, Leu, Val, or Pro is located at the third or fourth residue of the protein. Feature 3 means no Asp or Glu residues within the first 12 amino acids. The matrix is boolean, i.e., 1 means true and 0 means false. The involved rhizobial species are abbreviated as Sino for Sinorhizobium, Meso for Mesorhizobium, and Brady for Bradyrhizobium. The effectors marked by * are from the four strains considered in this study. The third column lists GI numbers from NCBI GenBank.
Number of sequences in the rhizobia data set and prediction results.
| Strain | Original # | # Seq. with | Predicted # | Unconfirmed # |
|---|---|---|---|---|
| WSM419 | 6213 | 160 | 9 | 9 |
| MAFF303099 | 7272 | 142 | 12 | 8 |
| USDA110 | 8317 | 279 | 30 | 23 |
| NGR234 | 418 | 375 | 6 | 0 |
| Total | 22220 | 956 | 57 | 40 |
The strains are abbreviated as WSM419 for Sinorhizobium medicae WSM419, MAFF303099 for Mesorhizobium loti MAFF303099, USDA110 for Bradyrhizobium japonicum USDA110, and NGR234 for Sinorhizobium sp. NGR234. The original number means the number of proteins collected from the rhizobial strains. For MAFF303099 and NGR234, the numbers are the total numbers of proteins on both the chromosome and plasmids. The third column lists the numbers of sequences that have the tts motif in their promoters. The fourth column records the numbers of candidate effectors predicted by the SVM.
Experimentally confirmed secreted proteins in Bradyrhizobium japonicum USDA110 and Mesorhizobium loti MAFF303099.
| Strain | Effector | Source |
|---|---|---|
| USDA110 | nodulation protein NolB | NCBI GenBank |
| bll1862 | Ref. [ | |
| blr1904 | Ref. [ | |
| blr2058 | Ref. [ | |
| blr2140 | Ref. [ | |
| bll8201 | Ref. [ | |
| bll8244 | This study | |
| MAFF303099 | nodulation protein NolX | NCBI GenBank |
| mlr8763 ( | Ref. [ | |
| mlr6361 | Ref. [ | |
| mlr6358 | Ref. [ |
Predicted secreted proteins in rhizobia that have not been confirmed experimentally.
| Gene ID | Annotation | Position of | Motif e-value | SVM probability |
|---|---|---|---|---|
| blr1704 | hypothetical protein | -67 ~ -31 | 2.30E-06 | 0.92 |
| bll1648 | hypothetical protein | -260 ~ -224 | 9.60E-03 | 0.88 |
| blr1854 | hypothetical protein | -66 ~ -30 | 6.10E-07 | 0.86 |
| mlr5875 | hypothetical protein | -157 ~ -121 | 1.00E-02 | 0.86 |
| mlr6331 | hypothetical protein | -81 ~ -45 | 2.10E-03 | 0.69 |
| Smed_1170 | biotin-regulated protein | -107 ~ -71 | 8.30E-03 | 0.68 |
| blr5999 | hypothetical protein | -693 ~ -657 | 7.00E-03 | 0.67 |
| bll1840 | hypothetical protein | -74 ~ -38 | 5.60E-05 | 0.64 |
| Smed_5711 | hypothetical protein | -606 ~ -570 | 0.0047 | 0.55 |
| bll1796 | hypothetical protein | -930 ~ -894 | 1.40E-06 | 0.54 |
| bll1804 | hypothetical protein | -102 ~ -66 | 7.50E-10 | 0.51 |
| bll8244 | hypothetical protein | -188 ~ -152 | 9.80E-06 | 0.51 |
| bll1636 | hypothetical protein | -657 ~ -621 | 3.10E-03 | 0.50 |
| Smed_4857 | hypothetical protein | -826 ~ -790 | 0.0068 | 0.49 |
| Smed_1856 | putative signal peptide protein | -299 ~ -263 | 2.80E-03 | 0.48 |
| Smed_4485 | hypothetical protein | -637 ~ -601 | 0.005 | 0.4 |
| bll0275 | hypothetical protein | -395 ~ -361 | 8.50E-03 | 0.38 |
| bsr1999 | hypothetical protein | -264 ~ -227 | 4.20E-04 | 0.37 |
| mlr3881 | hypothetical protein | -483 ~ -447 | 9.80E-03 | 0.36 |
| blr0325 | hypothetical protein | -490 ~ -454 | 5.40E-03 | 0.35 |
| mll5027 | hypothetical protein | -377 ~ -340 | 9.20E-03 | 0.35 |
| bll1848 | hypothetical protein | -300 ~ -264 | 9.00E-08 | 0.34 |
| bll5481 | hypothetical protein | -128 ~ -92 | 6.50E-03 | 0.33 |
| mlr0825 | hypothetical protein | -535 ~ -499 | 5.70E-03 | 0.32 |
| bsr8005 | hypothetical protein | -89 ~ -53 | 5.60E-03 | 0.31 |
| mlr1025 | * | -764 ~ -728 | 8.60E-04 | 0.29 |
| mlr7808 | hypothetical protein | -906 ~ -869 | 6.70E-03 | 0.29 |
| Smed_0887 | hypothetical protein | -585 ~ -549 | 5.50E-03 | 0.27 |
| blr6167 | hypothetical protein | -250 ~ -214 | 9.50E-03 | 0.27 |
| msl5783 | hypothetical protein | -710 ~ -673 | 8.40E-03 | 0.25 |
| Smed_1171 | peptidase M23B | -993 ~ -957 | 8.30E-03 | 0.23 |
| bll5622 | hypothetical protein | -152 ~ -116 | 9.50E-03 | 0.19 |
| bll1877 | hypothetical protein | -101 ~ -65 | 1.80E-08 | 0.13 |
| Smed_5269 | hypothetical protein | -610 ~ -574 | 0.00014 | 0.12 |
| blr1869 | hypothetical protein | -147 ~ -111 | 3.50E-08 | 0.1 |
| Smed_0286 | hypothetical protein | -133 ~ -97 | 1.50E-04 | 0.09 |
| blr0354 | hypothetical protein | -482 ~ -446 | 5.80E-03 | 0.09 |
| bll1810 | hypothetical protein | -246 ~ -210 | 1.90E-07 | 0.08 |
| bll1798 | hypothetical protein | -90 ~ -54 | 1.40E-06 | 0.08 |
| bll1797 | hypothetical protein | -533 ~ -497 | 1.40E-06 | 0.04 |
Here, the position of the tts box in a promoter region is specified in terms of its distance from the respective start codon. The negative sign means that the promoter region is upstream of the start codon. The annotation * indicates a transcriptional regulatory protein that is also a nodulation competitiveness determinant. Genes that contain "bll" and "blr" in their IDs are from Bradyrhizobium japonicum USDA 110, genes that contain "mll" and "mlr" are from Mesorhizobium loti MAFF303099, and genes that contain "Smed" are from Sinorhizobium medicae WSM419.