| Literature DB >> 29769048 |
Simon Houston1, Karen Vivien Lithgow1, Kara Krista Osbak2, Chris Richard Kenyon2,3, Caroline E Cameron4.
Abstract
BACKGROUND: Syphilis continues to be a major global health threat with 11 million new infections each year, and a global burden of 36 million cases. The causative agent of syphilis, Treponema pallidum subspecies pallidum, is a highly virulent bacterium, however the molecular mechanisms underlying T. pallidum pathogenesis remain to be definitively identified. This is due to the fact that T. pallidum is currently uncultivatable, inherently fragile and thus difficult to work with, and phylogenetically distinct with no conventional virulence factor homologs found in other pathogens. In fact, approximately 30% of its predicted protein-coding genes have no known orthologs or assigned functions. Here we employed a structural bioinformatics approach using Phyre2-based tertiary structure modeling to improve our understanding of T. pallidum protein function on a proteome-wide scale.Entities:
Keywords: Functional annotation; Proteome; Structural modeling; Syphilis; Treponema pallidum; Virulence factors
Mesh:
Substances:
Year: 2018 PMID: 29769048 PMCID: PMC5956850 DOI: 10.1186/s12900-018-0086-3
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
Fig. 1Pipeline for proteome-wide tertiary structure modeling of T. pallidum. a Dataset: All 978 proteins from T. pallidum subspecies pallidum (Nichols) were used for whole proteome tertiary structure modeling and functional predictions. b Modeling: Complete amino acids sequences corresponding to the 978 protein-coding genes were submitted in “batch-mode” to the protein tertiary structure modeling server, Phyre2, and the 20 top-ranked template structural matches for each protein were obtained. Only those T. pallidum proteins that were modeled with a Phyre2 confidence score of at least 90% and alignment coverage of at least 10% were analyzed further. c Validation: To help validate our approach and increase confidence in previous genomic annotations, the predicted functions (based on tertiary structure model template information) of 605 T. pallidum proteins were compared with their corresponding functional annotations derived from genome sequencing. Proteins were then categorized as having either the same, related, or different functions. d To gain insight into the potential function(s) of 175 uncharacterized proteins, and for the identification of potential virulence factors, the functions of the 20 top-ranked tertiary structure model templates for each protein were analyzed using the same confidence and alignment coverage cut-off scores as described above
Fig. 2Distribution of confidence and coverage scores from T. pallidum proteome-wide structural modeling. Amino acid sequences of 978 protein-coding genes from T. pallidum subspecies pallidum were submitted to Phyre2 for structural modeling. a Pie chart indicating the distribution of the number of T. pallidum proteins with tertiary structure model predictions within five Phyre2 confidence score ranges. A confidence score of at least 90% was used for all subsequent analyses. b Pie chart indicating the distribution of the number of T. pallidum proteins modeled by Phyre2 with ≥90% confidence scores within 10% alignment coverage categories (0–100%). An alignment coverage of at least 10% was used for all subsequent analyses
Fig. 3Comparison of T. pallidum primary and tertiary structure annotations. Predicted functions of 605 T. pallidum proteins derived from tertiary structure models with high confidence were compared with functional annotations from genome sequencing. a Distribution of T. pallidum proteins modeled by Phyre2 (≥ 90% confidence, ≥10% alignment coverage) predicted to have the same, related (same PDB functional group classification and/or related PDB molecule/template function), or different functions (including unknown functions) compared to the published T. pallidum (Nichols) genome annotations. This analysis only used the top-ranking tertiary structure model protein template function. b Distribution of T. pallidum proteins as outlined above after genome annotated functions were compared to all confident (≥90% confidence, ≥10% coverage alignment) top 20-ranking templates used to model each protein. A functional match was assigned when the genome annotated function matched at least one protein tertiary structure model template function. c Distribution of T. pallidum proteins as outlined above according to their PDB functional classification using the top-ranking template only, or (d) using all confident tertiary structure model top 20-ranking templates for each protein function comparison. It should be noted that in (c) and (d), protein templates used to model a small number of T. pallidum proteins in the current study were categorized as “unknown function” by the PDB classification system. However, these proteins were ascribed functions in the current study based on their PDB molecule function annotation and/or PDB structure title which allowed for comparisons with their genomic protein annotations
Fig. 4Functional annotation of T. pallidum hypothetical proteins using Phyre2. The potential function of 175 uncharacterized proteins were predicted by analyzing the functions of the template proteins Phyre2 used for tertiary structure modeling. a Distribution of PDB functional classes within the T. pallidum proteome based on Phyre2 modeling using the top-ranking tertiary structure model template only, or (b) by comparing the genome annotated functions to all confident (≥90% confidence, ≥10% coverage alignment) tertiary structure model top 20-ranking templates that were used to model each protein
Summary comparison of virulence factor candidates from whole genome sequencing and corresponding Phyre2-modeled proteins
| Protein | Genome Annotation [ | Phyre2 Model Templates: 1st rank model and potential virulence model(s) |
|---|---|---|
| TPANIC_0027 | Putative hemolysin (HlyC) | 1st rank: CorC Magnesium/Cobalt efflux protein ( |
| TPANIC_0028 | Putative hemolysin (HlyC) | 1st rank: CorC Magnesium/Cobalt efflux protein ( |
| TPANIC_0399 | Type 3 (virulence-related) | 1st rank: PrgH ( |
| TPANIC_0401 | Type 3 (virulence-related) | 1st rank: V-type proton ATPase subunit E (yeast) |
| TPANIC_0402 | Type 3 (virulence-related) | 1st rank: Flagellar type 3 ATPase FliI ( |
| TPANIC_0649 | Putative hemolysin (TlyC) | 1st rank: CorC Magnesium/Cobalt efflux protein ( |
| TPANIC_0714 | Type 3 (virulence-related) | 1st rank: Flagellar biosynthesis protein FlhA ( |
| TPANIC_0715 | Type 3 (virulence-related) | 1st rank: YscU ( |
| TPANIC_0936 | Putative hemolysin | 1st rank: CorC Magnesium/Cobalt efflux protein ( |
| TPANIC_1037 | Putative hemolysin III (HlyIII) | 1st rank: Human adiponectin receptor 1 |
Genome sequencing annotated proteins with potential novel virulence-related functions identified by Phyre2 modeling
| Protein | Genome Annotation | Phyre2 Models: 1st rank model and potential virulence model(s) |
|---|---|---|
| TPANIC_0262 | Cyclic nucleotide-binding protein | 1st rank: PrfA ( |
| TPANIC_0862 | Peptidylprolyl isomerase (FklB) | 1st rank: Mip ( |
| TPANIC_1033 | Patatin family phospholipase | 1st rank: VipD ( |
Summary of T. pallidum uncharacterized proteins identified by structural modeling with potential roles in virulence
| Protein | Phyre2 Virulence Model Templates (% confidence / % coverage) | Roles in virulence | Expression Analyses | MSC | |
|---|---|---|---|---|---|
| cDNA/DNA | Protein | ||||
| TPANIC_0020 | • TgMIC2 ( | • Promotes active invasion [ | 2.206 | E [ | - |
| • TRAP protein ( | • Cell adhesion & invasion [ | ||||
| TPANIC_0126 | • Outer membrane protein W ( | • Phagocytosis resistance [ | 1.115 | E [ | + |
| • Outer membrane protein A ( | • Host adhesion, invasion and immune evasion [ | ||||
| • Outer membrane protein OprG ( | • Cytotoxicity [ | ||||
| • Outer membrane protein F ( | • Host adhesion [ | ||||
| • NspA (Neisseria surface protein A) (90.4 / 64.1) | • Factor H-binding and complement resistance [ | ||||
| TPANIC_0134 | • Bacterial sialidases/neuraminidases ( | • | 3.586 | ND | + |
| • | |||||
| TPANIC_0225 | • Leucine-rich repeat surface proteins ( | • Host adhesion [ | 2.27 | ND | - |
| • PcpA ( | • Host adhesion [ | ||||
| TPANIC_0246 | • TRAP protein ( | • Cell recognition & invasion [ | 0.096 | E [ | - |
| • TRAP protein ( | • Cell recognition & invasion [ | ||||
| • TgMIC2 ( | • Active invasion [ | ||||
| TPANIC_0421 | • PknD ( | • Adhesion and invasion of brain endothelia [ | 0.686 | E [ | - |
| TPANIC_0544 | • SmcL ( | • Cytotoxicity [ | 0.811 | E [ | - |
| • Beta-hemolysin toxin ( | • Cytotoxicity [ | ||||
| • Cytolethal distending toxin protein B ( | • Cytotoxicity [ | ||||
| TPANIC_0579 | • YenC2 ( | • Cytotoxicity [ | 0.268 | E [ | - |
| TPANIC_0594 | • HP1028 ( | • Host colonization and persistence [ | 2.238 | ND | + |
| TPANIC_0598 | • BamB ( | • BAM complex; assembly and insertion of beta-barrel proteins in outer membrane [ | 0.272 | E [ | - |
| TPANIC_0625 | • BamD (Beta barrel assembly machinery protein) ( | • Essential BAM complex protein; assembly and insertion of beta-barrel proteins in outer membrane [ | 0.461 | E [ | - |
| TPANIC_0733 | • NspA (Neisseria surface protein A) (97.8 / 58.4) | • Factor H-binding and complement resistance [ | 1.735 | ND | + |
| • Ail ( | • Host cell attachment, invasion, and complement resistance [ | ||||
| TPANIC_0783 | • BamB ( | • BAM complex; assembly and insertion of beta-barrel proteins in outer membrane [ | 0.402 | ND | - |
| TPANIC_0789 | • LolA ( | • Translocation of lipoproteins to the outer membrane [ | 1.283 | E [ | - |
| • LprG ( | • TLR2-agonist; inhibits primary human macrophage MHC-II Ag processing [ | ||||
| TPANIC_0854 | • Bacterial sialidases/neuraminidases ( | • | 0.221 | E [ | - |
| • | |||||
| TPANIC_0911 | • EscU ( | • Type 3 effector translocation into host cells [ | 0.701 | ND | - |
| • SpaS ( | • Invasion and secretion of invasion plasmid antigens [ | ||||
| TPANIC_0928 | • SurA ( | • Folding and assembly of outer membrane proteins [ | 1.049 | E [ | - |
| TPANIC_0966 | • TolC ( | • Type 1 secretion and drug efflux [ | 0.856 | ND | - |
| TPANIC_0967 | • TolC ( | • Type 1 secretion and drug efflux [ | 2.844 | E [ | + |
| TPANIC_0968 | • TolC ( | • Type 1 secretion and drug efflux [ | 3.256 | E [ | + |
| TPANIC_0969 | • TolC ( | • Type 1 secretion and drug efflux [ | 2.751 | E [ | + |
cDNA/DNA ratios indicate transcript expression levels from a previous rabbit infection study where a value of 1.0 represents the mean transcript expression level for all T. pallidum genes in the study [45]. E; proteins known to be expressed during rabbit infection (ND; proteins not detected) [47, 48]. MSC; T. pallidum subspecies pallidum proteins that contain major sequence changes compared to orthologs from subspecies pertenue and T. paraluiscuniculi [55, 61]