| Literature DB >> 29492339 |
Dickson Kinyanyi1, George Obiero2, George F O Obiero1, Peris Amwayi1, Stephen Mwaniki1, Mark Wamalwa3.
Abstract
African swine fever virus (ASFV) is the etiological agent of ASF, a fatal hemorrhagic fever that affects domestic pigs. There is currently no vaccine against ASFV, making it a significant threat to the pork industry. The ASFV genome sequence has been published; however, about half of ASFV open reading frames have not been characterized in terms of their structure and function despite being essential for our understanding of ASFV pathogenicity. The present study reports the three-dimensional structure and function of uncharacterized protein, pB263R (NP_042780.1), an open reading frame found in all ASFV strains. Sequence-based profiling and hidden Markov model search methods were used to identify remote pB263R homologs. Iterative Threading ASSEmbly Refinement (I-TASSER) was used to model the three-dimensional structure of pB263R. The posterior probability of fold family assignment was calculated using TM-fold, and biological function was assigned using TM-site, RaptorXBinding, Gene Ontology, and TM-align. Our results suggests that pB263R has the features of a TATA-binding protein and is thus likely to be involved in viral gene transcription.Entities:
Keywords: African swine fever transcription factor; African swine fever virus; BA71V pB263R; I-TASSER; In silico characterization; Sequence homology search; TATA-binding protein
Year: 2018 PMID: 29492339 PMCID: PMC5825884 DOI: 10.7717/peerj.4396
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
pB263R homologs identified by HHpred.
| Rank | PDB ID | Organism | Protein domain | Probability | Identity | Query HMM region | Columns | |
|---|---|---|---|---|---|---|---|---|
| 1 | 2z8uA | TBP | 100% | 7.6 × 10−39 | 15% | 44–253 | 176 | |
| 2 | 1aisA | TBP | 100% | 1.3 × 10−38 | 15% | 40–251 | 178 | |
| 3 | 1ytbA | TBP | 100% | 2.1 × 10−38 | 15% | 44–253 | 175 | |
| 4 | 1mp9A | TBP | 100% | 3.7 × 10−38 | 12% | 44–263 | 185 | |
| 5 | 3eikA | TPB | 100% | 1.0 × 10−37 | 13% | 34–253 | 185 |
Notes.
Probability of the template being a true positive.
Percent sequence identity between the template and query sequences.
Number of aligned HMM-HMM match-match columns.
hidden Markov model
Protein Data Bank identification
TATA-binding protein
Figure 1I-TASSER-modeled structure of the complete pB263R protein sequence (residues 1–263).
A ribbon-style representation of the predicted 3D model (pB263R model 1a) is shown, with a TM score of 0.52 ± 0.15 and Cs of −1.61. The β strands and α helices form a pseudo-dyad structure consisting of two similar subdomains, each with five β strands (S1–S5 on subdomain 1 and S1′–S5′ on subdomain 2) and two α helices (H1/H2 and H1′/H2′). N and C indicate the N and C termini, respectively, of pB263R model 1a.
Figure 2Model of pB263R residues 45 (A1) to 263 (D219) predicted with I-TASSER.
The structure was consistent with typical TBP topology but with an asymmetric number of anti-parallel β sheets. The structure had a high TM score (0.57) and Cs (−1.15), indicating that the loop from residues 1 to 44 contained a possible species-specific intrinsically disordered region.
Figure 3TM-align structural superposition of pB263R model 1a (blue ribbon) and experimentally solved 4b0aA.
The TM score was 0.711. The figure was generated using the molsoft icm browser (http://www.molsoft.com/icm_browser.html).
Figure 4pB263R in silico predicted structure validation. The predicted structure was validated using ProSA Web and rampage.
(A) ProSA Web plot showing Z-score of the predicted structure −4.65 of pB263R (black dot) relative to Z-scores of similar sized protein structures solved using NMR and X-ray crystallography. (B) Ramachandran plot obtained from Rampage showing 85.4% of residues lie in the most favoured region, 9.6% of residues in allowed regions and 4.6% in the outlier region.
Top 10 experimentally solved structural analogs of pB263R model 1a identified by TM-align.
| Rank | PDB hit | Non-normalized TM score | RMSD | IDEN | Coverage | Posterior probability in SCOPe | Posterior probability in CATH | |
|---|---|---|---|---|---|---|---|---|
| 1 | 4b0aA | Not manually classified in SCOP 2.06 | 0.711 | 1.49 | 0.162 | 0.715 | 0.9881 | 0.9832 |
| 2 | 1vokA | d.129.1.1 | 0.670 | 1.64 | 0.135 | 0.696 | 0.9918 | 0.9966 |
| 3 | 1jfiC | d.129.1.1 | 0.664 | 1.45 | 0.115 | 0.684 | 0.9958 | 0.9954 |
| 4 | 1mp9A | d.129.1.1 | 0.657 | 2.09 | 0.125 | 0.677 | 0.9958 | 0.9910 |
| 5 | 1nh2A | d.129.1.1 | 0.656 | 1.31 | 0.167 | 0.684 | 0.9920 | 0.9914 |
| 6 | 3eikA | d.129.1.0 | 0.645 | 1.43 | 0.146 | 0.688 | 0.9920 | 0.9369 |
| 7 | 1pczA | d.129.1.1 | 0.643 | 1.88 | 0.158 | 0.665 | 0.9918 | 0.9964 |
| 8 | 2z8uB | d.129.1.1 | 0.618 | 1.68 | 0.161 | 0.677 | 0.9919 | 0.9098 |
| 9 | 2glsL | d.15.9.1 | 0.486 | 5.48 | 0.061 | 0.681 | 0.2902 | 0.1236 |
| 10 | 3ng0A | d.15.9.0 | 0.481 | 5.66 | 0.057 | 0.707 | 0.2736 | 0.1150 |
Notes.
Rank of PDB structures based on TM score of the structural alignment between the pB263R model 1a and known structures in the PDB library.
Root mean square deviation between residues structurally aligned by TM-align.
Percent sequence invariability in structurally identical regions.
Alignment coverage by TM-align, equivalent to the number of structurally aligned residues divided by the length of the queried protein sequence.
Probability of two proteins structures being in the same fold family in SCOPe.
Probability of two proteins structures being in the same fold family in CATH.
Class, Architecture, Topology, and Homology
Protein Data Bank
Structural Classification of Proteins
Top five predicted binding site residues for pB263R as determined by TM-site.
| Rank | CSt | Cluster size | Representative template | TATA box sequence | Ligands | Predicted binding site residues in pB263R model 1a |
|---|---|---|---|---|---|---|
| 1 | 0.39 | 32 | 1qn4A_BS01_NUC | TGCC[CATTTATA]GC (TATAAATG) | Nucleic acid (32) | N52, N54, F90, N91, K108, F110, T113, E115, Q117, I157, Q158, N160, E200, D201, S205, F207, R217, N219, L229, and N232 |
| 2 | 0.37 | 27 | 1qnbA_BS02_NUC | TGCC[CATTTATA]GC (TATAAATG) | Nucleic acid (27) | K49, A50, N52, C89, F90, E95, I98, M104, K106, K108, P119, G120, N160, D201, S202, N219, F221, K223, K225, and N227 |
| 3 | 0.19 | 3 | 5fmfQ_BS02_NUC | C[CTTTTATA]G (TATAAAAG) | Nucleic acid (3) | K49, A50, T88, C89, F90, N91, A93, E95, S97, M104, K106, K108, F110, P119, G120, I122, D201, S202, F221, K223, and K225 |
| 4 | 0.16 | 1 | 5fz5O_BS01_NUC | C[CTTTTATA]G (TATAAAAG) | Nucleic acid (1) | N52, N54, L60, F90, F110, S112, E115, Q117, Q158, D201, and L229 |
| 5 | 0.16 | 1 | 5fz5O_BS02_NUC | T[ATTATATA]CA (TATATAAT) | Nucleic acid (1) | K49, N50, N52, C89, F90, E95, L99, K106, K108, P119, and S202 |
Notes.
Confidence score of the TM-site prediction (rang: 0–1).
Total number of templates in a cluster.
Single complex structure with the most representative ligand in the cluster.
Name of identified ligand.
Figure 5Multiple sequence alignment of TBP sequences of the top 10 closest TBP structural homologs of pB263R.
4b0aA, S. cerevisiae TBP transcription initiation factor II D subunit 1; 3eikA, E. cuniculi TBP; 2z8uB, M. jannaschii TBP; 1vokA, A. thaliana TBP; 1rm1A, S. cerevisiae TBP; 1pczA, P. woesei TBP; 1nh2A, S. cerevisiae TBP CTD; 1mp9A, S. acidocaldarius TBP; 1jfiC, Homo sapiens TBP; 1aisA, P. woesei TBP. F90, F110 and F223 amino acid residues are indicated to be highly conserved among TBP.
Figure 6Structural superimposition of pB263R model 1a (blue coil) and experimentally solved 1qn4A (red ribbon).
The predicted conserved TATA box element includes intercalating residues F90, F110, and F221 of pB263R model 1a on the underside of concave β sheets.
Binding pockets predicted by RaptorX binding based on pB263R sequence.
| Pocket multiplicity | Ligand | Binding residue | |
|---|---|---|---|
| 1 | 146 | DT | D201, K203, S205, N219, and L229 |
| 2 | 141 | DT | C89, F90, S97, K106, K108, and Q117 |
| 3 | 110 | DA | N52, N54, F110, E115, Q117, and I157 |
| 4 | 104 | DA | N52, I157, M158, N160, R217, L229, and G230 |
| 5 | 96 | DT | K49, A50, N52, L99, K106, P119, G120, and N160 |
| 6 | 94 | DA | K49, N160, K162, F221, K225, and N227 |
| 7 | 94 | DA | D201, S202, F221, K223, and K225 |
| 8 | 82 | DG | F90, F110, T113, and E115 |
Notes.
Frequency with which the predicted pocket was found in the query protein template structure.
DA, 2′-deoxyadenosine-5′-monophosphate; DG, 2′-deoxyguanosine-5′-monophosphate; DT, thymidine-5′-monophosphate.
Figure 7Bootstrap phylogenetic tree of pB263R with archaeal, eukaryotic, and prokaryotic TBPs and related factors.
Light green indicates NCLDV TBPs while red indicates BA71V pB263r. Values at nodes indicate maximum likelihood bootstrap percentages. pB263r (red) and NCDLVs TBPs (light green) suggesting a closer evolutionary relatedness. Phylogenetic analyses were performed with MEGA6.