| Literature DB >> 25833954 |
Bernardo Ochoa-Montaño1, Nishita Mohan2, Tom L Blundell3.
Abstract
Tuberculosis kills more than a million people annually and presents increasingly high levels of resistance against current first line drugs. Structural information about Mycobacterium tuberculosis (Mtb) proteins is a valuable asset for the development of novel drugs and for understanding the biology of the bacterium; however, only about 10% of the ∼4000 proteins have had their structures determined experimentally. The CHOPIN database assigns structural domains and generates homology models for 2911 sequences, corresponding to ∼73% of the proteome. A sophisticated pipeline allows multiple models to be created using conformational states characteristic of different oligomeric states and ligand binding, such that the models reflect various functional states of the proteins. Additionally, CHOPIN includes structural analyses of mutations potentially associated with drug resistance. Results are made available at the web interface, which also serves as an automatically updated repository of all published Mtb experimental structures. Its RESTful interface allows direct and flexible access to structures and metadata via intuitive URLs, enabling easy programmatic use of the models.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25833954 PMCID: PMC4381106 DOI: 10.1093/database/bav026
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Overview of the CHOPIN modelling pipeline.
Figure 2.Distribution of the best FUGUE Z-Scores for all sequences of Mtb proteome. Blue (Z-Score ≥ 15), green (6 < Z-Score < 15) and yellow (4 < Z-Score < 6) correspond to very high, high and reasonable confidence matches, respectively, whereas red indicates non-significant hits.
General statistics of CHOPIN pipeline results
| Category | Count |
|---|---|
| Sequences w/ FUGUE | 1832 |
| Sequences w/ FUGUE | 759 |
| Sequences w/ FUGUE | 157 |
| Sequences w/ FUGUE | 759 |
| Sequences without FUGUE hits | 157 |
| Number of significant hits ( | 5268 |
| Unique TOCCATA profiles among hits | 2009 |
| Number of multi-domain hits | 523 |
| Number of alignments | 16 420 |
| Number of unique alignments | 13 169 |
| Alignments w/apo-form templates | 6071 |
| Alignments w/liganded templates | 5133 |
| Alignments w/complexed templates | 6365 |
| Alignments w/monomeric templates | 4839 |
| Alignments w/templates in any state | 5216 |
| Average template PID (%) | 24.21 |
| Total number of models | 49 218 |
| Top models w/ ‘great’ quality rating (=4) | 7026 |
| Top models w/ ‘good’ quality rating (≥3, <4) | 3187 |
| Top models w/ ‘fair’ quality rating (≥2, <3) | 2269 |
| Top models w/ ‘poor’ quality rating (<2) | 3931 |
Figure 3.Venn diagram of number of alignments according to conformational state of templates. Alignments in apo-form are in yellow tones; liganded in blue tones; monomeric in teal and complexed in magenta. State-free alignments, where templates can be in any state, are shown in the white centre.
Mutations predicted to be deleterious to protein stability according to SDM and mCSM
| Sequence ID | Mutation | Strain/Source | Sequence Description | SDM ΔΔG (kJ/mol) | mCSM ΔΔG (kJ/mol) |
|---|---|---|---|---|---|
| Rv0006 | A74S | FLQ | DNA gyrase subunit A gyrA | −2.29 | −1.15 |
| Rv0006 | D94A | FLQ | DNA gyrase subunit A gyrA | 2.04 | −0.79 |
| Rv0006 | G247S | DS,MDR,XDR | DNA gyrase subunit A gyrA | −3.28 | −1.29 |
| Rv0237 | A240V | DS,MDR,XDR | Lipoprotein lpqI | 2.18 | −0.71 |
| Rv0319 | G69D | DS,MDR,XDR | Pyrrolidone-carboxylate peptidase pcp | −1.57 | −2.31 |
| Rv0404 | P478H | DS,MDR,XDR | Fatty-acid-CoA ligase fadD30 | 1.38 | −2.10 |
| Rv0655 | V144A | DS,MDR,XDR | Ribonucleotide transport ATP-binding protein ABC transporter mkl | −1.53 | −2.38 |
| Rv0667 | L456S | DS,MDR,XDR | DNA-directed RNA polymerase beta subunit rpoB | −4.11 | −2.66 |
| Rv0667 | I1112T | XDR | DNA-directed RNA polymerase beta subunit rpoB | −4.53 | −2.43 |
| Rv0721 | A105V | DS,MDR,XDR | 30. ribosomal protein S5 rpsE | 2.18 | −0.25 |
| Rv0790c | F83S | DS,MDR,XDR | Hypothetical protein | −2.20 | −2.66 |
| Rv1001 | T281M | DS,MDR,XDR | Arginine deiminase arcA | 2.39 | −0.31 |
| Rv1039c | A67T | DS,MDR,XDR | PPE family protein | −2.48 | −0.92 |
| Rv1240 | G306R | DS,MDR,XDR | Malate dehydrogenase mdh | 3.41 | −0.97 |
| Rv1276c | Q79E | DS,MDR,XDR | Hypothetical protein | −0.31 | −2.48 |
| Rv1569 | A171G | DS,MDR,XDR | 8.Amino-7-oxononanoate synthase bioF1 | −2.24 | −1.39 |
| Rv1600 | S271A | DS,MDR,XDR | Histidinol-phosphate aminotransferase hisC1 | 2.85 | −0.50 |
| Rv1605 | G145V | DS,MDR,XDR | Cyclase hisF | 2.55 | −0.41 |
| Rv1638 | S908I | DS,MDR,XDR | Excinuclease ABC subunit A (DNA-binding ATPase) uvrA | 3.02 | 0.11 |
| Rv1825 | P181S | DS,MDR,XDR | Hypothetical protein | −0.81 | −2.03 |
| Rv1870c | D123G | DS,MDR,XDR | Hypothetical protein | 2.51 | −0.38 |
| Rv1878 | S296F | DS,MDR,XDR | Glutamine synthetase glnA3 | 3.03 | −0.90 |
| Rv1933c | V196A | MDR,XDR | Acyl-CoA dehydrogenase fadE18 | −2.73 | −2.53 |
| Rv2000 | L275P | XDR | Hypothetical protein | −6.18 | −0.95 |
| Rv2043c | A3P | PZA | Pyrazinamidase/Nicotinamidase PncA (PZase) | −3.35 | −0.51 |
| Rv2043c | Q10P | PZA | Pyrazinamidase/Nicotinamidase PncA (PZase) | −2.32 | −0.49 |
| Rv2043c | C14H | PZA | Pyrazinamidase/Nicotinamidase PncA (PZase) | −4.49 | −1.44 |
| Rv2043c | C14R | PZA | Pyrazinamidase/Nicotinamidase PncA (PZase) | −3.76 | −0.63 |
| Rv2043c | L19P | PZA | Pyrazinamidase/Nicotinamidase PncA (PZase) | −2.48 | −1.46 |
| Rv2043c | V21G | PZA | Pyrazinamidase/Nicotinamidase PncA (PZase) | −4.20 | −1.60 |
| Rv2043c | Y34S | PZA | Pyrazinamidase/Nicotinamidase PncA (PZase) | −2.47 | −2.96 |
| Rv2122c | A88D | DS,MDR,XDR | Phosphoribosyl-ATP pyrophosphohydrolase hisE | −2.70 | −0.82 |
| Rv2161c | G105A | DS,MDR,XDR | Hypothetical protein | 2.23 | −0.47 |
| Rv2197c | P112S | DS,MDR,XDR | Conserved transmembrane protein | 2.77 | −0.56 |
| Rv2250c | A119T | DS,MDR,XDR | Hypothetical transcriptional regulatory protein | −2.02 | −0.68 |
| Rv2464c | A99T | DS,MDR,XDR | Hypothetical DNA glycosylase | −2.84 | −1.35 |
| Rv2886c | V153A | DS,MDR,XDR | Hypothetical resolvase | −2.73 | −2.48 |
| Rv2887 | S2G | DS,MDR,XDR | Hypothetical transcriptional regulatory protein | 2.58 | −0.24 |
| Rv3032 | Q310L | DS,MDR,XDR | Hypothetical transferase | 3.07 | −0.33 |
| Rv3174 | L42R | DS,MDR,XDR | Hypothetical short-chain type dehydrogenase/reductase | −2.32 | −1.56 |
| Rv3545c | I359T | DS,MDR,XDR | Cytochrome P450 125 cyp125 | −2.20 | −2.79 |
| Rv3591c | F30S | DS,MDR,XDR | Hypothetical hydrolase | −3.05 | −1.96 |
| Rv3606c | L172P | DS,MDR,XDR | 2.Amino-4-hydroxy-6- hydroxymethyldihydropteridine pyrophosphokinase folk | −2.74 | −1.45 |
| Rv3719 | R310T | DS,MDR,XDR | Hypothetical protein | −2.20 | −1.80 |
DS (Drug Sensitive), MDR (Multiple Drug Resistant) and XDR (eXtensively Drug Resistance) refer to the KwaZulu-Natal strains sequenced by the Broad Institute, with residue numbers given relative to the F11 reference strain. PZA and FLQ indicate to various high-confidence pyrazinamide or fluoroquinone resistant strains, respectively, as identified on TBDreaMDB, with residue numbers relative to the H37Rv strain