| Literature DB >> 20419351 |
Muse Oke1, Lester G Carter, Kenneth A Johnson, Huanting Liu, Stephen A McMahon, Xuan Yan, Melina Kerou, Nadine D Weikart, Nadia Kadi, Md Arif Sheikh, Stefan Schmelz, Mark Dorward, Michal Zawadzki, Christopher Cozens, Helen Falconer, Helen Powers, Ian M Overton, C A Johannes van Niekerk, Xu Peng, Prakash Patel, Roger A Garrett, David Prangishvili, Catherine H Botting, Peter J Coote, David T F Dryden, Geoffrey J Barton, Ulrich Schwarz-Linek, Gregory L Challis, Garry L Taylor, Malcolm F White, James H Naismith.
Abstract
The Scottish Structural Proteomics Facility was funded to develop a laboratory scale approach to high throughput structure determination. The effort was successful in that over 40 structures were determined. These structures and the methods harnessed to obtain them are reported here. This report reflects on the value of automation but also on the continued requirement for a high degree of scientific and technical expertise. The efficiency of the process poses challenges to the current paradigm of structural analysis and publication. In the 5 year period we published ten peer-reviewed papers reporting structural data arising from the pipeline. Nevertheless, the number of structures solved exceeded our ability to analyse and publish each new finding. By reporting the experimental details and depositing the structures we hope to maximize the impact of the project by allowing others to follow up the relevant biology.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20419351 PMCID: PMC2883930 DOI: 10.1007/s10969-010-9090-y
Source DB: PubMed Journal: J Struct Funct Genomics ISSN: 1345-711X
CMR target selection summary
| Filter summary | No. of proteins post filtering |
|---|---|
| All CMR proteins | 657,391 |
| OB-Score ≥ 5 | 197,000 |
| PSIBLAST matcha to Ensembl human protein | 55,405 |
| No PSIBLAST matcha to PDB | 4,461 |
| Assign UniProt IDb | 4,306 |
| No TMHMM transmembrane regions & ≤20% SEG low-complexity, sequence length 60–600 amino acids | 3,508 |
| Infer Pfam family | 3,008 (344 families) |
| No structure in Pfam family | 1,329 (143 families) |
aMatches defined by Rost curve [14]
bUniprot identifiers inferred by perfect sequence match or BLASTP (1E-6, 90% query coverage, 90% identity)
Fig. 1Bar presentation of primers used for gene amplifications at SSPF. The 5′ end common primer (co) is a double-stranded primer generated by PCR and used for cloning genes with the N-terminal TEV protease cleavable 6× His tag. The 5′ end custom (gene-specific) primer (cu) contains overlap sequence of 30 bp with the common primer and the gene specific sequence of 23 bp. The 3′ end custom (gene-specific) primer (cu) contains the gene-specific sequence of 23 bp and attB2 recombination site of 30 bp. AttB: BP recombination sites, RBS: ribosome binding site, ATG: start codon, 6× His: six histidine tag, TEV site: TEV site: sequence coding TEV protease cleavage site and the spacer. Numbers indicate the length of the primers (bp)
Fig. 2Protein purification in the SPoRT laboratory. a Schematic representation of the fully automated purification. b Chromatogram of fully automated purification showing the Ni-NTA, desalting and gel filtration peaks
SPoRT laboratory pipeline scoreboard
| Targets | Selected | Cloned | Work stoppeda | Expression trials | Expressed | Soluble | Purified | Crystals | Structure |
|---|---|---|---|---|---|---|---|---|---|
| Bacteria | 185 | 185 | 32 | 153 | 148 | 120 | 99 | 32 | 22 |
| Archaea | 70 | 70 | 9 | 61 | 60 | 29 | 27 | 11 | 9 |
| Archaea viruses and bacteriophages | 92 | 92 | 18 | 70 | 49 | 32 | 32 | 16 | 9 |
| Eukaryotesb | 9 | 7 | – | 7 | 7 | 9 | 9 | 2 | 2 |
| Total | 356 | 354 | 59 | 291 | 264 | 190 | 167 | 61 | 42 |
aNumber of targets that were cloned but were not submitted for expression trials
bTwo targets were provided as purified proteins by our collaborators
Summary of all SPoRT laboratory crystal structures showing phasing methods, origins and functions of proteins, and retrospective analysis of all structurally characterized targets using three crystallization predictors
| Structures | Phasing method | Origin | Function/comments | Retrospective analysis of target selection | PDB code [reference] | ||
|---|---|---|---|---|---|---|---|
| XtalPred | ParCrys | Cluster | |||||
|
| |||||||
| pqsE | MR |
| Quinolone signal response protein | 4 | High-scoring | A | 2VW8 |
| pqsL | Sm |
| Probable FAD-dependent monooxygenase | 1 | High-scoring | A | 2X3N |
| PA4511 | MR |
| Uncharacterized | 1 | High-scoring | A | 2X5E |
| PA4631 | Lead |
| Nucleoside-diphosphate-sugar epimerase | 4 | Recalcitrant | – | 2X4G |
| PA4715 | MR |
| Probable aminotransferase | 3 | Amenable | B | 2X5D |
| PA0856 | Se-SAD |
| Uncharacterized | 4 | Recalcitrant | – | 2X3O |
| FabH | MR |
| 3-Oxoacyl-[acyl-carrier-protein] synthase III | 2 | Recalcitrant | A | 2X3E |
| AcsD | Se-SAD |
| Achromobactin synthetase | 4 | High-scoring | A | 3FFE [ |
| AlcC | SIRAS |
| Alcaligin biosynthesis protein C | 3 | High-scoring | A | 2X0O |
| DesE | Br |
| Putative ferric-siderophore receptor protein | 4 | High-scoring | A | 2X4L |
| Fbabb | Pt |
| Fibronectin binding protein | 3 | Recalcitrant | A | 2X5P |
| MRSA677 (Sar2028) | MR | Methicillin-resistant | Asp/Tyr/Phe pyridoxal-5′-phosphate-dependent aminotransferase | 4 | High-scoring | A | 2X5F |
| MRSA681 (Sar2676) | MR | MRSA | Pantothenate synthetase | 1 | High-scoring | A | 2X3F |
| MVAK (QGJ78) | Sm & Pt | MRSA | Mevalonate kinase | 1 | Amenable | A | 2X7I |
| SAR0482 | Se-SAD | MRSA | Orn/Lys/Arg decarboxylase family protein | 2 | High-scoring | A | 2X3L |
| SAR1376 | Zn | MRSA | Putative 4-oxalocrotonate tautomerase | 5 | Amenable | – | 2X4K |
| PPFK (Q6GIU3) | MR | MRSA | Putative phosphofructokinase | 1 | Amenable | A | 2JG5 |
| TAG (Q6GG41) | Zn-SAD | MRSA | DNA-3-methyladenine glycosylase I | 1 | High-scoring | A | 2JG6 |
| SPT | MR |
| Serine palmitoyl transferase | 2 | Amenable | A | 2JG2 [ |
| ArdA | Pt | Transposon Tn916 | Antirestriction protein | 3 | Recalcitrant | – | 2W82 [ |
| ArdB | Pt |
| Antirestriction protein | 4 | Recalcitrant | A | 2WJ9 |
| VC1805 | MIRAS |
| Hypothetical protein | 4 | High-scoring | A | 2V1L [ |
|
| |||||||
| SSo1986 | K and Pb |
| Uncharacterized | 1 | Amenable | – | 2X5Q |
| SSo2273 | Fe-SAD |
| Uncharacterized | 3 | Amenable | – | 2X4H |
| SSo2452 | MR |
| ATPase | 4 | Amenable | – | 2W0M [ |
| SSo6206 | Se-SAD |
| Uncharacterized | 3 | Recalcitrant | A | 2X3D |
| PCNA | MR |
| DNA processivity factor | 3 | Amenable | A | 2IX2 [ |
| XPD | Se-SAD |
| DNA repair helicase | 3 | Amenable | – | 2VL7 [ |
| SSo2462 | MR |
| DNA repair helicase | 4 | Amenable | – | 2VA8 [ |
| SSo1404 | MR |
| Uncharacterized | 3 | Amenable | – | 2IVY |
| Ard1 |
| N-terminal acetylase | 3 | Recalcitrant | – | 2X7B | |
|
| |||||||
| SIRV-ORF114 (CAG38848) | MR |
| Uncharacterized | 2 | High-scoring | A | 2X4I |
| SIRV-ORF131 (CAG38830) | Se-SAD | SIRV | Uncharacterized | 4 | Recalcitrant | – | Sm-2X5G; Dm-2X5H |
| SIRV-ORF55 (CAG38821) | Zn-SAD | SIRV | Uncharacterized | 5 | Recalcitrant | – | 2X48 |
| SIRV-ORF119 (CAG38829) | Se-SAD | SIRV | Uncharacterized | 5 | Recalcitrant | – | 2X3G |
| PSV-ORF131 | S-SAD |
| Uncharacterized | 4 | Recalcitrant | – | 2X5C |
| PSV-ORF137 | Se-SAD | PSV | Uncharacterized | 3 | Recalcitrant | – | 2X4J |
| PSV-ORF165a | Se-SAD | PSV | Uncharacterized | 4 | Recalcitrant | – | 2VXZ |
| PSV-ORF239 | Sulphur-SAD | PSV | Uncharacterized | 5 | High-scoring | A | 2X3M |
| PSV-ORF126 | Zn-SAD | PSV | Uncharacterized | 3 | Recalcitrant | B | 2X5R |
|
| |||||||
| Ranasmurfin | Zn-SAD |
| Uncharacterized | 4 | Amenable | A | 2VH3 [ |
| Cathepsin-L mutant | MR | Human | Silica polymerization | 3 | Amenable | A | 2VHS [ |
The PDB code for each structure is indicated