| Literature DB >> 19620974 |
Greg L Hura1, Angeli L Menon, Michal Hammel, Robert P Rambo, Farris L Poole, Susan E Tsutakawa, Francis E Jenney, Scott Classen, Kenneth A Frankel, Robert C Hopkins, Sung-Jae Yang, Joseph W Scott, Bret D Dillard, Michael W W Adams, John A Tainer.
Abstract
We present an efficient pipeline enabling high-throughput analysis of protein structure in solution with small angle X-ray scattering (SAXS). Our SAXS pipeline combines automated sample handling of microliter volumes, temperature and anaerobic control, rapid data collection and data analysis, and couples structural analysis with automated archiving. We subjected 50 representative proteins, mostly from Pyrococcus furiosus, to this pipeline and found that 30 were multimeric structures in solution. SAXS analysis allowed us to distinguish aggregated and unfolded proteins, define global structural parameters and oligomeric states for most samples, identify shapes and similar structures for 25 unknown structures, and determine envelopes for 41 proteins. We believe that high-throughput SAXS is an enabling technology that may change the way that structural genomics research is done.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19620974 PMCID: PMC3094553 DOI: 10.1038/nmeth.1353
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
SAXS characterizations for 34 Pfu samples including structural information.
| Sample | New Structural Results | |||||
|---|---|---|---|---|---|---|
| Class | Gene | Ortholog | RG | Dmax | Assemblies | Envelope |
| Aggregates | PF0418 | ATPase | separable | No | ||
| PF1733 | Cons. hypothetical | inseparable | No | |||
| PF1951 | Aspartate-ammonia ligase | inseparable | No | |||
| Mixtures of Oligomers of Unknown Structure | PF0230 | ArsR transcription regulator | Mostly 2-mer | Yes | ||
| PF0259 | Cons. hypothetical | 26.0 | 84 | Mostly 8-mer | Yes | |
| PF0741 | Thioredoxin-related | 20 | >80 | Mostly 1-mer | Yes | |
| PF1548 | Cons. hypothetical | rings | ~Yes | |||
| PF1605 | Molybdopterin synthase | 21 | >90 | Mostly 1-mer | Yes | |
| Mixtures of Oligomers of Known Structure | PF0094 | Glutaredoxin-like | 28 | 110 | 92% 2-mer/8% 4-mer | Yes |
| PF0380 | Cons. hypothetical | 21 | 125 | 68% 2-mer/32% 1-mer | Yes | |
| PF0939 | Isopropylmalate dehydratase | 23.1 | 82 | 73% 2-mer/27% 1-mer | Yes | |
| PF1909 | Ferredoxin | 13.0 | 38 | 40% 2-mer/60% 1-mer | Yes | |
| Matching PDB Model | PF0863 | Adenylyl cyclase CyaB | 27.4 | 87 | Matching1-mer | Yes |
| PF1061 | Ferredoxin β-grasp fold | 17.7 | 78 | Matching 1-mer | Yes | |
| PF1281 | Superoxide reductase | 22.1 | 80 | Matching 4-mer | Yes | |
| PF1282 | Rubredoxin | 11.0 | 29 | Matching 1-mer | Yes | |
| PF0619 | Cons. hypothetical | 23.1 | 73 | Matching 3-mer | Yes | |
| Crystal Structure of Homolog | PF1026 | Malic enzyme, NAD-binding | 31.8 | 110 | Matching 1-mer | Yes |
| PF1033 | Thioredoxin-like fold | 51.2 | 150 | Matching 10mer | Yes | |
| PF1528 | SNO glutamine amidotransferase | 19.9 | 80 | Matching 1-mer | Yes | |
| PF1674 | Tyrosine/serine phosphatase | 16.7 | 58 | Matching 1-mer | Yes | |
| PF1787 | Acetyl-CoA synthetase | 33.9 | 98 | Novel 3-mer | Yes | |
| Proteins of Unknown Structure | PF0014/0015 | Cons. hypothetical | 55.0 | 165 | >8-mer | Yes |
| PF0553 | Tyrosine phosphatase | 19.2 | 110 | 1-mer | Yes | |
| PF706.1 | Zinc finger | 18.6 | 80 | 1-mer | Unfolded | |
| PF0699 | Conserved | 23.7 | 74 | 2-mer | Yes | |
| PF0715 | NADH oxidase | 23.1 | 96 | 2-mer | Yes | |
| PF0965/0966/0967/0971 | Pyruvate ferredoxin oxidoreductase | 36.9 | 120 | 244kDa | Yes | |
| PF1282/1205 | Nucleotide binding protein | 24.3 | 95 | 1-mer | Unfolded | |
| PF1291 | Phosphoesterase | 35.6 | 110 | 4-mer | Yes | |
| PF1372 | Cons. hypothetical | 23 | 75 | 4-mer | Yes | |
| PF1911 | Ferredoxin NADP reductase | 30.9 | 101 | 2-mer | Yes | |
| PF1950 | Phosphoribosyl transferase | 25.3 | 100 | 2-mer | Yes | |
| PF2047.1 | Cons. hypothetical | 29.7 | 155 | 1-mer | Unfolded | |
PF number is the Pfu genome ORF[30] number; protein names (orthologs) were based on bioinformatics analyses [www.ebi.ac.uk/interpro/]. We divide SAXS samples into three general classes: non-ideal samples, samples with existing atomic structural information and proteins of unknown structures. Non-ideal samples are either aggregated or mixtures of multimers which cannot be modeled with available atomic resolution results. Samples with direct atomic resolution results or from a sequence homolog provide improved resolution even when a mixture is present.
The abbreviation Cons. Hypothetical denotes conserved hypothetical proteins which have unknown function.
Proteins which have been crystallized.
Proteins with models determined from NMR.
PF0014 and 0015 were tandemly expressed in E-Coli. and form a complex
PF0965,PF0966,PF0967 and PF0971 form Pryruvate ferredoxin oxidoreductase and were purified from native biomass.
The PF1282/1205 recombinant fusion protein has the rubredoxin (PF1282) of Pfu as the ‘tag’ to an unrelated putative nucleotide binding protein (PF1205). While the PF1282 portion is folded as evident by its red color (Iron), PF1205 is unfolded as shown below.
Figure 1High-throughput SAXS pipeline. (a) Configuration of the SAXS endstation shows X-ray beam path, sample position, pipetting robot, and area detector. (b) Schematic of the sample area showing how the sample is loaded by the robot into a temperature-controlled cell. Positive helium pressure reduces air scatter and oxidative damage. (c) SAXS analysis tree for rapid and robust data processing and analysis. Proteins are first categorized as aggregated (using either the scattering curve itself or dynamic light scattering (DLS)), mixtures (based on native gel electrophoresis or multi-angle light scattering (MALS)), or mono-disperse samples. For monodisperse samples, SAXS data next defines global solution structural parameters radius of gyration, maximum dimension, and calculated mass. Sequence-based homology search discovers existing structures that can be used to analyze both mixtures and monodisperse samples. Approximate time scales are noted in each step. Perl scripts are used to collect information and begin processes for dashed paths. Both primary data and derived shapes are stored at the BioIsis internet accessible utility.
Figure 2SAXS analysis provides feedback on challenging samples that are polydisperse or inhomogeneous. (a) PF0230 and PF1548 were mixtures by native gel electrophoresis. Overlaying the SAXS-predicted PF0230 envelope with a close homolog (PDB 2CWE) revealed consistency to the homolog dimer with additional density indicating a larger species. (b) SAXS results directly discerned aggregation based on low angle Guinier regions (insert) for three protein samples PF0418 (red), PF1733 (blue) and PF1281 (green). Features (oscillations) in the SAXS scattering curve for PF0418 and PF1281 suggest that small adjustments in sample preparation may yield workable data, e.g. PF1281 was markedly improved after passing through a filter (purple). (c) Probable multimers may be identified when atomic resolution results are available of the protein or a homolog. Here, multimers in crystal lattices (PF0094 homolog PDB 1J08, PF0380 PDB 1VK1, PF0930 homolog PDB 1V7L, and PF1090 PDB 1SJ1) are used to identify a best fit to the SAXS data.
Figure 3SAXS provides accurate shape and assembly in solution for most samples. (a) For the ten proteins with structural homologs or existing structures, the experimental scattering data (colors) were compared with the scattering curve calculated for the matching structure (black). (b) For monodisperse samples, the envelope determinations (colored as in a) were overlaid with the existing structures (ribbons). All monomeric units had a seven amino-acid His-tag attached. (c) For the 9 proteins with no pre-existing structural information, envelope predictions from two independent programs were compared and generally agree. The DAMMIN results (black mesh) were generated without symmetry. The GASBOR results used 2-fold symmetry for PF0014/0015, PF0965/0966/0967/0971, PF1911 (dimer), PF00716 (dimer), PF0699 (dimer) and PF1950 (dimer). Four-fold symmetry was imposed on tetrameric PF1291 and PF1372. (d) Plotting the SAXS data as I*q vs. q (Kratky plot) highlights proteins with large unfolded regions. The Kratky plot of PF0715 is shown for comparison of a folded protein and shows characteristic parabolic behavior at wide angles. In contrast PF0706.1, PF2047.1, and PF1282/1205 have SAXS data consistent with unfolded regions as reflected in the non-parabolic wide-angle properties.
Figure 4SAXS determines accurate assembly state in solution, as shown for acetyl-CoA synthetase subunit (PF1787). The experimental scattering curve for PF1787 (black) is shown with calculated scattering curves for monomeric (magenta dots) and dimeric (green dashes) atomic resolution structures of homologs. The best fit (red) to the experimental SAXS data is calculated from a 3-fold symmetric trimer derived from a monomeric homologue (PDB 1WR2). The trimeric form of PF1787 was confirmed using I(0), the extrapolated intensity at 0 scattering angle, normalized for concentration (inset). Proteins standards lysozyme (Lys), xylanase (Xyl), PF1281, bovine serum albumin (BSA) and glucose isomerase (GI) were used to place the data on a relative scale. Relevant structures from analysis of PF1787 are shown on the right. The crystallographic dimer (green) is a flexibly-linked 2-domain protein. Models with 3-fold symmetry enforced (blue) match the SAXS results.
Figure 5SAXS defines accurate shape and assembly in solution for unknown structures and can uncover unsuspected structural similarity. Experimental scattering curves for proteins with no known structural homolog (left, color) were compared with calculated scattering (black curves on left) from PDB structures identified by DARA[26], a database of scattering curves calculated from the PDB database. Results from the shape reconstruction program GASBOR (colored envelopes) are overlaid onto the structures identified by DARA (ribbon models, right). In addition, PF1674 and PF1281 with known structures show a limitation in the DARA search (see text) and the need for better comparative algorithms.