| Literature DB >> 31067727 |
Gideon K Gogovi1,2, Fahad Almsned3, Nicole Bracci4,5, Kylene Kehn-Hall6,7, Amarda Shehu8,9,10,11, Estela Blaisten-Barojas12,13.
Abstract
A tertiary structure governs, to a great extent, the biological activity of a protein in the living cell and is consequently a central focus of numerous studies aiming to shed light on cellular processes central to human health. Here, we aim to elucidate the structure of the Rift Valley fever virus (RVFV) L protein using a combination of in silico techniques. Due to its large size and multiple domains, elucidation of the tertiary structure of the L protein has so far challenged both dry and wet laboratories. In this work, we leverage complementary perspectives and tools from the computational-molecular-biology and bioinformatics domains for constructing, refining, and evaluating several atomistic structural models of the L protein that are physically realistic. All computed models have very flexible termini of about 200 amino acids each, and a high proportion of helical regions. Properties such as potential energy, radius of gyration, hydrodynamics radius, flexibility coefficient, and solvent-accessible surface are reported. Structural characterization of the L protein enables our laboratories to better understand viral replication and transcription via further studies of L protein-mediated protein-protein interactions. While results presented a focus on the RVFV L protein, the following workflow is a more general modeling protocol for discovering the tertiary structure of multidomain proteins consisting of thousands of amino acids.Entities:
Keywords: Rift Valley fever virus; computational structure determination; multidomain protein; tertiary structure
Mesh:
Substances:
Year: 2019 PMID: 31067727 PMCID: PMC6539450 DOI: 10.3390/molecules24091768
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Workflow for determining the protein domains, domain boundaries, domain structure, domain assembly, evaluation and refinement of the Rift Valley fever virus (RVFV) L protein containing 2092 amino acids.
Figure 2Tertiary structure obtained via RAPTORX for the entire 2092 amino acid sequence of the RVFV L protein. Rendering was performed with Visual Molecular Dynamics (VMD) software [17].
Figure 3Three domains and boundary determination of RVFV L protein structural models.
Structural models considered for the L1, L2, and L3 templates provided to I-TASSER [19]. C-score values in bold are the best for domains L1 and L2.
| PDB id | Organism | C-Score | PDB id | Organism | C-Score | Model | Description | C-Score |
|---|---|---|---|---|---|---|---|---|
| L1-nt | - | −4.32 | L2-nt | - | −0.09 | L3-nt | I-TASSER model | −1.68 |
| L1-4miw | Lassa virus | −1.55 | L2-5amq | La Crosse |
| L3-nt-MD | L3-nt with MD aa 1861–2092 | −1.95 |
| L1-5ize | Hantaan virus |
| L2-5amr | La Crosse | −0.09 | L3-AIDA | L3-nt-MD, AIDA | - |
| L1-5hsb | Andes virus |
| L2-1yuy | Hepatitis C | −0.05 | L3-Chimera | L3-nt-MD, Chimera | - |
| L1-5j1n | Lassa virus | −1.45 | L2-4xhi | Thosea Asigna |
| |||
| L1-MD | - | - | L2-4ucy | Metapneu- movirus |
|
Properties of the molecular-dynamics (MD)-optimized structures of domains L1 and L3: potential energy per atom PE, radius of gyration , end-to-end distance , and maximum radius from center of mass .
| Domain Segment | PE (kJ/mol) | |||
|---|---|---|---|---|
| L1-MD | −8.62 | 6.66 | 3.05 | 6.03 |
| L3 | −7.57 | 7.23 | 2.45 | 5.47 |
| L3-MD | −8.70 | 3.89 | 2.97 | 5.11 |
Molprobity evaluation of full-length L protein structures refined with 3DRefine, before and after structural relaxation. The best MP-scores are given in bold.
| MP-Score | Clash-Score | Rot-Out | Ram-Out | Ram-fv | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
| Before min | After min | Before min | After min | Before min | After min | Before min | After min | Before min | After min |
| L1-5ize + L2-4xhi + L3-nt | 3.75 |
| 74.1 | 1.89 | 7.91 | 8.52 | 8.42 | 7.22 | 80.53 | 72.16 |
| L1-5ize + L2-4xhi + L3-MD |
| 2.52 | 73.34 | 1.95 | 7.53 | 9.68 | 7.85 | 6.96 | 82.54 | 78.54 |
| L1-5ize + L2-4xhi + L3-AIDA | 3.79 |
| 73.36 | 1.71 | 8.82 | 9.18 | 9.76 | 8.40 | 80.05 | 71.19 |
| L1-5ize + L2-4xhi + L3-Chimera | 3.91 | 2.60 | 88.81 | 2.04 | 9.84 | 10.95 | 10.07 | 7.47 | 79.58 | 70.82 |
| L1-5hsb + L2-4xhi + L3-nt | 3.76 | 2.64 | 77.25 | 2.61 | 8.02 | 10.12 | 8.18 | 7.27 | 81.15 | 72.94 |
| L1-5hsb + L2-4xhi + L3-MD |
|
| 74.62 | 1.98 | 8.23 | 9.07 | 7.70 | 7.22 | 83.16 | 73.20 |
| L1-5hsb + L2-4xhi + L3-AIDA | 3.78 | 2.56 | 77.23 | 2.16 | 8.23 | 9.40 | 9.33 | 6.96 | 80.19 | 71.39 |
| L1-5hsb + L2-4xhi + L3-Chimera | 3.90 | 2.56 | 89.62 | 2.13 | 9.36 | 9.13 | 9.73 | 7.89 | 79.29 | 69.43 |
| L1-MD + L2-4xhi + L3-nt |
| 2.60 | 72.28 | 2.16 | 8.02 | 10.90 | 7.94 | 6.24 | 81.53 | 72.37 |
| L1-MD + L2-4xhi + L3-MD |
|
| 76.02 | 1.86 | 7.42 | 9.18 | 7.42 | 6.86 | 83.43 | 72.89 |
| L1-MD + L2-4xhi + L3-AIDA | 3.77 | 2.53 | 72.81 | 1.89 | 8.66 | 9.79 | 9.04 | 7.68 | 80.57 | 71.75 |
| L1-MD + L2-4xhi + L3-Chimera | 3.90 | 2.57 | 90.22 | 2.28 | 9.63 | 9.07 | 10.16 | 8.45 | 80.30 | 70.36 |
Figure 4Relaxed structures corresponding to the MP score before and after relaxation in Table 3. L1, L2, and L3 domains depicted in blue, green, and red, respectively. Drawn with Chimera [22].
Potential energy per atom PE, radius of gyration , hydrodynamic radius , flexibility coefficient , end-to-end distance , and solvent-accessible surface area SASA of RVFV L protein structural models. Values correspond to relaxed structures after minimization. Models provided in decreasing PE order. Last column lists the root-mean-squared deviation (RMSD) of each model with respect to the most energetically stable model, L1-MD + L2-4xhi + L3-Chimera. Bold values indicate the structures of lowest PE.
| Model | PE (kJ/mol) |
| SASA (nm | RMSD (nm) | ||||
|---|---|---|---|---|---|---|---|---|
| 1 | L1-MD + L2-4xhi + L3-nt | −6.664 | 5.10 | 9.49 | 0.05 ± 0.01 | 2.58 | 845.0 | 3.15 |
| 2 | L1-5ize + L2-4xhi + L3-nt | −6.710 | 5.29 | 9.51 | 2.05 ± 0.23 | 16.27 | 832.4 | 3.55 |
| 3 | L1-5hsb + L2-4xhi + L3-nt | −6.751 | 4.34 | 8.67 | 0.25 ± 0.03 | 5.64 | 816.0 | 3.37 |
| 4 | L1-5ize + L2-4xhi + L3-AIDA | −6.762 | 4.91 | 9.27 | 1.61 ± 0.18 | 14.33 | 802.8 | 3.74 |
| 5 | L1-MD + L2-4xhi + L3-MD | −6.781 | 4.78 | 9.15 | 0.17 ± 0.02 | 4.71 | 826.7 | 3.14 |
| 6 | L1-5ize + L2-4xhi + L3-MD | −6.791 | 5.11 | 9.44 | 0.48 ± 0.05 | 7.88 | 840.0 | 3.49 |
| 7 | L1-5hsb + L2-4xhi + L3-MD | −6.792 | 4.87 | 9.12 | 1.78 ± 0.19 | 15.04 | 801.5 | 3.55 |
| 8 | L1-5ize + L2-4xhi + L3-Chimera | −6.799 | 4.66 | 8.84 | 0.24 ± 0.03 | 5.60 | 759.8 | 1.85 |
| 9 | L1-5hsb + L2-4xhi + L3-Chimera |
| 4.71 | 8.99 | 0.54 ± 0.06 | 8.41 | 742.8 | 1.50 |
| 10 | L1-5hsb + L2-4xhi + L3-AIDA |
| 4.72 | 8.89 | 1.72 ± 0.18 | 14.80 | 771.0 | 3.31 |
| 11 | L1-MD + L2-4xhi + L3-AIDA |
| 4.43 | 8.71 | 0.66 ± 0.08 | 9.26 | 787.3 | 2.54 |
| 12 | L1-MD + L2-4xhi + L3-Chimera |
| 4.81 | 8.91 | 0.22 ± 0.02 | 5.28 | 747.6 | 0.00 |
Figure 5Best four relaxed structures based on L-protein energetics reported in Table 4. L1, L2, and L3 domains depicted in blue, green, and red, respectively. Drawn with Chimera [22].
Figure 6Interaction energy between contiguous domains L1–L2 (magenta) and L2–L3 (green) within the full L protein. Structural models numbered by the order they appear in Table 4.
Figure 7Ratio versus radius of gyration of the 12 structural models.