| Literature DB >> 35495632 |
Jeffrey K Holden1, Ryan Pavlovicz2, Alberto Gobbi3, Yifan Song2, Christian N Cunningham1.
Abstract
Technologies for discovering peptides as potential therapeutics have rapidly advanced in recent years with significant interest from both academic and pharmaceutical labs. These advancements in turn drive the need for new computational tools to design peptides for purposes of advancing lead molecules into the clinic. Here we report the development and application of a new automated tool, AutoRotLib, for parameterizing a diverse set of non-canonical amino acids (NCAAs), N-methyl, or peptoid residues for use with the computational design program Rosetta. In addition, we developed a protocol for designing thioether-cyclized macrocycles within Rosetta, due to their common application in mRNA display using the RaPID platform. To evaluate the utility of these new computational tools, we screened a library of canonical and NCAAs on both a linear peptide and a thioether macrocycle, allowing us to quickly identify mutations that affect peptide binding and subsequently measure our results against previously published data. We anticipate in silico screening of peptides against a diverse chemical space will be a fundamental component for peptide design and optimization, as more amino acids can be explored in a single in silico screen than an in vitro screen. As such, these tools will enable maturation of peptide affinity for protein targets of interest and optimization of peptide pharmacokinetics for therapeutic applications.Entities:
Keywords: design; macrocycle; noncanonical; peptide; rosetta
Year: 2022 PMID: 35495632 PMCID: PMC9047896 DOI: 10.3389/fmolb.2022.848689
Source DB: PubMed Journal: Front Mol Biosci ISSN: 2296-889X
FIGURE 2Parameterization and application of thioether linker. (A) Representative molecule used for thioether parameterization. The angle scanned for dihedral parameterization is highlighted by larger spheres. (B) C-C-S-C torsional profiles by the MMFF94S force field (black) and Rosetta (blue) after dihedral constraints were optimized. (C) C-C-S-C angle distributions for 100,000 thioether-linked 8-mer (AAAAAAAC) macrocycles generated without additional dihedral constraints (top) and with dihedral constraints applied (bottom). The red dashed lines represent the theoretical maxima of the dihedral energy landscape, while the green lines represent the theoretical minima for the thioether bond. Note that each thioether bond has two C-C-S-C torsions.
FIGURE 1AutoRotLib was developed to parameterize chemically diverse NCAAs for Rosetta. (A) NCAAs that can be parameterized with our automated tools include exotic R groups, N-methyl amino acids, and peptoids with ≤4 heavy atom torsion angles. (B) In order to parameterize a NCAA using the automated tools, only a SMILES string and residue charge is required for generating files required for use in Rosetta. (C) Parameterization of the non-canonical b-(2-naphthyl)-L-alanine (2Np) using AutoRotLib requires capping the termini with acetyl and N-methyl groups (colored blue) to produces rotameric positions of the side chain for a given backbone position. Example rotamers for phi = −60°, psi = −40°; 2-D and 3-D representation (top) and 2-D representation of X angles (bottom) for 2Np.
Rotamer recovery of canonical amino acids after repacking with the Rosetta standard libraries (dun10 with Shapovalov’s corrections) and libraries generated by the MakeRotLib and AutoRotLib protocols using Rosetta 3.10 and the REF2015 score function. Rotamer recovery reported as the average ±standard deviation for three separate calculations.
| Amino acid | Rotamer recovery (%) | ||
|---|---|---|---|
| Rosetta baseline | MakeRotLib | AutoRotLib | |
| L-cysteine | 96.6 ± 0.0 | 83.6 ± 0.8 | 80.2 ± 0.8 |
| L-serine | 97.7 ± 0.0 | 90.8 ± 0.2 | 91.1 ± 0.3 |
| L-threonine | 98.7 ± 0.1 | 93.5 ± 0.3 | 94.5 ± 0.3 |
| L-valine | 99.6 ± 0.0 | 97.7 ± 0.2 | 99.5 ± 0.1 |
| L-leucine | 96.3 ± 0.1 | 89.2 ± 0.4 | 93.4 ± 0.2 |
| L-isoleucine | 98.2 ± 0.0 | 95.2 ± 0.1 | 95.8 ± 0.2 |
| L-methionine | 84.8 ± 0.2 | 50.5 ± 0.4 | 73.1 ± 0.5 |
| L-arginine | 63.5 ± 0.4 | 39.8 ± 0.5 | 45.7 ± 0.6 |
| L-lysine | 86.3 ± 0.4 | 76.2 ± 0.3 | 76.6 ± 0.2 |
| L-proline | 99.8 ± 0.0 | 99.8 ± 0.0a | 99.4 ± 0.1 |
| L-asparagine | 94.2 ± 0.1 | 80.3 ± 0.3 | 81.4 ± 0.1 |
| L-aspartic acid | 90.8 ± 0.3 | 75.4 ± 0.1 | 78.7 ± 0.4 |
| L-glutamine | 81.7 ± 0.4 | 67.4 ± 0.5 | 77.3 ± 0.1 |
| L-glutamic acid | 79.1 ± 0.2 | 66.6 ± 0.9 | 72.8 ± 0.4 |
| L-histidine | 91.5 ± 0.0 | 71.6 ± 0.0 | 71.0 ± 0.6 |
| L-phenylalanine | 94.1 ± 0.1 | 88.1 ± 0.7 | 91.0 ± 0.7 |
| L-tryptophan | 86.8 ± 0.3 | 75.8 ± 0.6 | 81.4 ± 1.1 |
| L-tyrosine | 87.6 ± 0.2 | 77.0 ± 0.8 | 84.6 ± 0.4 |
aProline uses the Dunbrack rotamer library in the MakeRotLib test due to difficulty parameterizing the cyclic amino acid.
NCAAs generated by AutoRotLib were scored using a MM force field and rescored using QM.
| Noncanonical | Rotamer agreement between MM and QM | ||||
|---|---|---|---|---|---|
| 2D structure | phi | psi | 87% of library | 95% of library | |
| b-(2-Naphthyl)-L-alanine (2Np) |
| −60 | −40 | 14 of 16 | 19 of 19 |
| −110 | 130 | 14 of 17 | 19 of 21 | ||
| L-2-thienyl-Ala (2Th) |
| −60 | −40 | 9 of 10 | 10 of 11 |
| −110 | 130 | 9 of 10 | 10 of 11 | ||
| N-a-Methyl-L-phenylalanine (MeF) |
| −60 | −40 | 7 of 7 | 7 of 8 |
| −110 | 130 | 6 of 7 | 9 of 9 | ||
| N-a-Methyl-L-histidine (MeH) |
| −60 | −40 | 19 of 19 | 21 of 21 |
| −110 | 130 | 17 of 17 | 21 of 21 | ||
| N-(2-Phenylethyl)-glycine (PeG) |
| −60 | −40 | 13 of 14 | 16 of 16 |
| −110 | 130 | 11 of 12 | 15 of 16 | ||
| Cyclopropyl-methyl-glycine (CpG) |
| −60 | −40 | 5 of 5 | 6 of 6 |
| −110 | 130 | 5 of 5 | 6 of 6 | ||
FIGURE 3Mutational scanning of canonical and NCAA on the peptide PUMA bound to MCL-1. (A) NCAA shown in stick format were evaluated and parameterized with AutoRotLib. The NCAA DAI was previously parameterized in Rosetta as DAL. (B) PUMA peptide depicted in cartoon format and colored blue is bound to MCL-1 (PDB 2ROC). (C) Heatmap analysis of Δddgcalculated (ddgwt−ddgdesign) for individual residues. Native residues that are >75% solvent exposed are indicated by an asterisk*. (D) Agreement between Δddgcalculated and ΔΔGbinding for point mutations to be stabilizing or destabilizing on PUMA peptide shown as grey tiles and point mutations that differed between the two datasets are shown as white tiles. Tiles that are marked with an X represent the native residue.
FIGURE 4Site saturation mutagenesis on Macrocycle CP2 bound to target KDM4. (A) The crystal structure of CP2-KDM4 (PDB 5LY1) was used as the initial model for site saturation mutagenesis. (B) Heat map of Δddgcalculated values for all mutations evaluated. Tiles that are colored black were found to have a Δddgcalculated > 5. Native residues that are >75% solvent exposed are indicated by an asterisk*. (C) Agreement measured between free energy of mutations calculated for designs using Δddgcalculated and previously reported ΔΔGbinding values (Rogers et al., 2018) displayed as grey tiles with white tiles representing differences between Δddgcalculated and ΔΔGbinding values. Tiles that are marked with an X represent the native residue.