| Literature DB >> 24932669 |
George A Khoury1, James Smadbeck, Phanourios Tamamis, Andrew C Vandris, Chris A Kieslich, Christodoulos A Floudas.
Abstract
We describe the development and testing of ab initio derived, AMBER ff03 compatible charge parameters for a large library of 147 noncanonical amino acids including β- and N-methylated amino acids for use in applications such as protein structure prediction and de novo protein design. The charge parameter derivation was performed using the RESP fitting approach. Studies were performed assessing the suitability of the derived charge parameters in discriminating the activity/inactivity between 63 analogs of the complement inhibitor Compstatin on the basis of previously published experimental IC50 data and a screening procedure involving short simulations and binding free energy calculations. We found that both the approximate binding affinity (K*) and the binding free energy calculated through MM-GBSA are capable of discriminating between active and inactive Compstatin analogs, with MM-GBSA performing significantly better. Key interactions between the most potent Compstatin analog that contains a noncanonical amino acid are presented and compared to the most potent analog containing only natural amino acids and native Compstatin. We make the derived parameters and an associated web interface that is capable of performing modifications on proteins using Forcefield_NCAA and outputting AMBER-ready topology and parameter files freely available for academic use at http://selene.princeton.edu/FFNCAA . The forcefield allows one to incorporate these customized amino acids into design applications with control over size, van der Waals, and electrostatic interactions.Entities:
Keywords: AMBER partial charges; Compstatin; complement; inhibitors; molecular dynamics; noncanonical amino acids; unnatural amino acids
Mesh:
Substances:
Year: 2014 PMID: 24932669 PMCID: PMC4277759 DOI: 10.1021/sb400168u
Source DB: PubMed Journal: ACS Synth Biol ISSN: 2161-5063 Impact factor: 5.110
Examples of Non-canonical Amino Acids Incorporated into Therapeutic Peptide Agonists and Antagonists Targeting Various Diseases
| non-canonical amino acid | therapeutic peptide | diseases targeted | source |
|---|---|---|---|
| biphenylalanine 2′-et-4′-ome-biphenylalanine 2-napthylalanine | truncated variant of GLP1-peptide | diabetes | Mapelli et al.[ |
| Rosetta designed peptides | Alzheimer’s | Sievers et
al.[ | |
| 2-indanylglycine | oxytocin variants | inhibition of uterine motor activity | Bakos et
al.[ |
| 2-napthylalanine | T140 variants | CXCR4/HIV-1 | Tamamura et al.[ |
| O-methyltyrosine | carbetocin | prevention of uterine atony, induction, and control of postpartum bleeding or hemorrhage | Vlieghe et al.[ |
| O-ethyltyrosine | atosiban | delaying the birth in case of premature birth | Vlieghe et al.[ |
| 1-napthylalanine | angiotensin I variants | hypertension | Kokubu et al.[ |
| 1-methyltryptophan, 5-methyltryptophan | Compstatin variants | stroke, heart attack, Alzheimer’s, asthma, rheumatoid arthritis, systemic lupus | Mallik et al.[ |
| cyclohexylalanine | |||
| phosphotyrosine | |||
| aminoisobutyric acid | |||
| N-methylated amino acids | Qu et al.[ |
Table of Modified Amino Acids for Which Charge Parameters Are Presented in This Work Grouped by Scaffold Amino Acida
| α-aminoisobutyric acid (AIB) | (R)-α-methyl-phenylalanine (MPH) | N4-methyl-asparagine (MEN) | N-methylaspartic acid (NMD) | (R)- |
| 2-aminobutyric acid (ABA) | 2-ethyl-4-O-methyl-biphenylalanine (TEF) | (2s,4s)-2,5-diamino-4-hydroxy-5-oxopentanoic acid (GHG) | 2-amino-propanedioic acid (FGL) | cysteine acetamide (YCM) |
| adamanthane (ADA) | 3-methyl-biphenylalanine (TMB) | glutamine hydroxamate (HGA) | 3-methyl-aspartic acid (2AS) | N-methylcysteine (NMC) |
| 2-aminoheptanoic acid (AHP) | 3-O-methyl-biphenylalanine (TOM) | N-methyl-asparagine (NMN) | 2-amino-6-oxopimelic acid (26P) | carboxymethylated cysteine (CCS) |
| 3-cyclopentylalanine (CP3) | 2-ethyl-biphenylalanine (EBP) | β-asparagine (NBA) | β-aspartic acid (DBA) | benzylcysteine (BCS) |
| diethylalanine (DLE) | 2-methyl-4-O-methyl-biphenylalanine (MFO) | s-(2-hydroxyethyl)-cysteine (OCY) | ||
| R(+)-α-Allylalanine (AAL) | 2-methyl-biphenylalanine (MBP) | 5-methyltryptophan (MTR) | 2-indanyl-glycine (IGL) | s-acetonylcysteine (CSA) |
| (R)-α-ethyl alanine (REA) | biphenylalanine (BFA) | 1-methyltryptophan (OMW) | vinylglycine (LVG) | β-cysteine (CBA) |
| (S)-α-ethyl alanine (SEA) | 2-methylphenylalanine (MH2) | N-methyltryptophan (NMW) | phenylglycine (004) | |
| cyclohexylalanine (ALC) | 3-methylphenylalanine (APD) | 2-hydroxytryptophan (TRO) | 4-hydroxyphenylglycine (D4P) | hydroxyl-methionine (ME0) |
| 1-napthylalanine (ALN) | 4-methylphenylalanine (4PH) | 4-amino-tryptophan (4IN) | (2s)-amino(3,5-dihydroxyphenyl)-ethanoic acid (3FG) | ethionine (ESC) |
| 2-napthylalanine (NAL) | 4-tert-butyl-phenylalanine (TP4) | 6-methyltryptophan (TR6) | N-methylglycine (NMG) | N-methyl-methionine (MME) |
| 5-hydroxy-1-napthalene (NO1) | 4-amino-phenylalanine (HOX) | 5-methoxytryptophan (MT5) | 2-allyl-glycine (2AG) | β-methionine (MBA) |
| 6-hydroxy-2-naphthalene (NO2) | 4-methoxy-phenylalanine (0A1) | β-hydroxy-tryptophane (HTR) | β-glycine (GBA) | |
| 3-(9-anthryl)-alanine (ANT) | 5-hydroxytryptophan (HRP) | |||
| 3-(2-pyridyl)-alanine (PY2) | 4-carbamimidoyl-phenylalanine (0BN) | β-tryptophan (WBA) | (R)-(+)-α-methylvaline (MVL) | norleucine (NLE) |
| 3-(3-pyridyl)-alanine (PY3) | 4-hydroxymethyl-phenylalanine (4HP) | norvaline (NVA) | 5-oxo-norleucine (ONL) | |
| 3-(4-pyridyl)-alanine (PY4) | 3-ethyl-phenylalanine (DMP) | O-methyltyrosine (OMY) | N-methyl-valine (MVA) | N-methyl-leucine (MLE) |
| 3-(2-quinolyl)-alanine (Q32) | 3,4-dimethylphenylalanine (D34) | O-ethyltyrosine (OEY) | β-valine (VBA) | homoleucine (HLE) |
| 3-(3-quinolyl)-alanine (Q33) | phenylserine (BB8) | O-allyltyrosine (OAY) | (R)-α-methylleucine (RML) | |
| 3-(4-quinolyl)-alanine (Q34) | homophenylalanine (HPE) | N-methyltyrosine (NMY) | N-methylthreonine (NMT) | β-hydroxyleucine (HLU) |
| 3-(5-quinolyl)-alanine (Q35) | 3,3-diphenylalanine (DIF) | O-tyrosine (OTY) | hydroxynorvaline (VAH) | |
| 3-(6-quinolyl)-alanine (Q36) | kynurenine (KYN) | 3-amino-tyrosine (TY2) | β-threonine (TBA) | β-leucine (LBA) |
| 3-(8-hydroxyquinolin-3-yl)-alanine (HQA) | N-methyl-phenylalanine (MEA) | 3-amino-6-hydroxy-tyrosine (TYQ) | ||
| N-methylalanine (NMA) | β-phenylalanine (FBA) | (β-R)-β-hydroxy-tyrosine (OMX) | (R)-α-methylornithine (RMO) | homoserine (HSE) |
| 1-pyrenylalanine (PAL) | β-tyrosine (YBA) | (S)-α-methylornithine (SMO) | 2-amino-5-hydroxypentanoic acid (LDO) | |
| (R)-2-(2′-propenyl)-alanine (PRP) | N-methylisoleucine (NMI) | 2,3-diaminopropanoic acid (DPP) | 6-hydroxy-norleucine (AA4) | |
| (R)-2-(4′-pentenyl)-alanine (PEN) | allo-isoleucine (IIL) | N-methylglutamic acid (NME) | diaminobutyric acid (DAB) | N-methyl-serine (NMS) |
| (R)-2-(7′-octenyl)-alanine (OCT) | 3-methyl-alloisoleucine (I2M) | (3r)-3-methyl-glutamic acid (LME) | (2s)-2,8-diaminooctanoic acid (HHK) | β-serine (SBA) |
| β-alanine (AAB) | β-isoleucine (IBA) | (3s)-3-methyl-glutamic acid (MEG) | N-methyl-lysine (NMK) | |
| 2s,4r-4-methylglutamate (SYM) | β-lysine (KBA) | N-methylhistidine (NMH) | ||
| N-methylglutamine (NMQ) | N-methylarginine (NMR) | 5- | ||
| N5-methyl-glutamine (MEQ) | ||||
| 3-methyl glutamine (LMQ) | ||||
| β-glutamine (QBA) |
Corresponding three-letter codes are listed in parentheses following each amino acid.
Data Set Used for Testing the Optimized Charges Introduced in Forcefield_NCAAa
| analog | source | sequence | SeqID | IC50 (μM) |
|---|---|---|---|---|
| 1 | pharmacophore | Ac-ICV(PTR)QDWGAHRCI-NH2 | 28 | 9.60 |
| 2 | pharmacophore | Ac-RCVVQDWGHHRCT-NH2 | 17 | 8.00 |
| 3 | pharmacophore | Ac-LCVVQDWGWHRCG-NH2 | 15 | 5.40 |
| 4 | pharmacophore | Ac-ICVWQDWGWHRCT-NH2 | 24 | 3.10 |
| 5 | pharmacophore | Ac-ICVVNDWGHHRCT-NH2 | 3 | 4.20 |
| 6 | structurekinetic | Ac-ICV(OMY)QDWGAHRCT-NH2 | 5 | 1.30 |
| 7 | pharmacophore | Ac-MCVHQDWGGHRCF-NH2 | 16 | 85.20 |
| 8 | pharmacophore | Ac-ICVWQDWGHHRCT-NH2 | 2 | 2.20 |
| 9 | structurekinetic | Ac-ICV(MTR)QDWGAHRCT-NH2 | 3 | 0.87 |
| 10 | novel analogues | Ac-ICVYQDWGAHRC(NMT)-NH2 | 12 | 1.90 |
| 11 | pharmacophore | Ac-ICV(OMW)QDWGAHRCT-NH2 | 1 | 0.21 |
| 12 | pharmacophore | Ac-ICVSQDWGHHRCT-NH2 | 20 | 50.90 |
| 13 | pharmacophore | Ac-ICVVQDWGHHSCT-NH2 | 10 | 25.00 |
| 14 | pharmacophore | Ac-ICVVQDWGHHRCI-NH2 | 13 | 3.20 |
| 15 | structurekinetic | Ac-ICVWQDWG(AIB)HRCT-NH2 | 12 | 1.50 |
| 16 | pharmacophore | Ac-ICVWQDWGAHRCT | 25 | 2.00 |
| 17 | pharmacophore | Ac-ICVVNDWGHHACT-NH2 | 11 | 60.00 |
| 18 | novel analogues | Ac-ICVYQDWGAHR(NMC)T-NH2 | 11 | 154.00 |
| 19 | structurekinetic | Ac-ICV(PAL)QDWGAHRCT-NH2 | 9 | 1.20 |
| 20 | pharmacophore | Ac-ICV(ALC)QDWGAHRCT | 27 | 53.60 |
| 21 | pharmacophore | Ac-ICVHQDWGHHRCT-NH2 | 21 | 10.50 |
| 22 | pharmacophore | Ac-ICVVQDWGAHACT-NH2 | 12 | 9.90 |
| 23 | structurekinetic | Ac-ICVWQDWGAHRCT-NH2 | 0 | 1.20 |
| 24 | pharmacophore | Ac-ICVWQD(OMW)GAHRCT-NH2 | 4 | 1000.00 |
| 25 | pharmacophore | Ac-ICLVQDWGHHRCT-NH2 | 8 | 10.00 |
| 26 | pharmacophore | Ac-ICVYQDWGAHRCT-NH2 | 23 | 3.80 |
| 27 | structurekinetic | Ac-ICVYQDWGAHRCT-NH2 | 4 | 2.40 |
| 28 | pharmacophore | Ac-ICVWQDWG(AIB)HRCT-NH2 | 29 | 1.50 |
| 29 | structurekinetic | Ac-ICVVQDWGHHRCT-NH2 | 15 | 4.50 |
| 30 | pharmacophore | Ac-ICVAQDWGAHRCI-NH2 | 7 | 12.00 |
| 31 | pharmacophore | Ac-ICLVNDWGHHRCT-NH2 | 9 | 8.30 |
| 32 | novel analogues | Ac-ICVYQD(NMW)GAHRCT-NH2 | 6 | 25.00 |
| 33 | pharmacophore | Ac-ICV(ALN)QDWGAHRCT | 31 | 1.80 |
| 34 | pharmacophore | Ac-ICVTQDWGHHRCT-NH2 | 19 | 68.30 |
| 35 | novel analogues | Ac-ICVYQ(NMD)WGAHRCT-NH2 | 5 | 44.00 |
| 36 | novel analogues | Ac-ICVYQDW(NMG)AHRCT-NH2 | 7 | 584.47 |
| 37 | novel analogues | Ac-ICVY(NMQ)DWGAHRCT-NH2 | 4 | 33.00 |
| 38 | novel analogues | Ac-ICVYQDWGAHRCT-NH2 | 0 | 2.40 |
| 39 | pharmacophore | Ac-LCVWQDWGRHQCF-NH2 | 14 | 131.00 |
| 40 | pharmacophore | Ac-ICVFQDWGHHRCT-NH2 | 22 | 10.20 |
| 41 | novel analogues | Ac-ICVYQDWGAH(NMR)CT-NH2 | 10 | 32.00 |
| 42 | novel analogues | Ac-ICVYQDWGA(NMH)RCT-NH2 | 9 | 94.00 |
| 43 | structurekinetic | Ac-ICV(OMW)QDWGPHRCT-NH2 | 14 | 0.54 |
| 44 | pharmacophore | Ac-DCVVQDWGHHRCT-NH2 | 18 | 22.00 |
| 45 | structurekinetic | Ac-ICV(OEY)QDWGAHRCT-NH2 | 6 | 1.30 |
| 46 | novel analogues | Ac-ICVYQDWG(NMA)HRCT-NH2 | 8 | 1000.00 |
| 47 | pharmacophore | CVVQDWGHHRC-NH2 | del1 | 33.00 |
| 48 | pharmacophore | CVVQDWGHC-NH2 | del9 | 600.00 |
| 49 | novel analogues | Ac–I(NMC)VYQDWGAHRCT-NH2 | 1 | 7.50 |
| 50 | novel analogues | Ac-ICV(NMY)QDWGAHRCT-NH2 | 3 | 1000.00 |
| 51 | pharmacophore | Ac-ICVVGDWGHHRCT-NH2 | 6 | 567.00 |
| 52 | pharmacophore | CVVQDWGHHRCT-NH2 | del0 | 25.00 |
| 53 | pharmacophore | ICVVQDWGHHRCT | 0 | 12.00 |
| 54 | pharmacophore | IAVVQDWGHHRAT | 5 (Linear) | 600.00 |
| 55 | pharmacophore | CVVQDWC-NH2 | del8 | 600.00 |
| 56 | pharmacophore | CAVQDWGHHRC | del10 | 1200.00 |
| 57 | pharmacophore | CWGHHRCT-NH2 | del4 | 600.00 |
| 58 | pharmacophore | CVVQDWAHHRC | del11 | 1200.00 |
| 59 | pharmacophore | CVQDWGHHRCT-NH2 | del7 | 600.00 |
| 60 | pharmacophore | CQDWGHHRCT-NH2 | del6 | 600.00 |
| 61 | pharmacophore | CDWGHHRCT-NH2 | del5 | 600.00 |
| 62 | pharmacophore | CHHRCT-NH2 | del2 | 600.00 |
| 63 | pharmacophore | CGHHRCT-NH2 | del3 | 600.00 |
The noncanonical amino acids studied are phosphotyrosine (PTR), O-methyltyrosine (OMY), N-methylthreonine (NMT), 5-methyltryptophan (MTR), 1-methyltryptophan (OMW), α-aminoisobutyric acid (AIB), N-methylcysteine (NMC), 1-pyrenylalanine (PAL), cyclohexylalanine (ALC), N-methyltryptophan (NMW), 1-naphthylalanine (ALN), N-methylaspartic acid (NMD), N-methylglycine (NMG), N-methylglutamine (NMQ), N-methyl-arginine (NMR), N-methylhistidine (NMH), O-ethyltyrosine (OMY), N-methylalanine (NMA), N-methyltyrosine (NMY). Several of these non-canonical amino acids were substituted in different positions on the Compstatin sequence. ACE and NH2 correspond to the N-terminal and C-terminal blocking groups acetyl and amide to keep the termini neutrally charged.
Figure 1Receiver operating characteristic (ROC) curves constructed from rank-ordered lists of Compstatin variants’ binding metrics. ROC curve for rank ordered list by K* corresponding to an active IC50 cutoff of <20 μM (A) and 200 μM (B). ROC curve for rank ordered list by ΔGBind,Solv°GBSA to an active IC50 cutoff of <20 M (C) and 200 μM (D).
Figure 2Correlation between IC50 and calculated binding free energies using MM-GBSA for 63 Compstatin analogs. The blue bands correspond to the 95% confidence interval for the regression line. The red bands correspond to the 95% confidence interval for a new value to lie in the prediction band. Error bars are ±1 standard deviation from the mean binding free energy calculated.
Figure 3(A) Nonpolar and (B) polar interaction maps for analog W4(OMW)A9. (C) Nonpolar and (D) polar interaction maps for analog W4A9. (E) Nonpolar and (F) polar interaction maps for native Compstatin. The color bar represents the interaction free energy between the corresponding residue–residue pairs in kcal/mol. The color bar was scaled to be the same for the nonpolar and polar interaction free energy contributions so that the different analogues’ energetic contributions can be directly compared.
Figure 4Web interface for the dissemination of Forcefield_NCAA. The web interface has static links to download and use Forcefield_NCAA in AMBER locally, as well as an interactive interface to make noncanonical amino acid substitutions and mutations to an input PDB structure. Screenshot taken April 25, 2013.
Figure 5Automated framework for AMBER partial charge parametrization for noncanonical, α,α-disubstituted, β- and N-methylated amino acids. Adapted with permission from ref (50). Copyright 2013, American Chemical Society.
Figure 6(A) Crystal structure of Compstatin Variant E1 bound to Complement component C3c (PDB: 2QKI). Each macroglobulin domain (MG) is denoted by color. (B) The region where modified amino acids are to be substituted in is shown in the inset. This is the interface that is being designed when making natural or noncanonical amino acid substitutions. A disulfide bridge cyclizes Compstatin between residues 2 and 12. (C) This is the full region all calculations will be performed on, beginning with F335 from MG4 to D535 on MG5 as labeled. The images are visualized in PyMOL.[78]
Figure 7Thermodynamic cycle used to calculate the binding free energies. Ideally one can calculate the binding free energy for the association of [A] + [B] ⇌ [AB] directly in solvent. This calculation is expensive and contains much noise due to the contribution of the solvent. Therefore, a different approach was used exploiting a thermodynamic cycle that can calculate the same difference by utilizing the solvation free energies of the protein (receptor), peptide (ligand), and complex, with the binding energy calculated in vacuo.[87]