| Literature DB >> 30252602 |
Kannan Sankar1, Kam Hon Hoi1,2, Yizhou Yin1,3, Prasanna Ramachandran4, Nisana Andersen4, Amy Hilderbrand4, Paul McDonald5, Christoph Spiess1, Qing Zhang1,2.
Abstract
Monoclonal antibodies (mAbs) have become a major class of protein therapeutics that target a spectrum of diseases ranging from cancers to infectious diseases. Similar to any protein molecule, mAbs are susceptible to chemical modifications during the manufacturing process, long-term storage, and in vivo circulation that can impair their potency. One such modification is the oxidation of methionine residues. Chemical modifications that occur in the complementarity-determining regions (CDRs) of mAbs can lead to the abrogation of antigen binding and reduce the drug's potency and efficacy. Thus, it is highly desirable to identify and eliminate any chemically unstable residues in the CDRs during the therapeutic antibody discovery process. To provide increased throughput over experimental methods, we extracted features from the mAbs' sequences, structures, and dynamics, used random forests to identify important features and develop a quantitative and highly predictive in silico methionine oxidation model.Entities:
Keywords: Chemical stability; QSPR; algorithm; computer aided drug design; elastic network model; in silico modeling; mass spectrometry; molecular modeling; protein structure; structure property relationship
Mesh:
Substances:
Year: 2018 PMID: 30252602 PMCID: PMC6284603 DOI: 10.1080/19420862.2018.1518887
Source DB: PubMed Journal: MAbs ISSN: 1942-0862 Impact factor: 5.857
List of descriptors investigated in this study.
| No. | Descriptor Name | Explanation | Source | In Final Model? |
|---|---|---|---|---|
| 1 | NoverlResa | Number of overlaps between atoms of Met residue with spatial neighbors | Structure | Yes |
| 2 | TotSasaResa | Total solvent accessible surface area of Met residue | Structure | Yes |
| 3 | anmFluca | Mean square fluctuation of Met Cα atom based on Anisotropic Network Model | Dynamics | Yes |
| 4 | hnmFluca | Mean square fluctuation of Met Cα atom based on Hinsen’s Network Model | Dynamics | Yes |
| 5 | PhobSasaResa | Hydrophobic partition of the solvent accessible surface area of Met residue | Structure | No |
| 6 | PhilSasaResa | Hydrophilic partition of the solvent accessible surface area of Met residue | Structure | No |
| 7 | cdrLength | Length of CDR in which Met is located | Sequence | No |
| 8 | Centeredness | Location of Met with respect to center of CDR | Sequence | No |
| 9 | cdrLocation | CDR in which Met is located (CDR-H1/H2/H3/L1/L2/L3) | Sequence | No |
| 10 | IgGType | IgG type of the antibody | Sequence | No |
| 11 | lcFramework | Germline family of the light chain | Sequence | No |
| 12 | hcFramework | Germline family of the heavy chain | Sequence | No |
| 13 | QSasaResa | Ratio of exposed-to-total solvent accessible surface area of Met residue | Structure | No |
| 14 | dipoleMoment | Magnitude of the dipole Moment of the mAb | Structure | No |
| 15 | energyInt | Energy of interaction between VH and VL | Structure | No |
| 16 | protpI3D | 3D structure-based pI of the protein | Structure | No |
| 17 | chargeAtpH5 | Net charge of the mAb at pH 5.0 | Structure | No |
| 18 | chargeAtpH7 | Net charge of the mAb at pH 7.0 | Structure | No |
aThese descriptors were also calculated for the (N-1) th and (N + 1)th residues; but not identified to be useful; where N is the index of the Met residue.
Figure 1.Schematic workflow of the methodology. Antibody sequences are obtained from an in-house database and the Fv regions for each structure modeled using MOE protocols. Features are extracted from the sequence, structure and dynamics of the mAb Fv regions and used to implement a random forest-based predictor in R. The performance of the model is assessed using the standard metrics of correlation and root mean square error for regressor model and accuracy, precision, sensitivity and specificity for the implicit classifier model.
Figure 2.Scatterplot showing the predicted vs experimental % change in oxidized species for 172 Met residues. Abscissae represent the experimentally measured % change in oxidized species upon AAPH treatment whereas the ordinates represent the predicted values from the random forest regressor model. Residues with relative change < 25% (‘Non-liable’) as identified by experiment are colored in blue, while liable residues are colored in red. Outliers (having a prediction error > 3) which are mispredicted according to the classifier scheme are shown with a ‘+’ sign; and correct predictions as hollow circles. Non-outliers which are correctly predicted are shown as filled circles, and mispredicted ones with a ⊕ sign. The line of best fit (excluding outliers) is shown as a green line.
Performance measures of the random forest classifier on the training dataset of 172 Met residues.
| Experimental ‘Liable’ | Experimental ‘Non-liable’ | ||
|---|---|---|---|
| Predicted ‘Liable’ | Precision = | ||
| Predicted ‘Non-liable’ | |||
| Recall/Sensitivity = | Specificity = | Accuracy = | |
| Matthew’s Correlation Coefficient (MCC) = | |||
TP = True Positive, TN = True Negative, FP = False Positive, FN = False Negative
All predictions are ‘out-of-bag’ (OOB); that is predictions on each data point were made only using the trees not generated using that point.
Figure 3.Scatterplot showing the distribution of important features for liable versus non-liable Met residues. The number of overlaps of the Met residue with atoms of spatial neighbors (the feature ‘NoverlRes’) is shown along the x-axis and the total solvent accessible surface area of the residue (the feature ‘TotSasaRes’) along the y-axis. Liable Met residues are shown in red and non-liable Met residues in blue. Outliers (having a prediction error > 3) which are mispredicted according to the classifier scheme are shown with a ‘+’ sign; and correct predictions as hollow circles in their respective colors. Non-outliers which are correctly predicted are shown as filled circles, and mispredicted ones with a ⊕ sign.
Figure 4.Comparison of Receiver operating characteristic (ROC) curves for different methods on the benchmark clinical mAb dataset. Plot of the true positive rate (TPR) against the false positive rate (FPR) for our random forest-based prediction model (green) in comparison with that of Adimab (red) and MOPM (blue).
Figure 5.Map of liable and non-liable Met residues on the variable region of the antibodies. Histogram showing the frequencies of Met in the experimental dataset of 122 mAbs at various positions identified to be liable (red bars) and non-liable (green bars) based on Kabat numbering in different complementarity-determining regions of the heavy (left panel) and light chains (right panel). M100b is observed to be liable in one mAb and non-liable in another and there shows a green + red bar.