| Literature DB >> 35910615 |
Bifang He1, Bowen Li1, Xue Chen1, Qianyue Zhang1, Chunying Lu1, Shanshan Yang1, Jinjin Long1, Lin Ning2, Heng Chen1, Jian Huang3.
Abstract
Monoclonal antibody drugs targeting the PD-1/PD-L1 pathway have showed efficacy in the treatment of cancer patients, however, they have many intrinsic limitations and inevitable drawbacks. Peptide inhibitors as alternatives might compensate for the drawbacks of current PD-1/PD-L1 interaction blockers. Identifying PD-L1 binding peptides by random peptide library screening is a time-consuming and labor-intensive process. Machine learning-based computational models enable rapid discovery of peptide candidates targeting the PD-1/PD-L1 pathway. In this study, we first employed next-generation phage display (NGPD) biopanning to isolate PD-L1 binding peptides. Different peptide descriptors and feature selection methods as well as diverse machine learning methods were then incorporated to implement predictive models of PD-L1 binding. Finally, we proposed PDL1Binder, an ensemble computational model for efficiently obtaining PD-L1 binding peptides. Our results suggest that predictive models of PD-L1 binding can be learned from deep sequencing data and provide a new path to discover PD-L1 binding peptides. A web server was implemented for PDL1Binder, which is freely available at http://i.uestc.edu.cn/pdl1binder/cgi-bin/PDL1Binder.pl.Entities:
Keywords: PD-1/PD-L1 pathway; PD-L1 binding peptides; machine learning; next-generation phage display (NGPD) biopanning; support vector machine (SVM)
Year: 2022 PMID: 35910615 PMCID: PMC9335124 DOI: 10.3389/fmicb.2022.928774
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 6.064
Number of PD-L1 and non-PD-L1 binding peptides in each dataset.
| Dataset | Number of PD-L1 binding peptides | Number of non-PD-L1 binding peptides |
| Training dataset | 80 | 80/80/80/80/80/80/80/80/80/80 |
| TestDataset_1 | 30 | / |
| TestDataset_2 | / | 221405 |
For the training dataset, each negative sub-dataset with 80 non-PD-L1 binding peptides was paired with the positive training dataset composed of 80 PD-L1 binding peptides.
FIGURE 1Overview of this study.
FIGURE 2Deep sequencing the output of all selection rounds and the control experiments identified peptide sequences that exhibited high normalized abundance in R2 and low normalized abundance in R0, R1, and the control screens R2-DB, and R2-UF. Twenty-nine sequences from the deep sequencing results were clustered into five groups. Rep, replicate; R0, the library before round 1; R1, the first round of panning against PD-L1 ECD; R2-DB, panning the enriched Ph.D.-12 library from R1 against the Dynabeads; R2-UF, panning the enriched Ph.D.-12 library from R1 against unrelated anti-FLAG M2 monoclonal antibody (R2-UF); R2, panning the enriched Ph.D.-12 library from R1 against PD-L1 ECD.
List of 519 features.
| Peptide descriptor | Feature dimension |
| Amino acid composition (AAC) | 20 |
| Dipeptide composition (DPC) | 400 |
| Pseudo amino acid composition (PseAAC) | 24 |
| Composition of | 75 |
| (AAC, DPC, PseAAC, CKSAAGP) | 519 |
Each peptide was represented by four types of peptide descriptors, which were conflated into a feature vector with 519 dimensions.
FIGURE 3The performance metrics of each submodel. All data were expressed as mean ± standard deviation. SVM, Support vector machine; LR, Logistic regression; SGD, Stochastic gradient descent; NaïveBayes, Naïve Bayes; Pearson, Pearson’s correlation coefficient; CHI2, Chi-square test; IG, Information gain; FScore, F-score value; MIC, Mutual information.
FIGURE 4Framework of the proposed scheme for PD-L1 binding peptide prediction.
FIGURE 5Webpage of PDL1Binder. (A) Input interface of PDL1Binder. Users can submit query sequences in FASTA or plain text format. The tp can be set by users, ranging from 0 to 1. (B) Output interface of PDL1Binder. PDL1Binder outputs the number of SVM-based submodels that identify the query peptide is a PD-L1 binding peptide and the probability value that the query sequence is predicted to be a PD-L1 binding peptide. The output likelihood value is obtained by averaging the probability values of 10 SVM-based submodels.
Performance of PDL1Binder in two independent testing datasets under different tp-values.
|
| 0.1 | 0.15 | 0.20 | 0.25 | 0.30 | 0.35 |
| TestDataset_1 | 100.00% | 100.00% | 100.00% | 96.67% | 93.33% | 93.33% |
| TestDataset_2 | 0.80% | 2.79% | 5.03% | 8.92% | 15.18% | 20.29% |
|
| 0.40 | 0.45 | 0.50 |
| 0.60 | 0.65 |
| TestDataset_1 | 93.33% | 86.67% | 83.33% |
| 63.33% | 53.33% |
| TestDataset_2 | 27.52% | 37.39% | 44.31% |
| 62.34% | 71.17% |
|
| 0.70 | 0.75 | 0.80 | 0.85 | 0.90 | 0.95 |
| TestDataset_1 | 43.33% | 43.33% | 30.00% | 10.00% | 3.33% | 0.00% |
| TestDataset_2 | 80.84% | 86.43% | 92.26% | 96.52% | 99.09% | 100.00% |
tp, threshold of probability value to differentiate between predicted positives and negatives. Performance metric is the predictive accuracy of PD-L1 binding for TestDataset_1 and that of non-PD-L1 binding for TestDataset_2. Bold: The predictive accuracy of PD-L1 binding and that of non-PD-L1 binding have reached their maximum under tp = 0.55.