| Literature DB >> 33937058 |
Nicolas Borisov1, Anna Sergeeva2, Maria Suntsova3,4, Mikhail Raevskiy1, Nurshat Gaifullin5, Larisa Mendeleeva2, Alexander Gudkov3, Maria Nareiko2, Andrew Garazha6,7, Victor Tkachev6,7, Xinmin Li8, Maxim Sorokin3,6,7, Vadim Surin2, Anton Buzdin3,4,6.
Abstract
Multiple myeloma (MM) affects ~500,000 people and results in ~100,000 deaths annually, being currently considered treatable but incurable. There are several MM chemotherapy treatment regimens, among which eleven include bortezomib, a proteasome-targeted drug. MM patients respond differently to bortezomib, and new prognostic biomarkers are needed to personalize treatments. However, there is a shortage of clinically annotated MM molecular data that could be used to establish novel molecular diagnostics. We report new RNA sequencing profiles for 53 MM patients annotated with responses on two similar chemotherapy regimens: bortezomib, doxorubicin, dexamethasone (PAD), and bortezomib, cyclophosphamide, dexamethasone (VCD), or with responses to their combinations. Fourteen patients received both PAD and VCD; six received only PAD, and 33 received only VCD. We compared profiles for the good and poor responders and found five genes commonly regulated here and in the previous datasets for other bortezomib regimens (all upregulated in the good responders): FGFR3, MAF, IGHA2, IGHV1-69, and GRB14. Four of these genes are linked with known immunoglobulin locus rearrangements. We then used five machine learning (ML) methods to build a classifier distinguishing good and poor responders for two cohorts: PAD + VCD (53 patients), and separately VCD (47 patients). We showed that the application of FloWPS dynamic data trimming was beneficial for all ML methods tested in both cohorts, and also in the previous MM bortezomib datasets. However, the ML models build for the different datasets did not allow cross-transferring, which can be due to different treatment regimens, experimental profiling methods, and MM heterogeneity.Entities:
Keywords: PAD; VCD; bortezomib; fibroblast growth factor receptor 3; gene expression; machine learning; multiple myeloma; treatment response
Year: 2021 PMID: 33937058 PMCID: PMC8083158 DOI: 10.3389/fonc.2021.652063
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Bortezomib containing chemotherapy regimens currently in use for the first-line treatment of multiple myeloma.
| Chemotherapy regimen | Primary therapy for transplant candidates | Primary therapy for non-transplant candidates |
|---|---|---|
| Bortezomib + Doxorubicin + Dexamethasone (PAD) | yes ( | no |
| Bortezomib + Cyclophosphamide + Dexamethasone (VCD) | yes ( | yes ( |
| Bortezomib + Lenalidomide + Dexamethasone | yes ( | yes ( |
| Daratumumab + Bortezomib + Lenalidomide + Dexamethasone | yes ( | no |
| Bortezomib + Thalidomide + Dexamethasone | yes ( | no |
| Daratumumab + Bortezomib + Cyclophosphamide + Dexamethasone | yes ( | no |
| Daratumumab + Bortezomib + Thalidomide + Dexamethasone | yes ( | no |
| Bortezomib + Thalidomide + Dexamethasone + Cisplatin + Doxorubicin + Cyclophosphamide + Etoposide (VTD-PACE) | yes ( | no |
| Daratumumab + Bortezomib + Melphalan + Prednisone | no | yes ( |
| Daratumumab + Bortezomib + Cyclophosphamide + Dexamethasone | no | yes ( |
| Bortezomib + Dexamethasone | no | yes ( |
Figure 1The hierarchical clustering dendrogram of all experimental RNA sequencing profiles of the control and multiple myeloma samples. Gene expression data were used to calculate Euclidian distances between the samples. Color indicates the sample type. The lower scale indicates the number of uniquely mapped reads. QC denotes the quality control threshold of 2.5 million uniquely mapped reads.
Figure 2(A) The hierarchical clustering dendrogram of QC-passed experimental RNA sequencing profiles of the control and multiple myeloma samples. Gene expression data were used to calculate Euclidian distances between the samples. The color markers indicate the sample types. The lower scale indicates the number of uniquely mapped reads. ‘QC’ denotes the quality control threshold of 2.5 million uniquely mapped reads. (B) PCA for QC-passed experimental RNA sequencing profiles of the control and multiple myeloma samples. The color markers indicate the sample types. (C) The hierarchical clustering dendrogram of QC-passed experimental RNA sequencing profiles of the multiple myeloma samples. Gene expression data were used to calculate Euclidian distances between the samples. The color markers indicate the response. The lower scale indicates the number of uniquely mapped reads. ‘QC’ denotes the quality control threshold of 2.5 million uniquely mapped reads. (D) PCA for QC-passed experimental RNA sequencing profiles of the multiple myeloma samples. The color markers indicate the response.
Figure 3Gene expression levels of genes ARPC2 (A), EIF4BP8 (B), KLHDC7B (C), OSR2 (D), RPL21P1 (E), SETP4 (F), TRIM9 (G), and TSSC4 (H) in the full cohorts of MM responders and poor responders to PAD/VCD therapy. For every gene, paired t-test p-values and AUC values are shown. Each dot on the graph represents single MM sample. Grey indicates good treatment responders, black—poor responders.
Figure 4Gene expression levels of genes ARPC2 (A), CDHR4 (B), EFCAB8 (C), EIF4BP8 (D), OSR2 (E), SETP4 (F), SLC25A6P3 (G), TOGARAM1 (H), TRIM9 (I), and TSSC4 (J) in the cohorts of MM responders and poor responders to VCD therapy. For every gene, paired t-test p-values and AUC values are shown. Each dot on the graph represents single MM sample. Grey indicates good treatment responders, black—poor responders.
Core marker genes for the current PAD/VCD MM dataset (full cohort/VCD cohort).
| Gene ID1 | Cohort | Regulation in responders/non-responders2 | Molecular function3 |
|---|---|---|---|
|
| Full, VCD | Up/down | control of actin polymerization |
|
| VCD | Down/up | cell adhesion protein; sorting of heterogeneous cell types |
|
| VCD | Down/up | Calcium ion binding |
|
| Full, VCD | Down/up | eukaryotic translation initiation factor 4B pseudogene 8 |
|
| Full | Down/up | Kelch domain-containing protein 7B |
|
| Full, VCD | Down/up | transcription factor |
|
| Full | Down/up | ribosomal protein L21 pseudogene 1 |
|
| Full, VCD | Down/up | SET pseudogene 4 |
|
| VCD | Down/up | mitochondrial carrier; adenine nucleotide translocator, member 6 pseudogene 3 |
|
| VCD | Down/up | microtubule binding |
|
| Full, VCD | Down/up | includes TRIM motif with three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region |
|
| Full, VCD | Down/up | tumor suppressing subtransferable candidate 4 |
1Gene name according to HGNC nomenclature (63).
2Up/downregulation of marker genes in the treatment responders and non-responders, accordingly.
3Gene function according to GeneCards knowledgebase (64).
Figure 5Area under receiver-operator curve (AUC), sensitivity (Sn) and Specificity (Sp) for five ML methods (A) linear SVM, (B) RF, (C) RR, (D) BNB, (E) MLP during classification of response to PAD/VCD treatment of 53 MM patients (full cohort). Parameter B is false positive vs. false negative balance factor.
Figure 6Area under receiver-operator curve (AUC), sensitivity (Sn) and Specificity (Sp) for five ML methods (A) linear SVM, (B) RF, (C) RR, (D) BNB, (E) MLP during classification of response to VCD treatment of 47 MM patients (VCD cohort). Parameter B is false positive vs. false negative balance factor.
General characteristics of bortezomib chemotherapy response-annotated MM datasets.
| Reference | Dataset ID | Therapy | Experimental platform | Number | Number of |
|---|---|---|---|---|---|
| Current study (full cohort) | Bortezomib, doxorubicin, dexamethasone (PAD) AND/OR bortezomib, cyclophosphamide, dexamethasone (VCD) | RNA sequencing, Illumina HiSeq 3000 | 53 (28 R: | 8 | |
| Current_Study_VCD | Bortezomib, cyclophosphamide, dexamethasone (VCD) | RNA sequencing, Illumina HiSeq 3000 | 47 (24 R: | 10 | |
| ( | GSE9782 | Bortezomib monotherapy | Affymetrix Human Genome U133 Array | 169 (85 R: | 18 |
| ( | GSE68871 | Bortezomib-thalidomide-dexamethasone | Affymetrix Human Genome U133 Plus | 118 (69 R: | 12 |
| ( | GSE55145 | Bortezomib followed by ASCT | Affymetrix Human Exon 1.0 ST Array | 61 (33 R: | 14 |
| ( | GSE19784_1 | Bortezomib, doxorubicin, dexamethasone (PAD) | Affymetrix Human Genome U133 Plus 2.0 Array | 61 with ISS stage I [32 R, 29 NR ( | 7 |
| ( | GSE19784_2 | Bortezomib, doxorubicin, dexamethasone (PAD) | Affymetrix Human Genome U133 Plus 2.0 Array | 51 with ISS stage II [33 R, 18 NR ( | 12 |
| ( | GSE19784_3 | Bortezomib, doxorubicin, dexamethasone (PAD) | Affymetrix Human Genome U133 Plus 2.0 Array | 41 with ISS stage III [29 R, 12 NR ( | 11 |
| ( | GSE2658 | Bortezomib, doxorubicin, dexamethasone (PAD) | Affymetrix Human Genome U133 Plus 2.0 Array | 208 [55 R, 153 NR ( | 16 |
Core marker genes identified for bortezomib chemotherapy response-annotated MM datasets; genes that are overexpressed in the treatment responders are marked by (+), downregulated in the responders by (−).
| Current study full cohort | Current study VCD cohort | GSE9782 | GSE68871 | GSE55145 | GSE19784_1 | GSE19784_2 | GSE19784_3 | GSE2658 |
|---|---|---|---|---|---|---|---|---|
| ARPC2 (+) | ARPC2 (+) | ABHD14A (−) | BORCS8 (-) | AKNA (−) | ACE2 (+) | DCUN1D2 (−) | ANKRD11 (−) | BCOR (+) |
| EIF4BP8 (−) | CDHR4 (−) | ATP2B4 (+)1 | BTG1 (-) | CATSPER3 (+) | C22orf24 (−) | GTF2H5 (+) | FN1 (+) | CENPE (+) |
| KLHDC7B (−) | EFCAB8 (−) | ATP5S (−) | CCND1 (-) | EMP3 (+) | FAM132A (−) | HAUS8 (−) | HDAC2 (−) | COX6C (+) |
| OSR2 (−) | EIF4BP8 (−) | ATP5SL (−) | CSTB (−) | MYH9 (−) | GPR124 (+) | PPIEL (−) | HS3ST5 (+) | FH (+) |
| RPL21P1 (−) | OSR2 (−) | ATP6V0D1 (+) | CTCFL (+) | NDRG2 (+) | GTF2A1L (+) | PSPC1 (−) | KRT35 (−) | FUNDC1 (+) |
| SETP4 (−) | SETP4 (−) | B2M (+) | GAS6_AS1 (−) | NUCB2 (+) | PPP4R4 (+) | RAD52 (−) | LINC00511 (−) | G2E3 (+) |
| TRIM9 (−) | SLC25A6P3 (−) | BCL2L11 (−) | NOMO3 (+) | PASK (−) | RP11.680G24.5 (−) | RBP5 (+) | MFSD4 (−) | HMGB3 (+) |
| TSSC4 (−) | TOGARAM1 (−) | C7orf26 (−) | ORAI1 (−) | RAPGEF1 (−) | RP5.1098D14.1 (−) | NOXO1 (−) | HMGB3P1 (+) | |
| TRIM9 (−) | CCNB1IP1 (−) | PLOD3 (+) | RRP7BP (−) | SMARCA2 (+) | REM1 (−) | MEI1 (−) | ||
| TSSC4 (−) | COX7C (−) | SCN9A (+) | TMEM131L (−) | STK32A (−) | RP11.960B9.2 (−) | NEDD9 (−) | ||
| DLST (−) | STK33 (+) | TMEM99 (+) | TIAM1 (−) | SNCAIP (−) | PAM (−) | |||
| DLSTP1 (−) | SYBU (+) | TRAF3 (−) | TMEM57 (+) | RP11.164P12.4 (−) | ||||
| FAM106A (+) | TRAF4 (−) | S100PBP (+) | ||||||
| GFER (−) | ZNF286A (+) | SHCBP1 (+) | ||||||
| NDUFB1 (-) | SMC4 (+) | |||||||
| PATZ1 (−) | UNC13C (+) | |||||||
| RPS7 (−) | ||||||||
| TCP11L1 (−) |
Best ROC AUC and AUPR (precision-recall AUC) values obtained for good versus poor responder classifiers built using different ML methods without/with FloWPS for different MM annotated expression datasets.
| Dataset | SVM | RF | RR | BNB | MLP |
|---|---|---|---|---|---|
|
| 0.80/0.82 | 0.76/0.82 | 0.86/0.87 | 0.79/0.84 | 0.81/0.83 |
|
| 0.79/0.82 | 0.78/0.79 | 0.88/0.90 | 0.78/0.83 | 0.79/0.81 |
|
| 0.82/0.86 | 0.74/0.83 | 0.86/0.87 | 0.78/0.88 | 0.84/0.89 |
|
| 0.82/0.86 | 0.76/0.86 | 0.86/0.84 | 0.79/0.92 | 0.83/0.88 |
|
| 0.68/0.72 | 0.68/0.80 | 0.77/0.77 | 0.73/0.76 | 0.72/0.76 |
|
| 0.65/0.70 | 0.70/0.80 | 0.77/0.77 | 0.69/0.76 | 0.69/0.74 |
|
| 0.68/0.77 | 0.73/0.83 | 0.78/0.77 | 0.74/0.84 | 0.70/0.80 |
|
| 0.64/0.76 | 0.73/0.83 | 0.79/0.77 | 0.71/0.80 | 0.69/0.76 |
|
| 0.78/0.82 | 0.77/0.90 | 0.87/0.84 | 0.82/0.87 | 0.80/0.85 |
|
| 0.72/0.84 | 0.72/0.84 | 0.88/0.85 | 0.83/0.82 | 0.83/0.82 |
|
| 0.65/0.82 | 0.74/0.77 | 0.84/0.84 | 0.74/0.84 | 0.72/0.81 |
|
| 0.64/0.77 | 0.71/0.77 | 0.86/0.84 | 0.72/0.84 | 0.69/0.79 |
|
| 0.83/0.87 | 0.75/0.82 | 0.92/0.94 | 0.88/0.94 | 0.86/0.87 |
|
| 0.85/0.91 | 0.79/0.86 | 0.96/0.97 | 0.92/0.97 | 0.88/0.89 |
|
| 0.84/0.94 | 0.84/0.86 | 0.95/0.95 | 0.86/0.96 | 0.91/0.94 |
|
| 0.89/0.95 | 0.89/0.90 | 0.98/0.98 | 0.92/0.98 | 0.95/0.96 |
|
| 0.72/0.77 | 0.67/0.79 | 0.79/0.79 | 0.76/0.78 | 0.63/0.72 |
|
| 0.45/0.55 | 0.51/0.61 | 0.58/0.61 | 0.49/0.54 | 0.42/0.48 |
Figure 7Intersection analysis for differentially expressed genes (DEG) distinguishing good and poor treatment responders in eight bortezomib MM datasets (A). Observed vs expected (under the hypothesis of random DEG distribution) numbers of intersection events in all possible pairwise comparisons (B).
Common differentially expressed genes (DEGs) in the current experimental dataset (full or VCD only cohorts) and in seven previously published MM datasets.
| Dataset | GSE9782 | GSE68871 | GSE55145 | GSE19784_1 | GSE19784_2 | GSE19784_3 | GSE2658 |
|---|---|---|---|---|---|---|---|
| Common DEGs with full cohort |
|
|
|
|
|
|
|
| Common DEGs with VCD cohort |
|
|
|
|
|
|
|
(+) marks genes overexpressed in the good treatment responders.
Normalized expression levels of bortezomib targeting genes in MM patients before and after PAD/VCD treatment.
| Patient ID | Best response status | Sample status |
|
|
|---|---|---|---|---|
| MM111 | Partial response | Pretreatment | 880,4 | 669,9 |
| Relapse | 553 | 483,5 | ||
| MM115 | Minimal response | Pretreatment | 814 | 456,3 |
| Relapse | 604,8 | 282,8 |
1Normalized counts of gene-mapped RNA sequencing reads.