| Literature DB >> 32657371 |
Hossein Sharifi-Noghabi1,2, Shuman Peng1, Olga Zolotareva3, Colin C Collins2,4, Martin Ester1,2.
Abstract
MOTIVATION: The goal of pharmacogenomics is to predict drug response in patients using their single- or multi-omics data. A major challenge is that clinical data (i.e. patients) with drug response outcome is very limited, creating a need for transfer learning to bridge the gap between large pre-clinical pharmacogenomics datasets (e.g. cancer cell lines), as a source domain, and clinical datasets as a target domain. Two major discrepancies exist between pre-clinical and clinical datasets: (i) in the input space, the gene expression data due to difference in the basic biology, and (ii) in the output space, the different measures of the drug response. Therefore, training a computational model on cell lines and testing it on patients violates the i.i.d assumption that train and test data are from the same distribution.Entities:
Mesh:
Year: 2020 PMID: 32657371 PMCID: PMC7355265 DOI: 10.1093/bioinformatics/btaa442
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Schematic overview of AITL: first, the feature extractor receives source and target samples and maps them to a feature space in lower dimensions. Then, the multi-task subnetwork uses these features to make predictions for the source and target samples and also assigns cross-domain labels to the source samples. The multi-task subnetwork addresses the discrepancy in the output space. Finally, to address the input space discrepancy, global- and class-wise discriminators receive the extracted features and regularize the feature extractor to learn domain-invariant features. The feature extractor has one fully connected layer. The multi-task subnetwork has one fully connected shared layer followed by two fully connected layers for the regression task and one fully connected layer for the classification task. All the discriminators are single-layered fully connected subnetworks
Characteristics of the datasets
| Dataset | Resource | Drug | Type | Domain | Sample size | Number of genes |
|---|---|---|---|---|---|---|
| GSE55145 ( | Clinical trial | Bortezomib | targeted | Target | 67 | 11 609 |
| GSE9782-GPL96 ( | Clinical trial | Bortezomib | targeted | Target | 169 | 11 609 |
| GDSC ( | Cell line | Bortezomib | targeted | Source | 391 | 11 609 |
| GSE18864 ( | Clinical trial | Cisplatin | Chemotherapy | Target | 24 | 11 768 |
| GSE23554 ( | Clinical trial | Cisplatin | Chemotherapy | Target | 28 | 11 768 |
| TCGA ( | Patient | Cisplatin | Chemotherapy | Target | 66 | 11 768 |
| GDSC ( | Cell line | Cisplatin | Chemotherapy | Source | 829 | 11 768 |
| GSE25065 ( | Clinical trial | Docetaxel | Chemotherapy | Target | 49 | 8119 |
| GSE28796 ( | Clinical trial | Docetaxel | Chemotherapy | Target | 12 | 8119 |
| GSE6434 ( | Clinical trial | Docetaxel | Chemotherapy | Target | 24 | 8119 |
| TCGA ( | Patient | Docetaxel | Chemotherapy | Target | 16 | 8119 |
| GDSC ( | Cell line | Docetaxel | Chemotherapy | Source | 829 | 8119 |
| GSE15622 ( | Clinical trial | Paclitaxel | Chemotherapy | Target | 20 | 11 731 |
| GSE22513 ( | Clinical trial | Paclitaxel | Chemotherapy | Target | 14 | 11 731 |
| GSE25065 ( | Clinical trial | Paclitaxel | Chemotherapy | Target | 84 | 11 731 |
| PDX ( | Animal (mouse) | Paclitaxel | Chemotherapy | Target | 43 | 11 731 |
| TCGA ( | Patient | Paclitaxel | Chemotherapy | Target | 35 | 11 731 |
| GDSC ( | Cell line | Paclitaxel | Chemotherapy | Source | 389 | 11 731 |
Number of genes in common between the source and all of the target data for each drug.
Performance of AITL and the baselines in terms of the prediction AUROC
| Method/drug | Bortezomib | Cisplatin | Docetaxel | Paclitaxel |
|---|---|---|---|---|
|
| 0.48 | 0.58 | 0.55 | 0.53 |
| MOLI ( | 0.57 | 0.54 | 0.54 | 0.53 |
| PRECISE ( | 0.54 | 0.59 | 0.52 | 0.56 |
|
| 0.54 ± 0.07 | 0.60 ± 0.14 | 0.52 ± 0.02 | 0.58 ± 0.04 |
| ADDA ( | 0.51 ± 0.06 | 0.56 ± 0.06 | 0.48 ± 0.06 | Did not converge |
| ProtoNet ( | 0.49 ± 0.01 | 0.40 ± 0.003 | 0.40 ± 0.01 | Did not converge |
| AITL- | 0.69 ± 0.03 | 0.57 ± 0.03 | 0.57 ± 0.05 | 0.58 ± 0.01 |
| AITL-D | 0.69 ± 0.04 | 0.62 ± 0.1 | 0.48 ± 0.03 |
|
| AITL- | 0.69 ± 0.03 | 0.54 ± 0.1 | 0.59 ± 0.07 | 0.59 ± 0.03 |
| AITL |
|
|
| 0.61 ± 0.04 |
AITL with only the multi-task subnetwork (no AD).
AITL with only class-wise discriminators and the multi-task subnetwork (no global discriminator).
AITL with only the global discriminator and the multi-task subnetwork (no class-wise discriminator). Boldface in the table indicates the best performing of the corresponding drug.
Fig. 2.Performance of AITL and the baselines in terms of the prediction AUPR