| Literature DB >> 28962576 |
Jun He1, Haidan Yan1, Hao Cai1, Xiangyu Li1, Qingzhou Guan1, Weicheng Zheng1, Rou Chen1, Huaping Liu1, Kai Song2, Zheng Guo3,4, Xianlong Wang5,6.
Abstract
BACKGROUND: The Connectivity Map (CMAP) database, an important public data source for drug repositioning, archives gene expression profiles from cancer cell lines treated with and without bioactive small molecules. However, there are only one or two technical replicates for each cell line under one treatment condition. For such small-scale data, current fold-changes-based methods lack statistical control in identifying differentially expressed genes (DEGs) in treated cells. Especially, one-to-one comparison may result in too many drug-irrelevant DEGs due to random experimental factors. To tackle this problem, CMAP adopts a pattern-matching strategy to build "connection" between disease signatures and gene expression changes associated with drug treatments. However, many drug-irrelevant genes may blur the "connection" if all the genes are used instead of pre-selected DEGs induced by drug treatments.Entities:
Keywords: Differentially expressed genes; Drug repositioning; Metformin; Phenformin; The Connectivity Map
Mesh:
Substances:
Year: 2017 PMID: 28962576 PMCID: PMC5622488 DOI: 10.1186/s12967-017-1302-9
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Fig. 1The control samples of HepG2, HCT116 and MCF7 cell line collected from different laboratories. Samples for each type of cell lines were divided into two group, referred to as group1 and group2. Blue pie represents the stable gene pairs1 identified in the group1, red pie represents the stable gene pairs2 identified in the group2. The overlap in the pie represent common gene pairs in the stablepairs1 and stablepairs2 and the number in the brackets represent the consistency score, which denotes the percentage of common gene pairs in stableparis1 and stableparis2 that display the same REO patterns
Details of the three datasets used in the OneComp performance evaluation
| Dataset | Platform | Phenotype | Sub-datasets | Sample no. (control vs treat) |
|---|---|---|---|---|
| GSE41326 | HepG2 | Liver | Sub 1 | GSM1014792 vs GSM1014795 |
| Sub 2 | GSM1014793 vs GSM1014796 | |||
| Sub 3 | GSM1014794 vs GSM1014797 | |||
| GSE7161 | HCT116 | Colon | Sub 1 | GSM172453 vs GSM172451 |
| Sub 2 | GSM172454 vs GSM172452 | |||
| Sub 3 | GSM172457 vs GSM172455 | |||
| Sub 4 | GSM172458 vs GSM172456 | |||
| GSE37820 | MCF7 | Breast | Sub 1 | GSM928442 vs GSM928445 |
| Sub 2 | GSM928443 vs GSM928446 | |||
| Sub 3 | GSM928444 vs GSM928447 |
Description of the datasets used in the drug repositioning
| Source | Platform | Normal/control sample size | Cancer/treated sample size | Tissues/cell type |
|---|---|---|---|---|
| GSE7670 | GPL96 | 26 | 26 | Lung tissue |
| GSE10072 | GPL96 | 49 | 58 | Lung tissue |
| CMAP | GPL96 | 46 | / | MCF7 |
| CMAP | GPL3921 | 130 | / | MCF7 |
| CMAP | GPL3921 | 125 | / | HL60 |
| CMAP | GPL3921 | 116 | / | PC3 |
Details of the phenformin and metformin treated cells in the CMAP datasets
| Drug | Instance id | Batch id | Concentration (M) | Duration (h) | Cell type | Platform |
|---|---|---|---|---|---|---|
| Phenformin | 21 | 2 | 0.00001 | 6 | MCF7 | HG-U133A |
| 2350 | 618 | 0.0000166 | 6 | HL60 | HT_HG-U133A | |
| 2312 | 642 | 0.0000166 | 6 | MCF7 | HT_HG-U133A | |
| 3622 | 685 | 0.0000166 | 6 | MCF7 | HT_HG-U133A | |
| 4747 | 700 | 0.0000166 | 6 | MCF7 | HT_HG-U133A | |
| 3725 | 681 | 0.0000166 | 6 | PC3 | HT_HG-U133A | |
| 4283 | 701 | 0.0000166 | 6 | PC3 | HT_HG-U133A | |
| Metformin | 1 | 1 | 0.00001 | 6 | MCF7 | HG-U133A |
| 2 | 1 | 0.00001 | 6 | MCF7 | HG-U133A | |
| 3 | 1 | 0.0000001 | 6 | MCF7 | HG-U133A | |
| 4 | 1 | 0.001 | 6 | MCF7 | HG-U133A | |
| 61 | 2a | 0.00001 | 6 | MCF7 | HG-U133A | |
| 1858 | 629 | 0.0000242 | 6 | HL60 | HT_HG-U133A | |
| 1694 | 627 | 0.0000242 | 6 | MCF7 | HT_HG-U133A | |
| 5487 | 737 | 0.0000242 | 6 | MCF7 | HT_HG-U133A | |
| 1816 | 628 | 0.0000242 | 6 | PC3 | HT_HG-U133A | |
| 5068 | 718 | 0.0000242 | 6 | PC3 | HT_HG-U133A |
Fig. 2Performance of RankComp and OneComp in cell data. a Performance of RankComp in cell data with only one technical replicate. b Sample size influence on performance of OneComp via background filtering and building
Overlap and consistency of DEGs detected by OneComp and SAM (FDR < 5%)
| Dataset | DEGs by SAM | Sub-datasets | DEGs by OneComp | Overlap | POG (%) | Consistency (%) |
|---|---|---|---|---|---|---|
| GSE41326 | 1896 | Sub 1 | 5084 | 1069 | 56.33 | 99.91 |
| Sub 2 | 5217 | 1092 | 57.49 | 99.82 | ||
| Sub 3 | 5440 | 1075 | 56.43 | 99.53 | ||
| GSE7161 | 1280 | Sub 1 | 4770 | 917 | 71.64 | 100.00 |
| Sub 2 | 4595 | 932 | 72.81 | 100.00 | ||
| Sub 3 | 4568 | 890 | 69.53 | 100.00 | ||
| Sub 4 | 6973 | 909 | 70.94 | 99.89 | ||
| GSE37820 | 633 | Sub 1 | 3734 | 314 | 49.76 | 100.00 |
| Sub 2 | 3657 | 327 | 51.82 | 100.00 | ||
| Sub 3 | 3950 | 321 | 50.87 | 100.00 |
Pair 1, 2, 3, 4 representing paired control and treated technical replicates 1, 2, 3, 4 within each dataset. Overlap denotes the common DEGs detected in each of the pairs by OneComp and the large dataset by SAM. Consistency denotes the percentage of overlapped DEGs that display the same deregulation direction (up- or down-deregulation) between OneComp and SAM (FDR < 5%). P denotes the significance of the consistency (binomial test). POG denotes the percentage of the DEGs identified by SAM (FDR < 5%) that are consistently detected by OneComp ((FDR < 5%)
Results of the drug repositioning for phenformin and metformin
| Methods | CMAP name | Overlap genes | Reversal score | N |
|
|---|---|---|---|---|---|
| Approach recommended by CMAP | Phenformin_HL60 | /a | − 0.7880 | 1 | 1 |
| Phenformin_MCF7 | /a | 0.5070 | 4 | 1 | |
| Phenformin_PC3 | /a | − 0.2880 | 2 | 0.9887 | |
| Metformin_HL60 | /a | − 0.6700 | 1 | 1 | |
| Metformin_MCF7 | /a | − 0.2910 | 7 | 0.5039 | |
| Metformin_PC3 | /a | 0.4890 | 2 | 1 | |
| Approach based on OneComp | Phenformin_HL60 | 1180 | 0.4271 | 1 | 1 |
| Phenformin_MCF7 | 489 | 0.6094 | 4 | < 0.0001 | |
| Phenformin_PC3 | 190 | 0.6053 | 2 | 0.0023 | |
| Metformin_HL60 | 1028 | 0.5564 | 1 | 0.0002 | |
| Metformin_MCF7 | 1296 | 0.5872 | 7 | < 0.0001 | |
| Metformin_PC3 | 122 | 0.6148 | 2 | 0.0071 |
N present the number of the cell samples treated by the phenformin or phenformin at different dose
aCMAP approach do not provide these data
Fig. 3Genes irrelevant to drug treatment may blur the “connection” between a drug and a disease. Blue dots and bars represent the up-regulated genes; red dots and bars represent the down-regulated genes; black bars represent the non-regulated genes
Fig. 4The PPI links between the NSCLC signature DEGs which could be reversed with phenformin treatment (a) or metformin treatment (b)