| Literature DB >> 34777628 |
Lu Han1, Guangcun Shan2, Bingfeng Chu3, Hongyu Wang2,4, Zhongjian Wang4, Shengqiao Gao1, Wenxia Zhou1.
Abstract
The novel coronavirus disease, COVID-19, has rapidly spread worldwide. Developing methods to identify the therapeutic activity of drugs based on phenotypic data can improve the efficiency of drug development. Here, a state-of-the-art machine-learning method was used to identify drug mechanism of actions (MoAs) based on the cell image features of 1105 drugs in the LINCS database. As the multi-dimensional features of cell images are affected by non-experimental factors, the characteristics of similar drugs vary considerably, and it is difficult to effectively identify the MoA of drugs as there is substantial noise. By applying the supervised information theoretic metric-learning (ITML) algorithm, a linear transformation made drugs with the same MoA aggregate. By clustering drugs to communities and performing enrichment analysis, we found that transferred image features were more conducive to the recognition of drug MoAs. Image features analysis showed that different features play important roles in identifying different drug functions. Drugs that significantly affect cell survival or proliferation, such as cyclin-dependent kinase inhibitors, were more likely to be enriched in communities, whereas other drugs might be decentralized. Chloroquine and clomiphene, which block the entry of virus, were clustered into the same community, indicating that similar MoA could be reflected by the cell image. Overall, the findings of the present study laid the foundation for the discovery of MoAs of new drugs, based on image data. In addition, it provided a new method of drug repurposing for COVID-19. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11571-021-09727-5.Entities:
Keywords: Cell image feature; Coronavirus; Drug repurposing; LINCS; Machine learning
Year: 2021 PMID: 34777628 PMCID: PMC8570398 DOI: 10.1007/s11571-021-09727-5
Source DB: PubMed Journal: Cogn Neurodyn ISSN: 1871-4080 Impact factor: 5.082
Fig. 1Data distribution plots. a Original data distribution, b Normalized data distribution, and c relevant data distribution. The pie chart indicates the MoA. The top 10 MoA types are presented in the bar chart
Fig. 2Overview of this study approach for drug repurposing analysis. First, we obtained the image data from LINCS (1105 drugs × 812 dimensions). Then, we used PCA and metric-learning (ITML) to process the data. After obtaining the processed data, we used the AP algorithm for clustering data. Finally, we analyzed and compared the results
Fig. 3t-Distributed Stochastic Neighbor Embedding (t-SNE) plot of a original data and b ITML-processed data
Fig. 4Data clustering for a original data, b ITML-processed data, and c PCA data
The comparison data for clustering and enrichment
| Clustering Number | Enrichment Number | Enrichment Expectation (Enrichment Number/ Clustering Number) | |
|---|---|---|---|
| Original Data | 57 | 26 | 0.4561 |
| PCA | 48 | 24 | 0.5000 |
| ITML | 39 | 35 | 0.8974 |