| Literature DB >> 29370280 |
Alexey A Lagunin1,2, Varvara I Dubovskaja1, Anastasia V Rudik1, Pavel V Pogodin1, Dmitry S Druzhilovskiy1, Tatyana A Gloriozova1, Dmitry A Filimonov1, Narahari G Sastry3, Vladimir V Poroikov1.
Abstract
In silico methods of phenotypic screening are necessary to reduce the time and cost of the experimental in vivo screening of anticancer agents through dozens of millions of natural and synthetic chemical compounds. We used the previously developed PASS (Prediction of Activity Spectra for Substances) algorithm to create and validate the classification SAR models for predicting the cytotoxicity of chemicals against different types of human cell lines using ChEMBL experimental data. A training set from 59,882 structures of compounds was created based on the experimental data (IG50, IC50, and % inhibition values) from ChEMBL. The average accuracy of prediction (AUC) calculated by leave-one-out and a 20-fold cross-validation procedure during the training was 0.930 and 0.927 for 278 cancer cell lines, respectively, and 0.948 and 0.947 for cytotoxicity prediction for 27 normal cell lines, respectively. Using the given SAR models, we developed a freely available web-service for cell-line cytotoxicity profile prediction (CLC-Pred: Cell-Line Cytotoxicity Predictor) based on the following structural formula: http://way2drug.com/Cell-line/.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29370280 PMCID: PMC5784992 DOI: 10.1371/journal.pone.0191838
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Normal cell lines with predicted accuracy calculated by leave-one-out cross-validation (AUC LOO CV) and 20-fold cross-validation (AUC 20-fold CV) procedures.
| No | Cell line | Type of cell line | Tissue/organ | N | AUC LOO CV | AUC 20-fold CV |
|---|---|---|---|---|---|---|
| 1 | AG1523 | Fibroblast | Fibroblast | 25 | 0.971 | 0.971 |
| 2 | BJ | Foreskin fibroblast | Foreskin | 37 | 0.889 | 0.862 |
| 3 | CRL-7065 | Fibroblast | Skin | 9 | 0.926 | 0.927 |
| 4 | Detroit 551 | Embryonic skin | Skin | 30 | 0.962 | 0.962 |
| 5 | HaCaT | Keratinocyte | Skin | 218 | 0.978 | 0.978 |
| 6 | HASMC | Aortic smooth muscle | Muscle | 26 | 0.999 | 0.999 |
| 7 | HEK293 | Embryonic kidney fibroblast | Kidney | 711 | 0.922 | 0.921 |
| 8 | HEL 299 | Fibroblast | Lung | 3 | 0.889 | 0.891 |
| 9 | HFF | Foreskin fibroblast | Skin | 171 | 0.974 | 0.974 |
| 10 | HFL1 | Human foetal lung fibroblast | Lung | 3 | 1.000 | 1.000 |
| 11 | HMEC | Microvascular endothelial cell | Breast | 64 | 0.948 | 0.950 |
| 12 | HS27 | Fibroblast | Skin | 40 | 0.971 | 0.972 |
| 13 | HUVEC | Umbilical vein endothelial cell | Endothelium | 999 | 0.958 | 0.958 |
| 14 | IMR-90 | Embryonic lung fibroblast | Lung | 14 | 0.860 | 0.862 |
| 15 | MRC5 | Embryonic lung fibroblast | Lung | 392 | 0.921 | 0.920 |
| 16 | MT2 | Lymphocyte (HTLV-1 producing cell line) | Blood | 93 | 0.968 | 0.969 |
| 17 | NFF | Fibroblast | Skin | 57 | 0.978 | 0.978 |
| 18 | NHDF | Fibroblast | Skin | 51 | 0.947 | 0.941 |
| 19 | PBMC | Peripheral blood mononuclear cell | Blood | 1194 | 0.973 | 0.972 |
| 20 | PrEC | Prostate epithelial cell | Prostate | 4 | 0.802 | 0.804 |
| 21 | RPTEC | Renal proximal tubule epithelial cells | Kidney | 8 | 0.998 | 0.998 |
| 22 | SKW 6.4 | B lymphocyte; Epstein-Barr virus (EBV) transformed | Haematopoietic, lymphoid tissue | 39 | 1.000 | 1.000 |
| 23 | TERT-RPE1 | Retinal pigmented epithelial cell | Retina | 10 | 0.903 | 0.904 |
| 24 | WI-38 | Embryonic lung fibroblast | Lung | 150 | 0.939 | 0.939 |
| 25 | WI-38 VA13 | Embryonic lung fibroblast | Lung | 6 | 0.965 | 0.965 |
| 26 | WIL2 | Lymphoblastoid cell | Haematopoietic, lymphoid tissue | 31 | 1.000 | 1.000 |
| 27 | WIL2-NS | Lymphoblastoid cell | Haematopoietic, lymphoid tissue | 44 | 0.961 | 0.953 |
N—number of active compounds in the training set
Distribution of cancer cell lines in various organs or tissue types with data on mean accuracy of prediction calculated by leave-one-out cross-validation (AUC LOO CV) and 20-fold cross-validation (AUC 20-fold CV) procedures for cell lines from the Organ/tissue.
| No | Organ/tissue | Number of cell lines | N | AUC LOO CV | AUC 20-fold CV |
|---|---|---|---|---|---|
| 1 | Adrenal cortex | 1 | 11 | 0.844 | 0.846 |
| 2 | Blood | 26 | 7232 | 0.950 | 0.950 |
| 3 | Bone | 8 | 443 | 0.902 | 0.901 |
| 4 | Brain | 15 | 3534 | 0.941 | 0.940 |
| 5 | Breast | 16 | 15716 | 0.915 | 0.914 |
| 6 | Cervix | 3 | 4425 | 0.935 | 0.934 |
| 7 | Colon | 26 | 18423 | 0.948 | 0.947 |
| 8 | Germ cell, fibroblast | 1 | 58 | 0.993 | 0.992 |
| 9 | Haematopoietic and lymphoid tissue | 16 | 9540 | 0.914 | 0.912 |
| 10 | Head and neck | 4 | 82 | 0.989 | 0.989 |
| 11 | Kidney | 11 | 4678 | 0.904 | 0.902 |
| 12 | Large intestine | 1 | 564 | 0.962 | 0.961 |
| 13 | Liver | 8 | 3165 | 0.960 | 0.959 |
| 14 | Lung | 38 | 14439 | 0.915 | 0.911 |
| 15 | Nervous system | 3 | 599 | 0.920 | 0.921 |
| 16 | Ovarium | 24 | 7408 | 0.942 | 0.941 |
| 17 | Pancreas | 14 | 1417 | 0.921 | 0.919 |
| 18 | Prostate | 7 | 7286 | 0.935 | 0.933 |
| 19 | Skin | 26 | 7386 | 0.910 | 0.908 |
| 20 | Small intestine | 1 | 8 | 1.000 | 1.000 |
| 21 | Soft tissue | 1 | 338 | 0.903 | 0.899 |
| 22 | Stomach | 14 | 1884 | 0.948 | 0.948 |
| 23 | Testicle | 1 | 16 | 0.986 | 0.986 |
| 24 | Thyroid | 2 | 166 | 0.919 | 0.913 |
| 25 | Upper aerodigestive tract | 2 | 64 | 0.981 | 0.792 |
| 26 | Urinary tract | 6 | 583 | 0.913 | 0.908 |
| 27 | Uterus | 3 | 138 | 0.954 | 0.954 |
N—number of active compounds in the training set.
Fig 1The prediction results for Sorafenib with the web-service.
Known and new predicted applications for drugs launched for the treatment of breast cancer.
| Name | Known therapeutic groups of application | Phases of study | Possible new applications based on prediction of cancer cell line cytotoxicity |
|---|---|---|---|
| Doxorubicin | cancer: non-small cell lung | bone cancer, stomach cancer, kidney cancer, skin cancer, tumours of haematopoietic and lymphoid tissue | |
| Gemcitabine | cancer: small cell lung | osteosarcoma | |
| Raloxifene | cancer: breast | acute T-lymphoblastic leukemia | |
| Vinorelbine | cancer: multiple myeloma | small cell lung carcinoma, colon carcinoma, osteosarcoma, childhood acute myeloid leukaemia with maturation | |
italic font–phase of drug development;
*–correct prediction by CLC-Pred web-service.