| Literature DB >> 31001458 |
Fateme Shaabanpour Aghamaleki1, Behrouz Mollashahi1, Mokhtar Nosrati2, Afshin Moradi3, Mojgan Sheikhpour4, Abolfazl Movafagh1.
Abstract
Introduction Chronic lymphocytic leukemia (CLL) is one of the most common types of leukemia, and the early diagnosis of patients coincides with their proper treatment and survival. If patients are diagnosed late or proper treatment is not applied, it may lead to harmful results. Several methods could be used for the diagnosis of leukemia; some of these include complete blood count (CBC), immunophenotyping, lymph node biopsy, chest X-ray, computerized tomography (CT) scan, and ultrasound. Most of these methods are time-consuming and an application of more than one method will result as intended. This acknowledgment stresses the necessity of rapid and proper diagnosis for leukemia based on clinical and medical findings, inasmuch as it was decided to apply the artificial neural network (ANN) in order to identify a molecular biomarker for rapid leukemia diagnosis from blood samples and evaluate its potential for the detection of cancer. Materials & methods The independent sample t-test was applied with the Statistical Package for the Social Sciences (SPSS; IBM Corp, Armonk, NY, US) software on the microarray gene expression data of Gene Expression Omnibus (GEO) datasets (GSE22529); 12 genes that had shown the highest differences (among parameters whose p-value was less than 0.01) were selected for further ANN analysis. The selected genes of 53 patients were applied to the training network algorithm, with a learning rate of 0.1. Results The results showed a high accuracy of the relationship between the output of the trained network and the test data. The area under the receiver operating characteristic (ROC) curve was 0.991, which provides proof of the precision and the relationship with identifying Gelsolin as a potential biomarker for this research. Conclusions With these results, it was concluded that the training process of the ANN could be applied to rapid CLL diagnosis and finding a potential biomarker. Besides, it is suggested that this method could be performed to diagnose other forms of cancer in order to get a rapid and reliable outcome.Entities:
Keywords: artificial neural network; biomarkers; chronic lymphocytic leukemia; diagnosis
Year: 2019 PMID: 31001458 PMCID: PMC6450593 DOI: 10.7759/cureus.4004
Source DB: PubMed Journal: Cureus ISSN: 2168-8184
A list of the twelve selected genes based on the t-test result.
These twelve genes were selected based on their lowest p-value. The probe ID of the genes in the microarray and their names and gene symbols were identified.
| Probe ID | Gene symbol | Species | Gene name |
| 200666_s_at | DNAJB1 | Homo sapiens | DnaJ heat shock protein family (Hsp40) member B1 |
| 200627_at | PTGES3 | Homo sapiens | Prostaglandin E Synthase 3 |
| 200664_s_at | DNAJB1 | Homo sapiens | DnaJ heat shock protein family (Hsp40) member B1 |
| 200701_at | NPC2 | Homo sapiens | NPC intracellular cholesterol transporter 2 |
| 200675_at | CD81 | Homo sapiens | CD81 molecule |
| 200028_s_at | STARD7 | Homo sapiens | StAR related lipid transfer domain 7 |
| 200634_at | PFN1 | Homo sapiens | Profilin 1 |
| 200709_at | FKBP1A | Homo sapiens | FK506 binding protein 1A |
| 200022_at | RPL18 | Homo sapiens | Ribosomal Protein L18 |
| 200696_s_at | GSN | Homo sapiens | Gelsolin |
| 200657_at | SLC25A5 | Homo sapiens | Solute Carrier Family 25 Member 5 |
| 200650_s_at | LDHA | Homo sapiens | Lactate Dehydrogenase A |
The results of three algorithms for the twelve genes in two groups of patients and healthy.
AUC = Area under curve in the ROC analysis, CA = Classification accuracy, F1 = An index of reliability
| Method | AUC | CA | F1 | Precision | Recall |
| SVM | 0.985 | 0.952 | 0.953 | 0.955 | 0.952 |
| Random Forest | 0.969 | 0.936 | 0.936 | 0.936 | 0.936 |
| Neural network | 0.991 | 0.969 | 0.969 | 0.970 | 0.969 |
The results of the neural network algorithm for the twelve genes between the two groups.
The results of the neural network algorithm for the 12 genes between the two groups. This algorithm was applied for the 12 selected genes, to understand the ANN value in classification.
ANN = Artificial neural network, AUC = Area under curve in the ROC analysis, CA = Classification accuracy, F1 = An index of reliability
| Method | AUC | CA | F1 | Precision | Recall |
| Neural network | 0.991 | 0.981 | 0.980 | 0.981 | 0.981 |
Results of receiver operating characteristic curve analysis of the twelve genes.
Results of the ROC curve analysis of the 12 genes were indicated and according to this result, GSN has higher AUC; therefore, it can diagnose CLL samples as compared to healthy samples.
AUC = Area under the curve in the ROC analysis, ROC = receiver operating characteristic, GSN = Gelsolin, CLL = Chronic lymphocytic leukemia
| Probe ID | Gene symbol | Specificity | Sensitivity | AUC |
| 200022_at | RPL18 | 0.756097561 | 1 | 0.911308204 |
| 200028_s_at | STARD7 | 0.818181818 | 0.926829268 | 0.88691796 |
| 200627_at | PTGES3 | 0.727272727 | 0.951219512 | 0.840354767 |
| 200650_s_at | LDHA | 0.731707317 | 1 | 0.922394678 |
| 200657_at | SLC25A5 | 0.804878049 | 1 | 0.953436807 |
| 200664_s_at | DNAJB1 | 0.902439024 | 0.909090909 | 0.931263858 |
| 200666_s_at | DNAJB1 | 0.853658537 | 1 | 0.922394678 |
| 200675_at | CD81 | 0.951219512 | 0.909090909 | 0.911308204 |
| 200701_at | NPC2 | 0.902439024 | 0.909090909 | 0.940133038 |
| 200709_at | FKBP1A | 0.829268293 | 1 | 0.968957871 |
| 200696_s_at | GSN | 0.902439024 | 1 | 0.971175166 |
| 200634_at | PFN1 | 0.926829268 | 0.818181818 | 0.89578714 |
Figure 1The receiver operating characteristic curve of the twelve genes
The plot of each gene is indicated with a special color. The horizontal axis is defined as specificity and the vertical axis is defined as sensitivity.
Figure 2The receiver operating characteristic curve of Gelsolin.
The ROC curve of GSN with the highest AUC; the AUROC curve is 0.991, which is higher than the AUROC of the other genes.
ROC = receiver operating characteristic, GSN = Gelsolin, AUC = Area under curve; AUROC: Area under the receiver operating characteristic