| Literature DB >> 31281337 |
Tatdow Pansombut1, Siripen Wikaisuksakul1, Kittiya Khongkraphan1, Aniruth Phon-On1.
Abstract
This paper presents the recognition for WHO classification of acute lymphoblastic leukaemia (ALL) subtypes. The two ALL subtypes considered are T-lymphoblastic leukaemia (pre-T) and B-lymphoblastic leukaemia (pre-B). They exhibit various characteristics which make it difficult to distinguish between subtypes from their mature cells, lymphocytes. In a common approach, handcrafted features must be well designed for this complex domain-specific problem. With deep learning approach, handcrafted feature engineering can be eliminated because a deep learning method can automate this task through the multilayer architecture of a convolutional neural network (CNN). In this work, we implement a CNN classifier to explore the feasibility of deep learning approach to identify lymphocytes and ALL subtypes, and this approach is benchmarked against a dominant approach of support vector machines (SVMs) applying handcrafted feature engineering. Additionally, two traditional machine learning classifiers, multilayer perceptron (MLP), and random forest are also applied for the comparison. The experiments show that our CNN classifier delivers better performance to identify normal lymphocytes and pre-B cells. This shows a great potential for image classification with no requirement of multiple preprocessing steps from feature engineering.Entities:
Year: 2019 PMID: 31281337 PMCID: PMC6589284 DOI: 10.1155/2019/7519603
Source DB: PubMed Journal: Comput Intell Neurosci
Sample images of the considered white blood cells: lymmphocyte, pre-T, and pre-B lymphoblasts.
| Lymphocyte |
|
| Pre-B |
|
| Pre-T |
|
Figure 1Architecture of ConVNet. The CNN consists of seven layers. Layers 1, 2, and 3 implement feature extraction of cell images. Layer 4 transforms 64 extracted features into one-dimensional array of size 65536. Layer 5 maps 65536 inputs into 64 outputs. Layer 6 drops 50 percent of the 64 inputs at random. Layer 7 performs classification of 3 types of ALL subtypes.
Figure 2Image classification using ConVNet.
ConVNet's total convolutional operations.
|
|
|
|
|
| Number of convolutional operations in the |
|---|---|---|---|---|---|
| 1 | 1 | 3 | 32 | 256 | 1 × 32 × 32 × 2562=18,874,368 |
| 2 | 32 | 3 | 32 | 128 | 32 × 32 × 32 × 1282=150,994,944 |
| 3 | 32 | 3 | 64 | 64 | 32 × 32 × 64 × 642=75,497,472 |
| Total convolutional operations | 245,366,784 | ||||
Summary of features extracted using image processing.
| No. | Feature | ROI | Type | Description |
|---|---|---|---|---|
| 1 | N/C ratio | Cellular | Geometric | Ratio of number of pixels in nucleus to those in cytoplasm |
| 2 | Form factor | Nucleus | Ratio of number of pixels in nucleus to its perimeter | |
| 3 | Roundness | Nucleus | Measurement of how nucleus shape is close to a circle | |
| 4 | Eccentricity | Nucleus | Ratio of major axis to minor axis | |
| 5 | Compactness | Nucleus | Degree to which a shape is compact | |
| 6 | Symmetry | Nucleus | Ratio between two parts around the nucleus major axis | |
| 7 | Hand-mirror | Cellular | Measurement of how the hand-mirror part of the cell forms | |
| 8 | Fractal geometry | Nucleus | Degree to which the nucleus boundary is irregular by calculating Hausdorff dimension | |
| 9–11 | Contour | Nucleus | Variance, skewness, and kurtosis of distances between centroid and contour points along the nucleus boundary | |
| 12 | Fractal geometry | Cellular | Degree to which the cellular boundary is irregular by calculating Hausdorff dimension | |
| 13–15 | Contour | Cellular | Variance, skewness, and kurtosis of distances between centroid and contour points along the cellular boundary | |
|
| ||||
| 16–18 | Haar wavelet | Nucleus | Texture | Mean of |
| 19–21 | Haar wavelet | Nucleus | Variance of | |
| 22–26 | Haralick | Nucleus | Contrast, correlation, homogeneity, energy, and entropy of Haralick's texture feature values | |
| 27–34 | Fourier descriptors | Nucleus | Mean, standard deviation, skewness, and kurtosis of the frequency components obtained from discrete forward (27–30) and inverse (31–34) Fourier transforms | |
|
| ||||
| 35–37 | Color in RGB | Nucleus | Color | Mean color intensity of red, green, and blue in a nucleus area |
| 38–40 | Color in HSV | Nucleus | Mean color intensity of hue, saturation, and value in a nucleus area | |
| 41–43 | Color in RGB | Cytoplasm | Mean color intensity of red, green, and blue in a cytoplasm area | |
| 44–46 | Color in HSV | Cytoplasm | Mean color intensity of hue, saturation, and value in a cytoplasm area | |
Figure 3Identification of hand-mirror morphology measured by the proportion a+c/b+c.
Figure 4The locus of chromosome consists of three parts: C, γ, and the features mask f.
Figure 5Feature selection and parameters optimization using the GA-based technique from [33].
Summary of parameter settings for the ConVNet, SVM-GA, MLP, and random forest.
| Methods | Parameters | Setting |
|---|---|---|
| ConVNet | Filter size (convolutional layer) | 3 × 3 |
| Filter size (max pooling layer) | 2 × 2 | |
| Batch size | 121 | |
| Epoch | 50 | |
| Learning rate | 0.001 | |
|
| ||
| SVM-GA | Population size | 100 |
| Number of generations | 100 | |
| Probability of crossover | 0.80 | |
| Probability of mutation | 0.06 | |
| Rate of elitism | 0.05 | |
|
| 20 | |
|
| 42 | |
|
| ||
| MLP | Number of hidden layers | 1 |
| Number of neurons in hidden layer | 69 | |
| Activation function | Logistic | |
| Batch size | 121 | |
| Epoch | 10000 | |
| Learning rate | 0.001 | |
| Momentum | 0.7 | |
|
| ||
| Random forest | Number of classifiers | 100 |
| Maximum depth | 2 | |
Accuracy of ConVNet, SVM-GA, MLP, and random forest to identify lymphocytes, pre-T, and pre-B cells over ten test sets.
| Test set | ConVNet | SVM-GA | MLP | Random forest |
|---|---|---|---|---|
| 1 | 82.64 | 81.82 | 74.38 | 78.51 |
| 2 | 80.17 | 80.99 | 75.21 | 72.73 |
| 3 | 85.12 | 80.17 | 75.21 | 79.34 |
| 4 | 80.17 | 80.17 | 76.86 | 80.99 |
| 5 | 78.51 | 79.34 | 76.03 | 78.51 |
| 6 | 78.51 | 81.82 | 73.55 | 76.86 |
| 7 | 83.47 | 80.99 | 78.51 | 79.34 |
| 8 | 85.95 | 81.82 | 81.82 | 80.99 |
| 9 | 83.47 | 86.78 | 76.03 | 79.34 |
| 10 | 79.34 | 82.64 | 73.55 | 77.69 |
| Average | 81.74 ± 2.74 | 81.65 ± 2.05 | 76.12 ± 2.51 | 78.43 ± 2.38 |
Sensitivity of ConVNet and SVM-GA to identify ALL subtypes over ten test sets.
| Test set | ConVNet | SVM-GA | ||||
|---|---|---|---|---|---|---|
| Lymphocyte | Pre-T | Pre-B | Lymphocyte | Pre-T | Pre-B | |
| 1 | 100.00 | 71.11 | 82.22 | 100.00 | 77.78 | 73.33 |
| 2 | 100.00 | 66.67 | 80.00 | 93.55 | 80.00 | 73.33 |
| 3 | 100.00 | 75.56 | 84.44 | 93.55 | 66.67 | 84.44 |
| 4 | 100.00 | 64.44 | 82.22 | 100.00 | 77.78 | 73.33 |
| 5 | 100.00 | 62.22 | 80.00 | 90.32 | 60.00 | 91.11 |
| 6 | 100.00 | 57.78 | 84.44 | 83.87 | 80.00 | 82.22 |
| 7 | 100.00 | 80.00 | 75.56 | 96.77 | 75.56 | 75.56 |
| 8 | 100.00 | 82.22 | 80.00 | 90.32 | 77.78 | 80.00 |
| 9 | 96.77 | 71.43 | 80.00 | 96.77 | 84.44 | 82.22 |
| 10 | 100.00 | 57.78 | 86.67 | 100.00 | 68.89 | 84.44 |
| Average | 99.68 ± 1.02 | 68.92 ± 8.65 | 81.56 ± 3.15 | 94.52 ± 5.28 | 74.89 ± 7.41 | 80.00 ± 6.02 |
Sensitivity of MLP and random forest to identify ALL subtypes over ten test sets.
| Test set | MLP | Random forest | ||||
|---|---|---|---|---|---|---|
| Lymphocyte | Pre-T | Pre-B | Lymphocyte | Pre-T | Pre-B | |
| 1 | 100.00 | 68.89 | 62.22 | 100.00 | 73.33 | 68.89 |
| 2 | 100.00 | 68.89 | 64.44 | 100.00 | 64.44 | 62.22 |
| 3 | 100.00 | 62.22 | 71.11 | 100.00 | 64.44 | 80.00 |
| 4 | 100.00 | 68.89 | 68.89 | 100.00 | 71.11 | 77.78 |
| 5 | 96.77 | 64.44 | 73.33 | 100.00 | 62.22 | 80.00 |
| 6 | 100.00 | 64.44 | 64.44 | 100.00 | 71.11 | 66.67 |
| 7 | 96.77 | 77.78 | 66.67 | 100.00 | 73.33 | 71.11 |
| 8 | 100.00 | 80.00 | 71.11 | 96.77 | 80.00 | 71.11 |
| 9 | 100.00 | 75.56 | 60.00 | 100.00 | 75.56 | 68.89 |
| 10 | 100.00 | 57.78 | 71.11 | 96.77 | 66.67 | 75.56 |
| Average | 99.35 ± 1.36 | 68.89 ± 7.10 | 67.33 ± 4.45 | 99.35 ± 1.36 | 70.22 ± 5.66 | 72.22 ± 5.95 |
Confusion matrix of the classification from the worst results produced by ConVNet and SVM-GA.
| Class | Lymphocyte | Pre-T | Pre-B |
|---|---|---|---|
|
| |||
| Lymphocyte |
| 0 | 0 |
| Pre-T | 0 |
| 17 |
| Pre-B | 0 | 9 |
|
|
| |||
|
| |||
| Lymphocyte |
| 0 | 3 |
| Pre-T | 0 |
| 18 |
| Pre-B | 0 | 4 |
|
Confusion matrix of the classification from the best results produced by ConVNet and SVM-GA.
| Class | Lymphocyte | Pre-T | Pre-B |
|---|---|---|---|
|
| |||
| Lymphocyte |
| 0 | 0 |
| Pre-T | 0 |
| 8 |
| Pre-B | 0 | 9 |
|
|
| |||
|
| |||
| Lymphocyte |
| 1 | 0 |
| Pre-T | 0 |
| 7 |
| Pre-B | 0 | 8 |
|