Literature DB >> 28400990

Classifications of Multispectral Colorectal Cancer Tissues Using Convolution Neural Network.

Hawraa Haj-Hassan¹, Ahmad Chaddad², Youssef Harkouss³, Christian Desrosiers⁴, Matthew Toews⁴, Camel Tanougast⁵.

Abstract

BACKGROUND: Colorectal cancer (CRC) is the third most common cancer among men and women. Its diagnosis in early stages, typically done through the analysis of colon biopsy images, can greatly improve the chances of a successful treatment. This paper proposes to use convolution neural networks (CNNs) to predict three tissue types related to the progression of CRC: benign hyperplasia (BH), intraepithelial neoplasia (IN), and carcinoma (Ca).
METHODS: Multispectral biopsy images of thirty CRC patients were retrospectively analyzed. Images of tissue samples were divided into three groups, based on their type (10 BH, 10 IN, and 10 Ca). An active contour model was used to segment image regions containing pathological tissues. Tissue samples were classified using a CNN containing convolution, max-pooling, and fully-connected layers. Available tissue samples were split into a training set, for learning the CNN parameters, and test set, for evaluating its performance.
RESULTS: An accuracy of 99.17% was obtained from segmented image regions, outperforming existing approaches based on traditional feature extraction, and classification techniques.
CONCLUSIONS: Experimental results demonstrate the effectiveness of CNN for the classification of CRC tissue types, in particular when using presegmented regions of interest.

Entities: Chemical

Keywords: Active contour segmentation; colorectal cancer; convolution neural networks; multispectral optical microscopy

Year: 2017 PMID： 28400990 PMCID： PMC5360018 DOI： 10.4103/jpi.jpi_47_16

Source DB: PubMed Journal: J Pathol Inform

Introduction

Colorectal cancer (CRC), also known as colon cancer, is due to the abnormal growth of cells proliferating throughout the colon.[1] The American Cancer Society estimates that almost 136,830 people will be diagnosed and 50,310 will die due to CRC in 2016,[2] and that the average lifetime risk of developing this type of cancer is one in 20 (5%). As with most types of cancer, the early detection of CRC is key to improving the chances of a successful treatment. CRC is typically diagnosed through the microscopic analysis of colon biopsy images. However, this process can be time consuming and subjective, often leading to significant inter/intraobserver variability. As a result, many efforts have been made toward the development of reliable techniques for the automated detection of CRC. A number of studies have investigated the development of automated methods for the assessment and classification of CRC tissue. In Fu et al.,[3] a computer-aided diagnostic system was developed to classify colorectal polyp types using sequential image feature selection and support vector machine (SVM) classification. A processing pipeline, including microscopic image segmentation, feature extraction, and classification, was also proposed in Kumar et al.[4] for the automated detection of cancer from biopsy images. In Jass,[5] a study based on clinical, morphological, and molecular features showed the usefulness of using such features for the diagnosis and treatment of CRC. A combination of geometric, morphological, texture, and scale invariant features was also investigated in Rathore et al.,[6] classifying colon biopsy images with an accuracy of 99.18%. In Rathore et al.,[7] a similar set of hybrid features was used with an ensemble classifier to enhance the classification accuracy. In Rathore et al.,[8] structural features based on the white run length and percentage cluster area were also shown to be useful for the classification of biopsy images. A few studies have also focused on the detection of CRC using multispectral microscopy images. One such work, presented in Chaddad et al.,[9] uses three-dimensional gray-level co-occurrence matrix features to classify CRC tissue types in multispectral biopsy images. Recently, convolution neural network (CNN) models have resulted in state of the art performance on a broad range of computer vision tasks such as face recognition,[10] large-scale object classification,[11] and document analysis.[12] Unlike methods based on handcrafted features, such models have the ability to build high-level features from low-level ones in a data-driven fashion.[13] In medical image analysis, CNNs have shown great potential for various applications such as medical image pattern recognition,[14] abnormal tissue detection,[15] and tissue classification.[1617] In this paper, we propose a new approach for assessing CRC progression that applies CNNs to multispectral biopsy images. In this approach, the progression of CRC is modeled using three types of pathological tissues: (1) benign hyperplasia (BH) representing an abnormal increase in the number of noncancerous cells, (2) intraepithelial neoplasia (IN) corresponding to an abnormal growth of tissue that can form a mass (tumor), and (3) carcinoma (Ca), in which the abnormal tissue develops into cancer. A CNN is used to determine the tissue type of biopsy images acquired with an optical microscope at different wavelengths. By identifying this tissue type, our approach can determine and track the progression of CRC, thereby facilitating the selection of an optimal treatment plan. To the best of our knowledge, this work is the first to use CNNs to model the progression of CRC tissues based on multispectral microscopy images. The rest of this paper is organized as follows: Section 2 describes the data used in this study, the preprocessing steps, and the proposed CNN-based model. Section 3 then presents the experimental methodology and results, highlighting the performance of our model. In section 4, we discuss the main results and limitations of this study. Finally, we conclude by summarizing the contributions of this work and proposing some potential extensions.

Methods

Figure 1 presents the pipeline of the proposed approach. Multispectral CRC biopsy images are first acquired using an optical microscope system with a charge-coupled device (CCD) camera and a liquid crystal tunable filter (LCTF). These images are then used as input to the CNN classifier, both for training the model and classifying the tissue type of a new image. Alternatively, a segmentation technique based on active contours can be used to extract regions of interest corresponding to pathological tissues, before the classification step. In section 3, we show that presegmenting images can improve the classification accuracy of the proposed approach. Individual steps of the pipeline are detailed in the following subsections.

Figure 1

Flowchart of the proposed pipeline. Convolutional neural network classification of colorectal cancer tissues based on multispectral biopsy images

Data acquisition

Histological CRC data are obtained from anatomical pathology at the CHU Nancy Brabois Hospital. Part of these data was used in a previous study for classifying the abnormal PT from texture features.[9] Tissue samples were obtained from sequential colon resections of thirty patients with CRC. Sections of 5 μm thickness were extracted and stained using hematoxylin and eosin to reduce image processing requirements. Multispectral images of 512 × 512 pixels were then acquired using a CCD camera integrated with a LCTF in an optical microscopy system.[18] For each tissue sample, the LCTF was used to provide 16 multispectral images sampled uniformly across the wavelength range of 500–650 nm.[19] Since multispectral imaging considers a broader range of wavelengths, it can capture physiological characteristics of tissues beyond those provided by standard grayscale or trichromatic photography. As mentioned before, the progression of CRC was modeled by considering three types of pathological tissues: BH, IN, and Ca. Each of the thirty biopsy samples used in this study was labeled by a senior histopathologist, for a total of ten BH samples, ten IN samples, and ten Ca samples.

Pathological tissue segmentation

Although unsegmented images can also be used directly, a segmentation step is added in the proposed pipeline to isolate the CRC tissues from nonrelevant tissues and structures such as lumen. For this step, we used a semi-automatic segmentation technique based on the active contour algorithm, which can accurately delineate the boundaries of irregularly shapes and has been shown to perform well for tissue segmentation.[20] Briefly, the active contour model can be described as a self-adaptive search for a minimal energy state EAC, defined as the sum of an internal energyand an external energy Eext.[21] The segmentation contour in an image I (x, y) is represented as a parametric function P (s) = (x (s), y (s)), s [0, 1] and is updated iteratively to minimize EAC. The internal energy is defined as Where p’ (s) and p’’ (s) represent the first and second derivatives of P (s), respectively, and α and β are constants weighting the relative importance of the derivatives. Intuitively, the internal energy favors short and smooth contour configurations. Likewise, the external energy is defined as: Where G is the 2D Gaussian function with standard deviation σ and ∇ is the gradient operator. The external energy ensures agreement between the contour and image gradient information arising from tissue boundaries. Segmentation contours were initialized as centered rectangles of 508 × 508 pixels. As shown in Figure 2, the segmentation algorithm partitions the image in two nonoverlapping regions, containing pixels that are on one side or the other of the segmentation contour. In a postprocessing step, a trained user (e.g., a pathologist) is asked to select the region of interest as one of the two segmented regions [Figure 2c]. Although outside the scope of this paper, a supervised learning model such as SVM could also be trained to select the region of interest automatically.

Figure 2

Example of tissue segmentation. (a) Original image, (b) segmentation obtained by the active contour model, (c) selected region of interest

Example of tissue segmentation. (a) Original image, (b) segmentation obtained by the active contour model, (c) selected region of interest To evaluate the performance of our segmentation model, we used manually annotated ground truth provided by the CHU Nancy-Brabois Hospital. The Jaccard similarity coefficient (JSC), dice similarity coefficient (DSC),[2223] false positive rate (FPR),[24] and false negative rate (FNR)[25] were considered as performance metrics. The JSC and DSC metrics evaluate the degree of the correspondence between two segmentations (i.e., segmentation output and ground truth) and are defined as: Where A and B represent the compared segmentations, TP/TNisthe number of correctly classified foreground/background pixels (i.e., true positives/negatives) and FP/FNisthe number of incorrectly labels foreground/background pixels (i.e., false positives/negatives). Moreover, FPR/FNR is the ratio between the number of pixels incorrectly labeled as foreground/background and the total number of background/foreground pixels: We compared the performance of our active contour model with two standard segmentation approaches: Otsu's thresholding method[26] and edge detection.[27]

Convolutional neural network-based classification

As in most CNN-based classification approaches, we adopted an architecture consisting of three types of layers: convolution layers, subsampling (max-pooling) layers, and a fully-connected output layer.[28] These types of layer can be described as follows:

Convolution layer

This layer type receives as input either the image to classify or the output or the previous layer and applies a set of Nl convolution filters to this input. The output of the layer corresponds to Nl feature maps, each one the result of a convolution filter and some additive bias. The parameters learned during training correspond to the convolution filter and bias weights. Note that the convolution process trims output maps by a border of Ml − 1 pixels, where Ml × Ml is the size of convolution filters.

Subsampling (pooling) layer

This parameter-less type of layer reduces the size of input feature maps through subsampling, thereby supporting local spatial invariance. It divides the input maps into nonoverlapping subregions and applies a specific pooling function to each one of them. In our architecture, we considered the max-pooling strategy, which outputs the maximum value of each subregion. Note that the pooling process reduces the feature maps by a factor of Ml, where Ml × Ml is the size of pooling subregions.

Output (fully-connected) layer

This type of layer captures the relationship between the final layer feature maps and class labels. The output of the layer is a vector of K elements, each one representing the score of a class (e.g., K = 3 in our network). Fully-connected layers can be seen as convolution operations, in which filters have the same size as their input maps. Figure 3 shows the proposed CNN architecture. This architecture is composed of two convolution layers (C1 and C3), each one followed by max-pooling layers (S2 and S4), and an output layer (F6). Although not represented in the figure, a layer of Rectified Linear Units (ReLUs) is added after each max-pooling layer to improve the convergence of the learning process, which is based on the stochastic gradient descent algorithm. Further details on implementing CNNs may be obtained from.[28]

Figure 3

Proposed convolutional neural network architecture with two convolution layers (C1 and C3), two max-pooling layers (S2 and S4), and one fully-connected layer (F5). For each layer, the filter size and number of output features are given For training and evaluating the CNN, the data were split based on the patients. We randomly selected the data of 21 patients (7 BH, 7 IN and 7 Ca) for training and used the data of the remaining 9 patients (3 BH, 3 IN, and 3 Ca) for testing. Furthermore, the biopsy images of three training patients were held out in a validation set and used to determine the optimal network architecture and number of training epochs. For each patient in the training, validation, and testing sets, we obtained multiple examples by running a 60 × 60 × 16 sliding window across the original 512 × 512 × 16 multispectral images. These examples were given the same label as the original image and used as input to the CNN. For segmented images, only examples whose center is within the region of interest were kept. Thus, training examples obtained from segmented images contain more relevant information for the target classification problem. Accuracy, which corresponds to the percentage of correctly classified examples, was used as a measure of classification performance.

Results

In this section, we evaluate the performance of our CRC tissue classification method. Since this method uses segmentation as preprocessing step, we first assess the ability of the proposed segmentation algorithm to extract regions of interest corresponding to CRC tissues. Table 1 gives the average performance of the three tested segmentation methods (Otsu's thresholding, edge detection, and active contour) on tissue samples corresponding to BH, IN, and Ca. We observe that our active contour method outperforms the other two approaches, for all performance metrics and tissue types (with P < 0.01 in a paired t-test). With respect to tissue types, the best performance of our method is obtained for Ca, with a JSC of 0.86, compared to 0.80 and 0.82 for BH and IN, respectively. Examples of segmentation results, for each tissue type, are shown in Figure 4. We see that the active contour method finds more consistent regions that better delineate the CRC tissues in the image.

Table 1

Average performance obtained by three tissue segmentation methods on BH, IN and Ca tissue samples

Figure 4

Examples of results obtained by the segmentation methods for the benign hyperplasia, intraepithelial neoplasia, and carcinoma tissue types. (a) Original image, (b) Otsu's thresholding, (c) edge detection, (d) active contour

Average performance obtained by three tissue segmentation methods on BH, IN and Ca tissue samples Examples of results obtained by the segmentation methods for the benign hyperplasia, intraepithelial neoplasia, and carcinoma tissue types. (a) Original image, (b) Otsu's thresholding, (c) edge detection, (d) active contour Classification results are summarized in Table 2, showing the accuracy obtained on examples of the test set by our proposed CNN model, with and without the image segmentation step. For comparison, we also report accuracy values obtained by various tissue classification approaches on the same data. These approaches are categorized according to the type of texture or shape features used (e.g., gray-level co-occurrence matrix [GLCM], and statistical moments), classifier model (e.g., SVM and nearest-neighbors), and whether image segmentation is required or not. Note that these approaches work by quantifying a region (segmented or the whole image) with generic features and using these features as input to a classifier. In contrast, our proposed CNN method learns the features from training data, providing a better representation of the different CRC tissue types.

Table 2

Comparison of tissue classification methods on the same data

Comparison of tissue classification methods on the same data From these results, we observe that extracting regions of interest through segmentation enhances the accuracy of our method. Thus, while an accuracy of 79.23% is obtained without segmentation, the accuracy of our method reaches 99.17% on presegmented images. While this could be due to various other factors, extracting regions corresponding to CRC tissues provide more discriminative examples to train the CNN. In comparison to other tissue segmentation approaches, our CNN method with segmentation provides the highest accuracy (i.e., 99.17% vs. 98.92% for Chaddad et al.). Although relatively small, such improvement in accuracy can have a significant impact considering that the problem is cancer detection. To illustrate the convergence of the parameter optimization phase (i.e., stochastic gradient optimization), Figure 5 shows the mean squared error (MSE) measured after each training epoch, corresponding to the mean squared difference between the network output Yi’and target label vector Yi:

Figure 5

Variation of mean squared error across training epochs

Variation of mean squared error across training epochs We see that the optimization converges after 500 epochs and that the MSE upon convergence is nearly 0. This shows that the proposed architecture is complex enough to learn a discriminative representation of tissue types and that the learning rate is adequate. In practice, the best number of epochs is selected based on the validation accuracy. However, eight segmented images were tested, and the performance metrics were increased with MSE and accuracy value of 0.0001 and 99.168%, respectively.

Discussion

The CRC tissue classification approaches presented in Table 2 are based on the assumption that such tissues can be effectively described using a generic texture or morphological features.[4792930] For example, Chaddad et al. used texture features derived from GLCMs, discrete wavelet transforms, and Laplacian of Gaussian filters, computed on presegmented regions, to classify the same images with an accuracy of 98.92%.[9] Likewise, Peyret et al. computed texture features such as local binary and local intensity order patterns, on unsegmented to images, obtaining an accuracy of 91.3%.[29] In contrast, we proposed a data-driven method, based on CNNs, to learn an optimal representation of tissues from training data. Our experiments showed this method outperforms existing approaches, even with a small number of tissue samples, with an accuracy of 99.17%. Results have also shown the usefulness of using presegmented images, which significantly improves the accuracy by focusing computation on relevant tissue regions within the image. While results are promising, this study also has several limitations. First, it is based on a single small cohort of thirty patients. Having a larger set of biopsy images from different patients would help capture the full variability of tissues in the progression of CRC. Moreover, to obtain an optimal accuracy, our method currently requires the pathologist to select the region of interest from a segmented image. To have a fully automated pipeline, this step should be replaced by a supervised learning model which would determine the region of interest from training data.

Conclusions

We have presented a method for the classification of CRC tissues from multispectral biopsy images, based on active contour segmentation and CNNs. Unlike traditional approaches, which extract generic texture or shape features from the image, our method learns a discriminative representation directly from the data. Experiments on multispectral images of thirty patients show our method to outperform traditional approaches when using presegmented images. In future work, we will extend this study by including a larger number of patients and using a fully automated segmentation step.

Financial support and sponsorship

The University of Lorraine supports all the work of this research project.

Conflicts of interest

There are no conflicts of interest.

14 in total

Review 1. Current methods in medical image segmentation.

Authors: D L Pham; C Xu; J L Prince
Journal: Annu Rev Biomed Eng Date: 2000 Impact factor: 9.590

Review 2. Classification of colorectal cancer based on correlation of clinical, morphological and molecular features.

Authors: J R Jass
Journal: Histopathology Date: 2007-01 Impact factor: 5.087

3. Artificial convolution neural network techniques and applications for lung nodule detection.

Authors: S B Lo; S A Lou; J S Lin; M T Freedman; M V Chien; S K Mun
Journal: IEEE Trans Med Imaging Date: 1995 Impact factor: 10.048

4. Classification of mass and normal breast tissue: a convolution neural network classifier with spatial domain and texture images.

Authors: B Sahiner; H P Chan; N Petrick; D Wei; M A Helvie; D D Adler; M M Goodsitt
Journal: IEEE Trans Med Imaging Date: 1996 Impact factor: 10.048

5. Multi Texture Analysis of Colorectal Cancer Continuum Using Multispectral Imagery.

Authors: Ahmad Chaddad; Christian Desrosiers; Ahmed Bouridane; Matthew Toews; Lama Hassan; Camel Tanougast
Journal: PLoS One Date: 2016-02-22 Impact factor: 3.240

6. Ensemble classification of colon biopsy images based on information rich hybrid features.

Authors: Saima Rathore; Mutawarra Hussain; Muhammad Aksam Iftikhar; Abdul Jalil
Journal: Comput Biol Med Date: 2014-01-08 Impact factor: 4.589

7. Texture analysis for colorectal tumour biopsies using multispectral imagery.

Authors: Remy Peyret; Ahmed Bouridane; Somaya Ali Al-Maadeed; Suchithra Kunhoth; Fouad Khelifi
Journal: Conf Proc IEEE Eng Med Biol Soc Date: 2015-08

8. Measurement of the false positive rate in a screening program for human immunodeficiency virus infections.

Authors: D S Burke; J F Brundage; R R Redfield; J J Damato; C A Schable; P Putman; R Visintine; H I Kim
Journal: N Engl J Med Date: 1988-10-13 Impact factor: 91.245

9. Feature extraction and pattern classification of colorectal polyps in colonoscopic imaging.

Authors: Jachih J C Fu; Ya-Wen Yu; Hong-Mau Lin; Jyh-Wen Chai; Clayton Chi-Chang Chen
Journal: Comput Med Imaging Graph Date: 2014-01-02 Impact factor: 4.790

Review 10. Reducing inequities in colorectal cancer screening in North America.

Authors: Kathleen M Decker; Harminder Singh
Journal: J Carcinog Date: 2014-11-14

12 in total

Review 1. Deep Learning on Histopathological Images for Colorectal Cancer Diagnosis: A Systematic Review.

Authors: Athena Davri; Effrosyni Birbas; Theofilos Kanavos; Georgios Ntritsos; Nikolaos Giannakeas; Alexandros T Tzallas; Anna Batistatou
Journal: Diagnostics (Basel) Date: 2022-03-29

2. Radiomics Evaluation of Histological Heterogeneity Using Multiscale Textures Derived From 3D Wavelet Transformation of Multispectral Images.

Authors: Ahmad Chaddad; Paul Daniel; Tamim Niazi
Journal: Front Oncol Date: 2018-04-04 Impact factor: 6.244

3. Using spectral imaging for the analysis of abnormalities for colorectal cancer: When is it helpful?

Authors: Ruqayya Awan; Somaya Al-Maadeed; Rafif Al-Saady
Journal: PLoS One Date: 2018-06-06 Impact factor: 3.240

Review 4. An Artificial Neural Network Integrated Pipeline for Biomarker Discovery Using Alzheimer's Disease as a Case Study.

Authors: Dimitrios Zafeiris; Sergio Rutella; Graham Roy Ball
Journal: Comput Struct Biotechnol J Date: 2018-02-21 Impact factor: 7.271

Review 5. Challenges Facing the Detection of Colonic Polyps: What Can Deep Learning Do?

Authors: Samy A Azer
Journal: Medicina (Kaunas) Date: 2019-08-12 Impact factor: 2.430

6. Segmentation and Grade Prediction of Colon Cancer Digital Pathology Images Across Multiple Institutions.

Authors: Saima Rathore; Muhammad Aksam Iftikhar; Ahmad Chaddad; Tamim Niazi; Thomas Karasic; Michel Bilello
Journal: Cancers (Basel) Date: 2019-11-01 Impact factor: 6.639

7. Accurate diagnosis of colorectal cancer based on histopathology images using artificial intelligence.

Authors: K S Wang; G Yu; C Xu; X H Meng; J Zhou; C Zheng; Z Deng; L Shang; R Liu; S Su; X Zhou; Q Li; J Li; J Wang; K Ma; J Qi; Z Hu; P Tang; J Deng; X Qiu; B Y Li; W D Shen; R P Quan; J T Yang; L Y Huang; Y Xiao; Z C Yang; Z Li; S C Wang; H Ren; C Liang; W Guo; Y Li; H Xiao; Y Gu; J P Yun; D Huang; Z Song; X Fan; L Chen; X Yan; Z Li; Z C Huang; J Huang; J Luttrell; C Y Zhang; W Zhou; K Zhang; C Yi; C Wu; H Shen; Y P Wang; H M Xiao; H W Deng
Journal: BMC Med Date: 2021-03-23 Impact factor: 8.775

8. Hyperspectral Imaging for the Detection of Glioblastoma Tumor Cells in H&E Slides Using Convolutional Neural Networks.

Authors: Samuel Ortega; Martin Halicek; Himar Fabelo; Rafael Camacho; María de la Luz Plaza; Fred Godtliebsen; Gustavo M Callicó; Baowei Fei
Journal: Sensors (Basel) Date: 2020-03-30 Impact factor: 3.576

9. Preoperative AI-Driven Fluorescence Diagnosis of Non-Melanoma Skin Cancer.

Authors: Victoriya Andreeva; Evgeniia Aksamentova; Andrey Muhachev; Alexey Solovey; Igor Litvinov; Alexey Gusarov; Natalia N Shevtsova; Dmitry Kushkin; Karina Litvinova
Journal: Diagnostics (Basel) Date: 2021-12-29

Review 10. CAD systems for colorectal cancer from WSI are still not ready for clinical acceptance.

Authors: Sara P Oliveira; Pedro C Neto; João Fraga; Diana Montezuma; Ana Monteiro; João Monteiro; Liliana Ribeiro; Sofia Gonçalves; Isabel M Pinto; Jaime S Cardoso
Journal: Sci Rep Date: 2021-07-13 Impact factor: 4.379