Literature DB >> 25474260

Staining pattern classification of antinuclear autoantibodies based on block segmentation in indirect immunofluorescence images.

Jiaqian Li¹, Kuo-Kun Tseng¹, Zu Yi Hsieh², Ching Wen Yang³, Huang-Nan Huang⁴.

Abstract

Indirect immunofluorescence based on HEp-2 cell substrate is the most commonly used staining method for antinuclear autoantibodies associated with different types of autoimmune pathologies. The aim of this paper is to design an automatic system to identify the staining patterns based on block segmentation compared to the cell segmentation most used in previous research. Various feature descriptors and classifiers are tested and compared in the classification of the staining pattern of blocks and it is found that the technique of the combination of the local binary pattern and the k-nearest neighbor algorithm achieve the best performance. Relying on the results of block pattern classification, experiments on the whole images show that classifier fusion rules are able to identify the staining patterns of the whole well (specimen image) with a total accuracy of about 94.62%.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：
Antibodies, Antinuclear

Year: 2014 PMID： 25474260 PMCID： PMC4256175 DOI： 10.1371/journal.pone.0113132

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Autoimmune diseases, such as rheumatoid arthritis, primary biliary cirrhosis and dermatomyositis, are individually rare in contrast with other kinds of diseases, but together they affect the health of many people worldwide. They are a fascinating but poorly understood group of diseases [1]. Antinuclear autoantibodies are a serological hallmark of most autoimmune diseases, and serve as diagnostic biomarkers and classification criteria for a number of these diseases [2]. Although the role of autoantibodies is still not clear, growing evidence shows that most autoimmune diseases are confirmed to be in connection with the occurrence of specific auto-antibodies, such as primary biliary cirrhosis [3]. However, antinuclear antibodies are also detectable in approximately 50% of subjects with primary biliary cirrhosis. Several ANAs are associated with primary biliary cirrhosis, so the connection of a specific ANA to the pathogenesis of primary biliary cirrhosis is not known [3]. This demonstrates that the relationship between autoimmune diseases and autoantibodies is not a single correspondence. Although there are many tests for the detection of ANAs, such as indirect immunofluorescence (IIF) and enzyme-linked immunosorbent assay (ELISA), IIF based on HEp-2 cell substrate during the serological hallmark is the most commonly used staining method for antinuclear autoantibodies. Usually, the immunofluorescence patterns are manually identified by the physician visually inspecting the slides under a microscope. Since IIF diagnosis requires both the estimation of fluorescence intensity and the description of staining patterns, adequately trained persons are not always available for these tasks, so this procedure still needs highly specialized and experienced physicians to make the diagnoses. As ANA testing becomes more used in clinics, an automatic inspection system for pattern categories is in great demand [4]. Before the classification of staining patterns, relevant patterns (see Figure 1) related to the most recurrent ANAs should be considered [5], [6] in the experimental dataset.

Figure 1

ANA patterns in the experimental dataset: (a) coarse speckled (b) fine speckled (c) nucleolar (d) peripheral.

Coarse Speckled: this pattern is characterized by coarse granular nuclear staining of the interphase cell nuclei; Fine Speckled: this pattern is characterized by fine granular nuclear staining of the interphase cell nuclei; Peripheral: this group is characterized by solid staining, primarily around the outer region of the nucleus, with weaker staining toward the centre of the nucleus; Nucleolar: this pattern is characterized by large coarse speckled staining within the nucleus, less than six in number per cell. The aim of this paper is to design an automatic system with a two-layer classification model, block pattern recognition and well pattern recognition, to identify the staining patterns of the whole well based on block segmentation. In particular, the following points will be investigated in the present study: Block segmentation. In contrast to the previous cell segmentation used for ANA classification, block segmentation is significantly easier to implement and more applicable due to the erroneous conditions of cell segmentation. Block pattern classification. Various image features (local binary pattern (LBP), linear discrimination analysis (LDA), scale-invariant feature transform (SIFT) and grey-level co-occurrence matrix (GLCM) and classifiers K-nearest neighbour (KNN), Back Propagation Neural Network (BPNN) and support vector machine (SVM) are compared in this step to seek the best characteristic and classifier for ANA classification. Well pattern classification. Based on the results of the block pattern classification, classifier fusion rules are used to identify the staining patterns of the whole well. Meanwhile, a kind of cell pattern classification is regarded as the control group. The rest of this paper comprises four parts. In Section 2, we introduce some related studies on ANA patterns including segmentation, feature extraction and classification. Section 3 presents the proposed method consisting of four steps: block segmentation, feature extraction, block pattern classification and well pattern classification. Section 4 provides the experimental results and comparison. Finally Section 5 is the conclusion and discussion.

Related Studies

2.1 Image Segmentation

The previous research on ANA image segmentation has mainly focused on cell segmentation and the criteria for recognition of cell patterns, but a more applicable method of block segmentation for ANA pattern classification has so far not been developed. Many competitions and conferences research cell classification and cell segmentation, for example, the competition on cell classification by fluorescent image analysis hosted by the 20th IEEE International Conference on Image Processing (ICIP) and The 1st Workshop on Pattern Recognition Techniques for Indirect Immunofluorescence Images. Creemers et al. [7] repeatedly used image processing techniques, including morphological opening and Otsu thresholding, to cut out the needed region of interest. It was found that this method has the capability to split connected regions into individual cells with an average accuracy of 89.57%. Huang et al. [8] proposed an efficient method for automatically detecting the outlines of fluorescent cells in IIF images. This method first classified an IIF image into two cases, sparse and mass cell regions, based on the number of connected regions in an image. Depending on the cell types of the images, different colour spaces and processing techniques were adopted for cell segmentation. For images with sparse region cells, HSB colour space, anisotropic diffusion, canny edge detection and morphological smoothing are applied sequentially to detect the cell outline, while for images with mass region cells, CMY colour space, anisotropic diffusion, Otsu's thresholding and morphological processing are used. Hsieh et al. and Huang et al. [9], [10] also presented a reliable region-based method of two-staged watershed segmentation to solve a wide range of difficult problems of ANA image segmentation, i.e. over-segmentation and sensitivity to noise and contrast in the image. Region merging and region elimination were utilized for the first stage watershed algorithm [11] to obtain the cell boundaries and in the second stage the similarity-based watershed algorithm acted as the marker to prevent over-segmentation. It was proved that the segmentation performance achieved an overall sensitivity of 94.7%.

2.2 Image Feature Extraction

Numerous features utilized in ANA pattern classification were investigated, including texture features and shape features, as shown in Figure 2. Since the same object may have a variety of different colours but a similar shape, many queries may arise as to the shape of the image instead of the colour of the image. There are two methods of presenting shape features: contour feature and regional characteristics. However, shape features lack a model, and have high computation and storage requirements. In [12], the shape measurement of a single feature vector, with greater weight by far given to texture, is used to identify the cytoplasmatic class and the shape feature (calculated as the area divided by the square of the perimeter) is able to recognize most samples of this category based on a single parameter. In [4], four shape features, area, perimeter, inside area and perimeter area, in the feature vector are utilized as the inputs for a self-organizing map (SOM) model to determine the similarity of the cells.

Figure 2

ANA pattern classification methods: segmentation, feature extraction and classification.

Texture feature is the most commonly used feature for pattern classification, describing the surface nature of the scene corresponding to a specified image or image area. Texture feature is not the sort of feature based on pixels, which need statistical calculations of more than one pixel belonging to the region. As a statistical feature, texture feature, often with rotation-invariant characteristics, has a strong capability to resist noise. However, it also has its drawbacks, and one obvious drawback is that changing the image resolution, may result in larger deviations in the calculated texture feature. Moreover, the feature may sometimes be affected by light and reflection. Texture feature extraction methods can be divided into statistical methods, structure methods and spectrum methods. Giulio et al. [5] used the well-known SIFT [13], [14] descriptor to extract concise and informative local characteristic from HEp-2 images. The SIFT algorithm has proved to be one of the most effective in the object-recognition field because of its invariance to common image transformations, illumination changes and noise. Discrete cosine transform (DCT) [15] is always used to extract relevant textural information for image compression and classification. In [16], 328 DCT coefficients, which represent different patterns of image variation and directional information of the texture, are calculated through two-dimensional DCT of the normalized images. In [17], 48 DCT features, including the DC component, mean value and standard variance, are extracted for HEp-2 cell pattern classification. The LBP descriptor [18]–[20] is a robust and computationally efficient means of texture description, which derives from a general definition of texture in a local neighbourhood, incorporates both statistical and structural information and has shown effectiveness in many applications. Kuan et al. [17] extracted 42 features from each HEp-2 cell image through the multi-resolution LBP descriptors. Ersoy et al. [21] used a uniform rotation-invariant LBP consisting of 18 unique patterns for HEp-2 cell classification. GLCM [22] is a powerful technique that extracts texture characteristics from the spatial relationship among intensity values at specified offsets and reports the distribution of co-occurring values among local pixels based on different distances and angles. In [23], only four GLCM features (intensity, standard deviation, entropy and range) are calculated as a part of the final feature vector, while in [16], a total number of 44 features, represented by the mean and the range value over the 22 statistical measures (e.g. correlation, cluster prominence, cluster shade, energy, entropy, variance, homogeneity, maximum probability, etc. [22]), are extracted from four GLCMs. There are many other texture features used for HEp-2 cell classification, such as Gabor transform [17], histogram of oriented gradients (HOG) [24], independent component analysis (ICA) [25], codebook based descriptors [26], speeded-up robust features (SURF) [24], grey-level size zone matrix (GLSZM) [27] etc. However, most studies combine several of the image features mentioned above into a feature vector to recognize the patterns of the HEp-2 cell instead of using a single characteristic since a combination of several features is able to extract more image information on texture, shape and space than a single feature. In [24], Ghosh and Chaudhary. explored various features like SURF in a bag of words (BoW) model, texture-based features from the GLCMs and region of interest (ROI)-based features and HOG features, using one or several of them to create a composite feature vector to investigate the performance of a classification based on various features. A total of 372 features containing 44 GLCM features and 328 DCT coefficients in [16] were used to characterize each HEp-2 image. Moreover, in [9] a total of eight practical features (standard deviation, uniformity/entropy, block variation of local correlation coefficients, spatial grey-level dependence matrices, grey-level difference matrix, neighbourhood grey-tone difference matrix, fractal dimension and image coarse degrees) obtained from an IIF cell image were utilized to identify fluorescence patterns.

2.3 ANA Pattern Classification

In the past decade, there have been many studies on the detection of ANA patterns and many classification methods, mainly including the KNN algorithm [28], [29], artificial neural networks (ANNs), expert systems (ESs) and SVM etc., have been utilized for pattern recognition in HEp-2 cells and have achieved positive final performance. Multi-class SVMs with different kernels were investigated and used in [12], [17], [24]. Soda has always used multi-expert systems (MESs) [30]–[33] to explore the problem of HEp-2 cell pattern classification. Cordelli and Soda [34] test four popular classifiers belonging to different paradigms: a multi-layer perceptron (MLP) as a neural network, a KNN as a statistical classifier, an SVM as a kernel machine, and AdaBoost as an ensemble of classifiers. In [31], the authors selected two ANN-based classifiers based on the MLPs and the radial basis function (RBF) network architecture to separate intrinsically dubious samples whose error tolerance can be flexibly set. In [9], learning vector quantization (LVQ), which is a prototype-based supervised classification algorithm and can be understood as a special case of an ANN, utilized the normalized feature vector to differentiate the autoantibody fluorescence patterns.

Methods

The proposed classification architecture consists of four sequential steps: block segmentation, feature extraction, block pattern classification and well pattern classification. To our knowledge, block segmentation has never been used to process HEp-2 cell images. Various combinations of features and classifiers are utilized to identify the patterns of the blocks to explore the best features and classifiers suitable for this application. But not all combinations are used in the block pattern classification; for example, the VLFeat Package has its own classifier for the SIFT feature, so we just use this combination. As to the fusion rules, the weighted sum rule (WSR) is only defined in the KNN classifier, so in other classifiers, we just aggregate the block patterns to classify the staining pattern of the specimen image with WR and weighted majority rule (WMR) rules.

3.1 Block Segmentation

As mentioned above, the previous research into HEp-2 cell image segmentation mainly concentrated on cell segmentation, which also has drawbacks affecting the final sensitivity of the segmentation. For example, Otsu's thresholding method can choose the threshold to minimize the intra-class variance of the black and white pixels automatically, but due to the variety of ANA patterns, Otsu's algorithm always failed to segment cells of discrete speckled and nucleolar patterns and resulted in over-segmentation [8]. Even though two-stage watershed segmentation [10] uses two watershed transformations to avoid over-segmentation, it may occur in generating erroneous outlines of IIF cells because of noise and speckles in IIF images. Since the two-stage watershed segmentation uses a great number of morphological techniques, including pre-processing, Otsu's thresholding, region merging and region elimination, in the expectation of better segmentation performance, its time complexity and space complexity significantly enlarge in contrast to Otsu's segmentation method. The methods [8], [10], [35] proposed in the previous works for the segmentation utilized various techniques to eliminate over-segmentation and overlap problems, which have no effects on the performance of block segmentation although there are overlap areas between different blocks. Block segmentation is much easier to implement than cell segmentation and does not have the same problems as cell segmentation. As is shown in Figure 3, first the RGB image is converted into a binary image and morphological erosion with a disk mask is performed; then the connected regions, which determine the position of candidate blocks, are located. The centre of the connected region is regarded as the centre of the block with a fixed size, such as and (the set depending on the size of the well image). The centre of the connected region is defined as (Figure 4a)where denotes the location of the centre and and denote the maximum and minimum x axes of the connected region. Similarly, and are the maximum and minimum y axes of the connected region.

Figure 3

Flowchart of block segmentation.

Figure 4

Overlap in the block segmentation: (a) calculate the centre of the connected regions; (b) overlap problem; (c) the overlapping rates with the size increasing.

Sometimes, the overlapping area occupies a large part of the total area, here using to present the rate of overlap areas between two blockswhere denotes the area of the overlapping area, denotes the area of one block and denotes the area of another block overlapping with the former block as shown in Figure 4b. Depending on the overlapping rates, the block with the maximum overlapping relationships is removed first and then the second maximum, then the third, and so on, until there is no overlapping rate which is larger than the threshold. Here the overlapping relationship means the mapping from one block to another if they have overlaps. A block may overlap with many other blocks, so the block with the most overlapping relationships is removed first. The preliminary experiments show that the average overlapping rate of two blocks increases as the block enlarges (Figure 4c). In order to decrease the number of blocks separated from an image and the influence of block overlap, one of the two overlapping blocks will be removed if the overlap rate is larger than 0.5. But if the block is too small, for example, and (Figure 4b), then most of the blocks are retained resulting in enormous experimental complexity and the blocks are too large to obtain sufficient blocks with a low overlapping rate, e.g. the average overlapping rate exceeds 80% when the size is or . Moreover, the investigation and experiments demonstrate that the block of is the most suitable for block segmentation and classification. The overlap problem in block segmentation has no effect on the final well pattern classification as the blocks with overlapping relationships are all either in the training set or in the test set.

3.2 Feature Extraction

In this section, in total four practical features, LBP, SIFT, LDA and GLCM, are solely explored to identify fluorescence patterns. These features are briefly described as follows.

LBP Features

The original LBP operator, introduced by Ojala et al. [36], is a powerful means of texture description. Here, we use the notation for the LBP operator (Figure 5a, 5b, 5c). The subscript represents using the operator in a (P, R) neighbourhood. A histogram of the labelled image can be defined as:where n is the number of different labels produced by the LBP operator and

Figure 5

LBP descriptors: (a) ; (b) ; (c) ; (d) division for retaining spatial information.

This histogram contains information about the distribution of the local micro-patterns, such as edges, spots and flat areas, over the whole image. For efficient image feature representation, spatial information should be retained. So the image is divided into regions (see Figure 5d, here m = 16) and the spatially enhanced histogram is defined as [18] In this experiment, the best LBP descriptor is , and the dimension of the feature vector is 256.

SIFT Features

Lowe [13] summed up the existing feature detection method based on invariants technology in 2004, and formally proposed the SIFT algorithm invariant to common image transformations (image scaling, rotation, even affine transformation), illumination changes and noise. The SIFT algorithm first undertakes feature detection in scale space and defines the key point positions and the scale of the key points, and then uses the main direction of the neighbourhood gradient of the key points as the direction features of the points in order to achieve the operator independent of scale and direction. The MATALB Code S1 of SIFT we used is from http://www.vlfeat.org/index.html [37]. Different images have feature vectors with different dimensions, but each element has a direction parameter with 128 dimensions.

LDA Features

Principal component analysis (PCA) and LDA [38]–[40] are two powerful tools used for dimensionality reduction and feature extraction in most pattern recognition applications. Due to the number of blocks being so numerous that pattern classification only based on PCA characteristics may waste too much time, LDA features with five dimensions based on PCA characteristics are utilized to identify HEp-2 cells, achieving better performance than that only using PCA.

GLCM Features

GLCM [22] is a powerful technique that extracts texture characteristics from the spatial relationship between intensity values at specified offsets and reports the distribution of co-occurring values between local pixels based on different distances and angles. Here we extract 44 features, represented by the mean and the range value over the four GLCMs for each of the 22 statistical measures (e.g. correlation, cluster prominence, cluster shade, energy, entropy, variance, homogeneity and maximum probability, etc. [22]) in [16].

3.3 Block Pattern Classification

In this procedure, three commonly used classifiers, i.e. KNN using Euclidean distance, common BPNN with sigmoid units and SVM with linear kernel function, are used with different features for block recognition. Several different patterns may appear in a single image, but the segmentation method proposed here only considers images with a unique pattern, which implies that blocks separated from an image are all marked with the pattern of the same well image. The KNN classification algorithm, presented by Cover and Hart [28] in 1967, is a more mature approach in theory, but also one of the simplest machine learning algorithms. This decision rule provides a simple non-parametric procedure for the assignment of a class label to the input pattern based on the class labels represented by the k-closest neighbours of the vectors. BPNN learning methods provide a robust approach to approximating real-valued, discrete-valued, and vector-valued target functions. For certain types of problems, such as learning to interpret complex real-world sensor data, ANNs are among the most effective learning methods currently known. SVM is a powerful machine learning method successfully used in many applications and the classification is based on the implicit mapping of data to a higher dimensional space via a kernel function and on the identification of the maximum-margin hyperplane that separates the given training instances in this high-dimensional space [16].

3.4 Well Pattern Classification

To classify the screening patterns of the whole image into one of the basic classes mentioned in Section 1, first blocks should be segmented from the well image and then the set of features extracted; second the staining patterns of blocks labelled by the pattern of the original image are classified, and finally the staining pattern of the whole well is distinguished based on the results of the classification of its cells (Figure 6).

Figure 6

Architecture of well pattern classification.

In fact, such an approach based on classification of individual blocks cannot detect the occurrence of multiple patterns since there may be cells with different patterns in a block marked with one pattern. But it is acceptable that most cells in a block belong to the class of the block. Furthermore, this approach is greatly tolerant and robust to misclassifications in block recognition since the final label of the whole image is aggregated by the classification information of all the blocks segmented from the image. Indeed, if enough blocks per well are available, it is reasonable that block misclassification, if limited, does not affect the final well pattern classification. Typical fusion techniques, including majority rule (MR), WMR [41] and WSR [42], [43] (see Figure 7), will be used in this section to combine the results of block recognition. However, a critical point of these fusion rules is that different blocks belonging to the same well should be included in either the training set or the testing set, which guarantees that the final well pattern is determined by all the blocks belonging to this well image. So we randomly subdivided all the well images into two equal partitions and different blocks belonging to the same well were all in one partition. In the following, we briefly describe these fusion rules.

Figure 7

Method flow in the classification architecture.

First, a conceptual formula [43] is given as follows:in which indicates the weighted parameter of the input block b in the block set B and is defined as follows: The index of the final class of the well staining pattern is the class for which is maximum. If , the rule is MR; if indicates the number of cells in block b, the fusion rule is defined as WMR and if indicates the classification reliability of the input block b, this rule becomes WSR, which is only used in the KNN classification. The classification reliability for the KNN [42] classifier is given bywhere denotes the smallest distance of b from a reference sample belonging to the same pattern of b; implies the highest among the values of obtained for samples in a set disjointed from the reference set and the test set and is the distance between x and the reference sample with the second smallest distance from x among all the reference set samples belonging to a class which is different from that of .

Experimental Results

Note that the proposed method to identify the staining pattern of the HEp-2 cell image here only considers images with a unique staining pattern; implying that blocks separated from an image are all marked with the pattern of the same well image. Not only the direct classification of the whole image, but also the staining pattern classification of the well image, based on cell segmentation as the control group, abides by this principle. Therefore, block segmentation is equal to cell segmentation in the problem to be solved.

4.1 Dataset

In this study, the IIF images were collected based on HEp-2 substrate at a serum dilution of 1∶80. A physician takes images of slides with an acquisition unit consisting of the fluorescence microscope coupled with a commonly used fluorescence microscope (Axioskop 2, CarlZeiss, Jena, Germany) at 640-fold magnification. The IIF images were taken by an operator with a colour digital camera (E-330, Olympus, Tokyo, Japan). The digitized images were of 8-bit photometric resolution for each RGB colour channel with a resolution of 3136×2352 pixels [9]. This image database contains 260 samples belonging to four different patterns, i.e. coarse speckled (CS), fine speckled (FS), nucleolar (NU) and peripheral (PE). The number of samples in each pattern were 167 (CS), 20 (FS), 38 (NU) and 35 (PE), and the odd-numbered half of them were selected to belong to the training set, and the remainder were the test set (Table 1). If ANA testing detects any of the four patterns, the patients may have specific systemic autoimmune diseases. For example, if the test detected the CS pattern, the patients may have systemic lupus erythematosus (SLE), mixed connective tissue disease (MCTD), progressive systemic sclerosis (PSS) or cryoglobulinemia. Experiments have shown that the best features for ANA classification are features, which are shown in Figure 8.

Table 1

The samples in the training set and in the testing set.

IIF patterns	Samples in the training set	Samples in the testing set
Coarse speckled	84	83
Fine speckled	10	10
Nucleolar	19	19
Peripheral	17	18
Total	130	130

Figure 8

Four staining patterns and the corresponding descriptors: (a) coarse speckled (CS); (b) fine speckled (FS); (c) nucleolar (NU); (d) peripheral (PE).

The dataset is from the third party, Taichung Veterans General Hospital. The data is available upon request to the corresponding author. Moreover, the Code S1 of this experiment is uploaded in the online version for the convenience of testing our proposed method on other datasets, please refer to Supporting Information: Instruction S1.

4.2 Direct Whole Image Pattern Classification

LBP descriptors of the HEp-2 images without cell segmentation and block segmentation were directly extracted, and then classified on the test set. It was found that the KNN classifier with the descriptor just achieved the best performance with a total accuracy of 83.08%, as depicted in Table 2; in particular, only half of the samples with peripheral patterns were classified into the right class, demonstrating that global classification is not applicable in some patterns.

Table 2

Classification results based on the whole image.

	CS	FS	NU	PE	Accuracy
CS	75	0	1	7	90.36%
FS	0	9	0	1	90.00%
NU	3	0	15	1	78.95%
PE	8	0	1	9	50.00%
Total					83.08%

4.3 Classification Based on Cell Segmentation

HEp-2 cell images were separated by Otsu's thresholding method and all cells divided from an image belonging to the training set or the test set were still regarded as the training set or the test set respectively. Subsequently, various combinations of classifiers and features were applied to the HEp-2 cells dataset, and then suitable fusion rules were used to aggregate the results of cell pattern classification into well pattern classification. The experiments showed that when utilizing a combination of KNN, LBP and MR, the total accuracy of the four distinct patterns is 90.77% (see Table 3), which is better than that in the direct classification. Compared with direct classification, the accuracy of the CS pattern increased to 100%, while that of the FS and NU patterns slightly decreased.

Table 3

Classification results based on cell segmentation with KNN classifier, LBP feature and Majority Rule.

	CS	FS	NU	PE	Accuracy
CS	83	0	0	0	100.0%
FS	2	5	3	0	70.00%
NU	3	0	14	2	73.68%
PE	2	0	0	16	77.78%
Total					90.77%

4.4 Classification Based on Block Segmentation

To better illustrate the advantages of block segmentation, such comparisons are described as follows: (a) the block segmentation considered is significantly different from cell segmentation, that is, the complexity of block segmentation is significantly lower than that of cell segmentation with numerous morphological techniques, and the block segmentation method just depends on connected regions; (b) in contrast to direct whole image classification, the classification based on block segmentation has a robust tolerance to misclassifications in the block recognition since the final label of the whole image is aggregated by classification information of all blocks segmented from the image. However, that the size of the block requires a great number of explorations to determine, since there has been no regularity so far, is a problem. In this experiment, various combinations of classifier, feature and fusion rule were utilized to evaluate the performance of the staining pattern recognition of the HEp-2 cell image. Figure 9 presents the accuracies of 10 combinations mainly focusing on the LBP feature and KNN classifier, with some passive combinations omitted, such as LDA feature and BPNN classifier, LDA feature and SVM classifier etc. LBP+BPNN+MR and LBP+KNN+WSR achieve the same accuracy, 94.62%, and the classification results based on the LBP feature is more positive than other features with a maximum accuracy of 76.15% using GLCM+KNN+MR (Figure 9). This indicates that the LBP feature is the most suitable characteristic to identify ANA patterns.

Figure 9

Accuracies of different combinations of classifier, feature and fusion rule: from right to left sequentially GLCM+SVM+WMR, GLCM+KNN+MR, LBP+BPNN+MR, LBP+BPNN+WMR, LBP+KNN+MR, LBP+KNN+WMR, LBP+KNN+WSR, LDA+KNN+MR, SIFT(vlfeat)+MR and SIFT(vlfeat)+WMR.

Two methods have the same accuracy, LBP+BPNN+MR and LBP+KNN+WSR, but their individual results for cell pattern classification are different. The latter, with a total accuracy of 82.21%, is slightly better than the former with 79.95%, as shown in Table 4. We used different fusion rules to aggregate the different classification results of the block pattern, but achieved the same positive performance, demonstrating the robust tolerance to misclassifications of well pattern classification based on the results of block pattern classification.

Table 4

Classification results of block pattern classification of LBP+KNN and LBP+BPNN.

Block patterns	Ntrain	Ntest	Nc (LBP+BPNN)	Nc (LBP+KNN)
CS	1714	1584	1417(89.46%)	1458(92.05%)
FS	130	133	106(79.70%)	127(95.49%)
NU	419	440	239(54.32%)	321(72.95%)
PE	373	401	283(70.57%)	197(49.13%)
Total	2636	2558	79.95%	82.21%

Ntrain: number of blocks in the training set.

Ntest: number of blocks in the testing set.

Nc: number of correct classification of blocks.

Ntrain: number of blocks in the training set. Ntest: number of blocks in the testing set. Nc: number of correct classification of blocks. Even though only approximately half of the blocks marked with the peripheral pattern are correctly distinguished (Table 4) in the block pattern classification with the LBP characteristic and KNN classifier, there are still 12 samples among the test set of well images (18 samples) correctly classified (Table 5). Even though the accuracy of block pattern classification with the LBP feature and BPNN classifier is no more than 80%, the final well pattern classification based on it achieved positive performance with a total accuracy of 94.62% (Figure 9). In contrast with the classification based on cell segmentation (Table 3), the accuracies of FS and NU patterns both in well classification based on block segmentation reached 100% while those of CS and PE patterns slightly decreased. Consequently, the performance of well classification based on block segmentation is a little better than that based on cell segmentation.

Table 5

Classification results based on block segmentation with KNN classifier, LBP feature and Weighted Sum Rule.

	CS	FS	NU	PE	Accuracy
CS	82	0	1	0	98.80%
FS	0	10	0	0	100.0%
NU	0	0	19	0	100.0%
PE	5	0	1	12	66.67%
Total					94.62%

Moreover, mean class accuracy (MCA) is commonly used as the evaluation criteria in cell level classification. So here we use it to measure the performance of cell segmentation and block segmentation under the same circumstance, that is, feature, classifier and fusion rule. MCA can be defined as follows:where is the correct classification rate for class k and K is equal to the number of classes. Table 6 reports the classification results of the proposed approach and some previous methods. The result of the classification based on block segmentation with MCA of 91.37% is significantly better than that of the others. Furthermore, some previous approaches, such as HOG and SVM, GLCM and SVM, are distinctly inapplicable in this dataset, achieving passive and biased accuracies.

Table 6

Comparison of direct whole image classification, classification based on cell segmentation and classification based on block segmentation.

	Total accuracy	Mean class accuracy
Direct whole image classification	83.08%	77.33%
Cordelli E [34] (LBP and KNN) and WSR	90.77%	78.14%
Cordelli E [34] (LBP and KNN) and MR	90.00%	79.03%
Cordelli E [34] (LBP and SVM) and MR	63.85%	25.00%
Ghosh S [24] (Texture feature, HOG and SVM) and MR	58.46%	22.89%
Di Cataldo S [16] (GLCM and SVM) and MR	59.23%	23.19%
Classification based on block segmentation	94.62%	91.37%

Conclusion and Discussion

In this study, in contrast to cell segmentation a new block segmentation method never used was proposed to process the original HEp-2 images and then classification of the block patterns was undertaken based on various selected features (GLCM, LBP, SIFT and LDA) and classifiers (KNN, BPNN and SVM), commonly used in the previous studies of cell pattern classification. Subsequently, fusion rules (MR, WMR, and WSR) were used to aggregate the results of the block pattern classification to identify the staining patterns of the whole images. The results show that the proposed method can classify the well images correctly with an accuracy of 94.62% depending on the combination of LBP, KNN and MSR or the combination of LBP, BPNN and MR, which is better than pattern classification with a total accuracy of 90.77% based on cell segmentation and direct whole image classification with a total accuracy of 83.08%. The block segmentation considered is significantly different from cell segmentation, that is, the complexity of block segmentation is significantly lower than that of cell segmentation with numerous morphological techniques, and the block segmentation method just depends on connected regions. In contrast to direct whole image classification, the classification based on block segmentation has a robust tolerance to misclassifications in block recognition since the final label of the whole image is aggregated by classification information of all the blocks segmented from the image. However, that the size of the block requires a great number of explorations to determine, since there has been no regularity so far, is a problem. If the block size is too large, block segmentation will waste too much memory compared with cell segmentation. Moreover, well pattern classification based on the classification of individual blocks cannot detect the occurrence of multiple patterns since there may be cells with different patterns in a block marked with one pattern. However, this approach is greatly tolerant and robust to misclassifications in block recognition. If enough blocks per well are available, it is reasonable that block misclassification does not affect the final well pattern classification. (PDF) Click here for additional data file. (ZIP) Click here for additional data file.

4 in total

Review 1. Autoimmune diseases.

Authors: A Davidson; B Diamond
Journal: N Engl J Med Date: 2001-08-02 Impact factor: 91.245

1 in total

1. Correction: Staining Pattern Classification of Antinuclear Autoantibodies Based on Block Segmentation in Indirect Immunofluorescence Images.

Authors: Jiaqian Li; Kuo-Kun Tseng; Zu Yi Hsieh; Ching Wen Yang; Huang-Nan Huang
Journal: PLoS One Date: 2020-07-29 Impact factor: 3.240