Literature DB >> 32555130

Ensemble Deep Learning Model for Multicenter Classification of Thyroid Nodules on Ultrasound Images.

Xi Wei1, Ming Gao2, Ruiguo Yu3, Zhiqiang Liu3, Qing Gu4, Xun Liu5, Zhiming Zheng6, Xiangqian Zheng2, Jialin Zhu1, Sheng Zhang1.   

Abstract

BACKGROUND Thyroid nodules are extremely common and typically diagnosed with ultrasound whether benign or malignant. Imaging diagnosis assisted by Artificial Intelligence has attracted much attention in recent years. The aim of our study was to build an ensemble deep learning classification model to accurately differentiate benign and malignant thyroid nodules. MATERIAL AND METHODS Based on current advanced methods of image segmentation and classification algorithms, we proposed an ensemble deep learning classification model for thyroid nodules (EDLC-TN) after precise localization. We compared diagnostic performance with four other state-of-the-art deep learning algorithms and three ultrasound radiologists according to ACR TI-RADS criteria. Finally, we demonstrated the general applicability of EDLC-TN for diagnosing thyroid cancer using ultrasound images from multi medical centers. RESULTS The method proposed in this paper has been trained and tested on a thyroid ultrasound image dataset containing 26 541 images and the accuracy of this method could reach 98.51%. EDLC-TN demonstrated the highest value for area under the curve, sensitivity, specificity, and accuracy among five state-of-the-art algorithms. Combining EDLC-TN with models and radiologists could improve diagnostic accuracy. EDLC-TN achieved excellent diagnostic performance when applied to ultrasound images from another independent hospital. CONCLUSIONS Based on ensemble deep learning, the proposed approach in this paper is superior to other similar existing methods of thyroid classification, as well as ultrasound radiologists. Moreover, our network represents a generalized platform that potentially can be applied to medical images from multiple medical centers.

Entities:  

Mesh:

Year:  2020        PMID: 32555130      PMCID: PMC7325553          DOI: 10.12659/MSM.926096

Source DB:  PubMed          Journal:  Med Sci Monit        ISSN: 1234-1010


Background

Thyroid nodules are common clinically, and with application of high-frequency ultrasound, their incidence has increased. Ultrasound diagnosis of benign and malignant nodules is mainly performed under guidelines from the American College of Radiology (ACR) [1] and the ultrasound section of the American Thyroid Association (ATA) [2], both of which have been increasingly improved in recent years. But there still remain some defects, and diagnostic accuracy is not consistent due to differing levels of experience among radiologists performing ultrasound [3]. With gradual development of machine learning in recent years, intelligent medical image diagnosis has become available. Deep learning can reveal subtler and more abstract information embedded in images along with the deepening of the network layers. In addition, use of artificial intelligence (AI) for medical or auxiliary medical care can lighten the burden of doctors and optimize medical treatments. Medical image processing is one of the breakthroughs in this field. Both deep learning and AI have achieved high accuracy for classification of skin cancer and detection of pneumonia [4,5], even exceeding that of physicians. This is also true for diagnosis of thyroid nodules [6-9]. In 2008, Lim KJ et al. [10] were the first to apply a neural network to differentiation of benign and malignant thyroid nodules. Ma J et al. [11] were the first to use a convolutional neural network in this field in 2017. They separately trained two networks in the ImageNet database. Then, by concatenating feature images, they used the softmax classifier to diagnose thyroid nodules with an accuracy of 83.02%±0.72%. Imaging diagnosis assisted by AI has attracted much attention in the past several years. If the diagnostic effectiveness of AI – including accuracy, sensitivity, and specificity – is found to be comparable to that of an experienced radiologist performing ultrasound, it will have a tremendous impact on the imaging diagnosis. However, if ultrasound images are directly used as inputs to a neural network, the shape information from thyroid nodules may be lost. Thus, two different AI models were trained on the basis of ensemble learning [12]. To accurately diagnose thyroid nodules, we calculated the mean output of these two types of models and determined whether the thyroid nodules were benign or malignant using a new model: EDLC-TN (ensemble deep learning-based classifier for thyroid nodules). The aim of our research was to use the deep learning method to differentiate benign and malignant thyroid nodules, thereby improving the accuracy of lesion identification.

Material and Methods

Study cohort and datasets

We used four independent ultrasound datasets to develop and evaluate EDLC-TN in four different hospitals: Tianjin Medical University Cancer Institute and Hospital (Center 1), Jilin Integrated Traditional Chinese and Western Medicine Hospital (Center 2), Cangzhou Hospital of Integrated Traditional Chinese, Western Medicine of Hebei Province (Center 3), and Peking University BinHai Hospital (Center 4). Between January 2015 and December 2017, consecutive patients in these four medical centers who underwent diagnostic thyroid ultrasound examination and subsequent surgery were included in the study. Exclusion criteria were: (1) images from anatomical sites that were judged as not having tumor according to postoperative pathology; (2) nodules with incomplete or low-quality ultrasound images; and (3) cases with incomplete clinicopathological information. Finally, three datasets from Centers 1 to 3 including a total of 25 509 thyroid ultrasound images were used to train and test the model, of which 15 255 were malignant and 10 254 were benign (confirmed by postoperative pathological diagnosis). Images (n=1,032) from Center 4 differed greatly from the other three in terms of style, clarity, and machine types. Therefore, the dataset from Center 4 was only used as an external validation set for verifying the generalizability of the model. Data from each medical Centers 1 to 3 were randomly divided into training and testing sets at a ratio of approximately 7: 3 (Table 1). In all settings, testing data did not include any images used in training.
Table 1

Number of training and testing images from four datasets.

Center 1 (23504)Center 2 (530)Center 3 (1475)Center 4 (1032)Total of all (26541)
Training dataset
 Benign64642055227191
 Malignant1009016441410668
 Total for training1655436993617859
Testing dataset
 Benign2620843595023565
 Malignant4330771805305117
 Total for testing695016153910328682
This study was approved by the Tianjin Medical University Cancer Institute and Hospital ethics committee. Informed consent from patients was waived due to the retrospective nature. In training and test datasets, ultrasound images were collected and stored by various brands of ultrasonic equipment, such as PHILIPS, GE, Siemens, Mindray, and TOSHIBA. In addition, the images were acquired with superficial probes.

Experimental pathways

Our experimental pathways mainly included three parts (Figure 1): segmentation of nodules, ensemble learning for classification, and testing the diagnostic performance of the model. The purpose of the training segmentation model was only to find the nodule automatically. To verify whether the algorithm was effective or not, we manually performed a test check of 500 images, reaching a relevance ratio of more than 98%. Using the segmentation model, the region of interest (ROI) containing the nodule was first segmented and then classification was modeled. Results of the classification were calculated quantitatively as the comprehensive evaluation of the two processes. The classification model was improved based on DenseNet [13] and adopted as a multistep cascade experiment pathway, as shown in Figure 2. The classification result was determined according to the voting of three weak models by the average method and the voting method. Finally, we compared diagnostic performance of the EDLC-TN with that of ultrasonographers and four advanced deep learning models, and conducted an external test.
Figure 1

Pathways of experiments. Our experimental pathways mainly included three parts. (A) Data desensitization, removal of the sections of the patient’s personal information in the images. (B) Training and validation of ensemble learning for classification of thyroid nodules. In the segmentation part, the nodule area was manually marked and used to train the segmentation model. ROI and mask were extracted by the segmentation model. Then, three weak models were trained and combined to obtain an advanced classification model. (C) Comparison experiments with radiologists and other deep learning models, and external validation experiment. We then compared performance of the classification model with that of three ultrasound radiologists and four state-of-the-art deep learning models. Finally, we conducted an external validation using an independent dataset.

Figure 2

The multistep cascade experiment pathway of EDLC-TN. (A) The process of extracting ROI and mask. First, the boundary was cut off (a). Second, the nodule area was segregated (b). Then, the mask image of the thyroid nodule was depicted (c). Finally, ROI was segmented (d). (B) The process of classifying images by ensemble learning model. After obtaining the ROI and its corresponding mask, three classification models were trained and combined to obtain an advanced classification model. ROI was put into models and got the final classification result through the voting method.

EDLC-TN model

A multistep cascade experiment pathway was adopted, as shown in Figure 2. First, the image boundary with annotation was cut off (Supplementary Table 1) for data cleaning. Then, the nodule and the surrounding area of the image (region of interest, ROI) was extracted. We used a semiautomatic method to achieve this goal, that is, carefully annotating the boundaries of thyroid nodules in 3000 images by hand, and training a nodule segmentation model with these marked images to segment all of the rest images. The structure of segmentation model is shown in Supplementary Table 2, and the method of converting the segmentation results to ROI is shown in Supplementary Table 1. Through the above process, each image generated a three-channel ROI R, and a one-channel mask M. We used these data to train nodule classification models based on the structure shown in Supplementary Table 3. For better performance, we trained multiple models and combined them through two ensemble learning methods, namely the average and voting methods. The average method calculates the mean value of all base model results. For the voting method, each base model votes on the category of the image, and the final result is the category with more votes. The Adam optimizer was used during the training. The learning rate was initialized as 0.1. After 60 epoch iterations, it was decreased to 0.01, and then reduced by 10 times after every 200 epochs. The batch size was adjusted to the maximum within the limits of the computer memory. We trained our models on NVIDIA TITAN XP GPU based on the TensorFlow framework.

Radiologist evaluation and comparison

To assess the predictive effect of this deep learning algorithm, this paper reflects the performance of radiologists (W.X., Z.J.L. and Z.S.) on 1000 (11.52%, 1000/8,682) ultrasound images randomly selected from the test set and compares accuracy in differentiating between benign and malignant thyroid nodules on ultrasound images with the predictive results of deep learning models. The radiologists assessed nodules according to ACR TI-RADS criteria [1] and predicted whether a nodule was benign or malignant. After each individual independently judged and labeled each ultrasound image, in a kind of double-blind experiment, we used postoperative pathological analysis results (i.e., benign and malignant diagnoses that were completely correct) for statistical analysis. Finally, the average accuracy rate was calculated to assess each individual radiologist’s accuracy in evaluation of an ultrasound image of a thyroid nodule. The independent radiologists involved in the evaluation work were the attending doctor or associate professors. The first reader (W.X.) had 13 years of experience, the second reader (Z.J.L) had 8 years of experience, and the third reader (Z.S.) had more than 30 years of experience in diagnosing thyroid nodules.

Comparison with four state-of-the-art deep learning models

We compared the diagnostic performance of our model with the four machine learning algorithms which are currently most popular and advanced, including ResNeXt [14], SE_Inception_v4 [15], SE_Net [16] and Xception [17]. These models are widely used in the field of AI of medical images [18,19]. The 3000 ultrasound images randomly selected from the test set in Center 1 were used for this part of the study. The area under the receiver operating characteristic (ROC) curve with a 95% confidence interval (CI), accuracy, sensitivity, and specificity were calculated to compare capability for diagnosing thyroid cancer on ultrasound.

General applicability test

In this section, we aimed to investigate the general applicability of our AI system for diagnosing thyroid cancer. We did so by testing our network on a dataset of ultrasound images (n=1032) from Peking University BinHai Hospital, including 502 benign nodule images and 530 malignant nodule images (Table 1).

Statistical analysis

Data are shown as the means and standard deviations for continuous variables. The number of patients and images were analyzed for categorical variables. Diagnostic performance of the EDLC-TN and the radiologists was evaluated by calculating sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy. To determine whether the diagnostic performance of our models significantly differed, the AUCs between the EDLC-TN and the other four models were compared using the Z test. The intraclass correlation coefficient (ICC) and Kappa value were used to assess test-retest reliability and inter-reader agreement for different radiologists. All statistical results shown were calculated using MedCalc for Windows v15.8 (MedCalc Software, Ostend, Belgium), and P<0.05 was considered statistically significant.

Results

Four image datasets and study population

The total number of ultrasound images in this work was 26 541, including 10 756 benign nodule images and 15 785 malignant nodule images. Of the images, 17 859 (67.29%) images from Centers 1 to 3 were used for training. A total of 7 560 (28.82%) images from Centers 1 to 3 were used for internal testing. The dataset from Center 4 containing 1 032 (3.89%) images was only used as an external test set without training for verifying the generalizability of the model. Table 1 summarizes the number of images used in our training and testing datasets. A total of 11 865 patients who underwent ultrasound examination and surgery between January 2015 and December 2017 at one of these four centers were included in this research. Demographic data and image information for all patients from four medical centers are shown in Table 2.
Table 2

Demographic data and image information for all patients from four medical centers.

Center 1Center 2Center 3Center 4
No. of patients10993151460261
Sex
 Female (n)8379 (76.22%)117 (77.48%)369 (80.22%)197 (75.48%)
 Male (n)2614 (23.78%)34 (22.52%)91 (19.78%)64 (24.52%)
Age (years)46 (18–84)49 (21–67)51 (18–73)52 (23–70)
Position of nodules
 Left lobe5063 (46.06%)79 (52.31%)214 (46.52%)127 (48.66%)
 Right lobe5652 (51.41)70 (46.36%)228 (49.57%)128 (27.77%)
 Isthmus278 (2.53%)2 (1.32%)18 (3.91%)6 (2.61%)
Size (cm)1.25 (0.38–7.80)1.72 (0.58–6.25)1.53 (0.30–6.91)2.12 (0.49–7.21)
Postoperative pathology
Benign nodules3996 (36.35%)82 (54.30%)213 (46.30%)153 (58.62%)
 Nodular goiter274571166127
 Adenomatous goiter551114526
 Thyroid granuloma518
 Follicular adenoma1822
Malignant nodules6997 (63.65%)69 (45.70%)247 (53.70%)108 (41.38%)
 PTC691069246108
 MTC651
 FTC20
 ATC2
Total images2350453014751032
 Benign9084289881502
 Malignant14420241594530
Types of machinePhilips EPIQ 5Philips IU22Philips EPIQ 7GE LOGIQ E9
Philips IU ElitePhilips IU22Mindray DC-8
Philips IU22Siemens Acuson Qxana 2Siemens Acuson S2000
Philips HD11Siemens Acuson S2000
TOSHIBA Aplio 500
TOSHIBA Aplio 400

PTC – papillary thyroid carcinoma; MTC – medullary thyroid carcinoma; FTC – follicular thyroid carcinoma; ATC – anaplastic thyroid carcinoma.

Classification by EDLC-TN

In this paper, accuracy, specificity and sensitivity were the main evaluation criteria for classification. Two models were similar in structure, so we analyzed the experimental results of one them, Classifier1, as the main model. The results of ensemble learning using different combination strategies are shown in Table 3. Among them, the method of voting requires at least three weak models, so two instances of weak classifier 1 are used.
Table 3

Comparison of the diagnostic performance of EDLC-TN with radiologists.

AccuracySensitivitySpecificity
EDLC-TN93.70%93.19%94.01%
Radiologist 191.55%91.45%91.71%
Radiologist 287.26%96.34%72.19%
Radiologist 393.07%92.56%93.92%
Average of radiologists90.63%93.45%85.94%
Radiologists and EDLC-TN96.54%97.11%95.58%

EDLC-TN – ensemble deep learning classification model of thyroid nodules.

The accuracy rate of the two weak models was already high. Strong classifier 1 and strong classifier 2 both were obtained by combining two models. Of the three methods, the averaging method calculates the arithmetic mean of the results obtained from the two models, the competition method takes the higher confidence level of two results as the predicted value, and the voting method combines the results of multiple (more than 3) models. All the models vote for benignancy and malignancy, with the majority of votes serving as the final result. Therefore, we found that the strong classifiers had higher accuracy than each weak classifier. The test results for weak and strong classifiers in diagnosis of thyroid nodules are shown in Supplementary Table 4. The model proposed in this paper is the structure of “classification after segmentation”. The performance of ensemble learning is shown in Figure 3A. With the changing threshold, accuracy, specificity, and sensitivity continue to change. When the threshold is around 0.54, the accuracy, sensitivity, and specificity were all at the high level (93.70%, 93.19% and 94.01%, respectively).
Figure 3

Performance of the EDLC-TN in identification of thyroid cancer in different datasets. (A) Performance of the EDLC-TN on the training dataset. The accuracy, sensitivity and specificity were 93.70%, 93.19%, and 94.01%, respectively. (B) Diagnostic performance of the EDLC-TN and four other state-of-the-art machine learning algorithms. The EDLC-TN demonstrated the highest value for AUC (0.941, 95% CI: 0.935–0.946), sensitivity (93.77%), specificity (94.44%), and accuracy (98.51%). (C) The performance of EDLC-TN on the external validation dataset. The EDLC-TN achieved an accuracy of 95.76%, with a sensitivity of 95.88%, a specificity of 93.75% and an AUC of 0.979 (95% CI: 0.958–0.992).

EDLC-TN vs. radiologists

In this experiment, three thyroid disease radiologists in the hospital were randomly selected to independently evaluate benign and malignant thyroid ultrasound images (the same test data set used for deep learning) and annotate them. The accuracy of each doctor and their average values are shown in Table 3. Those results indicate that the deep learning model proposed in this paper is more accurate than that of individual radiologists. In addition, we also carried out relevant experiments with multi-expert cooperating diagnosis, that is, the three radiologists simultaneously performed benign and malignant judgments and voted on one ultrasound image, and the majority of the votes were the final results. After comparing the results of a single model and a single radiologist, the highest accuracy of the model was 93.70%. However, compared with the accuracy of the model, the result of the medical consultation of three radiologists was more accurate, with a rate of 95.43%. Finally, the accuracy was 96.54% with analyses of the model and radiologist combined, which was higher than that for independent diagnosis by either (Table 3). The ICC and Kappa value were used to assess test-retest reliability and inter-reader agreement for three radiologists. As a result, the ICC of diagnosing results from three radiologists was 0.7052 (95%IC: 0.6836–0.7260). The Kappa values for Radiologist 1 vs. 2, Radiologist 2 vs. 3 and Radiologist 1 vs. 3 were 0.649 (95%IC: 0.609–0.689), 0.656 (95%IC: 0.616–0.696), 0.774 (95%IC: 0.741–0.808), separately.

EDLC-TN vs. other four AI models

The diagnostic performance of the four machine learning algorithms is shown in Table 4 and Figure 3B. The EDLC-TN model demonstrated the highest value for AUC (0.941, 95% CI: 0.935–0.946), which was significantly higher than the other four models (P<0.0001). Also, the EDLC-TN model performed had the highest values for sensitivity (93.77%), specificity (94.44%), and accuracy (98.51%).
Table 4

Comparison of the diagnostic performance of EDLC-TN with other four state-of-the-art algorithms.

AUCSensitivity (%)Specificity (%)Accuracy (%)
EDLC-TN0.941 (0.936–0.946)93.7794.4498.51
ResNeXt0.882 (0.875–0.889)*85.5390.8682.83
SE_Inception_v40.874 (0.866–0.881)*90.3384.3897.12
SE_Net0.840 (0.832–0.848)*88.6479.3596.52
Xception0.880 (0.872–0.887)*84.6891.2693.84

EDLC-TN – ensemble deep learning classification model of thyroid nodules; AUC – area under the ROC curve; AUCs of EDLC-TN and other three models were calculated by the method of DeLong et al. P – The difference of AUCs between the EDLC-TN and other four models was compared by Z-test,

P<0.05.

Generalizability of EDLC-TN

To investigate the generalizability of EDLC-TN in diagnosis of thyroid cancer, we applied the same deep learning framework to ultrasound images from Peking University BinHai Hospital (Center 4), which were not contained in the training set (Table 1). In this test, the EDLC-TN achieved an accuracy of 95.76%, with a sensitivity of 95.88% and a specificity of 93.75% in differentiating between benign and malignant thyroid nodules. The ROC curve is shown in Figure 3C and the area under the ROC curve of EDLC-TN for diagnosing thyroid cancer was 0.979 (95% CI: 0.958–0.992).

Discussion

Many researchers have made significant contributions to the field of deep learning models for differentiating between benign and malignant thyroid lesions. Xia J et al. [20] proposed an extreme learning machine (ELM) based on ultrasound features, such as composition, echogenicity, margin, shape, and calcification, to classify malignant and benign thyroid nodules and it achieved 87.72% diagnostic accuracy. Liu T et al. [21] used the CNN model learned from ImageNet as a pretrained feature extractor for an ultrasound image dataset. Their experimental results with 1 037 images demonstrated an accuracy of 93.1%. Li et al. [6] also structured an ensemble model for diagnosis of thyroid cancer based on ResNet 50 and Darknet 19. However, the diagnostic accuracy was only 85.7% to 88.9% because the types of two sub-models were similar. In this study, we proposed a new ensemble deep learning classification model called EDLC-TN for classifying benign and malignant thyroid nodules by ultrasound with evidence from multiple centers. The strengths of EDLC-TN model are fourfold. The core of this method is performing deep learning model training on the basis of segmenting the ROI, which is the area where the thyroid nodule is located. The accuracy of this model is the highest among the state-of-the-art algorithms and other models mentioned above. The accuracy of our model in diagnosing benign and malignant thyroid nodules was higher than that of a single radiologist and the model could help improve the diagnostic accuracy of radiologists. This model represents a generalized platform that can be universally applied to ultrasound images from different medical centers. Moreover, remarkable progress has been made with deep learning in the field of image processing, resulting in mature models of segmentation, localization, and classification for natural images. We used ensemble learning methods to connect the results of multiple models of deep learning. With that method, it was possible to distinguish between malignant and benign nodules with the highest accuracy, in contrast to other advanced deep learning models. The diagnostic performance of the radiologists in diagnosing thyroid cancer can be significantly improved if combined with EDLC-TN. Therefore, it could benefit radiologists in diagnosis to a large extent. Furthermore, our network is a general platform that can be universally applied to ultrasound images from different medical centers. When applying the EDLC-TN model to ultrasound images from a hospital with totally different types of ultrasound equipment, the EDLC-TN achieved excellent accuracy, sensitivity, and specificity. Even compared to a radiologist’s performance, our model also has advantages. The high accuracy with model in our study suggests that the EDLC-TN model has the potential to effectively learn from different types of medical images with a high degree of generalization. This could benefit screening programs and produce more efficient referral systems in all medical fields, particularly in low-resource or remote areas. The result might a wide-ranging impact on both clinical care and public health. There are several limitations to this study. Our benign datasets contained a high percentage of malignant nodules and nodular goiters, which may have introduced bias. Only three senior radiologists were chosen as the matched group, contributing to study bias. This model did not analyze extensive pathological types of thyroid nodules; they will be assessed in future studies. Our algorithm only gives a classification result and not provide a classification standard or texture analysis. In medicine, a good predictive algorithm often is insufficient. What is needed is the ability to explain an algorithm’s decisions and increase the credibility of diagnostic results [22]. We did not know whether this model can be applied to other types of medical images. These limitations will be overcome by expanding the ultrasound images datasets with various image types.

Conclusions

In this work, we proposed an ensemble deep learning classification model called EDLC-TN for distinguishing between benign and malignant thyroid nodules in ultrasound images. In addition, our network represents a generalized platform that can potentially be applied to different medical centers to assist radiologists. The algorithm for finding the upper and lower boundaries of a nodule. ROI extraction algorithm structure. Classification algorithm structure. Test results of weak and strong classifiers in the diagnosis of thyroid nodules.
Supplementary Table 1

The algorithm for finding the upper and lower boundaries of a nodule.

Algorithm 1. Detector for the upper and lower boundaries of a given nodule.
Input: mask: Distinguish whether a pixel belongs to the nodule with 0 or 1 label.
Output: up_bound: Upper boundary of the nodule;low_bound: Lower boundary of the nodule.
1:RS =(mask, axis=1) // The sum of each line of the mask.
2:RS.append(0) // In order to simplify the calculation process.
3:start, maxLen, curLen = 0, 0, 0
4:for i, v in enumerate(RS) do
5:if v > threshold then
6:  curLen +=1
7:else
8:  if curLen > maxLen then
9:   start = i – curLen
10:   maxLen = curLen
11:  end if
12:end if
13:end for
14:up_bound = start, low_bound = start + maxLen
15:return up_bound, low_bound
Supplementary Table 2

ROI extraction algorithm structure.

ProcessingLayerOutput sizeActivation
Down-samplingconv1_1224×224Relu
conv1_2224×224Relu
pool1112×112
conv2_1112×112Relu
conv2_2112×112Relu
pool256×56
conv3_156×56Relu
conv3_256×56Relu
conv3_356×56Relu
conv3_456×56Relu
pool328×28
conv4_128×28Relu
conv4_228×28Relu
conv4_328×28Relu
conv4_428×28Relu
pool414×14
conv5_114×14Relu
conv5_214×14Relu
Down-sampling [continued]conv5_314×14Relu
conv5_414×14Relu
pool57×7
conv67×7Relu
conv77×7Relu
conv87×7Relu
Up-samplingdeconv114×14
deconv228×28
deconv3224×224
Supplementary Table 3

Classification algorithm structure.

LayerDetailOutput size
Convolution3×3 conv64×64×16
Dense Block1{3×3 conv }×1764×64×220
Transition Layer11×1 conv32×32×220
2×2 avg pool
Dense Block2{3×3 conv }×1732×32×424
Transition Layer21×1 conv16×16×424
2×2 avg pool
Dense Block3{3×3 conv }×1716×16×628
Transition Layer31×1 conv8×8×628
2×2 avg pool
Dense Block4{3×3 conv }×178×8×832
Transition Layer41×1 conv4×4×832
2×2 avg pool
Dense Block5{3×3 conv }×174×4×1036
Batch Normalization4×4×1036
Relu4×4×1036
Pooling4×4 avg pool1×1×1036
Fully Connection1036
Fully Connection2
Softmax2
Supplementary Table 4

Test results of weak and strong classifiers in the diagnosis of thyroid nodules.

ModelAccuracySensitivitySpecificity
Weak Model 192.24%95.22%87.29%
Weak model 292.31%91.89%93.00%
Weak Model 391.89%91.00%92.26%
Strong Model93.70%93.19%94.01%
  16 in total

1.  Computer-aided diagnosis for the differentiation of malignant from benign thyroid nodules on ultrasonography.

Authors:  Kyoung Ja Lim; Chul Soon Choi; Dae Young Yoon; Suk Ki Chang; Kwang Ki Kim; Heon Han; Sam Soo Kim; Jiwon Lee; Yong Hwan Jeon
Journal:  Acad Radiol       Date:  2008-07       Impact factor: 3.173

2.  Ultrasound-based differentiation of malignant and benign thyroid Nodules: An extreme learning machine approach.

Authors:  Jianfu Xia; Huiling Chen; Qiang Li; Minda Zhou; Limin Chen; Zhennao Cai; Yang Fang; Hong Zhou
Journal:  Comput Methods Programs Biomed       Date:  2017-06-23       Impact factor: 5.428

3.  Management of Thyroid Nodules Seen on US Images: Deep Learning May Match Performance of Radiologists.

Authors:  Mateusz Buda; Benjamin Wildman-Tobriner; Jenny K Hoang; David Thayer; Franklin N Tessler; William D Middleton; Maciej A Mazurowski
Journal:  Radiology       Date:  2019-07-09       Impact factor: 11.105

4.  A pre-trained convolutional neural network based method for thyroid nodule diagnosis.

Authors:  Jinlian Ma; Fa Wu; Jiang Zhu; Dong Xu; Dexing Kong
Journal:  Ultrasonics       Date:  2016-09-12       Impact factor: 2.890

Review 5.  2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer.

Authors:  Bryan R Haugen; Erik K Alexander; Keith C Bible; Gerard M Doherty; Susan J Mandel; Yuri E Nikiforov; Furio Pacini; Gregory W Randolph; Anna M Sawka; Martin Schlumberger; Kathryn G Schuff; Steven I Sherman; Julie Ann Sosa; David L Steward; R Michael Tuttle; Leonard Wartofsky
Journal:  Thyroid       Date:  2016-01       Impact factor: 6.568

Review 6.  Texture analysis and machine learning to characterize suspected thyroid nodules and differentiated thyroid cancer: Where do we stand?

Authors:  Martina Sollini; Luca Cozzi; Arturo Chiti; Margarita Kirienko
Journal:  Eur J Radiol       Date:  2017-12-07       Impact factor: 3.528

7.  Machine Learning-Assisted System for Thyroid Nodule Diagnosis.

Authors:  Bin Zhang; Jie Tian; Shufang Pei; Yubing Chen; Xin He; Yuhao Dong; Lu Zhang; Xiaokai Mo; Wenhui Huang; Shuzhen Cong; Shuixing Zhang
Journal:  Thyroid       Date:  2019-04-27       Impact factor: 6.568

8.  Reduction in Thyroid Nodule Biopsies and Improved Accuracy with American College of Radiology Thyroid Imaging Reporting and Data System.

Authors:  Jenny K Hoang; William D Middleton; Alfredo E Farjat; Jill E Langer; Carl C Reading; Sharlene A Teefey; Nicole Abinanti; Fernando J Boschini; Abraham J Bronner; Nirvikar Dahiya; Barbara S Hertzberg; Justin R Newman; Daniel Scanga; Robert C Vogler; Franklin N Tessler
Journal:  Radiology       Date:  2018-03-02       Impact factor: 11.105

9.  Dermatologist-level classification of skin cancer with deep neural networks.

Authors:  Andre Esteva; Brett Kuprel; Roberto A Novoa; Justin Ko; Susan M Swetter; Helen M Blau; Sebastian Thrun
Journal:  Nature       Date:  2017-01-25       Impact factor: 49.962

10.  Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study.

Authors:  John R Zech; Marcus A Badgeley; Manway Liu; Anthony B Costa; Joseph J Titano; Eric Karl Oermann
Journal:  PLoS Med       Date:  2018-11-06       Impact factor: 11.069

View more
  5 in total

Review 1.  Artificial intelligence perspective in the future of endocrine diseases.

Authors:  Mandana Hasanzad; Bagher Larijani; Hamid Reza Aghaei Meybodi; Negar Sarhangi
Journal:  J Diabetes Metab Disord       Date:  2022-01-11

Review 2.  Radiomic Detection of Malignancy within Thyroid Nodules Using Ultrasonography-A Systematic Review and Meta-Analysis.

Authors:  Eoin F Cleere; Matthew G Davey; Shane O'Neill; Mel Corbett; John P O'Donnell; Sean Hacking; Ivan J Keogh; Aoife J Lowery; Michael J Kerin
Journal:  Diagnostics (Basel)       Date:  2022-03-24

Review 3.  Artificial Intelligence for Thyroid Nodule Characterization: Where Are We Standing?

Authors:  Salvatore Sorrenti; Vincenzo Dolcetti; Maija Radzina; Maria Irene Bellini; Fabrizio Frezza; Khushboo Munir; Giorgio Grani; Cosimo Durante; Vito D'Andrea; Emanuele David; Pietro Giorgio Calò; Eleonora Lori; Vito Cantisani
Journal:  Cancers (Basel)       Date:  2022-07-10       Impact factor: 6.575

4.  Automatic identification of benign pigmented skin lesions from clinical images using deep convolutional neural network.

Authors:  Hui Ding; Eejia Zhang; Fumin Fang; Xing Liu; Huiying Zheng; Hedan Yang; Yiping Ge; Yin Yang; Tong Lin
Journal:  BMC Biotechnol       Date:  2022-10-10       Impact factor: 3.329

5.  Visual Interpretability in Computer-Assisted Diagnosis of Thyroid Nodules Using Ultrasound Images.

Authors:  Xi Wei; Jialin Zhu; Haozhi Zhang; Hongyan Gao; Ruiguo Yu; Zhiqiang Liu; Xiangqian Zheng; Ming Gao; Sheng Zhang
Journal:  Med Sci Monit       Date:  2020-08-15
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.