Literature DB >> 32617330

Differentiate cavernous hemangioma from schwannoma with artificial intelligence (AI).

Shaowei Bi1, Rongxin Chen1, Kai Zhang1,2, Yifan Xiang1, Ruixin Wang1, Haotian Lin1,3, Huasheng Yang1.   

Abstract

BACKGROUND: Cavernous hemangioma and schwannoma are tumors that both occur in the orbit. Because the treatment strategies of these two tumors are different, it is necessary to distinguish them at treatment initiation. Magnetic resonance imaging (MRI) is typically used to differentiate these two tumor types; however, they present similar features in MRI images which increases the difficulty of differential diagnosis. This study aims to devise and develop an artificial intelligence framework to improve the accuracy of clinicians' diagnoses and enable more effective treatment decisions by automatically distinguishing cavernous hemangioma from schwannoma.
METHODS: Material: As the study materials, we chose MRI images as the study materials that represented patients from diverse areas in China who had been referred to our center from more than 45 different hospitals. All images were initially acquired on films, which we scanned into digital versions and recut. Finally, 11,489 images of cavernous hemangioma (from 33 different hospitals) and 3,478 images of schwannoma (from 16 different hospitals) were collected. Labeling: All images were labeled using standard anatomical knowledge and pathological diagnosis. Training: Three types of models were trained in sequence (a total of 96 models), with each model including a specific improvement. The first two model groups were eye- and tumor-positioning models designed to reduce the identification scope, while the third model group consisted of classification models trained to make the final diagnosis.
RESULTS: First, internal four-fold cross-validation processes were conducted for all the models. During the validation of the first group, the 32 eye-positioning models were able to localize the position of the eyes with an average precision of 100%. In the second group, the 28 tumor-positioning models were able to reach an average precision above 90%. Subsequently, using the third group, the accuracy of all 32 tumor classification models reached nearly 90%. Next, external validation processes of 32 tumor classification models were conducted. The results showed that the accuracy of the transverse T1-weighted contrast-enhanced sequence reached 91.13%; the accuracy of the remaining models was significantly lower compared with the ground truth.
CONCLUSIONS: The findings of this retrospective study show that an artificial intelligence framework can achieve high accuracy, sensitivity, and specificity in automated differential diagnosis between cavernous hemangioma and schwannoma in a real-world setting, which can help doctors determine appropriate treatments. 2020 Annals of Translational Medicine. All rights reserved.

Entities:  

Keywords:  Artificial intelligence (AI); differential diagnosis; multicenter

Year:  2020        PMID: 32617330      PMCID: PMC7327353          DOI: 10.21037/atm.2020.03.150

Source DB:  PubMed          Journal:  Ann Transl Med        ISSN: 2305-5839


Introduction

Cavernous hemangioma is one of the most common primary tumors that occur in the orbit, accounting for 3% of all orbital lesions (1-3), while schwannoma is a benign orbital tumor with a prevalence of less than 1% among all orbital lesions (1). It is necessary to distinguish these two tumors at treatment onset because they have different treatment strategies (4-6): complete removal is the treatment goal for cavernous hemangioma while for schwannoma, the goal is to ensure that no capsules remain. Moreover, clear differentiation provides useful information that fosters better vessel management (2). If the wrong surgical regimen is chosen, the tumor will recur, and the patient would need to undergo an additional operation. Similar to the diagnosis of many other tumors, imaging techniques are applied as the predominant methods to diagnose these two tumors. Magnetic resonance imaging (MRI) is the most commonly used approach because of its high resolution, which clearly reflects the tissues to determine the appropriate surgical approach (5-7). However, because it manifests similarly to cavernous hemangioma, especially in MRI images, schwannoma often evokes an improper diagnosis (2,7,8) even highly experienced ophthalmologists or radiologists can make inaccurate diagnoses (9). In recent years, the application of artificial intelligence (AI) in medicine has achieved physician-equivalent classification accuracy in the diagnosis of many diseases, including diabetic retinopathy (10-13), lung diseases (14), cardiovascular disease (15), liver disease, skin cancer (16), and thyroid cancer (17) and others. Therefore, the goal of this project was to develop an AI framework that uses MRI image sets from 45 hospitals in China as input to automate the differentia diagnosis between cavernous hemangioma and schwannoma with high accuracy, sensitivity and specificity.

Methods

Overall architecture

Considering the current dominance of MRI in the differential diagnosis of the two studied tumor types, we selected MRI images as the research materials in this study. The research framework included of three types of functional models. Each type consists of eight groups of models with different arrangements and combinations of slice orientations (coronal and transverse) as well as weighted sequences (T1-weighted, T1-weighted contrast-enhanced, T2-weighted and T2-weighted fat suppression). Each group was sorted into 4 models and trained according to the principle of four-fold cross validation. In summary, a total of 96 models were obtained (3×8×4=96) ().
Figure 1

Branching diagram of all 96 models.

Branching diagram of all 96 models. As mentioned above, we established 3 types of functional models to achieve the goal of distinguishing cavernous hemangioma from schwannoma. First, to reduce interference from unnecessary information, eye-positioning models were designed to identify the eye area from the complete images. Then, to further narrow the recognition range, tumor-positioning models were created to locate tumors within the identified eye area. Finally, tumor classification models were trained to classify the tumors. As shown in , when an MRI image is input, the framework first delineates the eye area from the whole image; then it localizes the tumor scope from the eye area; and finally, it specifically classifies the tumor. The eye-positioning and tumor-positioning models were trained using the Faster-RCNN algorithm, while the tumor classification models used the ResNet-101 algorithm.
Figure 2

Work flow of the AI framework. AI, artificial intelligence.

Work flow of the AI framework. AI, artificial intelligence.

Data set

The data set consisted of digital data scanned from MRI films representing patients from all over the country (most were from Southern China) who came to Sun Yat-sen University Zhongshan Ophthalmic Center (one of the most famous ophthalmic hospitals in China) for treatment. For all these patients, the diagnostic conclusions were supported by pathology and examined by the members of our team. First, the MRI films brought by the patients from 45 different hospitals were scanned into a digital format and then screened, rotated and cropped. After this step, we obtained 6,507 images of cavernous hemangioma (from 33 different hospitals, ) and 2,993 images of schwannoma (from 16 different hospitals, ). Then, to form training sets and validation sets, we used the image processing software named LabelImg [Tzutalin. LabelImg. Git code (2015). https://github.com/tzutalin/labelImg] to interpret and manually label all the images. The purpose of interpretation is to generate coordinates that delineate the extent of the ranges of eyes and tumors according to anatomical knowledge. The labels include eye, cavernous hemangioma and schwannoma supported by pathological diagnosis. Next, all these processed data were randomly divided into two parts: a training set and a validation set. The training set included 6,669 images for the eye-positioning model, 3,367 images for the tumor-positioning model and 3,131 images (2,059 images for cavernous hemangioma and 1,072 images for schwannoma) for the classification model. The validation set included 468 images for cavernous hemangioma and 217 images for schwannoma ().
Table 1

Data-set sources: MRI images of cavernous hemangioma

Serial numberHospitalsMRI images of cavernous hemangioma
1Renai Hospital of Guangzhou2,796
2Guang Kong Hou Qin Hospital930
3The First Affiliated Hospital, Sun Yat-sen University351
4Guangzhou Panyu Central Hospital186
5Jiangmen Central Hospital153
6Unknown147
7Guangzhou General Hospital of PLA146
8Foshan second People’s Hospital117
9The 458 PLA Hospital129
10Zhongshanyi Town Health Centre120
11Tianjin Huaxing Hospital106
12The Fifth Affiliated Hospital, Sun Yat-sen University98
13Shenzhen People’s Hospital93
14Jiangxi Ji’an Central Hospital89
15Huizhou City People’s Hospital85
16Anhui Yijishan Hospital of Wannan Medical College77
17Hainan General Hospital72
18Meizhou People’s Hospital72
19Jiangmen Wuyi TCM Hospital71
20Hengyang Central Hospital63
21Liupanshui Mineral Bureau Hospital62
22Guangzhou Huaxing Kangfu Hospital56
23Affiliated Hospital of Xiangnan University54
24The Second Affiliated Hospital of Guangzhou Medical University50
25Guangzhou TCM No. 1 Hospital50
26Hainan Province Nongken Sanya Hospital48
27Jinshazhou Hospital of Guangzhou University of Chinese Medicine47
28Dongguan SDBRM Hospital41
29Maoming TCM Hospital39
30Jiangxi TCM Hospital36
31Maoming Nongken Hospital35
32Liuzhou City Worker Hospital34
33Beijing Boren Hospital32
34Armed Police Chengdu Hospital22
Total6,507

MRI, magnetic resonance imaging.

Table 2

Data-set sources: MRI images of schwannoma

Serial numberHospitalsMRI images of schwannoma
1Renai Hospital of Guangzhou1,609
2Guang Kong Hou Qin Hospital225
3Unknown148
4Jiangxi People’s Hospital144
5Guangdong Hospital of TCM99
6Shenzhen Hengsheng Hospital95
7Xinhui People’s Hospital87
8Shenzhen Longgang Central Hospital83
9Guangdong Second TCM Hospital80
10Jiangsu Subei People’s Hospital79
11Shenzhen Shekou Hospital73
12Foshan Hospital of TCM72
13Sanya City People Hospital56
14Huizhou Boluo People’s Hospital53
15Guangzhou Huaxing Kangfu Hospital41
16Hunan Chenzhou First Hospital30
17Hainan Province Nongken Sanya Hospital19
Total2,993

MRI, magnetic resonance imaging.

Table 3

Components of the training and validation sets

Slice orientation  SequenceTraining sets of eye positioning modelsTraining sets of tumor positioning modelsTraining sets of tumor classification modelsValidation sets of tumor classification models
Cavernous hemangiomaSchwannomaCavernous hemangiomaSchwannoma
Coronal  T1-weighted1,2245443411765230
  T1-weighted contrast-enhanced5112561291125930
  T2-weighted238135934170
  T2-weighted fat suppression1851085745220
Transverse  T1-weighted1,2766233682038143
  T1-weighted contrast-enhanced1,2116123971718638
  T2-weighted1,0165303261507939
  T2-weighted fat suppression1,0085593481748237
Total6,6693,3672,0591,072468217
MRI, magnetic resonance imaging. MRI, magnetic resonance imaging.

Experimental settings

The settings of this study were based on Caffe (18), the Berkeley Vision and Learning Center deep-learning framework (BVLC), and TensorFlow (19). All the models were trained in parallel on three NVIDIA Tesla P40 GPUs. In terms of the classification problem, the key performance evaluation metrics were estimated as follows (20): where N represented the quantity of samples; P represented the number of correctly classified samples within the ith class; k denoted the number of classes in this specific classification problem; TP indicated the number of correctly classified samples within the ith class; FP denoted the number of wrongly recognized samples within the ith class; FN denoted the number of wrongly classified samples within the jth class, ; and TN denoted the number of samples that were correctly recognized as not belonging to the jth class, . All these parameters can be integrated into a confusion matrix. Additionally, the receiver operating characteristic (ROC) curves (21), which indicated how many samples of the ith class were recognized conditioned on a specific number of jth class () samples classified as the ith class, together with the area under the curve (AUC), were adopted to assess the performance. The performance evaluation parameters (accuracy, sensitivity, specificity, and ROC curve with AUC) were applicable only for binary classification problems. The accuracy and confusion matrix were applied to evaluate multiclass classification problems. For the object positioning problem, interpolated average precision (AP) was adopted for the performance evaluation (22). The interpolated AP is computed from the precision recall (PR) curve as shown in Eq. [4]. where represents the measured precision at a specific recall value . We adopted four-fold cross validation for the performance evaluation to assess all the classification and positioning problems.

Results

First, we conducted an internal four-fold cross-validation. The results showed that all the eye-positioning models achieved an AP of 100% and that the AP of the 28 tumor-positioning models exceeded 90% (). Similarly, the accuracy, sensitivity and specificity of almost all 32 tumor classification models were exceeded 90%, as shown in .
Table 4

AP of the eye-positioning models and tumor-positioning models

SequenceAP of eye positioning models (%)AP of tumor positioning models (%)
T1-weighted100100
T1-weighted contrast-enhanced1000
T2-weighted100100
T2-weighted fat suppression100100
T1-weighted100100
T1-weighted contrast-enhanced100100
T2-weighted10091
T2-weighted fat suppression100100

AP, average precision.

Table 5

Performances of the tumor classification models

Slice orientation  SequencePerformance of internal validation (%)Performance of external validation (%)
AccuracySensitivitySpecificityAccuracySensitivitySpecificity
Coronal  T1-weighted89.7680.4994.1969.5166.6771.15
  T1-weighted contrast-enhanced92.7491.6793.7560.6793.3344.07
  T2-weighted94.4490.9196.00
  T2-weighted fat suppression96.0090.91100.00
Transverse  T1-weighted93.0190.2094.5769.5767.4471.43
  T1-weighted contrast-enhanced95.0788.3797.9891.1386.8493.02
  T2-weighted94.0789.1996.3077.1253.8588.61
  T2-weighted fat suppression93.0279.07100.0064.7186.4954.88
AP, average precision. Next, we used the validation set for external validation. Considering that the tumor classification model results were mostly related to the differential diagnosis of cavernous hemangioma and schwannoma, the external verification of the tumor classification model predominantly represented the significance. The results showed that the transverse T1-weighted contrast-enhanced sequence model reached an accuracy of 91.13%, a sensitivity of 86.84%, a specificity of 93.02%, and an AUC of 0.9535. In contrast, the remaining models had significantly reduced performances compared with the internal verification results (see and ).
Figure 3

Performance of the tumor classification model trained by the transverse T1-weighted contrast-enhanced sequence images.

Performance of the tumor classification model trained by the transverse T1-weighted contrast-enhanced sequence images.

Discussion

Good performance in a real-world setting

Based on clinical experience, T1-weighted contrast-enhanced sequences can highlight the blood vessels. Progressive filling from center to periphery on enhancement is typical of cavernous hemangioma, while the enhancement pattern of schwannoma is partial and uneven (5,6) (see ). Therefore, these sequences are considered the most significant reference among all types of slices in the differential diagnosis of the two studied tumor types (23,24). The tumor classification model trained by the transverse T1-weighted contrast-enhanced sequence images and tested on the external validation sets achieved high accuracy, sensitivity, and specificity in automated cavernous hemangioma and schwannoma differential diagnosis in a real-world setting that is completely consistent with the clinical environment.
Figure 4

Manifestations of cavernous hemangioma and schwannoma in T1-weighted contrast-enhanced sequences.

Manifestations of cavernous hemangioma and schwannoma in T1-weighted contrast-enhanced sequences. Our results showed that the performance of the tumor classification model trained by transverse T1-weighted contrast-enhanced sequence images reached an accuracy of 91.13%, a sensitivity of 86.84%, a specificity of 93.02% and an AUC of 0.9535. These results suggested that this model’s performance quality meets the primary need for clinical application and that the goal of distinguishing cavernous hemangioma from schwannoma is achievable using this type of model.

A multicenter data-set

Thanks to the popularity of our ophthalmology center in China, patients from all over the country come here for treatment; thus, we were able to obtain these valuable images. In this study, we included data from more than 45 different hospitals in China to reach the current data amount. Moreover, due to the variety of equipment and operators among the different source hospitals, the data collection techniques were diverse, which enhances the wide generalizability of our diagnostic model.

Applying scanned versions rather than using DICOM

In previous AI studies, researchers have typically preferred raw data (11-13,15-17,25,26), such as DICOM format, generated directly from the imaging equipment, because the DICOM format both preserves all the original data and allows convenient collection. However, the scanned format was chosen for this study because the resultant AI framework needs to be useful for doctors in remote areas. The information technology level of hospitals in remote areas was limited, and they often lack comprehensive medical record management systems (27,28). Because most clinicians rely on film images instead of computerized interfaces, it made sense that models trained from a film version would be more suitable in this type of situation.

Three steps to reach the final goal

In previous studies, researchers commonly input entire MRI images for training (25,26). Here, we progressively designed three different types of models to achieve the goal of distinguishing cavernous hemangioma from schwannoma. First, because the eye area occupies only a small proportion of the entire MRI image, inputting the entire MRI image into the model directly would introduce considerable irrelevant information. To reduce the interference from such unnecessary information, we constructed an eye-positioning model that identifies the eye range within the full image; then, subsequent process can focus only on this range. Second, we established a tumor-positioning model to further narrow the scope for the final classifier and improve its precision. Third, we built a classification model to differentiate the located tumors to achieve the goal of automatically differentiating cavernous hemangioma from schwannoma.

Further subdividing the training sets instead of combining them

According to the traditional wisdom, having sufficient data volume is the foundation of training the current AI techniques (11-17). The most fundamental and effective way to improve the accuracy, sensitivity and specificity of the model is to augment the data in the training set. However, the MRI images for training had remarkable variations in different weighted sequences and slice orientations. If these images were blindly combined while ignoring these variations, the resultant incompatibilities would inevitably confuse the system, and its performance would deviate from the original intention. Therefore, we divided all the images into eight groups for training based on their different weighted sequences and slice orientations. The final result supported our conjecture: the performance of the transverse T1-weighted contrast-enhanced sequence was outstanding compared to that of other models. If all the training sets were combined, the accuracy of this model would be well below 91.13%.

Web-based automatic diagnostic system

Early in our research, our team built a cloud platform for congenital cataract diagnosis (29); we will implement the models in this study on that platform at the appropriate time. In China, an objective technological gap exists between urban and rural areas, and this imbalance is particularly evident with regard to medical resources (30-32). The establishment of this AI cloud platform for disease diagnosis is an economical and practical approach to alleviate the problem of the uneven distribution of medical resources.

Proper algorithms

Localization method

Faster-RCNN is a widely used algorithm used to address positioning problems because of its practicability and efficiency. Evolving from RCNN and Fast-RCNN (33), Faster-RCNN generates region proposals quickly by using an anchor mechanism rather than by applying a superpixel segmentation algorithm. After adopting two-stage training, transformations of the bounding box regressor and classifier were achieved. In the first stage, Faster-RCNN generated region proposals. Then, it judged the authenticity of the proposals, and the topmost coordinates of each object were regressed. In the second stage, the class of each object was evaluated and each object was eventually regressed to obtain its coordinates. We adopted a pretrained Zeiler and Fergus (ZF) network (34) to reduce the training time.

Convolutional neural network (CNN)

The CNN is the most popular AI model used in medicine. In this study, we adopted ResNet, which has a thin CNN architecture that includes numerous cross-layer connections and is suitable for rough classification tasks. To fit the residual function, the objective function was transformed, which resulted in a significant increase in efficacy, and we adopted a LogSoftMax loss function with class weights. The ResNet selected for this study has 101 layers, which is a sufficient depth to address the classification problems (20).

Limitations of our study

The most important deficiency in the study is that we simply chose a model that achieved good efficacy rather than also considering other models. Although the model trained on the group containing the transverse T1-weighted contrast-enhanced sequence images achieved particularly remarkable performance and is already sufficiently robust to help doctors in clinical work, the other seven groups may also contain useful information for feature extraction. Thus, the diagnostic efficiency of the model should be improvable to some extent if we were to make rational use of the other seven groups of data. Such an approach requires involving multimodal machine learning (35-37), because MRI images with different weighted sequences should be processed as separate modes. Upon alignment, the models could be integrated under the joint representation principle. Our team will continue to investigate this aspect of the problem in future studies.

Conclusions

The findings of our retrospective study show that the designed AI framework tested on external validation sets can achieve high accuracy, sensitivity, and specificity for differential diagnosis of automated cavernous hemangioma and schwannoma in real-world settings, which will contribute to the selection of appropriate treatments. Although a partial accuracy rate of over 90% was achieved with the current data volume, AI algorithms can never have too much data. Thus, we plan to continue collecting additional cases to optimize the model by cooperating with hospitals in Shanghai to collect data in the eastern part of China, thereby supplementing our training set and enhancing model generalizability. Furthermore, at the appropriate time, we will design a web-based automatic diagnostic system to help solve the problem of obtaining advanced medical care in remote areas. In terms of algorithms, we will first investigate multimodal machine learning to take full advantage of these invaluable data. Overall, the results show that further investigation of AI approaches are clearly a worthwhile effort that should be tested in prospective clinical trials. The article’s supplementary files as
  29 in total

1.  Optic Nerve Meningioma Mimicking Cavernous Hemangioma.

Authors:  Alexia Savignac; Augustin Lecler
Journal:  World Neurosurg       Date:  2017-11-28       Impact factor: 2.104

Review 2.  Orbital schwannoma and neurofibroma: role of imaging.

Authors:  Rashmi Kapur; Mahmood F Mafee; Reema Lamba; Deepak P Edward
Journal:  Neuroimaging Clin N Am       Date:  2005-02       Impact factor: 2.264

3.  Classification and incidence of space-occupying lesions of the orbit. A survey of 645 biopsies.

Authors:  J A Shields; B Bakewell; J J Augsburger; J C Flanagan
Journal:  Arch Ophthalmol       Date:  1984-11

4.  Dermatologist-level classification of skin cancer with deep neural networks.

Authors:  Andre Esteva; Brett Kuprel; Roberto A Novoa; Justin Ko; Susan M Swetter; Helen M Blau; Sebastian Thrun
Journal:  Nature       Date:  2017-01-25       Impact factor: 49.962

Review 5.  Breast cancer in China.

Authors:  Lei Fan; Kathrin Strasser-Weippl; Jun-Jie Li; Jessica St Louis; Dianne M Finkelstein; Ke-Da Yu; Wan-Qing Chen; Zhi-Ming Shao; Paul E Goss
Journal:  Lancet Oncol       Date:  2014-06       Impact factor: 41.316

6.  China's human resources for health: quantity, quality, and distribution.

Authors:  Sudhir Anand; Victoria Y Fan; Junhua Zhang; Lingling Zhang; Yang Ke; Zhe Dong; Lincoln C Chen
Journal:  Lancet       Date:  2008-10-17       Impact factor: 79.321

Review 7.  Radiological Analysis of Orbital Cavernous Hemangiomas: A Review and Comparison Between Computed Tomography and Magnetic Resonance Imaging.

Authors:  Stephanie Ming Young; Yoon-Duck Kim; Jung Hye Lee; Kyung In Woo
Journal:  J Craniofac Surg       Date:  2018-05       Impact factor: 1.046

8.  Survey of 1264 patients with orbital tumors and simulating lesions: The 2002 Montgomery Lecture, part 1.

Authors:  Jerry A Shields; Carol L Shields; Richard Scartozzi
Journal:  Ophthalmology       Date:  2004-05       Impact factor: 12.079

9.  Deep Learning with Convolutional Neural Network for Differentiation of Liver Masses at Dynamic Contrast-enhanced CT: A Preliminary Study.

Authors:  Koichiro Yasaka; Hiroyuki Akai; Osamu Abe; Shigeru Kiryu
Journal:  Radiology       Date:  2017-10-23       Impact factor: 11.105

10.  Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning.

Authors:  Nicolas Coudray; Paolo Santiago Ocampo; Theodore Sakellaropoulos; Navneet Narula; Matija Snuderl; David Fenyö; Andre L Moreira; Narges Razavian; Aristotelis Tsirigos
Journal:  Nat Med       Date:  2018-09-17       Impact factor: 53.440

View more
  1 in total

Review 1.  Machine Learning for the Detection and Segmentation of Benign Tumors of the Central Nervous System: A Systematic Review.

Authors:  Paul Windisch; Carole Koechli; Susanne Rogers; Christina Schröder; Robert Förster; Daniel R Zwahlen; Stephan Bodis
Journal:  Cancers (Basel)       Date:  2022-05-27       Impact factor: 6.575

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.