Literature DB >> 35453841

Radiomic Detection of Malignancy within Thyroid Nodules Using Ultrasonography-A Systematic Review and Meta-Analysis.

Eoin F Cleere1,2, Matthew G Davey1, Shane O'Neill3, Mel Corbett2, John P O'Donnell4, Sean Hacking5, Ivan J Keogh2, Aoife J Lowery1,3, Michael J Kerin1,3.   

Abstract

BACKGROUND: Despite investigation, 95% of thyroid nodules are ultimately benign. Radiomics is a field that uses radiological features to inform individualized patient care. We aimed to evaluate the diagnostic utility of radiomics in classifying undetermined thyroid nodules into benign and malignant using ultrasonography (US).
METHODS: A diagnostic test accuracy systematic review and meta-analysis was performed in accordance with PRISMA guidelines. Sensitivity, specificity, and area under curve (AUC) delineating benign and malignant lesions were recorded.
RESULTS: Seventy-five studies including 26,373 patients and 46,175 thyroid nodules met inclusion criteria. Males accounted for 24.6% of patients, while 75.4% of patients were female. Radiomics provided a pooled sensitivity of 0.87 (95% CI: 0.86-0.87) and a pooled specificity of 0.84 (95% CI: 0.84-0.85) for characterizing benign and malignant lesions. Using convolutional neural network (CNN) methods, pooled sensitivity was 0.85 (95% CI: 0.84-0.86) and pooled specificity was 0.82 (95% CI: 0.82-0.83); significantly lower than studies using non-CNN: sensitivity 0.90 (95% CI: 0.89-0.90) and specificity 0.88 (95% CI: 0.87-0.89) (p < 0.05). The diagnostic ability of radiologists and radiomics were comparable for both sensitivity (OR 0.98) and specificity (OR 0.95).
CONCLUSIONS: Radiomic analysis using US provides a reproducible, reliable evaluation of undetermined thyroid nodules when compared to current best practice.

Entities:  

Keywords:  personalized medicine; radiogenomics; radiomics; thyroid nodules; ultrasound

Year:  2022        PMID: 35453841      PMCID: PMC9027085          DOI: 10.3390/diagnostics12040794

Source DB:  PubMed          Journal:  Diagnostics (Basel)        ISSN: 2075-4418


1. Introduction

Thyroid nodules occur commonly within the general population, with studies suggesting a prevalence of 20–67%, with an increased propensity in females and the elderly [1,2]. Increased access to healthcare and availability of modern imaging techniques such as ultrasonography (US) have led to the markedly increased detection of thyroid nodules [3]. The American Thyroid Association (ATA), British Thyroid Association (BTA), and European Society for Medical Oncology (ESMO) guidelines recommend US as the primary imaging modality for the assessment of thyroid nodules [4,5,6]. Several classification systems (e.g., ATA, BTA, and Thyroid Imaging Reporting and Data System (TIRADS)) are utilized by radiologists to stratify the risk of malignancy for each thyroid nodule based on US features [4,5,7]. These systems classify lesions on a scale ranging from benign to malignant based on sonographic parameters such as size, echogenicity, degree of margin irregularity, nodule height to width ratio, extra nodular extension, and calcification [4,5,7]. Suspicious nodules then proceed to fine-needle aspiration cytology (FNAC), where they are reassessed and graded as non-diagnostic, benign, atypical, suspicious for malignancy, or definitively malignant, using cytological reporting systems such as the Bethesda or Thy classification systems [8,9]. At present, patients with undetermined nodules following FNAC undergo surgery in order to obtain a definitive histological diagnosis, with 95% of nodules subsequently being stratified as “benign” [10], leading to the conceptualization that the present paradigm is guilty of overexposing patients to unnecessary over-investigation and overtreatment. Thus, it is vital for translational research efforts to focus on means of accurately screening and sub stratifying detected thyroid nodules into benign and malignant categories [11]. Precision medicine builds on the mantra that every patient, cancer, and disease process possesses its own characteristics with individualized diagnoses, prognoses, and management strategies. The clinical application of artificial intelligence (AI) has advanced the field of precision medicine through the exploration of hypotheses in large data sets [12]. Radiomics (or radiogenomics) is a rapidly evolving field that uses AI to extract vast quantities of data from medical imaging [13]. It represents a quantitative approach to medical imaging through mathematical extraction of the spatial distribution of signal intensities and pixel interrelationships, quantifying textural information by using AI analysis methods. Various radiomic methods exist at present, including radiomic AI, machine learning (ML), convolutional neural networks (CNN), and other deep-learning techniques. The radiomic process involves numerous steps incorporating image acquisition, image segmentation, quantitative feature extraction, computational analysis, and finally, computational modeling [14]. Through this use of vast amounts of data, radiomics provides a quick, reproducible, and objective analysis that can inform individualized diagnostics, sub stratification, prognostication, and future management of disease [13,14]. Due to the ever-increasing number of thyroid nodules detected, there is significant interest within the literature to develop novel strategies to inform diagnostics within the clinical workup of thyroid nodules [15]. Current data suggests radiomic imaging analysis may be capable of accurately stratifying thyroid lesions into benign and malignant based on data captured using sonographic imaging. Accordingly, the aim of the present study was to determine whether radiomic imaging analysis can provide an accurate evaluation of thyroid nodules undergoing diagnostic US evaluation.

2. Materials and Methods

A systematic review was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [16] and in accordance with the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy [17]. Local institutional ethical approval was not required.

2.1. Population, Intervention, Comparison, Outcomes (PICO)

Population: Patients who have undergone preoperative US and definitive thyroid nodule diagnosis as benign or malignant. Intervention: Radiomic analyses applied to preoperative US used to inform whether thyroid nodules are benign or malignant. Comparison: The discriminative ability of radiomics compared to confirmation of benign and malignant nodules. Nodules were determined as benign by either cytological or histological means, while malignancy was confirmed by histological analysis only. Outcomes: Primary outcomes included the evaluation of the clinical utility of preoperative US imaging to stratify thyroid nodules as either benign or malignant. Generated pooled sensitivity, specificity, and receiver operating characteristic (ROC) curve analyses will be representative of our primary outcomes. Secondary outcomes include comparing the ability of different radiomic methods to differentiate such nodules and to compare radiologists and radiomics in correctly discriminating benign versus malignant thyroid nodules.

2.2. Search Strategy

An electronic search of the PubMed Medline, EMBASE, and Scopus databases was performed on 16 January 2021 for relevant studies. This search was performed for the following headings: (Thyroid Cancer) and (Radiomics) linked using the boolean operator “AND”. Included studies were limited to those published in the English language and were not restricted based on the year of publication. All duplicate studies were manually removed before titles were screened, and studies deemed appropriate had their abstracts reviewed. Studies remaining had their full texts reviewed for eligibility.

2.3. Inclusion and Exclusion Criteria

Studies meeting the following inclusion criteria were included: (1) Studies with thyroid nodules confirmed as benign or malignant following US imaging; (2) imaging of tumors had to have been performed pre-diagnosis; (3) either stated study numbers of true positive, true negative, false positive, false negative, sensitivity, specificity, or accuracy data in relation radiomic tests or the ability to calculate these figures based on study data. In some cases, sensitivity and specificity were calculated from ROC curve analyses. Studies comparing the diagnostic ability of radiologists with radiomics were also included. Studies meeting any of the following exclusion criteria were excluded from this study: (1) studies not providing radiomic validation or “test” data, (2) studies outlining the diagnostic ability of radiomics differentiating benign and malignant lesions in other cancers (e.g., breast carcinoma, skin cancers, etc.), (3) studies with no full English text, (4) review articles, (5) studies including less than five patients in their series or case reports, and (6) editorial articles.

2.4. Data Extraction and Quality Assessment

This literature search was performed by two independent reviewers (E.F.C. and S.O.) using the aforementioned search strategy. Where discrepancies in opinion occurred between the reviewers, a third reviewer was asked to arbitrate (M.G.D.). As described, duplicate studies were removed. Both reviewers reviewed all retrieved manuscripts to ensure all inclusion criteria were met before extracting the following data: (1) first author name, (2) year of publication, (3) study design, (4) country, (5) level of evidence, (6) study title, (7) number of patients, (8) number of benign and malignant nodules confirmed though cytologic or histopathologic analysis, (9) sensitivity, specificity, and area under curve (AUC) scores from the ROC curve analyses obtained from radiomic “test” data and (10) sensitivity, specificity, and AUC scores from the ROC analyses from radiologists within studies where available. Sensitivity and specificity were directly extracted from tables and study text. When not provided as discrete data in tables or the text, specific estimates of sensitivity and specificity were calculated from ROC curves with the most accurate and appropriate sensitivity prioritized. Where studies tested the diagnostic ability of multiple radiomic methods (i.e., CNN, ML, etc.), only data for the best performing radiomic method within that study was extracted. Similarly, where studies detailed data on multiple radiologists’ ability to discriminate benign versus malignant nodules, data from the best performing radiologist from that particular study was included. Appraisal of the quality of each study was performed using the radiomics quality score (RQS), as outlined previously by Lambin et al. [18].

2.5. Statistical Analysis

Statistical analysis was performed according to the Cochrane guidelines. Pooled sensitivity and specificity and summary ROC analysis were calculated for included studies to demonstrate to convey the diagnostic test performance of radiomics in differentiating malignant thyroid nodules from benign thyroid nodules. We then performed a comparison between studies using CNNs (incorporating both CNNs and other deep learning methods) versus those using either ML or Radiomic AI analyses (together termed non-CNNs). For comparing radiologist and radiomic diagnostic test accuracy, sensitivity and specificity data were expressed as dichotomous data and reported as odds ratios (ORs) with 95% confidence intervals (CIs) following estimation using the Mantel–Haenszel method using random effects. The symmetry of funnel plots was used to assess publication bias. Statistical heterogeneity was determined using I2 statistics. Statistical significance was determined to be p < 0.05. Statistical analysis was performed using Review Manager (RevMan), Version 5.4 (Nordic Cochrane Centre, Copenhagen, Denmark).

3. Results

3.1. Literature Search

The initial search of PUBMED, SCOPUS, and EMBASE resulted in a total of 537 studies identified. Following the removal of duplicates, 488 studies remained. These studies were then screened by title and abstract for relevance, after which 119 studies remained—all had their full text analyzed for eligibility. Finally, 75 studies remained for inclusion in the analysis as depicted by Figure 1 [19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93].
Figure 1

PRISMA flow diagram detailing the systematic search process.

3.2. Study Characteristics

Overall, 75 studies arising from 15 different countries met inclusion and exclusion criteria (19–93), 8 studies were prospective in nature (25, 31, 38, 55, 57, 81, 83, 84) while the remaining 67 studies were retrospective. Of the included studies, 46 used convolutional neural networking (CNN) to analyse thyroid nodule US images (19, 25, 26, 29, 30, 33, 36–38, 40–45, 47, 48, 50–55, 57, 59, 63–65, 67–71, 73, 74, 77, 79, 81–84, 89–93), 29 studies used non CNN methods (20–24, 27, 28, 31, 32, 34, 35, 39, 46, 49, 56, 58, 60–62, 66, 72, 75, 76, 78, 80, 85–88) (Table 1).
Table 1

Study characteristics and demographics.

AuthorYearStudy Type (LOE)RadiomicsCountryUS Device BrandN PatientsMaleFemaleMean Age
Zhou2020RC (III)CNNChinaEsaote/Phillips105258047.9
Nguyen2019RC (III)CNNKoreaNS61NSNSNS
Wei2020RC (III)CNNChinaNS2489614187545.3
Park2019PC (II)CNNKoreaSamsung2655221347.1
Thomas2020RC (III)CNNUSA4 brands103NSNSNS
Wei (2)2020RC (III)Non-CNNChina5 brandsNSNSNS47
Liu2019RC (III)CNNChinaVinno131547746.7
Stib2020RC (III)CNNUSASiemens/GE/Phillips57123433752.9
Ye2020RC (III)CNNChina5 brands1664610044.6
Ma2020RC (III)CNNChinaNS21134177NS
Koh2020RC (III)CNNKorea11 brands2004915149.6
Kwon2020RC (III)CNNKoreaPhillips/Hitachi762NSNSNS
Kim2019RC (III)Non-CNNKoreaSamsung106297748
Zhao2020RC (III)Non-CNNChinaPhillips/Hitachi1744413045
Qin2019RC (III)CNNChinaNS233NSNSNS
Zhu2021RC (III)CNNChina4 brands102010254.8
Liu (2)2019RC (III)CNNChinaGE376NSNSNS
Xia2019PC (II)CNNChinaSamsung1713213947.2
Zhao2021RC (III)Non-CNNChinaSuperSonic102257750.6
Lee2019RC (III)CNNKoreaPhillips/Hitachi5199342647.5
Ataide2020RC (III)Non-CNNGermanyNS99NSNSNS
Chen2020RC (III)Non-CNNChinaGE/Hitachi1480302117845.6
Zhu (2)2021RC (III)CNNChinaPhillips/GE/Toshiba2616419752
Shi2020RC (III)CNNChinaEsaote/Hitachi/ToshibaNSNSNSNS
Barczyński2020PC (II)CNNPolandSamsung5094147.5
Zhang2020RC (III)Non-CNNKoreaSiemens3035924446.4
Wei (3)2020RC (III)Non-CNNChinaSamsung1813514646
Colakoglu2019RC (III)Non-CNNTurkeyGE1984815044.5
Park2021RC (III)Non-CNNKoreaPhillips3256126450.1
Nguyen2020RC (III)CNNKoreaNS61NSNSNS
Sun2020RC (III)CNNChinaGE33813441643.8
Peng2021PC (II)CNNChina13 brands2775726204942.2
Liu2021RC (III)CNNChinaSiemens1634811544.3
Han2021RC (III)CNNKoreaSamsung372NSNSNS
Shin2020RC (III)CNNKoreaSamsung3407926147.2
Wang2019RC (III)CNNChinaGE/Phillips2765322346.3
Zhu2019RC (III)CNNChina4 brands4679737045.3
Zhang (2)2019RC (III)Non-CNNChinaHitachi2032695133742.3
Ko2019RC (III)CNNKoreaPhillips/Hitachi1502312749.7
Song2019RC (III)CNNKoreaToshiba100NSNSNS
Li2019RC (III)CNNChinaPhillips/Toshiba/GE1543412051
Wildman-Tobriner2019RC (III)Non-CNNUKSiemens/GE/Phillips94217352.6
Yu2017PC (II)CNNChinaPhillips/Siemens5094148.4
Buda2019RC (III)CNNUSASiemens/GE/Phillips91NSNS52.3
Raghavendra2018RC (III)Non-CNNIndia4 brands344NSNS44.1
Li2018RC (III)CNNChinaNS30053247NS
Ma2017RC (III)CNNchina7 brands4782NSNS52
Raghavendra2017RC (III)Non-CNNIndiaGE2426317944.1
Zhu2013RC (III)CNNChinaSiemens61816152847.7
Choi2017PC (II)Non-CNNKoreaSamsung89187143.5
Gao2017RC (III)CNNChinaPhillips/GE3427027244.8
Choi2015RC (III)CNNKoreaPhillips85246152
Chi2017RC (III)CNNCanadaToshiba61NSNSNS
Jeong2019PC (II)CNNKoreaSamsung76NSNSNS
Acharya2012RC (III)Non-CNNSingaporeNS201010NS
Liang2018RC (III)Non-CNNChinaPhillips95207543.2
Prochazka (2)2019RC (III)Non-CNNCzechiaPhillips/GE60114955.7
Song2015RC (III)Non-CNNChinaGE14732115NS
Ardakani2015RC (III)Non-CNNIranMedison60NSNSNS
Guan2019RC (III)CNNChinaNS399NSNSNS
Xia2017RC (III)Non-CNNChinaSiemens1873615150.8
Yoo2018PC (II)CNNKoreaSamsung50104043.2
Tsantis2009RC (III)Non-CNNGreecePhillips85NSNSNS
Liu2008RC (III)Non-CNNUSANS37NSNSNS
Acharya2013RC (III)Non-CNNItalyEsaote20101052.8
Acharya (2)2012RC (III)Non-CNNItalyEsaote20101052.8
Acharya2011RC (III)Non-CNNItalyEsaote20101052.8
Kweon Seo2017RC (III)CNNKoreaNS2305117948.7
Ardakani (2)2015RC (III)CNNIranMedison60NSNSNS
Wu2016RC (III)CNNChinaPhillips97021475646.7
Cao2019RC (III)Non-CNNChinaNS120NSNSNS
Wang (2)2020RC (III)CNNChinaNS1040NSNSNS
Sun (2)2020RC (III)CNNChinaNS245NSNSNS
Reverter2019RC (III)Non-CNNSpainGE3004525555.5
Gitto2019RC (III)Non-CNNItalySamsung62125060

NS: not specified, LOE: level of evidence, RC: retrospective cohort, PC: prospective cohort, CNN: convolutional neural network, non-CNN: analysis performed using a method other than a convolutional neural network, GE: General Electric.

3.3. Clinicopathological Characteristics

Overall, there were 28,373 patients with 46,175 thyroid nodules included from the 75 studies. Males accounted for 24.6% of patients, while 75.4% of patients were female. There were 51 studies reporting mean patient age; within these studies, mean patient age was 48.3 years (range: 42.2–69.0 years) (Table 1). Overall, 22,814 (49.4%) nodules were benign while 23,361 (50.6%) of nodules were malignant. Within included studies, 35 reported mean nodule size; mean nodule size in these studies was 19.7 mm (range 8.3–31.7 mm). We found 34 studies provided a breakdown of malignant nodules by subtype. Papillary thyroid carcinoma (PTC) was the most prevalent subtype of malignant thyroid nodule within these studies, representing 94.7% of malignant thyroid nodules (Table 2).
Table 2

Study characteristics and demographics.

AuthorYearN NodulesMean Nodule Size (mm)N Benign NodulesN Malignant NodulesPapillary CaFollicular CaMedullary CaOther Thyroid Ca
Zhou2020105NS7530NSNSNSNS
Nguyen201961NS1150NSNSNSNS
Wei20202489NS10211468144211150
Park201928616.2130156149610
Thomas2020103NS703324324
Wei (2)20207560NS30634587NSNSNSNS
Liu201913116.1597272000
Stib2020651NS500151NSNSNSNS
Ye2020209NS109100NSNSNSNS
Ma2020846NS360486NSNSNSNS
Koh202020022.41029897001
Kwon2020762NS325437437000
Kim2019218121328686000
Zhao202017721.8968181000
Qin2019248NS115133NSNSNSNS
Zhu2021NSNS5745NSNSNSNS
Liu (2)2019450NS128322NSNSNSNS
Xia201918010.3859591400
Zhao202110617.37333NSNSNSNS
Lee201958912.9193396395100
Ataide202099NS1782NSNSNSNS
Chen20201558NS3471211NSNSNSNS
Zhu (2)20211032NS502530NSNSNSNS
Shi20201937NS1032905NSNSNSNS
Barczyński2020NS30.5401010000
Zhang202036518.31791861681170
Wei (3)2020204151129290101
Colakoglu2019235NS133102102000
Park20213252125768NSNSNSNS
Nguyen2020NSNS1150NSNSNSNS
Sun202055014128422NSNSNSNS
Peng20212775NS2472303299400
Liu202117511.967108103500
Han202145417.8287167161420
Shin2020348312529609600
Wang2019NS18.595181NSNSNSNS
Zhu20194678.3128339NSNSNSNS
Zhang (2)20192064NS1314750NSNSNSNS
Ko201915012.950100NSNSNSNS
Song2019100NS5050NSNSNSNS
Li2019154NS7084NSNSNSNS
Wildman-Tobriner201910027.18515NSNSNSNS
Yu201750NS331716010
Buda201999278415NSNSNSNS
Raghavendra2018344NS28856NSNSNSNS
Li2018NSNS50250250000
Ma201781482541264022NSNSNSNS
Raghavendra2017242NS21131NSNSNSNS
Zhu201368913.3265465NSNSNSNS
Choi201710212594343000
Gao201734212.1103239NSNSNSNS
Choi201599NS217877109
Chi2017NSNS1150NSNSNSNS
Jeong201910017564443100
Acharya201220NS10107102
Liang20189516435251100
Prochazka (2)201960NS4020NSNSNSNS
Song2015155NS7679NSNSNSNS
Ardakani201560NS2634NSNSNSNS
Guan2019399NS190209209000
Xia201720324.811489NSNSNSNS
Yoo201811715675050000
Tsantis200985NS5431NSNSNSNS
Liu200841NS212018002
Acharya20132031.710107102
Acharya (2)20122031.710107102
Acharya20112031.710107102
Kweon Seo201723029.41913903900
Ardakani (2)201560NS2634NSNSNSNS
Wu2016970NS4635074871244
Cao2019120NS7347NSNSNSNS
Wang (2)20203120NS13931841NSNSNSNS
Sun (2)2020245NS145100NSNSNSNS
Reverter201930029.81651351121535
Gitto201962184814NSNSNSNS

NS: not specified, Ca: cancer.

3.4. Diagnostic Ability of Radiomics

The mean AUC calculated from independent ROC curve analyses within included studies was 0.88 (range: 0.61–1.00). Individual study sensitivity and specificity for determining malignant versus benign thyroid nodules is demonstrated in Figure 2A. Pooled sensitivity for radiomics in distinguishing thyroid nodules was 0.87 (95% CI: 0.86–0.87). Pooled specificity for radiomics in distinguishing thyroid nodules was 0.84 (95% CI: 0.84–0.85). A combined ROC curve for radiomics of thyroid nodules by ultrasound sonography is demonstrated in Figure 2B.
Figure 2

(A) Overall sensitivity and specificity of radiomics. (B) Receiver operating characteristic (ROC) curve of malignant versus benign thyroid nodules based on radiomic analyses.

3.5. Comparison of CNN versus Non-CNN Radiomics

For studies using CNN pooled sensitivity was 0.85 (95% CI: 0.84–0.86) and pooled specificity was 0.82 (95% CI: 0.82–0.83). Pooled sensitivity 0.90 (95% CI: 0.89–0.90) and specificity 0.88 (95% CI: 0.87–0.89) was significantly higher in studies using non-CNN radiomics (p < 0.05) (Figure 3A,B). ROC curve comparison between CNN and non-CNN methods is outlined in Figure 3C.
Figure 3

(A) Pooled sensitivity and specificity of convolutional neural network (CNN) analyses and (B) represents pooled sensitivity and specificity of non-CNN analyses. (C) Depicts the receiver operating characteristic (ROC) curve for CNN analyses (black) versus non-CNN analyses (red).

3.6. Comparison of Radiomic Analysis of Thyroid Nodule US versus Radiologists Analysis of Thyroid Nodule US

Within the studies included in the meta-analysis, 35 studies provided a comparison between radiologists and radiomics in differentiating malignant versus benign thyroid nodules using thyroid US. Radiomics demonstrated similar sensitivity for detection of malignancy within a given thyroid nodule (OR 0.98, 95% CI 0.76–1.26) when compared with radiologists (Figure 4A). Radiomics also demonstrated similar specificity (OR 0.93, 95% CI 0.72–1.20) when compared with radiologists for this purpose (Figure 4B).
Figure 4

(A) Represents sensitivity comparison between radiologists and radiomics. (B) Represents specificity comparison between radiologists and radiomics.

4. Discussion

To the best of our knowledge, the current systematic review and meta-analysis is the first to evaluate the diagnostic test accuracy of radiomic imaging analysis in differentiating malignant from benign thyroid nodules using US. Due to the increasing prevalence of thyroid nodules now detected within the general population and the rising incidence of thyroid malignancy (which has tripled since 1975), accurate risk stratification is paramount to the enhancement of clinical outcomes [3]. The most important finding in this analysis of over 28,000 patients possessing over 46,000 thyroid nodules is the data supporting the utility of radiomic analysis in correctly stratifying undetermined thyroid nodules correctly into benign and malignant lesions (sensitivity: 0.87, specificity: 0.84). This is promising as we look to enhance diagnostics in this field of oncology, all the while promoting minimally invasive techniques in order to reduce morbidity and mortality for prospective patients. These results come at the timely promotion of precision oncology as a rapidly evolving field, which manipulates individual patient, cancer, or disease process characteristics in order to develop a personalized diagnosis, prognosis, and treatment strategies [12]. Data from this analysis support radiomic imaging analysis using US as a means of quantification of malignancy in thyroid nodules, without exposing patients to the risks associated with invasive FNAC sampling or surgical specimen assessment. For some patients, the use of radiomics could possibly circumvent the need for FNAC and surgical resection, providing a potentially more cost and time-efficient assessment of thyroid nodules than what is currently practiced [20,94]. Results of this analysis indicate that radiomics is a novel avenue worth exploring in the differentiation of benign and malignant thyroid lesions. CNN provided a pooled sensitivity of 85% and specificity of 82% compared to a pooled sensitivity of 90% and pooled specificity of 88% in non-CNN. CNN is designed as an automated means to adaptively learn spatial hierarchies of features through backpropagation by using multiple building blocks: convolution layers, pooling layers, and fully connected layers of data processing [95]. CNN has powerful pattern recognition capabilities due to the fact that they can approximate any continuous function, given an appropriate network structure [96]. In neural networking, high variance gives networks the ability to learn complex patterns, although it also runs the risk of overfitting since models will learn peculiarities, or noise, from a data set [96]. The noise phenomenon incorporates features into the model which are not generalizable outside of the training set [95]. This makes the model appear to perform well in training but fail to perform in a true clinical environment. Such overfitting in the setting of CNN has been noted in studies evaluating papillary thyroid nodules on US [44]. The margin between benign thyroidal tissue and malignant tissue may be unclear or blurry on US imaging, with significant overlap between cancerous and normal or benign regions. Thus, it is then challenging for the CNN model to perform accurate textural feature extraction of the malignant tissue, possibly contributing to poor model performance [44]. Ideally, a CNN should have a large training set to mitigate the risk of overfitting, but this is not always feasible due to cost, time, and other factors limiting available data [95]. Non-CNN incorporates a number of methods such as support vector machines (SVM), random forest (RF), k-nearest neighbor (k-nn), and Bayesian classifiers [97]. Each method has its own strengths and weakness. For example, SVM classifiers are based on decision planes that define decision boundaries. SVM is often used for the principle of structural risk minimization, which allows robust analysis of test data without the need for a large training set through margin maximization [98]. Another popular ML method is RF, which consists of a large network of individual decision trees that allows for ensemble learning, providing the benefit of human-readable data and the ability to adjust the classifiers’ decision trees where appropriate [97]. Ultimately, the randomness of this model makes it robust, generalizable, and less prone to overfitting, although large numbers of decision trees make this approach more time-consuming. Within our analysis, small-data and overfitting within individual studies may have contributed to the overall worse performance of CNN versus non-CNN. Based on the results of this meta-analysis non-CNN radiomics should be the preferred methods for evaluating the risk of malignancy in an undetermined thyroid nodule using US. For detecting malignancy within a given thyroid nodule radiomic methods had similar sensitivity (0.98, 95% CI 0.76–1.26) and specificity (0.93, 95% CI 0.72–1.20) when compared with radiologists. However, acknowledgment for the strengths maintained by radiologists compared to radiomics: At present, radiomic models are dependent on high-quality image acquisition and segmentation by radiologists. Without good imaging data to analyze, the radiomic model is unable to correctly stratify nodules [99]. Radiologists also maintain the innate ability to incorporate the global context of patients and the ability to maintain subjective associations based on experience, which current radiomic models are unable to perform. Radiomics can face issues with model fitting, poor input data, and subsequent suboptimal performance [100]. However, human assessment of medical imaging and, in particular, US suffers from significant inter-observer variability [101,102]. Radiomics, on the other hand, provides the benefit of an objective, quick, and reproducible analysis with the ability to analyze features of the nodule that are both visible to the radiologist and textural features occult to human perception [13,14]. Studies have attempted to blend the strengths of both radiologists and radiomic models to form computer-assisted diagnosis (CAD) tools. While CAD was not evaluated within the confines of the current meta-analysis, CAD has shown to be of benefit in the evaluation of thyroid nodules within the literature [39]. The present analysis is subject to a number of limitations. Primarily, radiomics involves a broad spectrum of analysis methods, ranging from the radiomic AI methods to deep-learning techniques; we have included all of these under the umbrella term “radiomics” despite variance in their reproducibility of data [103]. Secondly, the authors wish to highlight the inter-user variability of US due to this imaging acquisition being operator dependent. Radiomic analyses are dependent on high-quality images of thyroid nodules being obtained and nodules being correctly selected by ultra-sonographers. Thirdly, when extracting data, we selected the highest performing radiomic method within any given study. This may have led to over-estimation of overall sensitivity and specificity for radiomic evaluation of thyroid nodules on US as a whole. To combat this potential bias when comparing radiomics to radiologists, we selected data for the highest performing radiologist. Finally, prospective validation evaluating the utility of AI in the field of radiological diagnostics typically necessitates buying from large, international corporations in order to finance developing the evidence base in this field.

5. Conclusions

In conclusion, this meta-analysis of current evidence demonstrates an almost 90% reliability of radiomic imaging analyses to US in detecting malignancy within undetermined thyroid nodules. At present, radiomic analyses demonstrate equal diagnostic sensitivity and specificity of identifying malignant lesions when compared to radiologists. Within the field of radiomics, at present, non-CNN methods may be considered the preferred radiomic means of classifying thyroid nodules. Based on this meta-analysis, AI offers promising results as an avenue to be explored as we look to enhance the diagnostic accuracy and risk stratification of thyroid nodules in the era of personalized medical and oncological patient care. We advocate for rigorous experimentation in this field, given the potential for this technology to bolster diagnostic workflows, enhance clinical outcomes, and minimize patient morbidity; all while mitigating associated healthcare costs.
  99 in total

Review 1.  Clinical practice. The thyroid nodule.

Authors:  Laszlo Hegedüs
Journal:  N Engl J Med       Date:  2004-10-21       Impact factor: 91.245

2.  Thyroid FNA: New classifications and new interpretations.

Authors:  David N Poller; Zubair W Baloch; Guido Fadda; Sarah J Johnson; Massimo Bongiovanni; Alfredo Pontecorvi; Béatrix Cochand-Priollet
Journal:  Cancer Cytopathol       Date:  2016-02-23       Impact factor: 5.284

3.  Knowledge-guided synthetic medical image adversarial augmentation for ultrasonography thyroid nodule classification.

Authors:  Guohua Shi; Jiawen Wang; Yan Qiang; Xiaotang Yang; Juanjuan Zhao; Rui Hao; Wenkai Yang; Qianqian Du; Ntikurako Guy-Fernand Kazihise
Journal:  Comput Methods Programs Biomed       Date:  2020-06-30       Impact factor: 5.428

Review 4.  What is precision medicine?

Authors:  Inke R König; Oliver Fuchs; Gesine Hansen; Erika von Mutius; Matthias V Kopp
Journal:  Eur Respir J       Date:  2017-10-19       Impact factor: 16.671

5.  Thyroid cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up†.

Authors:  S Filetti; C Durante; D Hartl; S Leboulleux; L D Locati; K Newbold; M G Papotti; A Berruti
Journal:  Ann Oncol       Date:  2019-12-01       Impact factor: 32.976

6.  A Clinical Assessment of an Ultrasound Computer-Aided Diagnosis System in Differentiating Thyroid Nodules With Radiologists of Different Diagnostic Experience.

Authors:  Yichun Zhang; Qiong Wu; Yutong Chen; Yan Wang
Journal:  Front Oncol       Date:  2020-09-11       Impact factor: 6.244

7.  Current thyroid cancer trends in the United States.

Authors:  Louise Davies; H Gilbert Welch
Journal:  JAMA Otolaryngol Head Neck Surg       Date:  2014-04       Impact factor: 6.223

8.  Deep neural networks could differentiate Bethesda class III versus class IV/V/VI.

Authors:  Yi Zhu; Qiang Sang; Shijun Jia; Ying Wang; Timothy Deyer
Journal:  Ann Transl Med       Date:  2019-06

9.  A computer-aided diagnosing system in the evaluation of thyroid nodules-experience in a specialized thyroid center.

Authors:  Shujun Xia; Jiejie Yao; Wei Zhou; Yijie Dong; Shangyan Xu; Jianqiao Zhou; Weiwei Zhan
Journal:  World J Surg Oncol       Date:  2019-12-06       Impact factor: 2.754

10.  Diagnosis of Thyroid Nodules: Performance of a Deep Learning Convolutional Neural Network Model vs. Radiologists.

Authors:  Vivian Y Park; Kyunghwa Han; Yeong Kyeong Seong; Moon Ho Park; Eun-Kyung Kim; Hee Jung Moon; Jung Hyun Yoon; Jin Young Kwak
Journal:  Sci Rep       Date:  2019-11-28       Impact factor: 4.379

View more
  1 in total

1.  Accuracy of Ultrasound Diagnosis of Thyroid Nodules Based on Artificial Intelligence-Assisted Diagnostic Technology: A Systematic Review and Meta-Analysis.

Authors:  Yu Xue; Ying Zhou; Tingrui Wang; Huijuan Chen; Lingling Wu; Huayun Ling; Hong Wang; Lijuan Qiu; Dongqing Ye; Bin Wang
Journal:  Int J Endocrinol       Date:  2022-09-23       Impact factor: 2.803

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.