Literature DB >> 35795827

Machine Learning-Based Gray-Level Co-Occurrence Matrix (GLCM) Models for Predicting the Depth of Myometrial Invasion in Patients with Stage I Endometrial Cancer.

Xiaoyuan Qian¹, Du He², Li Qin³, Lin Lai², Hongli Wang⁴, Yukun Zhang².

Abstract

Purpose: Deep myometrial invasion (DMI) is an independent high-risk factor for lymph node metastasis and a prognostic risk factor in early-stage endometrial cancer (EC-I) patients. Thus, we developed a machine learning (ML) assistant model, which can accurately help define the surgical area.
Methods: 348 consecutive EC-I patients with the pathological diagnosis were recruited in the tertiary medical centre between January 1, 2012, and October 31, 2021. Five ML-assisted models were developed using two-step estimation methods from the candidate gray level co-occurrence matrix (GLCM). Receiver operating characteristic curve (ROC), decision curve analysis (DCA), and clinical impact curve (CIC) were prepared to evaluate the robustness and clinical practicality of each model.
Results: Our analysis identified several significant differences between the stage IA and IB groups. The top seven-candidate factors included correlation all direction offset1, correlation angle0 offset1, correlation angle45 offset1, correlation angle90 offset1, ID moment all direction offset1, ID moment angle0 offset1, and ID moment angle45 offset1. The areas under the ROC curve (AUCs) of the random forest classifier (RFC) model, support vector machine (SVM), eXtreme gradient boosting (XGBoost), artificial neural network (ANN), and decision tree (DT) ranged from 0.765 to 0.877 in the training set and from 0.716 to 0.862 in the testing set, respectively. Among the five machine algorithms, RFC obtained the optimal prediction efficiency using correlation angle0 offset1, correlation angle45 offset1, correlation angle90 offset1, correlation all direction offset1, ID moment angle0 offset1, and ID moment angle45 offset1, and ID moment angle90 offset1, respectively.
Conclusion: Our ML-based prediction model combined with GLCM parameters assessed the risk of DMI in EC-I patients, especially RFC, which helped distinguish stage IA and IB EC patients. This new predictive model based on supervised learning can be used to establish personalized treatment strategies.

Entities: Chemical

Keywords: endometrial cancer; gray level co-occurrence matrix; machine learning; myometrial invasion; prediction

Year: 2022 PMID： 35795827 PMCID： PMC9252192 DOI： 10.2147/CMAR.S370477

Source DB: PubMed Journal: Cancer Manag Res ISSN： 1179-1322 Impact factor: 3.602

Introduction

Endometrial cancer (EC) is the most common gynaecological cancer worldwide.1 According to the global disease burden (GBD) statistics, EC’s incidence and prevalence rates are growing globally, whereas the death rate and disability-adjusted life year decreased over the past 30 years.2 However, the increase in disease mortality and new diagnoses makes EC an essential consideration for women’s health. Our understanding of EC biology has progressed with the continuous breakthrough of diagnosis and treatment technology, but many aspects of treatment are still controversial. Pelvic lymph nodes are the most common sites of extrauterine tumor spread in early-stage EC.3,4 Deep myometrial invasion is an independent prognostic factor and a high-risk factor for lymph node metastasis.5 According to the International Federation of Gynecology and Obstetrics (FIGO), evidence of the survival benefit of lymphadenectomy in FIGO stage I EC patients is still lacking.3 The treatment methods for stage I superficial myometrial infiltration (stage IA) and deep myometrial infiltration (stage IB) are different.6 Some studies suggest that pelvic lymph node dissection is not performed for superficial but deep muscle infiltration.7 Deep myometrial invasion is a high-risk factor for prognosis. The risk of lymph node metastasis and tumor recurrence is higher in deep myometrial invasion EC (compared to 5% in the shallow myometrial invasion).8 Consequently, it is crucial to judge the depth of myometrial infiltration in stage I EC. MRI can diagnose whether EC has myometrial invasion, myometrial invasion depth, cervical canal invasion, and lymph node metastasis. MR sagittal T2WI is a classic sequence to show the uterine wall and uterine cavity. However, MR differentiation of EC myometrial invasion depth depends on physician experience, likely leading to biased results. Fortunately, texture analysis is an objective quantitative method, especially the gray level co-occurrence matrix (GLCM) is most commonly used for second-order texture analysis.9 In addition, the wide usage of various machine learning (ML) approaches has outperformed conventional statistical analyses, which might be beneficial in the medical field because of their high accuracy.10,11 So far, no attempts have been made to select the “optimal GLCM level” to predict myometrial invasion in EC-I patients, particularly the integration of the ML algorithm. Considering the superior ability of the ML-based algorithm in improving the accuracy of predicting muscular invasion, we applied the ML-assisted decision-support model to use preoperative GLCM parameters and evaluated DMI risk and guided clinical decision-making before making treatment decisions.

Patients and Methods

Patients Selection

Patients diagnosed with FIGO stage EC-I at tertiary medical centres (The Central Hospital of Enshi Tujia and Miao Autonomous Prefecture, Hu Bei Cancer Hospital, and Tongji Hospital) between January 1, 2012, and October 31, 2021, were enrolled for the study. The inclusion criteria for patients were as follows: (i) Endometrial carcinoma was diagnosed by pathology; (ii) Imaging examination was performed; (iii) Patients with complete clinical information were only enrolled. The exclusion criteria for the patients were as follows: (i) Presence of essential carcinogens measuring less than 1 cm in diameter; (ii) Patients with severe organ injuries or incomplete imaging findings in their medical records; (iii) Presence of benign diseases affecting texture feature extraction or analysis. The study protocol complied with the provisions of the Helsinki Declaration (2013 revision). It was approved by the Institutional Review Committee of Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology (TJ-IRB20210631). All patients’ information was strictly confidential, and informed consent was waived due to its traceability. The workflow for EC-1 patient selection and model construction is summarized in Figure 1.

Figure 1

The flow chart of patient selection and data process.

Image Selection and Segmentation

The analysis was conducted jointly by two experienced doctors. In case of a difference of opinion, an agreement was reached through consultation. MRI 750 W plain scan combined with an enhanced image was utilized to determine the tumor’s location invading the myometrium. The region of interest (ROI) of the affected multi-layered myometrium was manually delineated with software ITK-snap in sagittal fat compression T2WI sequence (delineated layer by layer ≥ 3 layers). The ROI was placed on the lesion invading the myometrium, including the entire uterine wall myometrium, with a length of about 1 cm. ICC confidence interval was defined as the individual variability divided by the total variability. The ICC value for ROI between 0 and 1.0 shows poor trust, and 1 means fully trusted. If the reliability coefficient is lower than 0.4, it indicates poor reliability, and greater than 0.75 indicates good reliability.

Data Collection and Quality Assessment

The following data were collected for all patients: age, body mass index, histology, FIGO stage, histologic grade, and imaging record. For variables with missing values, the median was typically used. If ≥10% of values were missing for a given variable, it was excluded from variable screening for the final model.

Development and Validation of ML-Based Models

The data were randomly divided into a training set (70%) and a verification set (30%) to verify the prediction model. The inclusion principle of variables reported in previous studies was followed to screen variables. The model variables (ie characteristic variables) were screened according to the principle of “OOB error”12, as follows: Gini(D)=1- If the Gini index is smaller, the probability that the selected samples in the set will be mixed will be small, that is, the higher the purity of the set is, on the contrary, the more impure the set is. However, if all the samples in the set are of the same class, the Gini index approaches zero.

Prediction Efficiency Evaluation of ML-Based Models

The optimal subset variables of the modelling were obtained based on the intersection of variable sets. The receiver operating characteristic (ROC) curve was used to evaluate the prediction accuracy of the model in the training and validation set. The discrimination ability of each model was quantified by the area under the ROC curve (AUC), decision curve analysis (DCA) and clinical impact curve (CIC).

Statistical Analysis

For descriptive analysis, median (IQR) and frequencies (%) were assessed for continuous and categorical variables, respectively. Bonferroni corrected probability values are used to compare the qualitative data.13 Wilcoxon rank-sum test or chi-square test was used to compare the differences between different groups. While selecting variables, each node was further divided by using the best subset of randomly selected explanatory variables or features, and the class prediction values generated by each tree were collected. Finally, the candidate variables of the prediction model, namely the Gini index, were determined according to the weight. All analysis was performed using the Python programming language (version 3.9.2, Python Software Foundation, ) and R Project for Statistical Computing (version 4.0.4, ). All P values were two-tailed, and P <0.05 was considered statistically significant.

Results

Baseline Characteristics of the Study Population

The detailed clinical characteristics and pathologic baseline data of 348 patients with EC-1 are shown in Table 1. For internal validation of the model, patients were randomly divided into a training set (70%, N=348) and a validation set (30%, N=348) using the caret package. Three parameters (13 texture features) exhibited significant differences (P < 0.05), namely correlation (full angle, 0°, 45°, 90°), inertia (full angle, full angle SD, 0°, 45°, 90°), and deficit moment (full angle, 0°, 45°, 90°). There was no significant difference in cluster prominence, cluster shadow, energy, entropy, and Haralick ().

Table 1

Baseline Demographic and Clinicopathological Characteristics of Patients

Variables	Training Set				Testing Set
Variables	Overall(N=243)	Yes(N=28)	No(N=215)	P-value	Overall(N=105)	Yes(N=20)	No(N=85)	P-value
Age (median [IQR]),year	55.00 [44.00, 65.50]	60.50 [45.50, 65.00]	55.00 [43.00, 66.00]	0.306	53.00 [43.00, 64.00]	53.50 [43.00, 63.25]	53.00 [41.00, 65.00]	0.838
BMI (median [IQR]),kg/m²	22.90 [20.60, 25.10]	24.60 [22.82, 26.40]	22.60 [20.25, 25.00]	0.005	23.50 [21.40, 25.70]	22.85 [22.25, 27.05]	23.80 [21.30, 25.60]	0.315
Smoking (%)
Yes	130 (53.5)	13 (46.4)	117 (54.4)	0.551	49 (46.7)	7 (35.0)	42 (49.4)	0.361
No	113 (46.5)	15 (53.6)	98 (45.6)		56 (53.3)	13 (65.0)	43 (50.6)
Drinking (%)
Yes	129 (53.1)	13 (46.4)	116 (54.0)	0.583	50 (47.6)	6 (30.0)	44 (51.8)	0.132
No	114 (46.9)	15 (53.6)	99 (46.0)		55 (52.4)	14 (70.0)	41 (48.2)
Hypertension (%)
Yes	120 (49.4)	14 (50.0)	106 (49.3)	1	54 (51.4)	12 (60.0)	42 (49.4)	0.546
No	123 (50.6)	14 (50.0)	109 (50.7)		51 (48.6)	8 (40.0)	43 (50.6)
Diabetes (%)
Yes	119 (49.0)	19 (67.9)	100 (46.5)	0.054	52 (49.5)	8 (40.0)	44 (51.8)	0.485
No	124 (51.0)	9 (32.1)	115 (53.5)		53 (50.5)	12 (60.0)	41 (48.2)
Menopause (%)
Yes	190 (78.2)	19 (67.9)	171 (79.5)	0.244	74 (70.5)	16 (80.0)	58 (68.2)	0.444
No	53 (21.8)	9 (32.1)	44 (20.5)		31 (29.5)	4 (20.0)	27 (31.8)
CA199 (median [IQR]),U/mL	32.70 [27.80, 38.25]	43.45 [37.15, 51.00]	32.00 [27.10, 36.80]	<0.001	34.20 [28.10, 38.30]	44.60 [32.80, 54.05]	33.10 [27.20, 37.10]	<0.001
CA125 (median [IQR]),U/mL	42.30 [37.45, 51.80]	58.50 [48.90, 70.90]	41.60 [36.90, 50.05]	<0.001	48.20 [41.00, 53.90]	62.75 [50.55, 68.62]	47.50 [39.80, 52.20]	<0.001
Pathological type (%)
Adenocarcinoma	175 (72.0)	26 (92.9)	149 (69.3)	0.017	71 (67.6)	18 (90.0)	53 (62.4)	0.035
Non-adenocarcinoma	68 (28.0)	2 (7.1)	66 (30.7)		34 (32.4)	2 (10.0)	32 (37.6)
FIGO stage (%)
I	215 (88.5)	0 (0.0)	215 (100.0)	<0.001	85 (81.0)	0 (0.0)	85 (100.0)	<0.001
IA	16 (6.6)	16 (57.1)	0 (0.0)		10 (9.5)	10 (50.0)	0 (0.0)
IB	12 (4.9)	12 (42.9)	0 (0.0)		10 (9.5)	10 (50.0)	0 (0.0)
Correlation_AllDirection_offset1 (median [IQR])	0.98 [0.93, 1.04]	2.46 [2.17, 2.88]	0.96 [0.92, 1.02]	<0.001	0.99 [0.95, 1.05]	2.35 [2.21, 2.93]	0.98 [0.94, 1.02]	<0.001
Correlation_angle0_offset1 (median [IQR])	1.03 [0.98, 1.06]	2.47 [1.94, 3.14]	1.01 [0.98, 1.05]	<0.001	1.04 [1.01, 1.08]	2.75 [1.91, 3.80]	1.03 [1.00, 1.06]	<0.001
Correlation_angle45_offset1 (median [IQR])	1.09 [1.04, 1.13]	2.19 [1.70, 2.65]	1.08 [1.04, 1.11]	<0.001	1.09 [1.04, 1.13]	2.38 [1.90, 2.78]	1.07 [1.04, 1.10]	<0.001
Correlation_angle90_offset1 (median [IQR])	1.14 [1.08, 1.21]	2.60 [2.11, 2.97]	1.12 [1.07, 1.18]	<0.001	1.16 [1.08, 1.22]	2.92 [2.13, 3.40]	1.13 [1.06, 1.18]	<0.001
Inertia_AllDirection_offset1 (median [IQR])	230.00 [207.50, 253.50]	183.00 [159.75, 201.25]	233.00 [213.00, 256.00]	<0.001	228.00 [206.00, 253.00]	182.00 [152.00, 197.25]	235.00 [218.00, 259.00]	<0.001
Inertia_AllDirection_offset1_SD (median [IQR])	4860.00 [3365.50, 6333.50]	3365.50 [2079.00, 4945.25]	5052.00 [3638.00, 6367.00]	0.003	4437.00 [2839.00, 6103.00]	3673.50 [2344.00, 5724.50]	4548.00 [3024.00, 6133.00]	0.133
Inertia_angle0_offset1 (median [IQR])	199.00 [175.00, 231.50]	126.00 [95.75, 158.50]	205.00 [180.50, 235.50]	<0.001	190.00 [168.00, 217.00]	138.00 [114.50, 181.00]	196.00 [175.00, 226.00]	<0.001
Inertia_angle45_offset1 (median [IQR])	282.00 [247.00, 314.50]	209.00 [166.75, 282.00]	287.00 [252.00, 319.00]	<0.001	283.00 [238.00, 320.00]	193.00 [164.50, 250.50]	301.00 [246.00, 326.00]	<0.001
Inertia_angle90_offset1 (median [IQR])	133.00 [111.00, 154.00]	130.50 [111.25, 139.25]	135.00 [111.00, 155.00]	0.229	132.00 [117.00, 156.00]	123.00 [111.50, 136.00]	135.00 [120.00, 156.00]	0.057
IDMoment_AllDirection_offset1 (median [IQR])	0.08 [0.07, 0.08]	0.10 [0.10, 0.12]	0.08 [0.07, 0.08]	<0.001	0.08 [0.07, 0.08]	0.10 [0.09, 0.12]	0.08 [0.07, 0.08]	<0.001
IDMoment_angle0_offset1 (median [IQR])	0.08 [0.07, 0.08]	0.11 [0.09, 0.12]	0.08 [0.07, 0.08]	<0.001	0.08 [0.08, 0.08]	0.12 [0.10, 0.14]	0.08 [0.08, 0.08]	<0.001
IDMoment_angle45_offset1 (median [IQR])	0.06 [0.05, 0.06]	0.10 [0.07, 0.12]	0.06 [0.05, 0.06]	<0.001	0.06 [0.05, 0.07]	0.10 [0.09, 0.10]	0.06 [0.05, 0.06]	<0.001
IDMoment_angle90_offset1 (median [IQR])	0.07 [0.05, 0.08]	0.11 [0.10, 0.13]	0.06 [0.05, 0.07]	<0.001	0.07 [0.06, 0.08]	0.13 [0.11, 0.14]	0.07 [0.06, 0.08]	<0.001

Abbreviations: IQR, inter-quartile range; BMI, body mass index; CA199, carbohydrate antigen199; CA125, carbohydrate antigen125; FIGO, Federation International of Gynecology and Obstetrics.

Baseline Demographic and Clinicopathological Characteristics of Patients Abbreviations: IQR, inter-quartile range; BMI, body mass index; CA199, carbohydrate antigen199; CA125, carbohydrate antigen125; FIGO, Federation International of Gynecology and Obstetrics.

Selection of Candidate Variables

Feature selection is the area of machine learning that focuses on selecting candidate variables.14 The iterative analysis screened the candidate covariates of each algorithm. We executed 23 variables via Pearson correlation analysis. The correlation matrix revealed that DMI significantly correlated with image factors and some clinical variables (Figure 2A). Additionally, each meaningful candidate variable, including correlation all direction offset1, correlation angle0 offset1, correlation angle45 offset1, correlation angle90 offset1, ID moment all direction offset1, ID moment angle0 offset1, and ID moment angle45 offset1, contributed to the ML-based model (Figure 2B). Therefore, consistent with the results of the correlation analysis, these 7 were the top-ranked predictors.

Figure 2

Variable screening and weight allocation. (A) Correlation matrix analysis of candidate features. (B) The weight distribution of the candidate variables of each ML-based model.

Construction of ML-Based DMI Predictive Model

For training data, each patient’s results were entered (positive or negative training), and the final judgment result was output. As shown in the formula: Gini(D)=1-. The RFC algorithm represents a computational method for effectively navigating the free parameter space to obtain a robust model (Figure 3A). The variable Gini index in the RFC model was depicted in . Consistent with the predicted results, the top seven-candidate variables were all direction offset1, correlation angle0 offset1, correlation angle45 offset1, correlation angle90 offset1, ID moment all direction offset1, ID moment angle0 offset1, and ID moment angle45 offset1. Additionally, data mining through the DT model, as shown using impurity analysis: Gini(p)=, was very advantageous. With the addition of inflammatory factor indicators, relevant correlation angle0 offset1, correlation angle45 offset1, and ID moment all direction offset1 acted as an irreplaceable weight at the branch of DT (Figure 3B). Meanwhile, the ANN model shows more robust prediction efficiency than other models but was inferior to the RFC (Figure 4 and ).

Figure 3

Figure 4

Predictive model visualization based on ANN algorithm. The candidate factors associated with myometrial invasion were ordered via the ANN algorithm. Red represents the positive weight, and blue represents the negative weight.

Predictive model visualization based on ML-based algorithm. (A) RFC model. (B) DT model. The candidate factors associated with myometrial invasion were ordered via RFC algorithm (A, B) prediction node, and weight was allocated via DT algorithm. Predictive model visualization based on ANN algorithm. The candidate factors associated with myometrial invasion were ordered via the ANN algorithm. Red represents the positive weight, and blue represents the negative weight.

Comparison Among ML-Based Models

To explore whether ML-based models can enhance the prediction performance, we used five supervised learning models for DMI assessment. DCA exhibited that the RFC model showed a robust prediction performance in the training and validation cohorts (Figure 5). Additionally, the AUCs of RFC models reached a plateau when the 7 variables were introduced, followed by ANN, DT, SVM, and XGBoost (Table 2 and ). Undoubtedly, RFC showed a superior prediction efficiency to the generalized linear model. Therefore, both RFC and DT (machine learning-assisted decision-support) models were used to guide DMI prediction using the iterative algorithm analysis of supervised learning.

Figure 5

Prediction performance of candidate models based on ML-based algorithm. (A) DCA for five ML-based models in the training set. (B) DCA for five ML-based models in the testing set.

Table 2

The ROC Curve Analyses for Predicting Myometrial Invasion in Each ML-Based Model

Model	Training Set			Testing Set
Model	AUC Mean	AUC 95% CI	Variables^&	AUC Mean	AUC 95% CI	Variables^&
RFC	0.877	0.316–1.438	7	0.862	0.301–1.423	7
SVM	0.765	0.191–1.339	8	0.716	0.142–1.290	8
DT	0.787	0.264–1.310	7	0.739	0.212–1.266	7
ANN	0.842	0.295–1.389	7	0.804	0.257–1.351	7
XGboost	0.768	0.204–1.332	8	0.715	0.151–1.279	8
Radiologists	0.835	0.267–1.403	0	0.816	0.248–1.384	0

Notes: &Variables included in the model.

Abbreviations: RFC, random forest classifier; SVM, support vector machine; DT, decision tree; ANN, artificial Neural Network; XGboost, eXtreme gradient boosting; AUC, area under curve; 95% CI, 95% confidence interval;.

The ROC Curve Analyses for Predicting Myometrial Invasion in Each ML-Based Model Notes: &Variables included in the model. Abbreviations: RFC, random forest classifier; SVM, support vector machine; DT, decision tree; ANN, artificial Neural Network; XGboost, eXtreme gradient boosting; AUC, area under curve; 95% CI, 95% confidence interval;. Prediction performance of candidate models based on ML-based algorithm. (A) DCA for five ML-based models in the training set. (B) DCA for five ML-based models in the testing set.

Internal Validation of the Optimal Predictive Model

To further validate the predictive performance of the RFC model, we also used CIC to evaluate the accuracy. CIC demonstrated that the stratification of DMI could be achieved in the training cohorts (). These were also consistent with the results of validation cohorts, indicating that RFC had the best performance across the metrics of discrimination, calibration, and overall performance, especially the candidate systemic inflammation markers that were highly relevant to DMI.

Discussion

In 2020, the ESGO/ESTRO/ESP guidelines combined with the molecular characteristics of Cancer Genome Atlas (TCGA) and pathological factors to stratify the prognosis of endometrial cancer (EC) patients.15,16 For instance, the prospective molecular risk classifier (service) for EC has been validated as a new classifier, which can be used to classify EC in clinical practice.16–18 However, integrating these molecular groups with other prognostic histological factors into the treatment of EC is still underway. Nowadays, minimally invasive surgery has become the standard treatment for EC patients. Women with early-stage EC only require a total hysterectomy and bilateral salpingo-oophorectomy.19,20 The standard treatment for EC in young women of childbearing age is hysterectomy and bilateral salpingo-oophorectomy, with or without lymphadenectomy. However, it is not ideal for women interested in future fertility.20–22 Moreover, a lack of high accuracy in the gross examination of DMI makes it challenging to determine whether lymph node resection is needed. Although surgical staging remains the most accurate method of determining the extent of disease, the therapeutic value of pelvic lymphadenectomy has not been established.19 In addition, little is known about the prognostic independence of DMI from imaging-based stratification. Therefore, it is necessary to evaluate the myometrial infiltration accurately. Our study revealed two significant findings. First, accurate risk stratification for EC-1 patients who should undergo additional surgery depends on the added value of GLCM. Second, novel ML-based predictive models to identify EC-1 patients and whether they suffer from myometrial invasion. To some extent, our model is superior to the traditional prediction algorithm model. Inappropriate risk stratification may affect treatment planning, patient selection, and decision-making processes. Given the excellent performance of ML-based algorithms, RFC, DT, ANN, Xgboost, and SVM were employed in our study to evaluate DMI. We developed and validated a new predictive model to determine that EC-1 patients are at high risk of myometrial invasion before making treatment decisions. According to the gynecological oncology group (GOG) and the International Federation of Gynecologic and Obstetrics (FIGO), the most important prognostic factors of lymph node metastasis in EC patients are tumor grade, myometrial invasion depth, and the risk of involvement is less than 1%.23 However, being a grade 1 tumor, the overall survival rate is 90%, and the 5-year progression-free survival rate is 95%.24 Traditionally, imaging by experts through MRI or transvaginal ultrasound detects possible myometrial invasion and excludes synchronous ovarian tumors or suspicious lymph node lesions.25,26 GLCM analysis is a contemporary and innovative computational method to assess the textural patterns applicable to most areas of microscopy.27 To the best of our knowledge, this is the first study to create a relatively sensitive GLCM-based ML model for evaluating DMI. Based on the obtained GLCM data, we applied five ML approaches. These models also showed a relatively high level of sensitivity and specificity and an excellent discriminatory power. ML-based classification is the most critical computer development in recent years.14 Mature, supervised learning classifiers, including support vector machine, random forest, convolutional neural network, and decision tree, have been gradually applied to clinical practice.28 As an essential branch of supervised learning, the RFC model has been successfully applied to high-dimensional and multi-source data reduction in many scientific fields.29 Our study referred to the subsampling method used to estimate the variance of variable importance extracted from underlying grayscale images collected by the drone and constructed confidence intervals. Interestingly, GLCM parameters could be an independent risk factor for DMI and show stable and robust performance. Using the ML iteration algorithm, the effective parameters in GLCM were obtained, including correlation all direction offset1, correlation angle0 offset1, correlation angle45 offset1, correlation angle90 offset1, ID moment all direction offset1, ID moment angle0 offset1, and ID moment angle45 offset1. The ML-based algorithm possesses distinctly superior accuracy to traditional clinical variables. Consistent with previous research reports, we found the RFC model more advantageous than the traditional linear regression model in feature selection and classification. Our results showed that a combination of ML-based algorithms and GLCM could serve as a robust tool to improve accuracy in predicting DMI. Although the clinical significance of the GLCM-based prediction model in EC-1 patients is promising, some limitations should be acknowledged. First, all of the samples from this study were retrospective, and a prospective multicentre cohort should be performed to validate this model. Second, optical images were used for feature extraction, and positive results were obtained in various recognition, which may have had an artificial selection bias. Further, it is necessary to construct a high throughput platform to enhance the contrast and entropy of images for the analysis. Third, several other categories of functional MRI, such as diffusion tensor imaging, were not performed in this study. We believe GLCM parameters obtained from different types of functional MRI may enhance the diagnostic capability of our formulas.

Conclusion

The textural features extracted from the grayscale images show great potential to predict DMI. Meanwhile, a combination of GLCM parameters and an ML-based algorithm may aid clinicians in individual risk assessment for DMI. We have successfully constructed RFC with the highest predictive accuracy, and further studies are needed to develop these models for use in daily practice.

29 in total

1. Machine Learning-Based Model for Prediction of Outcomes in Acute Stroke.

Authors: JoonNyung Heo; Jihoon G Yoon; Hyungjong Park; Young Dae Kim; Hyo Suk Nam; Ji Hoe Heo
Journal: Stroke Date: 2019-05 Impact factor: 7.914

Review 2. When to use the Bonferroni correction.

Authors: Richard A Armstrong
Journal: Ophthalmic Physiol Opt Date: 2014-04-02 Impact factor: 3.117

Review 3. Automated machine learning: Review of the state-of-the-art and opportunities for healthcare.

Authors: Jonathan Waring; Charlotta Lindvall; Renato Umeton
Journal: Artif Intell Med Date: 2020-02-21 Impact factor: 5.326

4. Endometrial Cancer: Combined MR Volumetry and Diffusion-weighted Imaging for Assessment of Myometrial and Lymphovascular Invasion and Tumor Grade.

Authors: Stephanie Nougaret; Caroline Reinhold; Shaza S Alsharif; Helen Addley; Jocelyne Arceneau; Nicolas Molinari; Boris Guiu; Evis Sala
Journal: Radiology Date: 2015-04-30 Impact factor: 11.105

Review 5. Endometrial cancer.

Authors: Joel I Sorosky
Journal: Obstet Gynecol Date: 2012-08 Impact factor: 7.661

6. Prediction of deep myometrial invasion in patients with endometrial cancer: clinical utility of contrast-enhanced MR imaging-a meta-analysis and Bayesian analysis.

Authors: K A Frei; K Kinkel; H M Bonél; Y Lu; C Zaloudek; H Hricak
Journal: Radiology Date: 2000-08 Impact factor: 11.105

Review 7. Machine Learning in Medicine.

Authors: Rahul C Deo
Journal: Circulation Date: 2015-11-17 Impact factor: 29.690

8. Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival.

Authors: Hemant Ishwaran; Min Lu
Journal: Stat Med Date: 2018-06-04 Impact factor: 2.373

9. Conservative treatment in early stage endometrial cancer: a review.

Authors: Giuseppe Trojano; Claudiana Olivieri; Raffaele Tinelli; Gianluca Raffaello Damiani; Antonio Pellegrino; Ettore Cicinelli
Journal: Acta Biomed Date: 2019-12-23

10. Predicting factors for survival of breast cancer patients using machine learning techniques.

Authors: Mogana Darshini Ganggayah; Nur Aishah Taib; Yip Cheng Har; Pietro Lio; Sarinder Kaur Dhillon
Journal: BMC Med Inform Decis Mak Date: 2019-03-22 Impact factor: 2.796