Literature DB >> 35600083

Noninvasively predict the micro-vascular invasion and histopathological grade of hepatocellular carcinoma with CT-derived radiomics.

Abstract

Objectives: This research aims to predict the micro-vascular invasion and histopathologic grade of hepatocellular carcinoma with the CT-derived radiomics.
Methods: The clinical and image data of 82 patients were accessed from the TCGA-LIHC collection in The Cancer Imaging Archive. Then the radiomics features were extracted from the CT images. For obtaining the appropriate feature subset, the redundant features were removed by means of intra-class agreement analysis, the Student t test, LASSO-regression and support vector machine (SVM) Recursive feature elimination (SVM-RFE). Then several machine-learning-based classifiers including SVM and random forest (RF) were established. To accurately evaluate the tumor grade and MVI with the integration of the Radiomics and clinical insights, the nomogram-based clinical models were constructed. The diagnostic performance was evaluated with ROC analysis.
Results: 7 and 10 radiomics features were selected via LASSO regression and SVM-RFE for identifying the tumor grade with regard to 13 and 10 features selected via LASSO regression and SVM-RFE for evaluating the MVI. The combination of the classifier-RF and the selection strategy of SVM-RFE yielded the best performance for grading HCC (AUC: 0.898). Differently, the combination of the classifier-RF and the selection strategy of LASSO regression resulted in the best performance for identifying MVI (AUC: 0.876). Finally, two nomograms were constructed with radiomics score (Rscore) and clinical risk factors, which showed excellent predictive value for both tumor grade (AUC: 0.928) and MVI (AUC: 0.945).
Conclusion: CT-derived radiomics were valuable for noninvasively assessing the micro-vascular invasion and histopathologic grade of hepatocellular carcinoma.

Entities: Chemical

Keywords: CT-Radiomics; Hepatocellular carcinoma; Machine Learning; Prognostic factors

Year: 2022 PMID： 35600083 PMCID： PMC9120240 DOI： 10.1016/j.ejro.2022.100424

Source DB: PubMed Journal: Eur J Radiol Open ISSN： 2352-0477

Introduction

Hepatocellular carcinoma (HCC) has been a tremendous threat to human health for a very long time because of the notoriously high incidence together with the high mortality [1]. At present, the first-line treatment options for different types of HCC contain the surgical resection, radiofrequency ablation, Transhepatic Arterial Chem therapy And Embolization (TACE) and so forth [2], [3], [4]. However, the poor prognosis is broadly regarded as the huge challenge. Accurate prognostic prediction and evaluation may, to some extent, guide the clinical management of HCC [5]. Currently, evaluating the prognosis-related histopathological markers have been accepted as the effective approach for prognostic prediction. For example, high histopathological grade and the presence of micro-vascular invasion indicate the high probability of recurrence, lymphatic metastasis, strong tumor invasion and metastasis [6], [7], [8], [9]. Assessing the prognosis related histopathological factors such as tumor grade [10], pathological stage [11], micro-vascular invasion [9], the expression of some histopathological markers containing Ki67 [12], CK-19 [13] have drawn innumerable attention. Nevertheless, current gold standard for evaluating the prognostic markers, histopathological examination, is with many disadvantages containing invasiveness, time-consumption and potential sampling bias. Novel approaches with complementary advantages are urgently required. During the past few years, growing attention has been paid to image-based prognostic prediction. Various prognosis related histopathological markers of HCC including the tumor grade, micro-vascular invasion, capsule formation, and the expression of Ki67 and CK-19 have been broadly assessed through exploring the representative image predictors of cancer [6], [9], [14]. With the core ideology of that images are more than pictures and they are data, radiomics have paved the unprecedented way for exploring the diagnostic markers and models from images [15]. Additionally, the integration of high-throughput radiomics features and robust artificial intelligence-based modes have been widely reported to yield extra clinical benefits in lesion discrimination, disease diagnosis and treatment efficacy prediction [16], [17], [18]. CT-based radiomics have shown great value in evaluating the prognostic markers of HCC [19], [20]. Several previous studies also aimed to apply the CT-derived radiomics for characterizing the histopathological grade, micro-vascular invasion or other pathological markers of HCC [19], [21], [22]. However, limited number of studies aimed to apply the CT-derived radiomics for simultaneously predicting multiple prognostic markers of HCC. Besides, previous results varied for different studies and included cohorts, which demonstrated that more explorations and researches should be conducted. Therefore, this research aims to extract the radiomics features from CT images and then established the machine-learning-based diagnostic models for identifying the histopathological grade and micro-vascular invasion of HCC.

Methods

Patient cohort

Both the image data and other clinical data were accessed from the TCGA-LIHC collection (https://wiki.cancerimagingarchive.net/display/Public/TCGA-LIHC) in Cancer Imaging Archive (TCIA). The local ethical approval (20-1574AB) and the written informed consents of all patient were successfully obtained, which is declared in the data source. In brief, the Cancer Genome Atlas Liver Hepatocellular Carcinoma (TCGA-LIHC) data collection provided a convenient community for researchers to investigate the hepatocellular carcinoma with insights of radiological findings, pathology, clinical outcome and genotype. Detailed multi-institution based data description of TCGA-LIHC can be found in the previous research [23]. In total, a dataset of 97 subjects were downloaded from the TCGA-LIHC collection. The available data were included according to the inclusion criterion and exclusion criterion as followings:

Inclusion criterions

Pathologically confirmed as HCC without preoperative treatment.

Exclusion criterions

The absence of CT image data. The absence of pathological results regarding the micro-vascular invasion or histopathological grade. The poor image quality of CT images. All subjects included into this research underwent abdominal multiphasic dynamic contrast-enhanced CT with the multi-detector row CT (MDCT) units (GE LightSpeed QX/I, GE Healthcare, USA or Siemens Sensation 16, Siemens, Germany). Detailed imaging parameters are listed as follows: 120 kV, auto tube current, field of view (FOV): 320–500 mm× 320–500 mm, scanning matrix: 512 × 512, reconstruction kernel: standard, scan type: helical, slice thickness 5 mm, slice gap: 5 mm, reconstructed section thickness 2 mm. The arterial-phase (AP), venous-phase (VP) and delay-phase (DP) CT were performed at 30–35 s, 65–70 s and 150–180 s after intravenous injection of contrast enhanced agent (Ultravist 370, Bayer Schering Pharma, Berlin, German, Dose: 1.5 mL/kg, injection rate: 3.0 mL/s). Pathological and clinical characteristics were likewise accessed from the TCGA-LIHC collection. Gender (0) and gender (1) respectively represent female and male. HCC was pathologically staged as IA (1), IB (2), II (3), IIIA (4), IIIB (5), IVA (6) and IVB (7). Percutaneous fine needle aspiration biopsy of liver lesion was performed in patients with local infiltration anesthesia. Then formalin-fixed paraffin-embedded biopsy specimens’ sections were stained with hematoxylin and eosin (H&E) for the following histopathological evaluation. All HCCs were classified into four grades (ES-1, ES-2, ES-3 and ES-4; Low Grade: ES-1 & ES-2, High Grade: ES-3 & ES-4) according to the Edmondson-Steiner grading guideline [24].

Radiomics and the diagnostic models

The schematic flowchart of this study is shown in Fig. 1. The detailed processes were listed as the followings:

Fig. 1

Schematic flowchart.

Tumor segmentation

The entire tumor in CT images were segmented via two abdominal radiologists with 17 years’ and 13 years’ experience, respectively, through an open-sourced software named as ITK-SNAP (http://www.itksnap.org/pmwiki/pmwiki.php). The lesions were segmented in the venous-phase (VP) CT images and then the volume of interests were copied to the other phases.

Feature extraction

The CT-derived radiomics features were extracted via an open-source Python-based software named Pyradiomics (https://pypi.python.org/pypi/pynetdicom). Radiomics features of 8 categories containing First Order Statistics (19 features), Shape-based (3D) (16 features), Shape-based (2D) (10 features), Gray Level Cooccurence Matrix (24 features), Gray Level Run Length Matrix (16 features), Gray Level Size Zone Matrix (16 features), Neighbouring Gray Tone Difference Matrix (5 features), Gray Level Dependence Matrix (14 features) were extracted. In total, 321 image features were obtained from the CT images (arterial phase, venous phase, delayed phase) of each patient.

Feature reduction

According to the statistical and algorithmic guidelines [25], redundant meaningfulness radiomics features will unnecessarily increase the model complexity and then holds potential risk of overfitting, which means the established models only have satisfying performance in training cohorts but have poor performance in validation cohorts. Therefore, it has been widely reported that well-designed feature selection strategies are necessary for establishing robust models [26], [27]. For this study, the image features were selected according to the following steps: (1) The intra-class coefficients (ICC) of each image feature were calculated to quantify the agreement and reproducibility. The image features with the ICC of less than 0.8 were removed. (2) The Student t test was utilized to screen the image features with significant differences between different subgroups (with micro-vascular invasion vs without micro-vascular invasion, high-grade HCC vs low-grade HCC). (3) Next, Least Absolute Shrinkage and Selection Operator (LASSO) regression or Support Vector Machine-Recursive Feature Elimination (SVM-RFE) was carried out to determine the ultimate feature subset.

Radiomics-based diagnostic models

Either for predicting the patients with micro-vascular invasion or the patients with high-grade HCC, machine-learning-based classifiers including the random forest (RF) and support vector machine (SVM) were established for achieving the diagnostic purpose with the different combination of feature subset by means of the different R packages including the randomForest and e1071. Consequently, a total of 8 models were constructed. 5-fold cross-validation was then used to select the best radiomics-based model with the highest area under the curve (AUC) of receiver operating characteristic (ROC) curves. To avoid the sampling bias, stratified sampling was performed in this study.

Nomogram-based predictor

The binary logistic regression model was firstly utilized to screen the independent clinical risk factors and establish the clinical model. The predictive probability of best radiomics model was determined as the Radiomics score (Rscore). Then, the nomogram-based predictors were constructed with Rscore and independent clinical risk factors.

Statistical analysis

ICC was calculated to quantify the intraclass agreement of the feature values given by two observers. The student t test was performed to explore the image features showing significant differences between different subgroups. Diagnostic performance of different models was evaluated by receiver operating characteristic (ROC) curve analysis. The detailed indexes of diagnostic performance included the sensitivity, specificity, area under the curves (AUC) and Youden index. It should be noted that in order to obtain the statistical results with reliability, the establishment, evaluation and comparison of radiomics model, clinical model and nomogram predictor were based on same 5-fold splitting data. P values of less than 0.05 were regarded as statistically different. All the statistical analysis were conducted with the SPSS 26.0 (SPSS, Chicago, IL, USA), R (R language 4.0.3, R Core Team, 2020) and Medcalc (MedCalc Software, Belgium).

Results

Patients cohort

A total of 97 patients were accessed from the TCGA-LIHC collection (https://wiki.cancerimagingarchive.net/display/Public/TCGA-LIHC) in Cancer Imaging Archive (TCIA). 10 patients were excluded because of the absence of the complete CT images. 3 patients were excluded because of the unavailable histopathological results. 2 patients were excluded as the image quality was poor or incomplete images. Ultimately, 82 patients (Male: 54, Female: 28; Age: 61.8 ± 14.0, Min: 20, Max: 85;) were included. Detailed baseline clinical characteristics were listed in Table 1.

Table 1

Patients characteristics.

Characteristics	Values	p Values (p^£/p^§)
Mean age (years)	61.8 ± 14.0 (Min: 20, Max: 85)	0.043^£/0.021^§
Gender	Men (54/65.9%)/Women (28/34.1%)	0.039^£/0.044^§
Mean Height (cm)	167.1 ± 13.5 (Min: 64.0, Max: 188.0)	0.872^£/0.451^§
Mean Weight (kg)	78.0 ± 20.9 (Min: 47.0, Max: 129.0)	0.542^£/0.770^§
Race	NA (6/7.3%)/Black (3/3.7%)/White (50/61.0%)/ Asia (23/28.0%)	0.890^£/0.718^§
Liver Cirrhosis	With (59/72.0%)/Without (23/28.0%)	0.072^£/0.088^§
Tumor Burden	Unifocal (68/82.9%)/ Multifocal (14/17.1%)	0.121^£/0.205^§
No. of Lesions	2.9 ± 2.3 (Min: 1.0, Max: 8.0)	0.100^£/0.151^§
Diameter (cm)	5.9 ± 2.1 (Min: 1.2, Max: 14.2)	0.080^£/0.720^§
Segment Location	I-II (3/3.7%)/III (7/8.5%)/IV (18/22.0%)/V (13/15.9%)/VI (22/26.8%)/VII-VIII (19/23.2%)	0.388^£/0.659^§
Etiology	Hepatitis B virus (25/30.5%)/Hepatitis C virus (34/41.5%)/Alcohol and other (23/28.0%)	0.062^£/0.215^§
MVI	Negative (25/30.5%)/ Positive (57/69.5%)	0.040^£
Grade	Low (53/64.6%))/High (29/35.4%)	0.033^§
Alpha Fetoprotein	7498.0 ± 13,787.0 (Min: 1.0, Max: 10,3900)	0.022^£/0.045^§
T Stage	T1 (35/42.7%)/T2 (16/19.5%)/T3 (7/8.5%)/T3a (9/11.0%)/T3b (8/9.8%)/T4 (7/8.5%)	0.004^£/0.002^§
N Stage	N0 (63/76.8%)/N1(3/3.7%)/NX (16/19.5%)	0.101^£/0.171^§
M Stage	M0 (68/82.9)/M1(4/4.9%)/MX (10/12.2%)	0.235^£/0.188^§
Child pugh classification	NA (12/14.6%)/A (53/64.6%)/B (17/20.7%)	0.098^£/0.297^§
ECOG Score	0 (41/50.0%)/1 (32/39.0%)/2 (9/11.0%)	0.132^£/0.007^§

Note: 1) NA indicates the results are unknown or unavailable. 2) p£ and p§ represent the p values for histopathologic grade and micro-vascular invasion, respectively. The p values were calculated according to the Fisher's exact test for categorical variables and the p values were calculated according to the Mann-Whitney U test for continuous variables.

Patients characteristics. Note: 1) NA indicates the results are unknown or unavailable. 2) p£ and p§ represent the p values for histopathologic grade and micro-vascular invasion, respectively. The p values were calculated according to the Fisher's exact test for categorical variables and the p values were calculated according to the Mann-Whitney U test for continuous variables.

Radiomics features and models

A total of 321 Radiomics features were extracted from the arterial, venous and delayed phase of CT images of each included patient. 39 features were removed due to the low intra-class coefficient (ICC < 0.8). For identifying the histopathological grade and MVI, 217 and 251 features were then removed because there were no significant differences between the subgroups (high grade vs low grade and MVI (+) vs MVI (-)). Next, two feature selection strategies including LASSO Regression and SVM-RFE were respectively performed to further eliminate the redundant features. As Table 2 and Table 3 show, 7 and 10 radiomics features were selected via LASSO regression and SVM-RFE for assessing the tumor grade with regard to 13 and 10 features selected via LASSO regression and SVM-RFE for evaluating the MVI. Next, random combination of two feature selection methods and two machine-learning classifying algorithms resulted in four radiomics-based predictive models for either evaluating grade or evaluating MVI. The 5-fold diagnostic performance of above eight models were exhibited in Table 4, which suggested the combination of SVM-RFE and RF had the best performance for grading HCC (AUC values: Fold 1: 0.906, Fold 2: 0.917, Fold 3: 0.857, Fold 4: 0.938, Fold 5: 0.875, Mean: 0.898). Differently, the combination of LASSO regression and RF had the best performance for identifying the MVI (AUC values: Fold 1: 0.889, Fold 2:0.926, Fold 3: 0.889, Fold 4: 0.800, Fold 5: 0.875, Mean: 0.876). The ROC curves of the two best models are shown in Fig. 2. Consequently, the 10 features selected via SVM-RFE and 13 features selected via LASSO regression severed as the best feature subsets for evaluating the grade and MVI, respectively. Fig. 3 shows the categorical distribution of the aforementioned two feature subsets. Among the best feature subset for grading HCC, the number of features belonging to First-order features, GLCM Features, GLDM features, GLRLM features, GLSZM features, NGTDM features and shape features were 1 (10%), 2 (20%), 3 (40%), 0, 0, 1(10%) and 2 (20%), respectively. Additionally, among the best feature subset for assessing MVI, the number of features belonging to First-order features, GLCM Features, GLDM features, GLRLM features, GLSZM features, NGTDM features and shape features were 2 (15.4%), 3 (23.1%), 1 (7.7%), 1 (7.7%), 2 (15.4%), 1 (7.7%) and 3 (23.1%), respectively. Fig. 4 displayed the value distribution of selected features in different subgroups (high grade, low grade, MVI (+), MVI (-)).

Table 2

Selected features for grading HCC.

Selection strategy	Radiomics features	Category	Enhanced phase
LASSO Regression (7 features)	Median	FO	VP
	Autocorrelation	GLCM	DP
	Contrast	GLCM	AP
	Joint Entropy	GLCM	AP
	Large Dependence High Gray Level Emphasis	GLCM	VP
	Low Gray Level Zone Emphasis (LGLZE)	GLSZM	VP
	Elongation	Shape	DP
	Median	FO	AP
SVM-RFE (Top 10 features)	Autocorrelation	GLCM	VP
	Contrast	GLCM	VP
	Low Gray Level Emphasis (LGLE)	GLDM	DP
	Dependence Entropy (DE)	GLDM	AP
	Dependence Non-Uniformity (DN)	GLDM	VP
	Large Dependence Emphasis (LDE)	GLDM	DP
	Coarseness	NGTDM	AP
	Elongation	Shape	DP
	Flatness	Shape	DP

Notes: 1) FO is the abbreviation of First-Order. 2) Enhanced phase indicates which phase of CT images the corresponding features were extracted from. 3) AP, VP, and DP represent the arterial-phase (AP), venous-phase (VP) and delay-phase (DP), respectively.

Table 3

Selected features for identifying the MVI status.

Selection strategy	Radiomics features	Category	Enhanced phase
LASSO Regression(13 features)	Kurtosis	FO	AP
	Total Energy	FO	VP
	Autocorrelation	GLCM	AP
	Contrast	GLCM	DP
	Difference Entropy	GLCM	VP
	Low Gray Level Emphasis (LGLE)	GLDM	VP
	Gray Level Non-Uniformity (GLN)	GLRLM	VP
	Low Gray Level Zone Emphasis (LGLZE)	GLSZM	AP
	Small Area Low Gray Level Emphasis (SALGLE)	GLSZM	VP
	Coarseness	NGTDM	DP
	Elongation	Shape	VP
	Flatness	Shape	AP
	Spherical Disproportion	Shape	DP
SVM-RFE(Top 10 features)	Mean	FO	VP
	Joint Average	GLCM	AP
	Autocorrelation	GLCM	DP
	Cluster Shade	GLCM	DP
	Difference Entropy	GLCM	VP
	Small Dependence Emphasis (SDE)	GLDM	VP
	Gray Level Non-Uniformity (GLN)	GLRLM	AP
	Low Gray Level Zone Emphasis (LGLZE)	GLSZM	VP
	Zone Percentage (ZP)	GLSZM	DP
	Spherical Disproportion	Shape	AP

Table 4

Diagnostic performance of radiomics-based models.

	FeatureSelection	Classifier	Fold1	Fold2	Fold3	Fold4	Fold5	Mean
Grade	LASSO	SVM	0.625	0.625	0.686	0.667	0.586	0.638
Grade	SVM-RFE	SVM	0.800	0.667	0.700	0.778	0.729	0.735
Grade	LASSO	RF	0.806	0.686	0.815	0.778	0.639	0.745
Grade	SVM-RFE	RF	0.906	0.917	0.857	0.938	0.875	0.898
MVI	LASSO	SVM	0.611	0.667	0.667	0.833	0.625	0.681
MVI	SVM-RFE	SVM	0.833	0.833	0.833	0.833	0.750	0.817
MVI	LASSO	RF	0.889	0.926	0.889	0.800	0.875	0.876
MVI	SVM-RFE	RF	0.624	0.762	0.715	0.78	0.812	0.721

Note: The values are the AUC values.

Fig. 2

Diagnostic performance of best radiomics-based models for evaluating the tumor grade and MVI.

Fig. 3

Categorical distribution of best feature subset for assessing tumor grade (first row) and MVI (second row).

Fig. 4

Value distribution of selected features in different subgroups (High Grade, Low Grade, MVI (+), MVI (-)). Note: 1) A grid in the longitudinal direction represents a patient. A grid on the horizontal represents a feature. 2) Only the best feature subsets for assessing the tumor grade and MVI are displayed.

Selected features for grading HCC. Notes: 1) FO is the abbreviation of First-Order. 2) Enhanced phase indicates which phase of CT images the corresponding features were extracted from. 3) AP, VP, and DP represent the arterial-phase (AP), venous-phase (VP) and delay-phase (DP), respectively. Selected features for identifying the MVI status. Notes: 1) FO is the abbreviation of First-Order. 2) Enhanced phase indicates which phase of CT images the corresponding features were extracted from. 3) AP, VP, and DP represent the arterial-phase (AP), venous-phase (VP) and delay-phase (DP), respectively. Diagnostic performance of radiomics-based models. Note: The values are the AUC values. Diagnostic performance of best radiomics-based models for evaluating the tumor grade and MVI. Categorical distribution of best feature subset for assessing tumor grade (first row) and MVI (second row). Value distribution of selected features in different subgroups (High Grade, Low Grade, MVI (+), MVI (-)). Note: 1) A grid in the longitudinal direction represents a patient. A grid on the horizontal represents a feature. 2) Only the best feature subsets for assessing the tumor grade and MVI are displayed.

The construction of Nomograms with the integration of clinical Risk factor and Radiomics Score (Rscore)

According to the binary logistic regression established with different clinical factors as independent variables and grade or MVI status as dependent variables, age, gender, alpha fetal protein (AFP) and tumor stage were identified as the independent risk factors of tumor grade (p < 0.05), and age, gender, AFP, tumor stage together with Eastern Cooperative Oncology Group (ECOG) score were identified as independent risk factors of MVI (p < 0.05) (Table 5). Therefore, the above risk factors and Rscore were utilized for constructing the Nomograms. The Nomograms utilized to assess the HCC grade and MVI status were displayed in Fig. 5 and Fig. 6. In addition, Fig. 5 and Fig. 6 also exhibited the diagnostic performance of different models including clinical models established with clinical factors, radiomics models established with radiomics features and nomogram predictors. The nomogram predictor possessed the best performance for predicting the tumor grade (AUC: 0.928) followed by radiomics model (AUC: 0.876) and the clinical model (AUC: 0.731). Similarly, the nomogram predictor also possessed the best performance for identifying the MVI status (AUC: 0.945) followed by the radiomics model (AUC: 0.890) and clinical model (AUC: 0.716) (Fig. 5, Fig. 6 and Table 6). As shown in Table 6, for predicting the HCC grade and MVI status, the diagnostic efficacy of the radiomics model was significantly higher than that of the clinical models. Furthermore, the results also indicated that the diagnostic performance of nomogram predictors for evaluating the grade as well as MVI is significantly better than not only clinical models but Radiomics model (p < 0.05).

Table 5

Determine independent clinical risk factor for histopathologic grade and micro-vascular invasion.

Histopathologic Grade
Clinical Variables	Coefficients	SD	P values
Age	1.256	0.413	0.045
Gender	0.523	0.348	0.037
AFP	0.274	0.080	0.029
Tumor Stage	1.767	0.692	0.002
MVI

Clinical Variables	Coefficients	SD	P values
Age	1.075	0.621	0.030
Gender	0.481	0.256	0.041
AFP	0.188	0.092	0.048
Tumor Stage	1.583	0.871	0.011
Ecog Score	1.989	0.674	0.005

Fig. 5

Nomogram-based predictor for grading HCC and the diagnostic performance comparison.

Fig. 6

Nomogram-based predictor for identifying MVI status and the diagnostic performance comparison.

Table 6

Diagnostic performance evaluation and comparison.

Model Evaluation
	Sensitivity (%)	Specificity (%)	AUC	Youden index
Grade
Clinical Model	85.0	61.0	0.731	0.460
Radiomics Model	90.0	80.5	0.876	0.705
Nomogram Predictor	80.0	95.1	0.928	0.751
MVI
Clinical Model	95.0	56.1	0.716	0.511
Radiomics Model	95.0	70.3	0.890	0.657
Nomogram Predictor	95.0	80.5	0.945	0.755
Model Comparison
Grade			P Values
Radiomics Model vs Clinical ModelNomogram predictor vs Clinical ModelNomogram predictor vs Radiomics ModelMVIRadiomics Model vs Clinical ModelNomogram predictor vs Clinical ModelNomogram predictor vs Radiomics Model			0.005
			0.002
			0.038

			0.002
			0.001
			0.045

Note: The model highlighted by red color is the better model compared to the other model.

Determine independent clinical risk factor for histopathologic grade and micro-vascular invasion. Nomogram-based predictor for grading HCC and the diagnostic performance comparison. Nomogram-based predictor for identifying MVI status and the diagnostic performance comparison. Diagnostic performance evaluation and comparison. Note: The model highlighted by red color is the better model compared to the other model.

Discussion

The highlights of this research are as the followings: (1) The CT-derived radiomics features were utilized to construct the diagnostic models for predicting dual prognostic markers including the histopathological grade and MVI. Compared to a lot of previously-reported studies aiming to evaluate the single prognostic factor, more evaluation insights regarding the prognostic indicators will provide more comprehensive characterization of the tumor during clinical management. (2) The included patients in this study were from multi-centers. Besides, as displayed in Table 1, the patients in this study belonged to multiple races. The aforementioned data source will be conducive to prove the applicability of the strategy proposed in this research. (3) Integrating the clinical risk factors and radiomics features derived from CT images, the nomogram predictors showed excellent diagnostic efficacy for evaluating the histopathological grade and MVI. In this research, most features selected for predicting tumor grade and the MVI status were high-order texture features rather than widely-used first-order features. The results demonstrated that there were only 10.0% and 15.3% first-order features in the best feature subset, which was similar to plenty of previous researches [28], [29], [30]. During daily clinical practice, CT images based diagnostic conclusions are usually drawn by naked eyes. The diagnostic insights are essentially based on the first-order features such as the overall attenuation (mean, median value). Invisible to the naked eyes, a lot of high-order texture features are of great importance for clinical application [31], [32], [33]. On the one hand, texture features are able to serve as the quantitative image markers for biomedical application. On the other hand, texture features can be utilized to construct the diagnostic models for various clinical applications such as tumor diagnosis, treatment efficacy evaluation and prognostic prediction. Moreover, the rapid development of artificial intelligence technology, including machine learning, deep learning, reinforcement learning and transfer learning, also brings unlimited possibilities for radiomics [34], [35]. Aforementioned issues further indicated the advantage of extracting CT-derived radiomics features for biomedical application. In detail, the best feature subset for grading HCC contained 10 features including Median, Autocorrelation, Contrast, Low Gray Level Emphasis (LGLE), Dependence Entropy (DE), Dependence Non-Uniformity (DN), Large Dependence Emphasis (LDE), Coarseness, Elongation and Flatness. Similarly, best feature subset for identifying the MVI status contained 13 features including Mean, Joint Average, Autocorrelation, Cluster Shade, Difference Entropy, Small Dependence Emphasis (SDE), Gray Level Non-Uniformity (GLN), Low Gray Level Zone Emphasis (LGLZE), Zone Percentage (ZP) and Spherical Disproportion. Above features are able to provide the characterization of tumor micro-structural heterogeneity in terms of gray level distribution, inhomogeneity of signal intensity, morphological differences and so forth. For example, Autocorrelation can be applied for quantifying the magnitude of the fineness and coarseness of texture. Tumors with high heterogeneity tend to have a coarser texture [36]. Contrast can be utilized to quantify the variation of local signal intensity [37]. Elongation severs as the measure of irregularity of ROI shape [38]. With the assistance of high-order features hidden under the naked eyes, different clinical models can be established to achieve different clinical goals. In this research, great diagnostic performance for evaluating the tumor grade and MVI were achieved with the radiomics based predictive model. The potential causes were as the followings: (1) Feature selection strategy was carefully designed. The feature selection in this study mainly contained 3 steps. Instable and meaningfulness features were firstly removed according to the ICC and Student t Test. Then, LASSO Regression and SVM-RFE were respectively performed. LASSO Regression and SVM-RFE are two machine learning-based feature selection strategies showing great potential in constructing the clinical predictive models [39], [40]. (2) Two machine learning classifiers including SVM and RF were then established. Compared to conventional linear classifiers such as regression-based models, through nonlinear transformation to high-dimensional feature space, SVM can construct a discriminant function in the high-dimensional feature space to realize the classification of samples, and cleverly avoids the problem of "dimension disaster" [41]. By means of integrating multiple classification tree, the random forest can achieve higher classification accuracy. In addition, due to the introduction of randomness, it has a certain anti-noise ability [42]. (3) Rather than utilizing single feature selection approach and single classifier to construct single model for clinical application, the random combination of two feature selection approaches (LASSO-regression and SVM-RFE) and two classifiers (SVM and RF) altogether yielded 8 predictive models in this research, which was conducive to obtain the model with the best performance. To obtain the more powerful predictors, nomogram-based predictors were constructed with clinical risk factors and the radiomics model. Age, gender, tumor stage along with AFP were screened as the independent risk factors of HCC grade, and age, gender, AFP, tumor stage, and ECOG score were selected as independent risk factors of MVI. High tumor stage and higher expression of AFP were more common in the patients with MVI and high-grade HCC. Furthermore, higher ECOG score, in this study, also indicated a high probability of MVI. The above results were consistent with many previous researches [43], [44], [45]. Besides, our results also demonstrated that age and gender were also associated with the histopathologic grade and MVI of HCC, which corresponded to some previous findings that the age and gender also served as independent risk factors and then were incorporated into the nomogram-based predictors [46], [47], [48]. Importantly, this study suggested that the integration of clinical indicators and radiomics resulted in fascinating diagnostic power for assessing the tumor grade and MVI (AUC > 0.900). The diagnostic efficacy of nomogram-based predictors was significantly better than that of either the radiomics model or clinical model. The above results revealed that CT-based radiomics can be applied for simultaneously predicting multiple important prognostic markers, which will be of great clinical potential for many other applications not limited to hepatic diseases but other cancers. The excellent predictive power may result from the following factors: (1) The combination of clinical risk factors and radiomics led to a comprehensive characterization of HCC from multiple perspectives. (2) radiomics-based model laid a solid foundation for the excellent performance of nomogram-based predictors. Several limitations should be acknowledged in this study. Firstly, although some approaches such as cross-validation have been carried out, the sample size of this study is not very large, which may hold potential risk for statistical bias. In the subsequent study, efforts need to be made to enroll more patients and further enhance the evidence. Secondly, no patient cohort was utilized as an external validation group. Thirdly, only two prognostic factors including MVI and grade were incorporated as the predictive target. More important markers should be incorporated to evaluate the feasibility of applying the radiomics-based model for predicting multiple markers.

Conclusion

This research indicated that CT-derived high-throughput radiomics features can serve as the quantitative biomarkers for characterizing hepatocellular carcinoma. Furthermore, with the assistance of machine learning, accurate and non-invasive prediction of histopathological grade as well as micro-vascular invasion can be achieved, which holds great potential for guiding the clinical management and predicting the prognosis of patients with HCC.

Ethics approval and consent to participate

The local ethical approval (20-1574AB) was obtained from the Institutional Review Board of Qiqihar Medical University. The written informed consents of all patient were successfully obtained, which is declared in the data source.

Funding

This project was supported by the Project of Heilongjiang Provincial Health Commission (Grant No. 2020-441).

CRediT authorship contribution statement

Jing Li: Conceptualization, Writing – review & editing. Xu Tong: Software, Data curation, Visualization, Investigation, Validation, Supervision, Writing – original draft, Methodology.

44 in total

1. Primary carcinoma of the liver: a study of 100 cases among 48,900 necropsies.

Authors: H A EDMONDSON; P E STEINER
Journal: Cancer Date: 1954-05 Impact factor: 6.860

2. Diffusion-weighted imaging (DWI) of hepatocellular carcinomas: a retrospective analysis of the correlation between qualitative and quantitative DWI and tumour grade.

Authors: T Jiang; J H Xu; Y Zou; R Chen; L R Peng; Z D Zhou; M Yang
Journal: Clin Radiol Date: 2017-01-19 Impact factor: 2.350

3. Correlation between CT based radiomics features and gene expression data in non-small cell lung cancer.

Authors: Ting Wang; Jing Gong; Hui-Hong Duan; Li-Jia Wang; Xiao-Dan Ye; Sheng-Dong Nie
Journal: J Xray Sci Technol Date: 2019 Impact factor: 1.535

4. Clear cell renal cell carcinoma: CT-based radiomics features for the prediction of Fuhrman grade.

Authors: Jun Shu; Yongqiang Tang; Jingjing Cui; Ruwu Yang; Xiaoli Meng; Zhengting Cai; Jingsong Zhang; Wanni Xu; Didi Wen; Hong Yin
Journal: Eur J Radiol Date: 2018-10-05 Impact factor: 3.528

5. Radiomic analysis of contrast-enhanced CT predicts microvascular invasion and outcome in hepatocellular carcinoma.

Authors: Xun Xu; Hai-Long Zhang; Qiu-Ping Liu; Shu-Wen Sun; Jing Zhang; Fei-Peng Zhu; Guang Yang; Xu Yan; Yu-Dong Zhang; Xi-Sheng Liu
Journal: J Hepatol Date: 2019-03-13 Impact factor: 25.083

6. Development and Validation of a Radiomics Nomogram for Preoperative Prediction of Lymph Node Metastasis in Colorectal Cancer.

Authors: Yan-Qi Huang; Chang-Hong Liang; Lan He; Jie Tian; Cui-Shan Liang; Xin Chen; Ze-Lan Ma; Zai-Yi Liu
Journal: J Clin Oncol Date: 2016-05-02 Impact factor: 44.544

7. Role of baseline volumetric functional MRI in predicting histopathologic grade and patients' survival in hepatocellular carcinoma.

Authors: Sanaz Ameli; Mohammadreza Shaghaghi; Mounes Aliyari Ghasabeh; Pallavi Pandey; Bita Hazhirkarzar; Maryam Ghadimi; Roya Rezvani Habibabadi; Pegah Khoshpouri; Ankur Pandey; Robert A Anders; Ihab R Kamel
Journal: Eur Radiol Date: 2020-03-06 Impact factor: 5.315

8. CK19 and Glypican 3 Expression Profiling in the Prognostic Indication for Patients with HCC after Surgical Resection.

Authors: Jiliang Feng; Ruidong Zhu; Chun Chang; Lu Yu; Fang Cao; Guohua Zhu; Feng Chen; Hui Xia; Fudong Lv; Shijie Zhang; Lin Sun
Journal: PLoS One Date: 2016-03-15 Impact factor: 3.240

9. Radiomics: Images Are More than Pictures, They Are Data.

Authors: Robert J Gillies; Paul E Kinahan; Hedvig Hricak
Journal: Radiology Date: 2015-11-18 Impact factor: 11.105

10. A model combining TNM stage and tumor size shows utility in predicting recurrence among patients with hepatocellular carcinoma after resection.

Authors: Yu Zhang; Shu-Wei Chen; Li-Li Liu; Xia Yang; Shao-Hang Cai; Jing-Ping Yun
Journal: Cancer Manag Res Date: 2018-09-20 Impact factor: 3.989