Jiliang Ren1, Ying Yuan1, Meng Qi2, Xiaofeng Tao3. 1. Department of Radiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, No. 639 Zhizaoju Road, Shanghai, 200010, China. 2. Department of Radiology, Eye & ENT Hospital of Shanghai Medical School, Fudan University, No. 83 Fenyang Road, Shanghai, 200030, China. 3. Department of Radiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, No. 639 Zhizaoju Road, Shanghai, 200010, China. cjr.taoxiaofeng@vip.163.com.
Abstract
OBJECTIVE: To compare the CT texture feature reproducibility of 2D and 3D segmentations and their machine learning (ML)-based classifications for predicting human papilloma virus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC). MATERIALS AND METHODS: Data about 47 patients with pathological OPSCC (15 HPV positive and 32 HPV negative) were collected from a public database. Using 2D and 3D manual segmentations, 1032 texture features were extracted from contrast-enhanced CT images. Intraclass correlation coefficients (ICCs) were calculated to evaluate intraobserver and interobserver reproducibility. Collinearity analysis and a wrapper-based subset search algorithm were used for feature selection. Models were created using k-nearest neighbors (k-NN), logistic regression (LR), and random forest (RF) alone and with a synthetic minority oversampling technique (SMOTE). Classifier performance was assessed using 10-fold cross-validation. RESULTS: Compared with 2D segmentation (468 of 1032, 45.3%), 3D segmentation (576 of 1032, 55.8%) yielded more texture features with reliable reproducibility (good to excellent in both intraobserver and interobserver analyses) (p < 0.001). RF and k-NN classifiers failed to achieve better classification performance using 3D features than using 2D features either alone or with SMOTE. The best models for 2D and 3D segmentations were both created using RF, which alone achieved areas under the curve (AUCs) of 0.880 and 0.847, respectively, and with SMOTE, AUCs of 0.953 and 0.920, respectively, were achieved. CONCLUSIONS: Three-dimensional segmentation had better CT texture feature reproducibility, but 2D segmentation showed better performance. Considering the cost, 2D segmentation is more recommended for ML-based classification of HPV status of OPSCC. KEY POINTS: • Three-dimensional segmentation had better CT texture feature reproducibility than 2D segmentation. • Despite yielding more features with reliable reproducibility, 3D segmentation failed to provide better classification performance as compared to 2D for predicting HPV status of oropharyngeal squamous cell carcinoma. • The best models for 2D and 3D segmentations were both created using random forest classifier.
OBJECTIVE: To compare the CT texture feature reproducibility of 2D and 3D segmentations and their machine learning (ML)-based classifications for predicting human papilloma virus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC). MATERIALS AND METHODS: Data about 47 patients with pathological OPSCC (15 HPV positive and 32 HPV negative) were collected from a public database. Using 2D and 3D manual segmentations, 1032 texture features were extracted from contrast-enhanced CT images. Intraclass correlation coefficients (ICCs) were calculated to evaluate intraobserver and interobserver reproducibility. Collinearity analysis and a wrapper-based subset search algorithm were used for feature selection. Models were created using k-nearest neighbors (k-NN), logistic regression (LR), and random forest (RF) alone and with a synthetic minority oversampling technique (SMOTE). Classifier performance was assessed using 10-fold cross-validation. RESULTS: Compared with 2D segmentation (468 of 1032, 45.3%), 3D segmentation (576 of 1032, 55.8%) yielded more texture features with reliable reproducibility (good to excellent in both intraobserver and interobserver analyses) (p < 0.001). RF and k-NN classifiers failed to achieve better classification performance using 3D features than using 2D features either alone or with SMOTE. The best models for 2D and 3D segmentations were both created using RF, which alone achieved areas under the curve (AUCs) of 0.880 and 0.847, respectively, and with SMOTE, AUCs of 0.953 and 0.920, respectively, were achieved. CONCLUSIONS: Three-dimensional segmentation had better CT texture feature reproducibility, but 2D segmentation showed better performance. Considering the cost, 2D segmentation is more recommended for ML-based classification of HPV status of OPSCC. KEY POINTS: • Three-dimensional segmentation had better CT texture feature reproducibility than 2D segmentation. • Despite yielding more features with reliable reproducibility, 3D segmentation failed to provide better classification performance as compared to 2D for predicting HPV status of oropharyngeal squamous cell carcinoma. • The best models for 2D and 3D segmentations were both created using random forest classifier.
Entities:
Keywords:
Human papilloma virus; Machine learning; Multidetector computed tomography; Squamous cell carcinoma of head and neck