Jianfang Liu1, Wei Guo1, Piaoe Zeng1, Yayuan Geng2, Yan Liu3, Hanqiang Ouyang4, Ning Lang1, Huishu Yuan5. 1. Department of Radiology, Peking University Third Hospital, 49 North Garden Road, Haidian District, Beijing, 100191, People's Republic of China. 2. Huiying Medical Technology Co., Ltd., Dongsheng Science and Technology Park, HaiDian District, Beijing, 100192, People's Republic of China. 3. Department of Hematology, Lymphoma Research Center, Peking University Third Hospital, 49 North Garden Road, Haidian District, Beijing, 100191, People's Republic of China. 4. Department of Orthopedic, Peking University Third Hospital, 49 North Garden Road, Haidian District, Beijing, 100191, People's Republic of China. 5. Department of Radiology, Peking University Third Hospital, 49 North Garden Road, Haidian District, Beijing, 100191, People's Republic of China. huishuy677@163.com.
Abstract
OBJECTIVES: This study aimed to use the most frequent features to establish a vertebral MRI-based radiomics model that could differentiate multiple myeloma (MM) from metastases and compare the model performance with different features number. METHODS: We retrospectively analyzed conventional MRI (T1WI and fat-suppression T2WI) of 103 MM patients and 138 patients with metastases. The feature selection process included four steps. The first three steps defined as conventional feature selection (CFS), carried out 50 times (ten times with 5-fold cross-validation), included variance threshold, SelectKBest, and least absolute shrinkage and selection operator. The most frequent fixed features were selected for modeling during the last step. The number of events per independent variable (EPV) is the number of patients in a smaller subgroup divided by the number of radiomics features considered in developing the prediction model. The EPV values considered were 5, 10, 15, and 20. Therefore, we constructed four models using the top 16, 8, 6, and 4 most frequent features, respectively. The models constructed with features selected by CFS were also compared. RESULTS: The AUCs of 20EPV-Model, 15EPV-Model, and CSF-Model (AUC = 0.71, 0.81, and 0.78) were poor than 10EPV-Model (AUC = 0.84, p < 0.001). The AUC of 10EPV-Model was comparable with 5EPV-Model (AUC = 0.85, p = 0.480). CONCLUSIONS: The radiomics model constructed with an appropriate small number of the most frequent features could well distinguish metastases from MM based on conventional vertebral MRI. Based on our results, we recommend following the 10 EPV as the rule of thumb for feature selection. KEY POINTS: • The developed radiomics model could distinguish metastases from multiple myeloma based on conventional vertebral MRI. • An accurate model based on just a handful of the most frequent features could be constructed by utilizing multiple feature reduction techniques. • An event per independent variable value of 10 is recommended as a rule of thumb for modeling feature selection.
OBJECTIVES: This study aimed to use the most frequent features to establish a vertebral MRI-based radiomics model that could differentiate multiple myeloma (MM) from metastases and compare the model performance with different features number. METHODS: We retrospectively analyzed conventional MRI (T1WI and fat-suppression T2WI) of 103 MM patients and 138 patients with metastases. The feature selection process included four steps. The first three steps defined as conventional feature selection (CFS), carried out 50 times (ten times with 5-fold cross-validation), included variance threshold, SelectKBest, and least absolute shrinkage and selection operator. The most frequent fixed features were selected for modeling during the last step. The number of events per independent variable (EPV) is the number of patients in a smaller subgroup divided by the number of radiomics features considered in developing the prediction model. The EPV values considered were 5, 10, 15, and 20. Therefore, we constructed four models using the top 16, 8, 6, and 4 most frequent features, respectively. The models constructed with features selected by CFS were also compared. RESULTS: The AUCs of 20EPV-Model, 15EPV-Model, and CSF-Model (AUC = 0.71, 0.81, and 0.78) were poor than 10EPV-Model (AUC = 0.84, p < 0.001). The AUC of 10EPV-Model was comparable with 5EPV-Model (AUC = 0.85, p = 0.480). CONCLUSIONS: The radiomics model constructed with an appropriate small number of the most frequent features could well distinguish metastases from MM based on conventional vertebral MRI. Based on our results, we recommend following the 10 EPV as the rule of thumb for feature selection. KEY POINTS: • The developed radiomics model could distinguish metastases from multiple myeloma based on conventional vertebral MRI. • An accurate model based on just a handful of the most frequent features could be constructed by utilizing multiple feature reduction techniques. • An event per independent variable value of 10 is recommended as a rule of thumb for modeling feature selection.