Guohui Wei1, He Ma2, Wei Qian3, Min Qiu4. 1. Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang 110819, China. 2. Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang 110819, China and Key Laboratory of Medical Image Computing, Ministry of Education, Northeastern University, Shenyang 110819, China. 3. Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang 110819, China and College of Engineering, University of Texas at El Paso, El Paso, Texas 79968. 4. Affiliated Hospital of Jining Medical University, Jining 272029, China.
Abstract
PURPOSE: To develop a new algorithm to measure the similarity between the query lung mass and reference lung mass data set for content-based medical image retrieval (CBMIR). METHODS: A lung mass data set including 746 mass regions of interest (ROIs) was assembled. Among them, 375 ROIs depicted malignant lesions and 371 depicted benign lesions. Each mass ROI is represented by a vector of 26 texture features. A kernel function was employed to map the original data in input space to a feature space. In this space, a semisupervised distance metric was learned, which used differential scatter discriminant criterion to represent the semantic relevance, and the regularization term to represent the visual similarity. The learned distance metric can measure the similarity of the query mass and reference mass data set. The clustering accuracy is used to configure the parameters. The retrieval accuracy and classification accuracy are used as the performance assessment index. RESULTS: After configuring the parameters, a mean clustering accuracy of 0.87 can be achieved. For retrieval accuracy, our algorithm achieves better performance than other state-of-the-art retrieval algorithms when applying a leave-one-out validation method to the testing data set. For classification accuracy, the area under the ROC curve of our algorithm can be achieved as 0.941 ± 0.006. The running times of 346 query images with the proposed algorithm are 5.399 and 6.0 s, respectively. CONCLUSIONS: The study results demonstrated the proposed algorithm outperforms the compared algorithms, when taking the semantic relevant and visual similarity into account in kernel space. The algorithm can be used in a CBMIR system for a query mass to retrieve similarity masses, which can help doctors make better decisions.
PURPOSE: To develop a new algorithm to measure the similarity between the query lung mass and reference lung mass data set for content-based medical image retrieval (CBMIR). METHODS: A lung mass data set including 746 mass regions of interest (ROIs) was assembled. Among them, 375 ROIs depicted malignant lesions and 371 depicted benign lesions. Each mass ROI is represented by a vector of 26 texture features. A kernel function was employed to map the original data in input space to a feature space. In this space, a semisupervised distance metric was learned, which used differential scatter discriminant criterion to represent the semantic relevance, and the regularization term to represent the visual similarity. The learned distance metric can measure the similarity of the query mass and reference mass data set. The clustering accuracy is used to configure the parameters. The retrieval accuracy and classification accuracy are used as the performance assessment index. RESULTS: After configuring the parameters, a mean clustering accuracy of 0.87 can be achieved. For retrieval accuracy, our algorithm achieves better performance than other state-of-the-art retrieval algorithms when applying a leave-one-out validation method to the testing data set. For classification accuracy, the area under the ROC curve of our algorithm can be achieved as 0.941 ± 0.006. The running times of 346 query images with the proposed algorithm are 5.399 and 6.0 s, respectively. CONCLUSIONS: The study results demonstrated the proposed algorithm outperforms the compared algorithms, when taking the semantic relevant and visual similarity into account in kernel space. The algorithm can be used in a CBMIR system for a query mass to retrieve similarity masses, which can help doctors make better decisions.