| Literature DB >> 24325128 |
U Rajendra Acharya1, S Vinitha Sree, Sanjeev Kulshreshtha, Filippo Molinari, Joel En Wei Koh, Luca Saba, Jasjit S Suri.
Abstract
Ovarian cancer is the fifth highest cause of cancer in women and the leading cause of death from gynecological cancers. Accurate diagnosis of ovarian cancer from acquired images is dependent on the expertise and experience of ultrasonographers or physicians, and is therefore, associated with inter observer variabilities. Computer Aided Diagnostic (CAD) techniques use a number of different data mining techniques to automatically predict the presence or absence of cancer, and therefore, are more reliable and accurate. A review of published literature in the field of CAD based ovarian cancer detection indicates that many studies use ultrasound images as the base for analysis. The key objective of this work is to propose an effective adjunct CAD technique called GyneScan for ovarian tumor detection in ultrasound images. In our proposed data mining framework, we extract several texture features based on first order statistics, Gray Level Co-occurrence Matrix and run length matrix. The significant features selected using t-test are then used to train and test several supervised learning based classifiers such as Probabilistic Neural Networks (PNN), Support Vector Machine (SVM), Decision Tree (DT), k-Nearest Neighbor (KNN), and Naive Bayes (NB). We evaluated the developed framework using 1300 benign and 1300 malignant images. Using 11 significant features in KNN/PNN classifiers, we were able to achieve 100% classification accuracy, sensitivity, specificity, and positive predictive value in detecting ovarian tumor. Even though more validation using larger databases would better establish the robustness of our technique, the preliminary results are promising. This technique could be used as a reliable adjunct method to existing imaging modalities to provide a more confident second opinion on the presence/absence of ovarian tumor.Entities:
Mesh:
Year: 2013 PMID: 24325128 PMCID: PMC4527478 DOI: 10.7785/tcrtexpress.2013.600273
Source DB: PubMed Journal: Technol Cancer Res Treat ISSN: 1533-0338
Figure 1:Sample ultrasound images of (A) benign ovarian tumor (upper panels) (B) malignant ovarian tumor (bottom panels).
Figure 2:Block diagram of the proposed system GyneScan™ for ovarian tumor detection.
Definition of first order statistical features.
| S. No. | Features | Description |
|---|---|---|
| 1 | Mean (m) |
|
| 2 | Variance (σ2) |
|
| 3 | Skewness (Sk) |
|
| 4 | Kurtosis (Kt) |
|
| 5 | Energy (E) |
|
Description of GLCM based textural features.
| S. No. | Haralick feature | Description |
|---|---|---|
| 1 | Contrast |
|
| 2 | Autocorrelation |
|
| 3 | Maximum probability |
|
| 4 | Dissimilarity |
|
| 5 | Homogeneity |
|
| 6 | Entropy |
|
| 7 | Energy |
|
| 8 | Correlation |
|
| 9 | Cluster shade |
|
| 10 | Variance |
|
| 11 | Sum average |
|
| 12 | Sum entropy |
|
| 13 | Sum variance |
|
| 14 | Difference variance |
|
| 15 | Difference entropy |
|
| 16 | Information correlation measure 1 |
|
| 17 | Information correlation measure 2 |
|
Description of run length matrix based textural features.
| S.No | Feature | Description |
|---|---|---|
| 1 | Short Run Emphasis (SRE) |
|
| 2 | Long Run Emphasis (LRE) |
|
| 3 | Gray-level Non-uniformity (GLNU) |
|
| 4 | Run length Non-uniformity (RLNU) |
|
| 5 | Run Percentage (RP) |
|
| 6 | Low Gray-level Run Emphasis (LGRE) |
|
| 7 | High Gray-level Run Emphasis (HGRE) |
|
| 8 | Short Run Low Gray-level Run Emphasis (SRLGE) |
|
| 9 | Short Run High Gray-level Run Emphasis (SRHGE) |
|
| 10 | Long Run Low Gray-level Run Emphasis (LRLGE) |
|
| 11 | Long Run High Gray-level Run Emphasis (LRHGE) |
|
Results of (Mean ± SD) for various features extracted.
| Benign | Malignant | |||
|---|---|---|---|---|
| Rank of feature using mRMR-MIQ | Mean ± SD | Mean ± SD | p-value | |
| Autocorrelation | 1 | 18.962 ± 4.132 | 18.030 ± 3.516 | <0.0001 |
| Homogeneity 90 | 27 | 0.705 ± 0.054 | 0.726 ± 0.066 | <0.0001 |
| Dissimilarity | 3 | 0.799 ± 0.180 | 0.720 ± 0.209 | <0.0001 |
| Max probability | 2 | 0.151 ± 0.122 | 0.179 ± 0.150 | <0.0001 |
| Contrast 0 | 13 | 0.930 ± 0.313 | 0.813 ± 0.301 | <0.0001 |
| Information correlation measure 2 | 12 | 0.801 ± 0.065 | 0.829 ± 0.068 | <0.0001 |
| Sum variance | 8 | 43.727 ± 9.655 | 41.503 ± 7.589 | <0.0001 |
| Cluster shade | 5 | 12.815 ± 19.309 | 20.148 ± 29.866 | <0.0001 |
| Correlation 90 | 19 | 0.799 ± 0.080 | 0.831 ± 0.087 | <0.0001 |
| Energy 0 | 21 | 0.076 ± 0.051 | 0.092 ± 0.090 | <0.0001 |
| Energy 135 | 24 | 0.061 ± 0.050 | 0.079 ± 0.091 | <0.0001 |
| Energy 90 | 23 | 0.067 ± 0.050 | 0.084 ± 0.091 | <0.0001 |
| Skewness | 29 | 0.264 ± 0.326 | 0.333 ± 0.362 | <0.0001 |
| Homogeneity 45 | 26 | 0.661 ± 0.063 | 0.687 ± 0.076 | <0.0001 |
| Energy 45 | 22 | 0.061 ± 0.050 | 0.079 ± 0.091 | <0.0001 |
| Run length non-uniformity | 35 | 3538.049 ± 981.493 | 3056.297 ± 1039.805 | <0.0001 |
| Short run low gray-level run emphasis | 38 | 0.103 ± 0.117 | 0.111 ± 0.102 | 0.047 |
| Variance | 31 | 4600.694 ± 613.812 | 4817.160 ± 714.229 | <0.0001 |
| Kurtosis | 30 | 2.240 ± 0.413 | 2.292 ± 0.457 | 0.002 |
| Long run high gray-level run emphasis | 40 | 6240082.394 ± 10506429.767 | 12769818.857 ± 12309279.552 | <0.0001 |
| Gray-level non-uniformity | 33 | 14966.417 ± 10250.377 | 24261.805 ± 15036.827 | <0.0001 |
| Run percentage | 34 | 0.629 ± 0.148 | 0.567 ± 0.148 | <0.0001 |
| High gray-level run emphasis | 37 | 3369.684 ± 2831.230 | 5109.719 ± 2934.456 | <0.0001 |
| Low gray-level run emphasis | 36 | 3.714 ± 7.851 | 5.168 ± 6.819 | <0.0001 |
| Short run emphasis | 32 | 0.768 ± 0.050 | 0.747 ± 0.065 | <0.0001 |
| Long run low gray-level run emphasis | 39 | 9904.270 ± 27160.338 | 14878.838 ± 23762.561 | <0.0001 |
| Entropy | 4 | 3.320 ± 0.323 | 3.212 ± 0.432 | <0.0001 |
| Sum average | 6 | 7.877 ± 1.231 | 7.580 ± 1.145 | <0.0001 |
| Sum entropy | 7 | 2.520 ± 0.162 | 2.487 ± 0.253 | <0.0001 |
| Difference variance | 9 | 1.550 ± 0.487 | 1.330 ± 0.535 | <0.0001 |
| Difference entropy | 10 | 1.158 ± 0.135 | 1.088 ± 0.181 | <0.0001 |
| Information correlation measure 1 | 11 | -0.292 ± 0.092 | -0.336 ± 0.111 | <0.0001 |
| Contrast 45 | 14 | 1.886 ± 0.600 | 1.615 ± 0.659 | <0.0001 |
| Contrast 90 | 15 | 1.485 ± 0.488 | 1.273 ± 0.546 | <0.0001 |
| Contrast 135 | 16 | 1.899 ± 0.593 | 1.619 ± 0.655 | <0.0001 |
| Correlation 0 | 17 | 0.875 ± 0.049 | 0.893 ± 0.050 | <0.0001 |
| Correlation 45 | 18 | 0.745 ± 0.098 | 0.786 ± 0.106 | <0.0001 |
| Correlation 135 | 20 | 0.744 ± 0.097 | 0.785 ± 0.106 | <0.0001 |
| Homogeneity 0 | 25 | 0.753 ± 0.049 | 0.772 ± 0.057 | <0.0001 |
| Homogeneity 135 | 28 | 0.661 ± 0.062 | 0.687 ± 0.075 | <0.0001 |
Results of average accuracy, sensitivity, specificity and PPV for various classifiers.
| Classifiers | No. of features | Accuracy (%) | PPV (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|---|
| SVM, RBF | 31 | 100.00 | 100.00 | 100.00 | 100.00 |
| SVM, linear | 40 | 84.73 | 87.59 | 81.00 | 88.46 |
| SVM, quadratic | 38 | 100.00 | 100.00 | 100.00 | 100.00 |
| SVM, poly3 | 15 | 100.00 | 100.00 | 100.00 | 100.00 |
| Decision tree | 22 | 98.54 | 98.92 | 98.15 | 98.92 |
| KNN | 11 | 100.00 | 100.00 | 100.00 | 100.00 |
| Naïve bayes | 3 | 67.35 | 69.93 | 60.62 | 74.08 |
| PNN | 11 | 100.00 | 100.00 | 100.00 | 100.00 |
Summary of results of CAD based studies for ovarian tumor classification.
| Literature | No. of samples | Features | Classifier | Performance |
|---|---|---|---|---|
| Renz | Benign, early stage and late stage cancers (55 cases) | Blood test data and age | Multilayer perceptron | Accuracy: 92.9% |
| Assareh and Moradi (48) | Dataset 1: | Three significant biomarkers from protein mass spectra | Two fuzzy linguistic rules | Dataset 1: Accuracy: 100% |
| Tan | 24 normal, 30 cancers | DNA micro-array, blood test, and proteomics data | Complementary Learning Fuzzy Neural Network | Accuracy: 84.72% |
| Tang et al. (27) | 95 normal, 121 cancers | Four statistical moments (mean, variance, skewness and kurtosis) obtained from SELDI-TOF mass spectroscopy data | Kernel partial least square classifier | Accuracy: 99.35% |
| Petricoin (28) | 66 benign, 50 cancers | Proteomic spectra | Genetic algorithm with self organizing cluster analysis | Sensitivity: 100% |
| Tailor | 52 benign, 15 cancers | Clinical and ultrasound based variables from TVUS images | Back propagation neural network | Sensitivity: 100% |
| Biagiotti | 175 benign, 51 cancers | Age and parameters from TVUS images | Three layer back propagation network | Sensitivity: 96% |
| Zimmer | - | B-scan ultrasound images | Morphological Analysis | Accuracy: 70% |
| Lucidarme | 234 benign, 141 cancers | Quantification of tissue disorganization in backscattered ultrasound (3D TVUS) | Ovarian HistoScanning (OHS) system | Sensitivity: 98% |
| Acharya | 1000 benign, 1000 cancers | Local Binary Pattern 1 Law’s Mask Energy | SVM classifier | Sensitivity: 100% |
| Acharya | 1300 benign, 1300 cancers | Hu’s invariant moments 1 Gabor wavelet features 1 Entropies | PNN classifier, tuned with genetic algorithm | Sensitivity: 99.2% |
| Acharya | 1000 benign, 1000 cancers | Texture and higher-order spectra based features | DT classifier | Sensitivity: 94.3% |
| Proposed method | 1300 benign, 1300 cancers | Features based on first order statistics, GLCM and run length matrix | KNN/PNN classifiers | Sensitivity: 100% |