| Literature DB >> 24672565 |
Hamid Reza Marateb1, Marjan Mansourian2, Peyman Adibi3, Dario Farina4.
Abstract
BACKGROUND: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal-variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). ORDINAL-TO-INTERVAL SCALE CONVERSION EXAMPLE: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests.Entities:
Keywords: Biostatistics; breast cancer; cluster analysis; data mining; research design
Year: 2014 PMID: 24672565 PMCID: PMC3963323
Source DB: PubMed Journal: J Res Med Sci ISSN: 1735-1995 Impact factor: 1.852
Selecting the appropriate test for comparisons between two or more than two groups based on different scales
Selecting the appropriate test or modeling for different categories of dependent and independent variables
Figure 1An example of calculating the distance between two objects of ordinal variables, using the simple dissimilarity measure
The ordinal-to-interval conversion matrix for nine ordinal variables (columns) with 10 ranks (rows) studied on the WBCD using the clustering method #1
The performance of the clustering methods studied on the WBCD
Figure 2The clustering structures of WBCD, found by the second ordinal– variable clustering method. Each major valley (local minimum) of the reachability distance plot (RD-plot) corresponds with a possible cluster. In this example, the first cluster is the malignant group while the second one is the benign group.