| Literature DB >> 26543767 |
Aslı Suner1, Gökhan Karakülah2, Özgün Koşaner3, Oğuz Dicle4.
Abstract
The improper use of statistical methods is common in analyzing and interpreting research data in biological and medical sciences. The objective of this study was to develop a decision support tool encompassing the commonly used statistical tests in biomedical research by combining and updating the present decision trees for appropriate statistical test selection. First, the decision trees in textbooks, published articles, and online resources were scrutinized, and a more comprehensive unified one was devised via the integration of 10 distinct decision trees. The questions also in the decision steps were revised by simplifying and enriching of the questions with examples. Then, our decision tree was implemented into the web environment and the tool titled StatXFinder was developed. Finally, usability and satisfaction questionnaires were applied to the users of the tool, and StatXFinder was reorganized in line with the feedback obtained from these questionnaires. StatXFinder provides users with decision support in the selection of 85 distinct parametric and non-parametric statistical tests by directing 44 different yes-no questions. The accuracy rate of the statistical test recommendations obtained by 36 participants, with the cases applied, were 83.3 % for "difficult" tests, and 88.9 % for "easy" tests. The mean system usability score of the tool was found 87.43 ± 10.01 (minimum: 70-maximum: 100). A statistically significant difference could not be seen between total system usability score and participants' attributes (p value >0.05). The User Satisfaction Questionnaire showed that 97.2 % of the participants appreciated the tool, and almost all of the participants (35 of 36) thought of recommending the tool to the others. In conclusion, StatXFinder, can be utilized as an instructional and guiding tool for biomedical researchers with limited statistics knowledge. StatXFinder is freely available at http://webb.deu.edu.tr/tb/statxfinder.Entities:
Keywords: Biomedical research; Decision support; Non-parametric statistical tests; Parametric statistical tests; Statistical test selection
Year: 2015 PMID: 26543767 PMCID: PMC4627976 DOI: 10.1186/s40064-015-1421-9
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Fig. 1Schematic diagram of our study workflow
Statistical test selection approaches included in the study after manual search, and their corresponding tests
| Reference | No of tests | Suggested statistical tests |
|---|---|---|
| Jaykaran | 25 | One sample t test, unpaired t test, paired t test, one-way ANOVA, repeated measures ANOVA, Pearson correlation, simple linear regression, simple logistic regression, multiple linear regression, multiple nonlinear regression, multiple logistic regression, statistics for one group description (mean, standard deviation, median, interquartile range, proportion), Wilcoxon rank sum test, one sample binomial test with exact methods, Mann–Whitney U test, Fisher’s exact test, Wilcoxon signed rank test, McNemar’s test, Kruskal–Wallis H test, Chi-square test of independence, Friedman test, Cochrane Q test, Spearman’s correlation, contingency coefficients, nonparametric regression |
| Twycross and Shields | 11 | Unpaired (independent) t test, paired (dependent) t test, one-way ANOVA, repeated measures ANOVA, Mann–Whitney U test, Wilcoxon signed rank test, Kruskal–Wallis H test, Chi-square test of independence, Friedman test, Spearman’s correlation, Kendall’s coefficient of concordance |
| Gunawardena | 14 | Unpaired (independent) t test, paired (dependent) t test, one-way ANOVA, repeated measures ANOVA, Pearson correlation, multiple linear regression, Mann–Whitney U test, Wilcoxon signed rank test, McNemar’s test, Kruskal–Wallis H test, Chi-square test of independence, Friedman test, Spearman’s correlation, contingency coefficients |
| Marusteri and Bacarea | 11 | One sample t test, unpaired (independent) t test, paired (dependent) t test, one-way ANOVA, repeated measures ANOVA, Welch’s corrected unpaired t test, Wilcoxon rank sum test, Mann–Whitney U test, Wilcoxon signed rank test, Kruskal–Wallis H test, Friedman test |
| Gaddis and Gaddis | 9 | Mann–Whitney U test, Fisher’s exact test, Wilcoxon signed rank test, Chi-square test of independence, Kruskal–Wallis H test, Friedman test, Chi-square goodness of fit test, RxC (Rows by Columns) test, Kolmogorov-Simirnov test |
| Nayak and Hazra | 30 | Unpaired (independent) t test, paired (dependent) t test, one-way ANOVA, repeated measures ANOVA, Pearson correlation, Wilcoxon rank sum test, Mann–Whitney U test, Fisher’s exact test, Wilcoxon signed rank test, McNemar’s test, Kruskal–Wallis H test, Friedman test, Cochrane Q test, Spearman’s correlation, Kendall’s coefficient of concordance, RxC test, Tukey’s HSD Test, Newman-Keuls test, Bonferonni’s test, Dunnet’s test, Scheffe’s test, Dunn’s test, risk ratio, odds ratio, Chi-square test for trend, logistic regression, interclass correlation coefficient, Bland–Altman plot, Cohen’s Kappa statistics, Chi-square test for 2 × 2 table |
| McCrum-Gardner | 13 | Unpaired (independent) t test, One-way ANOVA, Repeated measures ANOVA, Mann–Whitney U test, Paired (dependent) t test, Wilcoxon signed rank test, McNemar’s test, Kruskal–Wallis H test, Friedman test, Cochrane Q test, RxC test, Chi-square test for 2 × 2 table, Chi square test for 2 × C table |
| UCLA | 31 | One sample t test, unpaired (independent) t test, paired (dependent) t test, one-way ANOVA, repeated measures ANOVA, Pearson correlation, simple linear regression, simple logistic regression, multiple linear regression, multiple logistic regression, repeated measures logistic regression, factorial ANOVA, ordered logistic regression, factorial logistic regression, one-way ANCOVA, one sample binomial test with exact methods, Mann–Whitney U test, Fisher’s exact test, Wilcoxon signed rank test, Kruskal–Wallis H test, Chi-square test of independence, Friedman test, one-sample median, Chi-square goodness of fit test, McNemar’s test, Spearman’s correlation, one-way MANOVA, multivariate multiple linear regression, factor analysis, canonical correlation, discriminant analysis |
| Mertler | 17 | Pearson correlation, simple linear regression, multiple linear regression, path analysis, unpaired (independent) t test, one-way ANOVA, one-way ANCOVA, factorial ANOVA, factorial ANCOVA, one-way MANOVA, one-way MANCOVA, factorial MANOVA, factorial MANCOVA, simple logistic regression, discriminant analysis, factor analysis, principal components analysis |
| Rosner | 36 | One sample t test, One sample binomial test with exact methods, Paired (dependent) t test, Unpaired (independent) t test, One-way ANOVA, Pearson correlation, Simple linear regression, Multiple linear regression, Multiple logistic regression, Welch’s corrected unpaired t test, One-way ANCOVA, One sample z-test, One sample binomial test with normal theory methods, One sample Poisson test, Two-sample F test to compare variances, Nonparametric methods for two-sample problem, Two-way ANOVA, Two-way ANCOVA, Higher-way ANOVA, Higher-way ANCOVA, Two sample test for comparison of incidence rates, One-sample test for incidence rates, Test of trend for incidence rates, Fisher’s exact test, McNemar’s test, Kruskal–Wallis H test, Spearman’s correlation, Cohen’s Kappa statistics, Chi-square test for 2 × 2 table, Chi-square test for 2 × C table, Nonparametric methods for one-sample problem, Nonparametric methods for more than two samples problem, Chi-square test for trends, Log-rank test, Cox proportional hazards model, Chi-square test for heterogeneity for R × C tables |
Fig. 2An example of a decision tree for appropriate statistical test selection
Fig. 3The screenshots of StatXFinder. a StatXFinder optionally provides explanations for each question. b If there is any statistical term in the question, the explanation of that term can be viewed by moving the cursor over that term. c By giving the required answers at each decision step, StatXFinder ends the decision process by offering a recommendation
The questions directed in a particular order for an example hypothesis test
| Step | Questions | The statistical terms observed in the questions* | Definitions | Answers |
|---|---|---|---|---|
| 1 | Does your data set have only one variable? | Variablea | Since the example only addresses the values that a single variable, in the form of systolic blood pressure (mm/hg), could take, the answer to this question should be “yes” | Yes |
| 2 | Is it a “one sample” problem? | Sampleb | Since the experimenter obtained measurement values from the patient before and after administering drugs, there are two groups of data in question | No |
| 3 | Is it a “two samples” problem? | Sampleb | The experimenter has two groups of data such as pre and post administering | Yes |
| 4 | Does your data appear to be normally distributed (bell-shaped curve)? | Normally distributedc | The answer is given as “no” since the analysis was conducted with the assumption that the measurement values were not normally distributed and the sample size is smaller than 30 | No |
| 5 | Does your data have a binomial distribution? | Binomial distributiond | Since systolic blood pressure (values) do not have two possible outcomes as “success” and “failure”, it does not show a binomial distribution and the answer for this question should be “no” | No |
| 6 | Do you have person-time data? | Person-time datae | Since the blood pressure measurement value is not a variable which is observed over time such as some individuals who developed lung cancer over a year (time) this question was answered as “no” | No |
| 7 | Are your samples (=groups) independent? | Sampleb, independentf | Since two measurements were conducted on the same group before and after administering drugs to the patients, such as pre-treatment vs. post-treatment, the systolic blood pressure values measured in these two groups were dependent to each other; therefore, the question was answered as no | No |
| Recommendation—use Wilcoxon signed rank test or Sign test | N/A | |||
aA characteristic that consists of two or more categories or values, and that differs from subject to subject or from time to time. Categories such as occupation or nationality, or values such as age or intelligence score are examples of variable. The opposite of variable is constant. The term variable is often used as a shortened form of random variable
bA subset of cases drawn or selected, according to some specified criteria, from a larger set or population of cases with the purpose of estimating characteristics of the larger set or population, drawing inferences about the these characteristics and generalizing results from sample to population. A sample should be representative of the population from which it is drawn in order to be useful. For instance, to find out the relationship between drug abuse and mental health, it would be possible or practical to investigate this relationship by taking a sample of the population. Thus, it would be possible to determine to what extent this relationship is likely to be found in the population
cA theoretical distribution which shows the frequency or probability of all the possible values that a continuous variable can take. This distribution is bell shaped. The horizontal axis of the distribution represents all possible values of the variable while the vertical axis represents the frequency or probability of those values. In any normal distribution: (1) 68 % of the observations fall within σ of the mean μ, (2) 95 % of the observations fall within 2σ of μ, and (3) 99.7 % of the observations fall within 3σ of μ. This is known as 68–95–99.7 rule. It is also called the gaussian distribution
dThe probability distribution of the number of successes in n independent Bernoulli trials, such as a person passing or failing or being a woman or a man, where each trial has two outcomes (conveniently labeled success and failure), and the probability of success p is the same for each trial
eData referring to a measurement obtained by combining person data and time data. It is obtained as the sum of individual units of time that the subjects in the study population have been exposed to certain risk. It can also be obtained as the number of persons at risk of the event of interest multiplied by the average length of the study period
fIndependence is a characteristic of observations or random events. Essentially, the term is used to describe the property of independence of events or sample observations. It is an assumption required by many statistical tests. Independent variable is the variable in an experiment that is under the control of, and may be manipulated by, the experimenter. In regression analysis it is the variable being used to regress or predict the value of the dependent variable. It is also commonly known as regressor, predictor, or explanatory variable
The usability testing results of StatXFinder
| System Usability Scale items | Mean ± SD (n = 36) |
|---|---|
| 1. I think that I would like to use the StatXFinder frequently | 4.30 ± 0.86 |
| 2. I found the StatXFinder unnecessarily complex | 1.44 ± 0.94 |
| 3. I thought the StatXFinder was easy to use | 4.72 ± 0.57 |
| 4. I think that I would need the support of a technical person to be able to use the StatXFinder | 1.61 ± 1.10 |
| 5. I found the various functions in the StatXFinder were well integrated | 4.25 ± 0.99 |
| 6. I thought there was too much inconsistency in the StatXFinder | 1.25 ± 0.86 |
| 7. I would imagine that most people would learn to use the StatXFinder very quickly | 4.39 ± 0.80 |
| 8. I found the StatXFinder very cumbersome to use | 1.19 ± 0.47 |
| 9. I felt very confident using the StatXFinder | 4.61 ± 0.55 |
| 10. I needed to learn a lot of things before I could get going with the StatXFinder | 1.81 ± 0.92 |
| Total System Usability Scale score (0–100; Higher score means more user friendly tool) | 87.43 ± 10.01 |
SD standard deviation
The frequency table and descriptive statistics of total SUS score for users’ attributes
| Question | Answer | Frequency (%) | Descriptive Statistics for total SUS score (Mean ± SD), (Min–Max) | P-value |
|---|---|---|---|---|
| Academic title | MSc. Student | 16 (44.4 %) | 88.28 ± 11.09 (70–100) | 0.766a |
| MSc. | 2 (5.6 %) | 90 ± 0 (90–90) | ||
| PhD. Student | 2 (5.6 %) | 90 ± 0 (90–90) | ||
| PhD | 11 (30.6 %) | 87.95 ± 9.34 (72.5–97.5) | ||
| MD | 5 (13.9 %) | 81.50 ± 11.94 (70–100) | ||
| Gender | Female | 22 (61.1 %) | 86.82 ± 11.11 (70–100) | 0.961b |
| Male | 14 (38.9 %) | 88.39 ± 8.29 (70–97.5) | ||
| Age | Mean ± SD (Min–Max) | 33.47 ± 9.83 (23–58) | – | – |
| Level of computer use skills | Expert | 7 (19.4 %) | 90.71 ± 10.10 (75–100) | 0.530a |
| Advanced | 26 (72.2 %) | 87.02 ± 10.61 (70–100) | ||
| Average | 3 (8.3 %) | 83.33 ± 8.29 (70–97.5) | ||
| Elementary | – | – | ||
| Beginner | – | – | ||
| Level of english proficiency | Proficient | 6 (16.7 %) | 95.71 ± 3.16 (92.5–100) | 0.110a |
| Advanced | 22 (61.1 %) | 85.45 ± 10.11 (72.5–100) | ||
| Intermediate | 8 (22.2 %) | 87.19 ± 11.26 (70–97.5) | ||
| Elementary | – | – | ||
| Beginner | – | – | ||
| Level of statistics knowledge | Expert | 3 (8.3 %) | 90.83 ± 1.44 (90–92.5) | 0.077a |
| Advanced | 11 (30.6 %) | 92.50 ± 8.94 (75–100) | ||
| Average | 16 (44.4 %) | 82.81 ± 10.64 (70–100) | ||
| Elementary | 6 (16.7 %) | 88.75 ± 8.18 (72.5–95) | ||
| Beginner | – | – | ||
| Using a statistical software package | Yes | 28 (77.8 %) | 87.41 ± 10.01 (70–100) | 0.924b |
| No | 8 (22.2 %) | 87.5 ± 10.69 (70–100) |
SD standard deviation
aKruskal–Wallis test
bMann–Whitney U test