| Literature DB >> 31218217 |
Abstract
The purpose of this article is to provide an accessible introduction to foundational statistical procedures and present the steps of data analysis to address research questions and meet standards for scientific rigour. It is aimed at individuals new to research with less familiarity with statistics, or anyone interested in reviewing basic statistics. After examining a brief overview of foundational statistical techniques, for example, differences between descriptive and inferential statistics, the article illustrates 10 steps in conducting statistical analysis with examples of each. The following are the general steps for statistical analysis: (1) formulate a hypothesis, (2) select an appropriate statistical test, (3) conduct a power analysis, (4) prepare data for analysis, (5) start with descriptive statistics, (6) check assumptions of tests, (7) run the analysis, (8) examine the statistical model, (9) report the results and (10) evaluate threats to validity of the statistical analysis. Researchers in family medicine and community health can follow specific steps to ensure a systematic and rigorous analysis.Entities:
Year: 2019 PMID: 31218217 PMCID: PMC6583801 DOI: 10.1136/fmch-2018-000067
Source DB: PubMed Journal: Fam Med Community Health ISSN: 2305-6983
Descriptive statistics
| Statistic | Statistic | Description of calculation | Intent |
| Measures of central tendency | Mean | Total of values divided by the number of values. | Describe all responses with the average value. |
| Median | Arrange all values in order and determine the halfway point. | Determine the middle value among all values, which is important when dealing with extreme outliers. | |
| Mode | Examine all values and determine which one appears most frequently. | Describe the most common value. | |
| Measures of variability | Variance | Calculate the difference of each value from the mean, square this difference score, sum all of the squared difference scores and divide by the number of values minus 1. | Provide an indicator of spread. |
| Standard deviation | Square root of variance. | Give an indicator of spread by reporting on average how much values differ from the mean. | |
| Range | The difference between the maximum and minimum value. | Give a very general indicator of spread. | |
| Frequencies | Count the number of occurrences of each value. | Provide a distribution of how many times each value occurs. |
Inferential statistics
| Statistic | Intent |
| t tests | Compare groups to examine whether means between two groups are statistically significant. |
| Analysis of variance | Compare groups to examine whether means among two or more groups are statistically significant. |
| Correlation | Examine whether there is a relationship or association between two or more variables. |
| Regression | Examine how one or more variables predict another variable. |
Choosing and interpreting statistics for studies common in primary care
| I want to | Statistical choice | Independent variable | Dependent variable | How to interpret |
| Examine trends or distributions. | Descriptive statistics | Categorical or continuous | Categorical or continuous | Report the statistic as is to describe the data set. |
| Compare group means. | t tests | Categorical with two levels (ie, two groups) | Continuous | Examine the t statistic and significance level. |
| Compare group means. | Analysis of variance | Categorical with two or more levels (ie, two or more groups) | Continuous | Examine the |
| Examine whether variables are associated. | Correlation | Continuous | Continuous | Examine the r statistic and significance level. |
| Gain a detailed understanding of the association of variables and use one or more variables to predict another. | Regression | Continuous or categorical, may have more than one independent variable in multiple regression | Continuous | Examine the |
Threats to statistical conclusion validity
| Threat | Description |
| Low statistical power (see step 3) | The sample size is not adequate to detect an effect. |
| Violated assumptions of statistical tests (see step 6) | The data violate assumptions needed for the test, such as normality. |
| Fishing and error rates | Repeated tests of the same data (eg, multiple comparisons) increase chances of errors in conclusions. |
| Unreliability of measures | Error in measurement or instruments can artificially inflate or decrease apparent relationships among variables. |
| Restricted range | Statistics can be biased by limited outcome values (eg, high/low only) or floor or ceiling effects in which participants scores are clustered around high or low values. |
| Unreliability of treatment implementation | In experiments, unstandardised or inconsistent implementation affects conclusions about correlation. |
| Extraneous variance in an experiment | The setting of a study can introduce error. |
| Heterogeneity of units | As participants differ within conditions, standard deviation can increase and introduce error, making it harder to detect effects. |
| Inaccurate effect size estimation | Outliers or incorrect effect size calculations (eg, a continuous measure for a dichotomous dependent variable) can skew measures of effect. |