| Literature DB >> 32528335 |
Karin Binder1, Stefan Krauss1, Patrick Wiesner1.
Abstract
In teaching statistics in secondary schools and at university, two visualizations are primarily used when situations with two dichotomous characteristics are represented: 2 × 2 tables and tree diagrams. Both visualizations can be depicted either with probabilities or with frequencies. Visualizations with frequencies have been shown to help students significantly more in Bayesian reasoning problems than probability visualizations do. Because tree diagrams or double-trees (which are largely unknown in school) are node-branch structures, these two visualizations (in contrast to the 2 × 2 table) can even simultaneously display probabilities on branches and frequencies inside the nodes. This is a teaching advantage as it allows the frequency concept to be used to better understand probabilities. However, 2 × 2 tables and (double-)trees have a decisive disadvantage: While joint probabilities [e.g., P(A∩B)] are represented in 2 × 2 tables but no conditional probabilities [e.g., P(A|B)], it is exactly the other way around with (double-)trees. Therefore, a visualization that is equally suitable for the representation of joint probabilities and conditional probabilities is desirable. In this article, we present a new visualization-the frequency net-in which all absolute frequencies and all types of probabilities can be depicted. In addition to a detailed theoretical analysis of the frequency net, we report the results of a study with 249 university students that shows that "net diagrams" can improve reasoning without previous instruction to a similar extent as 2 × 2 tables and double-trees. Regarding questions about conditional probabilities, frequency visualizations (2 × 2 table, double-tree, or net diagram with absolute frequencies) are consistently superior to probability visualizations, and the frequency net performs as well as the frequency double-tree. Only the 2 × 2 table with frequencies-the one visualization that participants were already familiar with-led to higher performance rates. If, on the other hand, a question about a joint probability had to be answered, all implemented visualizations clearly supported participants' performance, but no uniform format effect becomes visible. Here, participants reached the highest performance in the versions with probability 2 × 2 tables and probability net diagrams. Furthermore, after conducting a detailed error analysis, we report interesting error shifts between the two information formats and the different visualizations and give recommendations for teaching probability.Entities:
Keywords: Bayesian reasoning; conditional probabilities; frequency net; joint probabilities; natural frequencies
Year: 2020 PMID: 32528335 PMCID: PMC7264419 DOI: 10.3389/fpsyg.2020.00750
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
FIGURE 12 × 2 tables, tree diagrams, and double-trees (left in probabilities, right in frequencies) for the mammography problem.
Advantages and disadvantages of 2 × 2 tables, trees, double-trees, and net diagrams.
| Advantage | 2 × 2 table | Tree diagram | Double-tree | Net diagram |
| All joint probabilities can be displayed directly | ✓ | ✓ | ||
| All conditional probabilities can be displayed directly | (Only 4 out of 8) | ✓ | ✓ | |
| Probabilities and frequencies can be presented simultaneously | ✓ | ✓ | ✓ | |
| Both “reading directions” are equally evident | ✓ | ✓ | ✓ |
FIGURE 2Schematic net diagram for two abstract events A and B and their counter-events and , representing four marginal probabilities, four joint probabilities, and eight conditional probabilities.
FIGURE 3Net diagram with probabilities (top), frequencies (middle), or both information formats simultaneously for the mammography problem.
FIGURE 4Both possible tree diagrams and the 2 × 2 table are included in the net diagram. (A) Net diagram with highlighted tree A; (B) Net diagram with highlighted tree B; (C) Net diagram with highlighted frequency 2 × 2 table; (D) Net diagram with highlighted probability 2 × 2 table.
Correct solution and typical incorrect Bayesian strategies with regard to the correct solution “F out of D” in a typical Bayesian reasoning task (according to Gigerenzer and Hoffrage, 1995; Steckelberg et al., 2004; Zhu and Gigerenzer, 2006; Días and Batanero, 2009; Eichler and Böcherer-Linder, 2018; Bruckmaier et al., 2019).
| Probabilities (with b, c, d, etc.) | Frequencies (with A, B, C, etc.) | |
| Correct solution (Bayesian) | k = f/d = b⋅j/(b⋅j + m⋅c) | F out of D = F out of (F + G) |
| Joint occurrence ( | f = b⋅j = d⋅k | F out of A |
| Fisherian/Representative thinking/Transposed conditional ( | j = f/b | F out of B |
| Base rate only/Conservatism ( | b | B out of A |
| Evidence only ( | d = f + g = b⋅j + c⋅m | D out of A = (F + G) out of A |
| Likelihood substraction ( | j – m = f/b – g/c | (F out of B) – (G out of C) |
| Pre-Bayes ( | Not applicable | B out of D = B out of (F + G) |
| Correct positive rate/false positive rate ( | j/m | Not applicable |
FIGURE 5Schematic representation of 2 × 2 tables, double-trees, and net diagrams (left in probabilities, right in frequencies).
Problem formulations.
| Mammography problem | Economics problem | |||
| Probability version | Natural frequency version | Probability version | Natural frequency version | |
| Imagine you are a reporter for a women’s magazine and you want to write an article about breast cancer. As a part of your research, you focus on mammography as an indicator of breast cancer. You are especially interested in the question of what it means when a woman has a positive result (which indicates breast cancer) in such a medical test. A physician explains the situation with the following information: | Imagine you are interested in the question, of whether career-oriented students are more likely to attend an economics course. Therefore the school psychological service evaluates the correlations between personality characteristics and choice of courses for you. The following information is available: | |||
| • Text only (no visualization): The probability of breast cancer is 2% for a woman who participates in routine screening. If a woman who participates in routine screening has breast cancer, the probability is 80% that she will have a positive test result. If a woman who participates in routine screening does not have breast cancer, the probability is 10% that she will have a positive test result. | • Text only (no visualization): 200 out of 10,000 women who participate in routine screening have breast cancer. Out of 200 women who participate in routine screening and have breast cancer, 160 will have a positive result. Out of 9,800 women who participate in routine screening and have no breast cancer, 980 will also have a positive result. | • Text only (no visualization): The probability that a student attends the economics course is 32%. If a student attends the economics course, the probability that he is career-oriented is 64%. If a student does not attend the economics course, the probability that he is still career-oriented is 60%. | • Text only (no visualization): 320 out of 1,000 students attend the economics course. Out of 320 students who attend the economics course, 205 are career-oriented. Out of 680 students who not attend the economics course, 408 are still career-oriented. | |
| • 2 × 2 table (prob.), or • double-tree (prob.), or • net diagram (prob.) | • 2 × 2 table (nat. freq.), or • double-tree (nat. freq.), or • net diagram (nat. freq.) | • 2 × 2 table (prob.), or • double-tree (prob.), or • net diagram (prob.) | • 2 × 2 table (nat. freq.), or • double-tree (nat. freq.), or • net diagram (nat. freq.) | |
| What is the probability that a woman who participates in routine screening and receives a positive test result has breast cancer? | How many of the women who participate in routine screening and receive a positive test result have breast cancer? | What is the probability that a student attends the economics course if he is career-oriented? | How many of the students who are career-oriented attend the economics course? | |
| Answer: ____ out of ____ | Answer: _______ | Answer: ___ out of ____ | Answer: _______ | |
| What is the probability that a woman who participates in routine screening receives a negative test result | How many of the women who participate in routine screening receive a negative test result | What is the probability that a student attends the economics course | How many of the students are not career-oriented | |
| Answer: _______ | Answer: ____ out of ____ | Answer: _______ | Answer: ____ out of ____ | |
Mental steps that are necessary for answering each question.
| Required for answering | |||
| Visualization | Question for a conditional probability/frequency | Question for a joint probability/frequency | |
| Genuine inference necessary | Genuine inference necessary | ||
| Genuine inference necessary | Choose a number (probability) | ||
| Choose a number (probability) | Genuine inference necessary | ||
| Choose a number (probability) | Choose a number (probability) | ||
| Genuine inference necessary | Genuine inference necessary | ||
| Choose a pair of numbers (frequencies) | Choose a pair of numbers (frequencies) | ||
| Choose a pair of numbers (frequencies) | Choose a pair of numbers (frequencies) | ||
| Choose a pair of numbers (frequencies) | Choose a pair of numbers (frequencies) | ||
Design of the 16 tested problem versions.
| Context | |||
| Mammography problem | Economics problem | ||
| • Bayesian text | • Bayesian text | ||
| • Bayesian text | • Bayesian text | ||
FIGURE 6Percentages of correct inferences in the question for a conditional probability, separated for information format and visualization type (across both contexts).
FIGURE 7Percentages of correct inferences in the question for a joint probability, separated for information format and visualization type (across both contexts).
FIGURE 8Typical errors on the question for a conditional probability, separated for information format and visualization type (across both contexts). In particular, the two errors Fisherian and joint-occurrence could be observed.
FIGURE 9Typical errors on the question for a joint probability, separated for information format and visualization (across both contexts). In the versions with frequencies, two main errors can be observed: the confusion of the joint probability either with the conditional probability p or the conditional probability q. In the versions with probabilities, on the other hand, more diverse error patterns appear: Specific errors are provoked by the pure text version with probabilities and the probability double-trees.