| Literature DB >> 34295507 |
Domenico Viganola1, Grant Buckles2, Yiling Chen3, Pablo Diego-Rosell2, Magnus Johannesson4, Brian A Nosek5,6, Thomas Pfeiffer7, Adam Siegel8, Anna Dreber4,9.
Abstract
There is evidence that prediction markets are useful tools to aggregate information on researchers' beliefs about scientific results including the outcome of replications. In this study, we use prediction markets to forecast the results of novel experimental designs that test established theories. We set up prediction markets for hypotheses tested in the Defense Advanced Research Projects Agency's (DARPA) Next Generation Social Science (NGS2) programme. Researchers were invited to bet on whether 22 hypotheses would be supported or not. We define support as a test result in the same direction as hypothesized, with a Bayes factor of at least 10 (i.e. a likelihood of the observed data being consistent with the tested hypothesis that is at least 10 times greater compared with the null hypothesis). In addition to betting on this binary outcome, we asked participants to bet on the expected effect size (in Cohen's d) for each hypothesis. Our goal was to recruit at least 50 participants that signed up to participate in these markets. While this was the case, only 39 participants ended up actually trading. Participants also completed a survey on both the binary result and the effect size. We find that neither prediction markets nor surveys performed well in predicting outcomes for NGS2.Entities:
Keywords: hypothesis; peer beliefs; prediction markets
Year: 2021 PMID: 34295507 PMCID: PMC8278038 DOI: 10.1098/rsos.181308
Source DB: PubMed Journal: R Soc Open Sci ISSN: 2054-5703 Impact factor: 2.963
Prediction market studies to forecast the outcomes of systematic replication studies. Traders were endowed with US$100 [9] or US$50 [10–12], and bet on the binary outcome of whether a study would replicate or not by trading contracts worth $1 in case of successful replication, and $0 otherwise. (Successful replication was mainly defined as an effect in the same direction as the original study with a p-value less than 0.05.) ML2 additionally elicited predictions on the effect sizes in the replications, and SSRP had two treatments to adjust to the design used for the replications. Trading periods ranged from 10 days to two weeks, and traders were given detailed information about the original studies and the replications.
| number of replications in market | participants in the market | references | key characteristics | ||
|---|---|---|---|---|---|
| reproducibility project: psychology | RPP | 23 (Period 1) | 47 (Period 1) | replication results: [ | Studies were selected from top psychology journals in 2008. Prediction markets were conducted in two periods on a subset of all studies replicated in RPP. Traders were recruited through the email lists of the Open Science Framework (OSF) and the RPP collaboration. |
| 21 (Period 2) (results for 41 out of 44 ready and analysed) | 45 (Period 2) | market: [ | |||
| experimental economics replication project | EERP | 18 | 97 | [ | Experimental economics studies were selected from top economics journals 2011–2014. Traders were recruited mainly through the Economic Science Association email list. |
| social science replication project | SSRP | 21 | 114 (Treatment 1) | [ | Experimental social science studies were selected from |
| 92 (Treatment 2) | |||||
| ManyLabs2 | ML2 | 28 (results for 24 analysed) | 78 | [ | Twenty-eight new and classic effects in psychology were selected. Traders were recruited mainly through the email list of the Open Science Framework (OSF). |
Forecasts and observed outcomes.
| observed outcomes | survey forecasts | market forecasts | ||||
|---|---|---|---|---|---|---|
| hypothesis | binary | Cohen's | binary | Cohen's | binary | Cohen's |
| H1 | 0 | 0.01 | 0.62 | 0.28 | 0.58 | 0.14 |
| H2 | 1 | 0.30 | 0.60 | 0.27 | 0.72 | 0.04 |
| H3 | 1 | –1.40 | 0.72 | 0.18 | 0.89 | –0.08 |
| H4 | 0 | 0.07 | 0.54 | 0.13 | 0.60 | 0.07 |
| H5 | 0 | 0.07 | 0.54 | 0.18 | 0.47 | 0.16 |
| H6 | 1 | 0.03 | 0.42 | 0.01 | 0.48 | 0.05 |
| H7 | 0 | –0.02 | 0.42 | 0.08 | 0.53 | –0.06 |
| H8 | 0 | 0.05 | 0.36 | –0.01 | 0.38 | 0.15 |
| H9 | 0 | –0.05 | 0.68 | 0.32 | 0.69 | 0.20 |
| H10 | 0 | –0.15 | 0.53 | 0.17 | 0.42 | 0.18 |
| H11 | 1 | –0.27 | 0.47 | 0.11 | 0.61 | 0.10 |
| H12 | 0 | 0.01 | 0.59 | 0.15 | 0.45 | 0.10 |
| H13 | 0 | –0.01 | 0.65 | 0.24 | 0.74 | 0.35 |
| H14 | 0 | 0.00 | 0.46 | 0.12 | 0.43 | 0.80 |
| H15 | 0 | 0.01 | 0.47 | 0.07 | 0.60 | –0.18 |
| H16 | 0 | –0.01 | 0.49 | 0.06 | 0.56 | –0.04 |
| H17 | 0 | 0.04 | 0.69 | 0.29 | 0.81 | 0.50 |
| H18 | 1 | 2.57 | 0.50 | 0.13 | 0.55 | 0.28 |
| H19 | 0 | –0.15 | 0.49 | 0.14 | 0.43 | 0.22 |
| H20 | 0 | 0.05 | 0.44 | 0.00 | 0.68 | 0.30 |
| H21 | 0 | 0.00 | 0.51 | 0.08 | 0.56 | –0.03 |
| H22 | 1 | 1.46 | 0.76 | 0.44 | 0.67 | 0.56 |
Figure 1Forecasts and experimental outcomes. Green (red) dots indicate hypothesis that were supported (not supported) by the experimental data. The identity line is shown in grey. (a) Market forecasts versus survey-based forecasts for the binary outcomes. (b) Smoothed market forecasts versus survey-based forecasts for the binary outcomes. (c) Market forecasts versus observed outcomes for the effect sizes. Note that the observed effect sizes cover a much larger range than the forecasted effect sizes.