| Literature DB >> 31600272 |
Eric W Schoon1, David Melamed1, Ronald L Breiger2, Eunsung Yoon2, Christopher Kleps1.
Abstract
Forecasting extremely rare events is a pressing problem, but efforts to model such outcomes are often limited by the presence of multiple causes within classes of events, insufficient observations of the outcome to assess fit, and biased estimates due to insufficient observations of the outcome. We introduce a novel approach for analyzing rare event data that addresses these challenges by turning attention to the conditions under which rare outcomes do not occur. We detail how configurational methods can be used to identify conditions or sets of conditions that would preclude the occurrence of a rare outcome. Results from Monte Carlo experiments show that our approach can be used to systematically preclude up to 78.6% of observations, and application to ground-truth data coupled with a bootstrap inferential test illustrates how our approach can also yield novel substantive insights that are obscured by standard statistical analyses.Entities:
Year: 2019 PMID: 31600272 PMCID: PMC6786560 DOI: 10.1371/journal.pone.0223239
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Truth table example.
| X1 | X2 | N | Observed N of Outcome | Assigned Outcome Value | Consistency |
|---|---|---|---|---|---|
| 1 | 0 | 2 | 2 | 1 | 1 |
| 0 | 1 | 5 | 4 | 1 | 0.8 |
| 1 | 1 | 3 | 2 | 0 | 0.67 |
| 0 | 0 | 4 | 2 | 0 | 0.5 |
Fig 1Proportion of cases removed by rarity of outcome.
Error bars denote 95% confidence interval around the means.
Fig 2Proportion of cases removed by sample size.
Error bars denote 95% confidence interval around the means.
Logistic regression of violent infighting.
| Excluded population | -0.088 |
| (0.204) | |
| Center segmentation | 0.334** |
| (0.117) | |
| Imperial past | 3.035 |
| (1.759) | |
| Linguistic fractionalization | 0.980 |
| (1.762) | |
| GDP per capita | 0.139 |
| (0.083) | |
| Population size | -0.292 |
| (0.143) | |
| Ongoing war | (omitted) |
| Year | 0.047 |
| (0.016) | |
| Peace years | 0.080 |
| (0.358) | |
| Spline 1 | 0.003 |
| (0.011) | |
| Spline 2 | -0.001 |
| (0.003) | |
| Spline 3 | 0.000 |
| (0.001) | |
| Constant | -99.135 |
| (31.928) |
Note: Standard errors are in parentheses.
*p< 0.05;
**p< 0.01;
***p< 0.001.
Fig 3Configurational results for the absence of infighting.
Bars denote proportion of cases removed overall, and for each configuration with a minimum of 10% coverage.
Fig 4Distribution of coverage scores from bootstrap inferential test of the configurational analysis for the absence of infighting.
Results indicate the coverage scores for the configurations presented in Fig 4 across 10,000 resamplings with replacement.