| Literature DB >> 25886158 |
Jerker Denrell1, Balázs Kovács2.
Abstract
Most studies of fashion and fads focus on objects and practices that once were popular. We argue that limiting the sample to such trajectories generates a selection bias that obscures the underlying process and generates biased estimates. Through simulations and the analysis of a data set that has previously not been used to analyze the rise and fall of cultural practices, the New York Times text archive, we show that studying a whole range of cultural objects, both popular and less popular, is essential for understanding the drivers of popularity. In particular, we show that estimates of statistical models of the drivers of popularity will be biased if researchers use only trajectories of those practices that once were popular.Entities:
Mesh:
Year: 2015 PMID: 25886158 PMCID: PMC4401772 DOI: 10.1371/journal.pone.0123471
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1A few illustrative word trajectories from the New York Times archive.
Fig 2a: The distribution of word counts. b: Scatter plot with the maximums the trajectories reach and the minimum they reach after their peak.
Estimates of Poisson models using simulated data.
| True Coefficients | Estimates: All trajectories | Estimates: Max >27 & min after <4 | Effect of Selection Bias | |
|---|---|---|---|---|
|
| 1.00 | .998 | 1.560 | Overestimated |
| (.0018) | (.018) | |||
|
| 0.60 | .598 | .531 | Underestimated |
| (.0015) | (.012) | |||
|
| 0.1 | .101 | .122 | Overestimated |
| (.0010) | (.006) | |||
|
| -0.05 | -.049 | -.047 | Overestimated |
| (.0003) | (.003) | |||
|
| -0.1 | -.099 | -.130 | Underestimated |
| (.0003) | (.002) | |||
|
| 0.3 | .300 | .272 | Underestimated |
| (.0013) | (.009) | |||
| N | 750000 | 7080 | ||
| Pseudo R2 | .569 | .623 | ||
| Log-likelihood | -1153324.4 | -16875.587 |
All parameter estimates are significant at the 1% level.
Estimates of Poisson models with previous level and sign of change, Abrahamson (1996 [2]) and David and Strang (2006 [19]) data.
| Abrahamson (1996)’s data | David and Strang (2006)’s data | |
|---|---|---|
|
| 2.215 | 2.650 |
| (.315) | (.0261) | |
|
| .559 | .534 |
| (.202) | (.137) | |
|
| .118 | .182 |
| (.139) | (.006) | |
|
| -.189 | -.078 |
| (.081) | (.017) | |
|
| .144 | .074 |
| (.178) | (.117) | |
| N | 8 | 11 |
| Pseudo R2 | .480 | .779 |
| Log-likelihood | -25.655 | -82.437 |
* Starred coefficient estimates are significantly different from zero at the 5% level.
Fig 3Abrahamson (1996)’s data: Comparison of the observed data and the predicted values from Poisson models using parameters from Table 2.
Fig 4David and Strang (2006)’s data: Comparison of the observed data and the predicted values from Poisson models using parameters from Table 2.
Estimates of Poisson models using the New York Times Data.
| Estimates: All trajectories | Estimates: Max >54 & min after <7 | Effect of Selection Bias | |
|---|---|---|---|
|
| -265.975 | -65.167 | Overestimated |
| (4.603) | (23.596) | ||
|
| .714 | .339 | Underestimated |
| (.015) | (.031) | ||
|
| -.238 | .111 | Overestimated |
| (.0033) | (.060) | ||
|
| -.210 | -.058 | Overestimated |
| (.012) | (.027) | ||
|
| -.060 | -.071 | Underestimated |
| (.004) | (.011) | ||
|
| .203 | .023 | Underestimated |
| (.018) | (.028) | ||
| Year | .133 | 0.034 | |
| (0.002) | (0.012) | ||
| N | 11,268,754 | 16,609 | |
| Pseudo R2 | .364 | .258 | |
| Log-likelihood | -8645497.1 | -328537.85 |
***significant at the 1% level.
** significant at the 5% level.
* significant at the 10% level (robust standard errors clustered by trajectories).