| Literature DB >> 34341340 |
J Nathan Matias1, Kevin Munger2, Marianne Aubin Le Quere3, Charles Ebersole4.
Abstract
The pursuit of audience attention online has led organizations to conduct thousands of behavioral experiments each year in media, politics, activism, and digital technology. One pioneer of A/B tests was Upworthy.com, a U.S. media publisher that conducted a randomized trial for every article they published. Each experiment tested variations in a headline and image "package," recording how many randomly-assigned viewers selected each variation. While none of these tests were designed to answer scientific questions, scientists can advance knowledge by meta-analyzing and data-mining the tens of thousands of experiments Upworthy conducted. This archive records the stimuli and outcome for every A/B test fielded by Upworthy between January 24, 2013 and April 30, 2015. In total, the archive includes 32,487 experiments, 150,817 experiment arms, and 538,272,878 participant assignments. The open access dataset is organized to support exploratory and confirmatory research, as well as meta-scientific research on ways that scientists make use of the archive.Entities:
Year: 2021 PMID: 34341340 PMCID: PMC8329003 DOI: 10.1038/s41597-021-00934-7
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Reconstruction of one test in 2013 that was composed of different tested packages. For this particular test, 3 different headlines were compared and the image was kept constant.
Fig. 2View of Upworthy website article view from December 2013. This figure demonstrates how the “Headline” and “Eyecatcher ID” fields from our dataset would have been shown to visitors. On this page, one of the image-headline combinations on the right sidebar was likely a package being tested.
Fig. 3Editor view of packages in Upworthy’s testing system once testing was underway. This package (D), was the fourth arm in an experiment. This reproduction of the Farm software, from 2018, had an entry for p-value, which was not computed during the period covered by the archive.
Columns in the Upworthy Research Archive.
| Column name | Description |
|---|---|
| created_at | Time the package was created (timezone unknown) |
| test_week | Week the package was created, a variable constructed by the archive creators for stratified random sampling |
| clickability_test_id | The test ID. Viewers were randomly assigned to packages with the same test ID |
| impressions | The number of viewers who were assigned to this package. The total number of participants for a given test is the sum of impressions for all packages that share the same clickability_test_id |
| headline | The headline being tested |
| eyecatcher_id | Image ID. Image files are not available. Packages that shared the same image have the same eyecatcher_id |
| clicks | The number of viewers (impressions) that clicked on the package. The clickrate for a given package is the number of clicks divided by the number of impressions |
| excerpt | Article excerpt |
| lede | The opening sentence or paragraph of the story |
| slug | Internal name for the web address |
| share_text | Summary for display on social media when the article is shared. This was not shown in tests, since tests were conducted on the Upworthy website |
| square | When used, part of the same social media sharing suggestion as the share text |
| significance | NOT an estimate of statistical significance; a complex, inconsistent calculation that compared the clicks on a package to the clicks on all previous packages that were fielded on the same pages |
| first_place | Along with significance, shown to editors to guide decisions about what test to choose |
| winner | Whether a package was selected by editors to be used on the Upworthy site after the test |
| updated_at | The last time the package was updated in the Upworthy system |
| Measurement(s) | behavior |
| Technology Type(s) | Web Site |
| Factor Type(s) | experimental group |
| Sample Characteristic - Organism | Homo sapiens |
| Sample Characteristic - Environment | internet |