| Literature DB >> 33005066 |
Martha Bailey1,2, Connor Cole1, Catherine Massey1.
Abstract
New large-scale linked data are revolutionizing quantitative history and demography. This paper proposes two complementary strategies for improving inference with linked historical data: the use of validation variables to identify higher quality links and a simple, regression-based weighting procedure to increase the representativeness of custom research samples. We demonstrate the potential value of these strategies using the 1850-1930 Integrated Public Use Microdata Series Linked Representative Samples (IPUMS-LRS)-a high quality, publicly available linked historical dataset. We show that, while incorrect linking rates appear low in the IPUMS-LRS, researchers can reduce error rates further using validation variables. We also show how researchers can reweight linked samples to balance observed characteristics in the linked sample with those in a reference population using a simple regression-based procedure.Entities:
Year: 2019 PMID: 33005066 PMCID: PMC7523567 DOI: 10.1080/01615440.2019.1630343
Source DB: PubMed Journal: Hist Methods ISSN: 0161-5440