| Literature DB >> 33834241 |
Lee Fiorio1, Emilio Zagheni2, Guy Abel3,4, Johnathan Hill1, Gabriel Pestre5, Emmanuel Letouzé5,6, Jixuan Cai7,4.
Abstract
Georeferenced digital trace data offer unprecedented flexibility in migration estimation. Because of their high temporal granularity, many migration estimates can be generated from the same data set by changing the definition parameters. Yet despite the growing application of digital trace data to migration research, strategies for taking advantage of their temporal granularity remain largely underdeveloped. In this paper, we provide a general framework for converting digital trace data into estimates of migration transitions and for systematically analyzing their variation along a quasi-continuous time scale, analogous to a survival function. From migration theory, we develop two simple hypotheses regarding how we expect our estimated migration transition functions to behave. We then test our hypotheses on simulated data and empirical data from three platforms in two internal migration contexts: geotagged Tweets and Gowalla check-ins in the United States, and cell-phone call detail records in Senegal. Our results demonstrate the need for evaluating the internal consistency of migration estimates derived from digital trace data before using them in substantive research. At the same time, however, common patterns across our three empirical data sets point to an emergent research agenda using digital trace data to study the specific functional relationship between estimates of migration and time and how this relationship varies by geography and population characteristics.Entities:
Keywords: Big data; Methods; Migration; Mobility
Mesh:
Year: 2021 PMID: 33834241 PMCID: PMC8055474 DOI: 10.1215/00703370-8917630
Source DB: PubMed Journal: Demography ISSN: 0070-3370
Fig. 1Three interlocking but distinct dimensions of migration measurement: start, buffer, and interval. By changing one while holding the other two fixed, we can assess how migration estimates are affected by seasonality (start), residency criteria (buffer), and cumulative exposure to migration risk (interval).
Fig. 2Migration rates derived from simulated data. Each line tracks the rate of individuals classified as migrants as the interval increases from a specific reference point, while the buffer is held fixed.
Fig. 3Counts of unique users by month in the Orange-Sonatel, Twitter, and Gowalla data sets
Fig. 4Estimates from the three empirical data sets of change in the rate of migration with increases in the interval from a specific start, while the buffer is held fixed at 1 week, 4 weeks, and 12 weeks
Fig. 5Each dot represents a bilateral flow or corridor, such as New England to the Mountain West (in the U.S. context). The migration rate corresponding to each bilateral flow is estimated at two different time scales: one-month-to-one-month and six-month-to-six-months. The plot shows the correlation of these two temporal specifications for each bilateral flow/corridor. The findings presented here provide fertile ground for future research and suggest that (easier to measure) short-term flows may be useful in modeling (harder to measure) long-term flows.