| Literature DB >> 26001625 |
Hao Dong1, Cameron Campbell, Satomi Kurosu, Wenshan Yang, James Z Lee.
Abstract
Comparison and comparability lie at the heart of any comparative social science. Still, precise comparison is virtually impossible without using similar methods and similar data. In recent decades, social demographers, historians, and economic historians have compiled and made available a large number of micro-level data sets of historical populations for North America and Europe. Studies using these data have already made important contributions to many academic disciplines. In a similar spirit, we introduce five new micro-level historical panel data sets from East Asia, including the China Multi-Generational Panel Dataset-Liaoning (CMGPD-LN) 1749-1909, the China Multi-Generational Panel Dataset-Shuangcheng (CMGPD-SC) 1866-1913, the Japanese Ninbetsu-Aratame-Cho Population Register Database-Shimomoriya and Niita (NAC-SN) 1716-1870, the Korea Multi-Generational Panel Dataset-Tansung (KMGPD-TS) 1678-1888, and the Colonial Taiwan Household Registration Database (CTHRD) 1906-1945. These data sets in total contain more than 3.7 million linked observations of 610,000 individuals and are the first such Asian data to be made available online or by application. We discuss the key features and historical institutions that originally collected these data; the subsequent processes by which the data were reconstructed into individual-level panels; their particular data limitations and strengths; and their potential for comparative social scientific research.Entities:
Mesh:
Year: 2015 PMID: 26001625 PMCID: PMC4789298 DOI: 10.1007/s13524-015-0397-y
Source DB: PubMed Journal: Demography ISSN: 0070-3370
Fig. 1Google Scholar citations generated by comparative “big” social science data
Map 1EAP II study populations
Available information in the five EAP II data sets
| CMGPD-LN | CMGPD-SC | KMGPD-TS | NAC-SN | CTHRD | |
|---|---|---|---|---|---|
| Data Set Information | |||||
| Period | 1749–1909 | 1866–1913 | 1678–1888 | 1716–1870 | 1906–1945 |
| Frequency of update | Triennial | Annual[ | Triennial | Annual | Continuous |
| No. of observations | 1,513,357 | 1,346,826 | 275,042 | 118,879 | 481,383 |
| No. of individuals | 266,091 | 107,551 | 136,690 | 6,257 | 103,151 |
| Demographic Information | |||||
| Sex | Recorded | Recorded | Recorded | Recorded | Recorded |
| Age[ | Yes | Inferred by birthdate | Yes | Yes | Inferred by birthdate |
| Timing of birth | Year-Month-Date-Hour[ | Inferred[ | Year | Year-Month | Year-Month-Date |
| Physical disability | Males | Males | Males and females | Males and females | Males and females |
| Timing of death | Three-year period | Year | Three-year period | Year-Month | Year-Month-Date |
| Marriage | Recorded | Recorded | Recorded | Recorded | Recorded |
| Residential location | Village | Village | Village | Village | Village |
| Migration[ | Tracked within the area | Entrance and exits | Entrance and exits | Entrance and exits | Entrance and exits |
| Timing of migration | Three-year period | Year | Year | Year | Year-Month-Date |
| Socioeconomic Information | |||||
| Relationship to | Yes[ | Yes | Yes | Yes | Yes |
| Administrative status[ | Regular/special | Metropolitan/rural/ | Yangban/sangmin/ | Honbyakusho/ | No |
| Occupation | Males | Males | Males | Males | Household heads |
| Civil service | Males | Males | Males | No | No |
| Household landholding | No | Yes | No | Yes | Partial |
Although the major part of Shuangcheng Settler—Metropolitan and Rural bannerman—registers are compiled annually, a small set of Shuangcheng floating labor bannerman registers are compiled triennially, which in total accounts for 10 % of observations in CMGPD-SC.
Ages are calculated by sui (Chinese)/sai (Japanese)/se (Korean), a traditional way to calculate age in East Asia. A person is aged 1 sui/sai/se at birth and is one year older after each lunar new year.
In Shuangcheng, year of birth is calculated from recorded age. In Liaoning, birthdate is recorded reliably only in early registers.
Information on physical disability and disease is not systematic in the whole recorded population and only of certain limited types.
In CMGPD-LN, we can continuously observe individuals before and after their legal migration. In the other data sets, we only have information of individuals before out-migration, either legal or illegal. As a result, our observation on such migrants ends after they migrate out, although sometimes their planned destinations are reported in the last records.
From the year 1789, information on individual’s relationship to household head is available in CMGPD-LN. Before 1789, only information on individual's relationship to head of lineage is available.
In the CMGPD-LN, such population categories are based on separate sets of population registers that reflect differences in social and political status and entitlement rights. In the CMGPD-SC, although they are all rural residents, metropolitan banner population refers to those immigrants originally from Beijing and Rehe, who are eligible for the highest amount of land allocated by the Shuangcheng government; rural banner population refers to those immigrants from Liaoning who are officially allocated a lesser amount of land and supposed to work as tenants for metropolitan bannerman; floating banner population has no right to claim the ownership of official lands as they migrate to Shuangcheng by themselves rather than commanded by government. In the KMGPD-TS, Yangban population refers to the high-level noble population in Korean society; Sangmin population refers to middle-status commoner population in the society; Nobi population refers to the low-status servile population. In NAC-SN, the status is in general based on the land-tax system. In addition to such major social categories as titled peasants (honbyakusho) and tenant peasants without landholding (mizunomi), there are other categories such as hereditary servants (nago), housing renters (tanagari), and Buddhist temple, Shinto shrine, mountain ascetic (yamabushi/shugen). In CTHRD, such category is a product of grouping specific occupation of household head and amount of household land tax.
Fig. 2Proportion of observations linked to the subsequent registers
Individuals by number of years of observation
| CMGPD-LN | CMGPD-SC | KMGPD-TS | NAC-SN | CTHRD | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Years Under Observation | Freq. | % | Cum. | Freq. | % | Cum. | Freq. | % | Cum. | Freq. | % | Cum. | Freq. | % | Cum. |
| 22+ | 115,948 | 43.57 | 43.57 | 40,824 | 37.77 | 37.79 | 9,845 | 7.21 | 7.21 | 2,167 | 34.69 | 34.60 | 24,321 | 23.57 | 23.58 |
| 19–21 | 11,738 | 4.41 | 47.99 | 6,435 | 5.95 | 43.74 | 1,856 | 1.36 | 8.57 | 244 | 3.90 | 38.50 | 6,138 | 5.95 | 29.53 |
| 16–18 | 7,768 | 2.92 | 50.91 | 8,151 | 7.55 | 51.29 | 3,980 | 2.91 | 11.48 | 300 | 4.80 | 43.30 | 5,610 | 5.44 | 34.97 |
| 13–15 | 8,140 | 3.06 | 53.96 | 7,086 | 6.57 | 57.86 | 3,724 | 2.72 | 14.20 | 349 | 5.59 | 48.89 | 4,181 | 4.05 | 39.02 |
| 10–12 | 8,979 | 3.37 | 57.34 | 8,915 | 8.25 | 66.11 | 5,100 | 3.73 | 17.93 | 363 | 5.81 | 54.70 | 4,248 | 4.12 | 43.14 |
| 7–9 | 8,390 | 3.15 | 60.49 | 7,793 | 7.21 | 73.32 | 7,661 | 5.60 | 23.53 | 365 | 5.83 | 60.53 | 4,870 | 4.72 | 47.86 |
| 4–6 | 38,861 | 14.60 | 75.10 | 10,530 | 9.75 | 83.07 | 6,632 | 4.85 | 28.38 | 469 | 7.50 | 68.03 | 6,358 | 6.16 | 54.02 |
| 2–3 | 24,727 | 9.29 | 84.39 | 10,550 | 9.77 | 92.84 | 26,322 | 19.26 | 47.64 | 1,163 | 18.59 | 86.62 | 14,520 | 14.08 | 68.10 |
| 1 | 41,540 | 15.61 | 100.00 | 7,736 | 7.16 | 100.00 | 71,570 | 52.36 | 100.00 | 837 | 13.38 | 100.00 | 32,905 | 31.90 | 100.00 |
| Total Individuals | 266,091 | 108,020 | 136,690 | 6,257 | 103,151 | ||||||||||
Number of individuals by linked previous generations
| CMGPD-LN | CMGPD-SC | KMGPD-TS | NAC-SN | CTHRD | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Generations | Freq. | % | Cum. | Freq. | % | Cum. | Freq. | % | Cum. | Freq. | % | Cum. | Freq. | % | Cum. |
| 6+ | 55,243 | 20.76 | 20.76 | 449 | 0.42 | 0.43 | 1,043 | 0.77 | 0.76 | 487 | 7.78 | 7.78 |
|
|
|
| 5 | 40,673 | 15.29 | 36.05 | 922 | 0.85 | 1.28 | 1,457 | 1.07 | 1.83 | 414 | 6.62 | 14.40 | 353 | 0.34 | 0.34 |
| 4 | 44,923 | 16.88 | 52.93 | 10,396 | 9.62 | 10.90 | 4,090 | 2.99 | 4.82 | 564 | 9.01 | 23.41 | 6,865 | 6.66 | 7.00 |
| 3 | 45,206 | 16.99 | 69.92 | 32,135 | 29.75 | 40.65 | 10,012 | 7.32 | 12.14 | 832 | 13.30 | 36.71 | 26,939 | 26.12 | 33.11 |
| 2 | 41,345 | 15.54 | 85.46 | 37,368 | 34.59 | 75.24 | 27,443 | 20.08 | 32.22 | 1075 | 17.18 | 53.89 | 31,314 | 30.36 | 63.47 |
| 1 | 38,701 | 14.54 | 100.00 | 26,750 | 24.76 | 100.00 | 92,645 | 67.78 | 100.00 | 2885 | 46.11 | 100.00 | 37,680 | 36.53 | 100.00 |
| Total Individuals | 266,091 | 108,020 | 136,690 | 6,257 | 103,151 | ||||||||||
Fig. 3Proportion of individuals without an exit annotation by reasons
Fig. 4Observation pyramids of the CMGPD-LN, CMGPD-SC, CTHRD, KMGPD-TS, and NAC-SN
Fig. 5Predicted probability of death by next year by age (left: female; right: male)
Fig. 6Proportion of observations with identified spouse by age (left: female; right: male)
Characteristics of EAP II and major Western large-scale micro-level historical demographic data
| Individual | Individual SES | Kinship/ Relationship Between Individuals | Intergenerational Linkage | Household Composition Recorded Continuously | Family/ | Complete | |
|---|---|---|---|---|---|---|---|
| CMGPD- | Yes | Yes | Yes | Yes | Yes | No | Yes |
| CMGPD- | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| CTHRD | Yes | Yes | Yes | Yes | Yes | No | Yes |
| KMGPD- | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| NAC-SN | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| BALSAC | Yes | Yes | Yes | Yes | No | No | Yes |
| IPUMS- | Yes | Yes | Yes | No | No | Yes | No |
| HSN | Yes | Yes | Yes | No | Yes | No | No |
| SEDD | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| UPDB | Yes | Yes | Yes | Yes | No | Yes | Yes |