| Literature DB >> 34054348 |
Konrad Turek1, Matthijs Kalmijn1, Thomas Leopold2.
Abstract
The Comparative Panel File (CPF) harmonizes the world's largest and longest-running household panel surveys from seven countries: Australia (HILDA), Germany (SOEP), United Kingdom (BHPS and UKHLS), South Korea (KLIPS), Russia (RLMS), Switzerland (SHP), and the United States (PSID). The project aims to support the social science community in the analysis of comparative life course data. The CPF builds on the Cross-National Equivalent File but offers a larger range of variables, larger and more recent samples, an easier and more flexible workflow, and an open science platform for development. The CPF is not a data product but an open-source code that integrates individual and household panel data from all seven surveys into a harmonized three-level data structure. The CPF allows analysing individual trajectories, time trends, contextual effects, and country differences. The project is organized as an open science platform. The CPF version 1.0 contains 2.7 million observations from 360,000 respondents, covering the period from 1968 to 2019 and up to 40 panel waves per respondent. In this data brief, we present the background, design, and content of the CPF.Entities:
Year: 2021 PMID: 34054348 PMCID: PMC8152217 DOI: 10.1093/esr/jcab006
Source DB: PubMed Journal: Eur Sociol Rev ISSN: 0266-7215
Variables available in the CPF version 1.0
| Group of variables | Description | Main variables |
|---|---|---|
| Technical | Respondent identifiers, information about wave and interview and other technical information |
Country Personal and household identification numbers Wave’s number and year Interview status Year and month of interview Sample identifiers |
| Demographic | Basic demographic characteristics |
Gender Age Year of birth |
| Education | Education level is harmonized using the ISCED classification in four different versions with three, four, and five levels. For example, three levels are (0–2) low, (3–4) medium, and (5–8) high. Variables also include years of education, participation in training, self-assessment of qualifications |
Education: 3/4/5 levels Participation in training in the past 12 months Work-education skill fit Qualifications for job |
| Marital and relationship status | CPF distinguishes between formal marital status and partnership living-status, which also accounts for living with the partner. Additionally, it includes less precise primary partnership status equivalent to the one used in CNEF. Also, it provides indicators for specific statuses (e.g., divorced) and being never married |
Formal marital status Partnership living-status Primary partnership status Living with the partner Never married Widowed Divorced Separated |
| Number of children and household members | There are several children-related variables to account for differences in questionnaires in:
the definition of children, e.g., own-born, adopted, of other family members, any children the situation of children, e.g., living currently in the household, living elsewhere, children ever had age of children, e.g., any age, below 18, and below 15 years old |
Number of children in household (aged 0–15, 0–17) Number of children ever had Has own children (yes/no) Number of people in household |
| Labour market situation and employment | An important goal of the CPF is to provide a comprehensive view of individuals' labour market situation. These include the following areas: | |
| Labour market situation: employed, unemployed, retired or disabled, in education, not active, employed but on leave. CPF also identifies maternity leave |
Labour market situation (5/6 categories) Currently working (self-reported) Working in the previous year (based on reported working hours) Being on maternity leave Never worked | |
| Level of employment: full- or part-time, number of working hours (several versions, including actual and contracted hours) |
Full- or part-time work (based on working hour/self-reported) Number of working hours (per year, month, week, day) Work hours per week: contracted | |
| Occupation—classified according to the International Standard Classification of Occupations (ISCO). KLIPS and PSID use different classifications than ISCO. In these cases, crosswalk algorithms were developed. ISCO level 1 and 2 are harmonized for all countries, but if available, CPF provides a more detailed classification in versions ISCO-88 or ISCO-08 at 3- or 4-digit levels |
Occupation: ISCO level 1: 1 digit, 10 categories Occupation: ISCO level 2: 2 digits, 50+ categories Additionally, ISCO-08/ISCO-88 with 3 or 4 digits Supervisory position | |
| Characteristics of the employee’s organization |
Industry: 3 major, 10 sub-major and 17 minor groups Sector (public) Size of organization | |
| More precise and specific identification of actively unemployed, self-employed, entrepreneurs (with employees), and retirees. These indicators are built on information from several variables. For example, individuals are classified as retired when they are not working and meet any of the following criteria:
Self-categorization as retired and age 50+ Receives old-age pension and age 50+ Age 65+ |
Unemployed: actively looking for work Self-employed Entrepreneur (including or not including farmers) Retired fully Receiving old-age pension | |
| Labour market experience measured as years of employment/work |
Total labour market experience (total/full time/part time) Tenure with current employer | |
| Perception of job security—whether the respondent is worried about job security (in two versions) |
Secure/insecure Secure/insecure/hard to say | |
| Incomes |
Incomes of individuals and households. Depending on the origin data, information on individual income is included in several variables based on: source of income (total income from jobs and benefits, from all jobs, from the main job) type of income (gross, net) reference period for income (year, month, per hour) This approach results in multiple variables but provides clear definitions. For analytical purposes, users can combine particular variables using the nominal values or relative values (e.g., percentiles). CPF provides values as they are included in the source data, without any additional cleaning, imputation, conversion, or inflation-adjustments. Values are in local currency Depending on the type of monthly household income in the origin data, information is provided in two versions: before taxes and deduction (gross, pre), after taxes and transfers (net, post). Some datasets provide a negative household income indicating a loss or debit (e.g., PSID since 1994). Values are in local currencies |
Individual income (all types) Year, net Month, net Individual labour earnings (all jobs) Year, gross Year, net Month, net Month, gross Salary from the main job Year, net Year, gross Month, gross Month, net Per hour, gross Household income (month) Gross Net |
| Health and wellbeing |
Self-rated health status is based on the standard 5-point scale There are three versions of disability-related questions Variable for chronic diseases is in a working version: it is not fully harmonized and should be modified by the users according to specific conceptual framework (e.g., defining chronic conditions) CPF provides several dimensions of subjective wellbeing, which can be harmonized for at least several countries. We include two versions of each variable due to differences in original answer scales: with a 5-point scale (1–5 range) and 11-point (0–10 range). If required, the original values were rescaled |
Self-rated health Receiving disability pension Disability: any type (physical, mental, or nervous condition) Disability: min. category 2 or greater than 30 per cent Chronic diseases (yes/no) Satisfaction with Life Work Financial situation of household Individual income Family Health |
| Parental background | Parents’ education level is coded in 3- and 4-categorical variables similarly to respondent’s education level |
Mother’s/father’s education: 3/4 levels |
| Socio-economic position | Socio-economic position scales are based on respondents’ work status and occupation’s ISCO code |
International Socio-Economic Index of occupational status (ISEI) Treiman's international prestige scale (SIOPS) German Magnitude Prestige Scale (MPS) |
Figure 1.Number of waves in which individuals participated: exact number by survey (left axis) and minimum number for the total sample (right axis)
Figure 2.Timeline of the data and number of observations by wave
Number of waves, observations, and respondents
| Country | Survey | First wave | No of waves | Observations | Unique respondents | ||
|---|---|---|---|---|---|---|---|
|
| % |
| % | ||||
| [1] Australia | HILDA | 2001 | 18 | 257,418 | 9.6 | 30,576 | 8.5 |
| [2] Korea | KLIPS | 1998 | 21 | 257,495 | 9.6 | 23,535 | 6.5 |
| [3] United States | PSID | 1968 | 40 | 457,638 | 17.0 | 42,219 | 11.7 |
| [4] Russia | RLMS | 1994 | 23 | 274,914 | 10.2 | 44,559 | 12.4 |
| [5] Switzerland | SHP | 1999 | 20 | 146,765 | 5.4 | 21,900 | 6.1 |
| [6] Germany | SOEP | 1984 | 35 | 675,693 | 25.1 | 94,525 | 26.3 |
| [7] United Kingdom | BHPS/UKHLS | 1991 | 27 | 626,787 | 23.2 | 102,605 | 28.5 |
| Total | 2,696,710 | 100 | 359,919 | 100 | |||
BHPS: 1991–2008, 18 waves and UKHLS: from 2009, 9 waves.
Figure 3.Structure of the CPF syntax
Figure 4.A step-by-step guide through using the CPF code
Figure 5.Distribution of birth cohorts (year of birth) by survey
Frequency by age groups and birth cohort
| Age group | Birth cohort | |||||||
|---|---|---|---|---|---|---|---|---|
| 1920s and earlier | 30s | 40s | 50s | 60s | 70s | 80s | 90s and later | |
| Australia | ||||||||
| 18/29 | 7,511 | 30,531 | 20,235 | |||||
| 30/39 | 9,802 | 23,432 | 12,898 | |||||
| 40/49 | 9,781 | 26,255 | 11,800 | |||||
| 50/59 | 7,233 | 22,881 | 12,034 | |||||
| 60/69 | 4,685 | 17,340 | 9,738 | |||||
| 70/79 | 3,597 | 10,427 | 6,716 | |||||
| 80/max | 7,227 | 3,295 | ||||||
| Korea | ||||||||
| 18/29 | 293 | 15,300 | 21,128 | 10,119 | ||||
| 30/39 | 318 | 15,988 | 27,590 | 10,440 | ||||
| 40/49 | 204 | 14,560 | 24,494 | 12,833 | ||||
| 50/59 | 176 | 9,703 | 21,296 | 11,032 | ||||
| 60/69 | 87 | 7,479 | 15,622 | 9,231 | ||||
| 70/79 | 3,317 | 10,989 | 6,871 | |||||
| 80/max | 4,963 | 3,462 | ||||||
| United States | ||||||||
| 18/29 | 269 | 15,509 | 36,014 | 27,610 | 14,140 | 12,825 | 3,211 | |
| 30/39 | 366 | 10,497 | 22,926 | 40,908 | 21,534 | 14,896 | 6,934 | |
| 40/49 | 11,567 | 14,378 | 22,789 | 21,942 | 13,102 | 4,601 | ||
| 50/59 | 22,230 | 13,738 | 10,912 | 12,772 | 4,477 | |||
| 60/69 | 27,206 | 6,540 | 6,483 | 3,878 | ||||
| 70/79 | 19,253 | 2,924 | 1,413 | |||||
| 80/max | 9,157 | 637 | ||||||
| Russia | ||||||||
| 18/29 | 1,864 | 14,424 | 31,611 | 14,526 | ||||
| 30/39 | 2,304 | 11,775 | 26,064 | 13,839 | ||||
| 40/49 | 1,831 | 13,137 | 21,304 | 12,164 | ||||
| 50/59 | 2,065 | 8,065 | 22,676 | 10,682 | ||||
| 60/69 | 1,537 | 9,306 | 13,367 | 10,046 | ||||
| 70/79 | 5,800 | 12,393 | 5,225 | |||||
| 80/max | 5,924 | 2,985 | ||||||
| Switzerland | ||||||||
| 18/29 | 4,398 | 10,389 | 8,210 | |||||
| 30/39 | 8,222 | 9,445 | 4,453 | |||||
| 40/49 | 7,638 | 16,440 | 6,411 | |||||
| 50/59 | 5,928 | 13,888 | 8,811 | |||||
| 60/69 | 3,885 | 11,182 | 7,283 | |||||
| 70/79 | 2,176 | 6,525 | 5,796 | |||||
| 80/max | 3,138 | 2,547 | ||||||
| Germany | ||||||||
| 18/29 | 3,367 | 27,133 | 31,543 | 39,541 | 29,239 | |||
| 30/39 | 3,344 | 21,400 | 38,970 | 43,479 | 23,469 | |||
| 40/49 | 3,563 | 19,398 | 33,950 | 54,142 | 27,963 | |||
| 50/59 | 2,376 | 17,570 | 27,350 | 40,066 | 24,461 | |||
| 60/69 | 11,823 | 24,385 | 33,642 | 16,600 | ||||
| 70/79 | 18,807 | 23,840 | 13,242 | |||||
| 80/max | 15,545 | 5,485 | ||||||
| United Kingdom | ||||||||
| 18/29 | 7,187 | 25,143 | 50,972 | 34,755 | ||||
| 30/39 | 6,689 | 27,367 | 54,372 | 22,801 | ||||
| 40/49 | 6,595 | 22,559 | 61,806 | 27,506 | ||||
| 50/59 | 4,157 | 20,823 | 50,456 | 27,749 | ||||
| 60/69 | 3,928 | 14,453 | 46,103 | 22,066 | ||||
| 70/79 | 14,884 | 28,190 | 17,335 | |||||
| 80/max | 20,544 | 8,347 | ||||||
Basic characteristics of the sample (column percentages)
| Australia | Korea | United States | Russia | Switzerland | Germany | United Kingdom | Total | |
|---|---|---|---|---|---|---|---|---|
| Gender | ||||||||
| Male | 47.2 | 47.9 | 44.7 | 42.1 | 44.6 | 47.6 | 45.9 | 46.0 |
| Female | 52.8 | 52.1 | 55.3 | 57.9 | 55.4 | 52.4 | 54.1 | 54.0 |
| | 257,418 | 257,495 | 457,637 | 274,916 | 146,765 | 675,693 | 626,166 | 2,696,090 |
| Education: 3 levels | ||||||||
| (0–2) Low | 29.0 | 33.3 | 25.7 | 19.3 | 9.8 | 19.6 | 29.4 | 24.5 |
| (3–4) Medium | 38.1 | 48.5 | 56.1 | 53.3 | 57.3 | 57.0 | 37.3 | 49.3 |
| (5–8) High | 32.9 | 18.2 | 18.2 | 27.4 | 33.0 | 23.5 | 33.3 | 26.1 |
| | 257,277 | 257,452 | 452,265 | 274,331 | 146,763 | 663,515 | 603,507 | 2,655,110 |
| Formal marital status | ||||||||
| Married/registered | 50.4 | 66.3 | 70.3 | 54.4 | 58.5 | 61.4 | 57.5 | 60.6 |
| Never married | 35.2 | 21.2 | 12.1 | 24.4 | 25.6 | 23.2 | 27.9 | 23.6 |
| Widowed | 5.2 | 8.7 | 5.7 | 12.5 | 5.5 | 6.0 | 6.8 | 7.0 |
| Divorced | 6.3 | 3.0 | 7.9 | 8.3 | 9.0 | 7.0 | 6.0 | 6.7 |
| Separated | 2.9 | 0.7 | 4.0 | 0.4 | 1.5 | 2.4 | 1.8 | 2.2 |
| | 257,374 | 257,441 | 457,607 | 274,454 | 146,761 | 669,286 | 626,412 | 2,689,335 |
| Employment status | ||||||||
| Employed | 64.4 | 57.6 | 67.7 | 57.9 | 68.8 | 58.8 | 57.7 | 60.9 |
| Unemployed | 3.5 | 2.9 | 5.0 | 7.4 | 2.0 | 7.4 | 4.6 | 5.2 |
| Retired, disabled | 21.8 | 11.2 | 13.9 | 25.2 | 16.6 | 20.9 | 26.6 | 20.5 |
| Not active/home | 8.9 | 23.3 | 12.1 | 6.4 | 8.6 | 9.0 | 7.0 | 10.1 |
| In education | 1.5 | 5.0 | 1.3 | 3.1 | 4.0 | 3.9 | 4.2 | 3.4 |
| | 257,418 | 257,493 | 425,355 | 274,893 | 146,765 | 675,685 | 626,126 | 2,663,735 |
| Occupation (ISCO level 1) of employed | ||||||||
| [1] Managers | 12.1 | 1.5 | 11.1 | 6.7 | 9.2 | 5.8 | 14.1 | 9.2 |
| [2] Professionals | 20.8 | 11.1 | 14.6 | 17.6 | 21.3 | 16.7 | 14.3 | 16.1 |
| [3] Technicians | 16.1 | 8.9 | 15.0 | 17.3 | 25.4 | 21.8 | 14.4 | 16.9 |
| [4] Clerical support | 12.5 | 14.9 | 10.7 | 5.3 | 11.9 | 11.1 | 13.8 | 11.6 |
| [5] Services and sale | 12.9 | 21.0 | 13.5 | 16.9 | 12.4 | 11.3 | 17.8 | 14.8 |
| [6] Skilled agricult. | 2.9 | 7.3 | 0.5 | 0.4 | 3.4 | 1.3 | 1.1 | 1.9 |
| [7] Craft and related | 9.3 | 11.8 | 10.6 | 13.8 | 9.2 | 15.9 | 8.5 | 11.6 |
| [8] Plant and machine | 6.0 | 12.3 | 9.2 | 14.6 | 2.5 | 8.1 | 6.8 | 8.5 |
| [9] Elementary occup. | 7.4 | 10.9 | 13.8 | 7.4 | 4.6 | 7.6 | 9.1 | 9.2 |
| | 165,720 | 146,852 | 301,351 | 157,866 | 99,086 | 377,193 | 354,208 | 1,602,276 |
Note: Missing values were removed. ‘Armed forces’ not shown in occupations due to low frequency.
Figure 6.The structure and tools of the CPF’s Open Science Framework