| Literature DB >> 29440217 |
Hongbo Lin1, Xun Tang2, Peng Shen1, Dudan Zhang2, Jinguo Wu3, Jingyi Zhang3, Ping Lu3, Yaqin Si2, Pei Gao2.
Abstract
INTRODUCTION: Data based on electronic health records (EHRs) are rich with individual-level longitudinal measurement information and are becoming an increasingly common data source for clinical risk prediction worldwide. However, few EHR-based cohort studies are available in China. Harnessing EHRs for research requires a full understanding of data linkages, management, and data quality in large data sets, which presents unique analytical opportunities and challenges. The purpose of this study is to provide a framework to establish a uniquely integrated EHR database in China for scientific research. METHODS AND ANALYSIS: The CHinese Electronic health Records Research in Yinzhou (CHERRY) Study will extract individual participant data within the regional health information system of an eastern coastal area of China to establish a longitudinal population-based ambispective cohort study for cardiovascular care and outcomes research. A total of 1 053 565 Chinese adults aged over 18 years were registered in the health information system in 2009, and there were 23 394 deaths from 1 January 2009 to 31 December 2015. The study will include information from multiple epidemiological surveys; EHRs for chronic disease management; and health administrative, clinical, laboratory, drug and electronic medical record (EMR) databases. Follow-up of fatal and non-fatal clinical events is achieved through records linkage to the regional system of disease surveillance, chronic disease management and EMRs (based on diagnostic codes from the International Classification of Diseases, tenth revision). The CHERRY Study will provide a unique platform and serve as a valuable big data resource for cardiovascular risk prediction and population management, for primary and secondary prevention of cardiovascular events in China. ETHICS AND DISSEMINATION: The CHERRY Study was approved by the Peking University Institutional Review Board (IRB00001052-16011) in April 2016. Results of the study will be disseminated through published journal articles, conferences and seminar presentations, and on the study website (http://www.cherry-study.org). © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.Entities:
Keywords: Chinese; cardiovascular diseases; electronic health records
Mesh:
Year: 2018 PMID: 29440217 PMCID: PMC5829949 DOI: 10.1136/bmjopen-2017-019698
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Figure 1Study location for the CHinese Electronic health Records Research in Yinzhou (CHERRY) study.
Figure 2Data sources for establishing the CHinese Electronic health Records Research in Yinzhou (CHERRY) cohort. Notes: Although the main focus of the CHERRY Study is on adults, data sources of infants, children and pregnant women are included in the health information system, for example, birth weight from birth certificates. However, birth records are generally not available for adults who were already over 18 years old in 2009. Maternal exposures from antenatal examination records, that is, gestational hypertension or diabetes, were recorded in the system which can be potential risk factors for cardiovascular disease (CVD) prediction in women. However, a further ethics review process is required to extract maternal information in CHERRY.
List of core risk factors for cardiovascular disease (CVD) in the CHinese Electronic health Records Research in Yinzhou (CHERRY) Study
| Population census and registered health insurance database | Health checks database | Disease surveillance and management database | Outpatient/inpatient EMR database (including laboratory testing) | Charge and claims database | Environmental monitoring database | |
| Individual-level measurements | ||||||
| Date of entry of first registration | * | |||||
| Date of the measurements | † | † | † | † | ||
| Date of birth, sex and ethnic groups | * | † | † | † | ||
| Marital status, education, occupation and socioeconomic status (household income, living space, etc) | * | † | ||||
| Smoking and alcohol use (current/former/never; amount/duration, etc) | * | † | † | |||
| Physical activity | * | † | † | |||
| Weight, height, waist and hip circumference | * | † | † | |||
| History of hypertension and diabetes mellitus | * | † | † | † | ||
| Prior history of coronary heart disease (in particular myocardial infarction and angina), stroke, transient ischaemic attack (TIA), and peripheral vascular disease (PVD) | * | † | † | † | ||
| Family history of diseases | * | † | ||||
| Systolic and diastolic blood pressure | * | † | † | |||
| Metabolic factors (including fasting glucose, postload glucose and glycosylated haemoglobin) | † | † | † | |||
| Lipid profiles: total, high-density and low-density lipoprotein cholesterol; triglycerides (including information about fasting status at time blood samples were taken) | † | † | † | |||
| Blood urea nitrogen, creatinine, uric acid | † | † | † | |||
| Use of cardiovascular medications (including antihypertensive drugs, ‘statins’, fibrates) and other medications (eg, hypoglycaemic agents, hormone replacement therapy) | † | † | † | |||
| Inflammatory markers (including homocysteine, C reactive protein, fibrinogen, albumin, interleukin 6 and the leucocyte count) | † | † | ||||
| Haemostatic factors (including von Willebrand factor and fibrin D-dimer) | † | † | ||||
| Novel CVD-related markers (eg, NT-proBNP) | † | |||||
| ECG | † | † | ||||
| Cardiovascular imaging information (eg, progression of coronary artery calcium, carotid intima media thickness) | † | † | ||||
| Urine albumin to creatinine ratio (UACR) | † | † | † | |||
| Cost of outpatient or inpatient admission (including fees for diagnosis, prescription, laboratory test, surgery, etc) | † | † | ||||
| Environmental and ecological data | ||||||
| Date and region of surveillance | † | |||||
| Air temperature and precipitation | † | |||||
| Particles with aerodynamic diameter <2.5 µm (PM2.5) | † | |||||
| Heavy metal concentration in water (lead, cadmium, mercury and arsenic) | † |
* indicates the database has only one record for the measurements.
† indicates the database has multiple records for the measurements.
Health checks database refers to health checks for new rural cooperative medical scheme, health checks for elderly people, and health checks for adults with hypertension and diabetes in figure 2.
EMR, electronic medical record; NT-proBNP, N-terminal pro B-type natriuretic peptide.
Definitions of major outcomes in the CHinese Electronic health Records Research in Yinzhou (CHERRY) Study
| Events of interest | ICD-10 code |
| Primary events of interest | |
| Death due to | |
| Ischaemic heart disease | I20-I25 |
| Cerebrovascular disease | I60-69 |
| Major cardiovascular disease | I00-I78 |
| All-cause mortality | |
| Hospitalisation with main diagnosis of | |
| Myocardial infarction | I21, I22 |
| Stroke | I60, I61, I63 (excluding I63.6), I64, H34.1 |
| Congestive heart failure | I50 |
| Outpatient visit with main diagnosis of | |
| Hypertension* | I10, I11, I12, I13, I15 |
| Diabetes mellitus† | E10, E11, E13, E14 |
| Cardiovascular disease | I00-I99 |
| Secondary events of interest (fatal and non-fatal) | |
| ST-elevation myocardial infarction (STEMI) | I21.0, I21.1, I21.2, I21.3, I22.0, I22.1, I22.8 |
| Non-ST segment elevation myocardial infarction (NSTEMI) | I21.4 |
| Ischaemic stroke | I63, I64, H34.1 (excluding I63.6) |
| Haemorrhagic stroke | I60, I61 |
| Transient ischaemic attack | G45 (excluding G45.4), H34.0 |
| Atrial fibrillation | I48 |
| Aortic aneurysm/aortic dissection | I71 |
| Peripheral artery disease | I70.2, I73.9, I74.3, I74.4 |
*Hypertension includes either registered in the hypertension management system, or self-reported history of hypertension during health checks, or hospital admission with a diagnosis of hypertension.
†Diabetes mellitus includes either registered in the diabetes management system, or self-reported history of diabetes during health checks or hospital admission with a diagnosis of diabetes.
Majority of ICD-10 codes were selected according to the published study: ‘The Cardiovascular Health in Ambulatory Care Research Team (CANHEART): using big data to measure and improve cardiovascular health and healthcare services.’ by Tu JV, et al, 2015, Circulation: Cardiovascular Quality and Outcomes, 8, p. 208. Copyright 2015 by the American Heart Association. Adapted with permission (License Number: 4252490297412). However, codes for STEMI and NSTEMI were modified based on the study in China.34
ICD-10, International Classification of Diseases, tenth revision.
Characteristics of participants in the CHinese Electronic health Records Research in Yinzhou (CHERRY) Study, by sex
| Men | Women | Overall | ||||
| n | % having at least one measurement | n | % having at least one measurement | n=1053 565 | % having at least one measurement | |
| Age at first registration (years) | 40.17±15.17 | 100% | 39.24±15.56 | 100% | 39.69±15.38 | 100% |
| Region (n (%)) | 98.93% | 98.45% | 98.68% | |||
| Rural | 354 524 (69.99%) | 364 120 (68.30%) | 718 644 (69.12%) | |||
| Urban | 152 038 (30.01%) | 169 017 (31.70%) | 321 055 (30.88%) | |||
| Education (n (%)) | 79.05% | 79.10% | 79.07% | |||
| Junior high school or lower | 324 438 (80.16%) | 352 534 (82.30%) | 676 972 (81.26%) | |||
| Senior high school or higher | 80 307 (19.84%) | 75 801 (17.70%) | 156 108 (18.74%) | |||
| Smoking status (n (%)) | 88.79% | 85.60% | 87.15% | |||
| Never smoker | 271 870 (59.80%) | 440 834 (95.11%) | 712 704 (77.62%) | |||
| Former smoker | 23 099 (5.08%) | 2828 (0.61%) | 25 927 (2.82%) | |||
| Current smoker | 159 664 (35.12%) | 19 861 (4.28%) | 179 525 (19.55%) | |||
| Alcohol use (n (%)) | 88.74% | 85.57% | 87.11% | |||
| ≥3 days per week | 87 064 (19.16%) | 3608 (0.78%) | 90 672 (9.88%) | |||
| <3 days per week | 43 811 (9.64%) | 3713 (0.80%) | 47 524 (5.18%) | |||
| Never drinker | 323 518 (71.20%) | 456 059 (98.42%) | 779 577 (84.94%) | |||
| Body mass index (BMI) (kg/m2) | 22.71±2.32 | 86.97% | 22.25±2.57 | 84.06% | 22.47±2.46 | 85.47% |
| Systolic blood pressure (mm Hg) | 128.09±14.56 | 41.37% | 126.16±15.11 | 40.53% | 127.11±14.87 | 40.94% |
| Diastolic blood pressure (mm Hg) | 79.89±9.76 | 41.35% | 78.13±10.02 | 40.52% | 79.00±9.93 | 40.93% |