Literature DB >> 35615723

Existing Data Sources for Clinical Epidemiology: Database of the National Hospital Organization in Japan.

Natsuko Kanazawa1, Takuaki Tani1, Shinobu Imai1,2,3, Hiromasa Horiguchi1, Kiyohide Fushimi1,2, Norihiko Inoue1,2.   

Abstract

This review introduces the National Hospital Organization (NHO) database in Japan. The NHO has maintained two databases through a system of data collection from 140 hospitals in the NHO. National Hospital Organization Clinical Data Archives (NCDA) is collecting clinical information in real time from the electronic medical records since January 2016, and Medical Information Analysis (MIA) databank is collecting daily insurance claims data since April 2010. The NHO database covers more than 8 million patients in 140 hospitals throughout Japan. The database consists of the information of patient profiles, hospital admission and discharge, diagnosis with ICD-10 codes, text data from medical chart, daily health insurance claims such as medical procedures, medications or surgeries, vital signs and laboratory data, and so on. The NHO database includes a wide variety of diseases and settings, including acute, chronic and intractable diseases, emergency medical services, disaster medicine, response to emerging infectious disease outbreaks, medical care according to health policies such as psychiatry, tuberculosis, or muscular dystrophy, and health systems in sparsely populated non-urban areas. Among several common diseases, the database has representativeness in terms of age distribution compared with the Patient Survey 2017 by the Ministry of Health, Labour and Welfare. Interested researchers can contact (700-dbproject@mail.hosp.go.jp) the NHO database division to obtain more information about the NHO database for utilization.
© 2022 Kanazawa et al.

Entities:  

Keywords:  DPC; National Hospital Organization in Japan; big data; database; diagnosis procedure combination; linkage; real-world data; validation

Year:  2022        PMID: 35615723      PMCID: PMC9126156          DOI: 10.2147/CLEP.S359072

Source DB:  PubMed          Journal:  Clin Epidemiol        ISSN: 1179-1349            Impact factor:   5.814


Introduction

National Hospital Organization

The National Hospital Organization (NHO) has 140 hospitals throughout Japan with approximately 52,000 hospital beds (Figure 1). The NHO provides ordinary acute care and emergency medical services, as well as disaster medicine, response to emerging infectious disease outbreaks, policy medicine such as psychiatry, tuberculosis, and muscular dystrophy, and health systems in sparsely populated non-urban areas. The NHO functions as a core hospital that provides primary to advanced medical care in each region, regardless of whether it is an urban or suburban area. The NHO also covers treatment of intractable diseases that are difficult for private hospitals to make a profit from, thus functioning as a fortress to protect the health of residents in each region.
Figure 1

Distribution of 140 hospitals in the National Hospital Organization all over Japan. The locations of some representative million cities are also indicated on the map.

Distribution of 140 hospitals in the National Hospital Organization all over Japan. The locations of some representative million cities are also indicated on the map.

National Hospital Organization Clinical Data Archives (NCDA) and Medical Information Analysis (MIA) Data Bank

The NHO headquarters has established and maintained two databases: the first is the National Hospital Organization Clinical Data Archives (NCDA), which collects real-time clinical information from the electronic medical records of NHO hospitals. The information of the NCDA has been collected since January 2016 including information related to clinical settings obtained from the Standardized Structured Medical Record Information Exchanged (SS-MIX) standardized storage.1 The second is the Medical Information Analysis (MIA) data bank, which collects daily insurance claims information. The MIA data bank started in April 2010 and collects insurance claims information, including the payment system by case-mix and diagnosis-related group (DRG) of diagnosis procedure combination/Per-Diem Payment System (DPC/PDPS).2–7 The DPC data in Japan is a dataset that has been diversely used in hundreds of clinical epidemiological studies over a wide range of subjects, such as internal medicine,8–12 surgery,13 emergency medicine,14–16 psychiatry medicine,17 geriatrics,18 and so on. These two databases can seamlessly link patient information and are used not only for clinical epidemiological research, but also for other purposes and analyses such as clinical quality indicators, analysis of hospital care or financial management, and so on. In the case of a COVID-19 pandemic or influenza pandemic, the databases are utilized to provide timely information to the Ministry of Health, Labour and Welfare (MHLW) and the National Institute of Infectious Diseases.19

Data Coverage

In March 2021, the NHO database collects data from 140 hospitals with more than 8 million patients with 10 years of follow-up (Table 1). Data quality checks and database construction are performed by the Information technology department within the NHO. The multicentre, long-term data brings epidemiological relevance for detailed investigation, which is especially important for the study of rare exposures and diseases. Patients with a history of consistent visits to NHO hospitals for more than 1 year have a median follow-up of 4.17 years, and a maximum follow-up of 10 years, allowing for the study of diseases with long latency periods and long-term outcomes. Japan has adopted a universal health insurance system where all Japanese and foreign nationals residing in Japan are mandated to join. This provides equal access to medical care to all citizens regardless of region or income, and the NHO database includes broad range of patients, regardless of their social background.
Table 1

Patient Demographic in NHO Database

NHO Database Year 2010–2019 N (%)NHO Database Year 2017 N (%)Patient Survey 2017 N (%)
Total patients8,341,9772,098,18363,727,000
 Men3,944,721 (47.3%)973,923 (46.4%)26,645,000 (41.9%)
 Women4,397,256 (52.7%)1,124,260 (53.6%)36,918,000 (58.1%)
Age in 2017
 <19292,588 (13.9%)4,960,000 (7.8%)
 19–39225,913 (10.8%)5,677,000 (8.9%)
 40–741,006,392 (48.0%)32,221,000 (50.6%)
 75 and more573,290 (27.3%)20,869,000 (32.7%)
Major diseases
 Ischemic heart diseases581,988 (7.0%)155,459 (7.4%)722,000 (1.1%)
 Cerebrovascular diseases543,665 (6.5%)153,961 (7.3%)1,115,000 (1.7%)
 Malignant neoplasms925,934 (11.1%)317,025 (15.1%)1,782,000 (2.8%)
 Diabetes mellitus939,548 (11.3%)303,010 (14.4%)3,289,000 (5.2%)
 Dementia257,454 (3.1%)57,551 (2.7%)704,000 (1.1%)

Notes: †The Patient Survey 2017 by the Ministry of Health, Labour and Welfare in Japan results publish the number of patients rounded to the nearest thousand.

Patient Demographic in NHO Database Notes: †The Patient Survey 2017 by the Ministry of Health, Labour and Welfare in Japan results publish the number of patients rounded to the nearest thousand.

Data Governance, Practice, and Patient Confidentiality

NHO strives to operate within Japanese laws and guidelines to protect confidentially. To protect patient confidentiality, patient identifiers are securely stored in the security room. Engineers are in charge of extracting the data in the security room, anonymizing the data, and handing it over to the researchers. Researchers cannot access the database directly. These processes ensure the anonymity and security of the information.

Funding Sources of NHO Database

The NHO database is developed and operated by the National Hospital Organization with its own funds.

Ethics for Researches Using NHO Database

For researchers who wish to use and analyze individual patient-level data in the NHO database, research proposals should be submitted to be reviewed by the Institutional Review Board (IRB) in the NHO Headquarters or each hospital. The IRB in the NHO includes external experts and covers a wide range of disciplines, including law, humanities, social sciences, statistics, and medicine, to ensure a multidisciplinary review of ethical validity. Although individual informed consent is not required to conduct the study due to the anonymized data, the research plan will be published on the NHO website and patients will be guaranteed the opportunity to opt out. In addition, the detailed dataset definition regarding the research has to be reviewed by the Review Board for Data Utilization in the NHO Headquarters whether the data content is appropriate according to the study protocol.

Details of NHO Database

Data Types and Contents

The data in the NHO database have insurance claims and clinical information with timestamp, covering many types of facility-level and patient-level characteristics (Table 2): diagnosis, medical procedures, medications, laboratory test, culture test, hospital admission and discharge, severity of diseases, etc. The most significant feature of the NHO database is that information entered on the electronic medical record is collected in real time, allowing for analysis of the latest hospital and patient information.
Table 2

Summary of Data Collected in NHO Database

CategoryData Element
Facility informationNumber of beds, Address, Facility type etc.
Patient profileGender, Birth date, Height, Weight, Activity of daily living, Address, Postal code, etc.
HospitalizationDate of hospital admission and discharge, Outcome, Admission path, Discharge destination, Ambulance transport, etc.
DiagnosisOnset date, disease name and codes based on ICD-10 and Japanese insurance claim: Main disease, Disease triggering admission, Comorbidity at hospital admission, Complications among hospitalization, Most and Second most resource-using disease
Severity of diseasesJCS, mRS, SOFA score, A-DROP scoring system, Killip classification, etc.
Medical practiceProcedures, Medications, Surgeries, Rehabilitation etc. with date of creation and performed, quantity, cost, and duration, etc.
MealDate and content
ExaminationLaboratory test, Blood pressure, Heart rate, Body temperature, Respiratory rate, Oxygen saturation, etc.
Medical chartChart by physician and nurse, discharge summary

Notes: †A-DROP scoring system is a modified version of CURB-65 developed by the Japanese Respiratory Society38,39.

Abbreviations: ICD-10, International Statistical Classification of Diseases and Related Health Problems 10th Revision; JCS, Japan Coma Scale; mRS, Modified-Rankin Scale; SOFA, sequential Organ Failure Assessment score.

Summary of Data Collected in NHO Database Notes: †A-DROP scoring system is a modified version of CURB-65 developed by the Japanese Respiratory Society38,39. Abbreviations: ICD-10, International Statistical Classification of Diseases and Related Health Problems 10th Revision; JCS, Japan Coma Scale; mRS, Modified-Rankin Scale; SOFA, sequential Organ Failure Assessment score. The diagnosis and code with the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) and the Japanese claims code are also accompanied by a record of the disease’s start date. The medical procedures defined by the Japanese reimbursement system, such as ICU admission and invasive ventilators, are recorded in the database on a daily basis and the start and end dates, and duration of medical practices can be counted. Especially, laboratory tests and culture tests are also recorded with the timestamp of ordering and performing tests, submitting specimens, running tests, and reports of the results. Patient outcomes include a wealth of information, such as death, length of stay, cost, and discharge destination. The NHO database does not collect the types of data with the intention of specific research purposes, such as self-reported survey data or genotype data.

Data Linkage

The NHO data are available at both the patient and hospital levels, depending on the purpose. Since the patient data are anonymized, data linkage at the patient level with the external database is not possible. However, data linkage at the hospital level is possible, with established linkages to public data such as the Patient Survey and the Survey of Medical Institutions by the Ministry of Health, Labour and Welfare, and demographic statistics. It is also possible to link the location information at the patient or the hospital level, including zip codes and routes, to broad areas on the map that do not identify individual patients.

Frequency of Data Collection

The data is collected from each hospital to the NHO headquarters. Claims information is collected monthly and the electronic medical records in real time via the internal secure network in the NHO. The collected data is built by a specialized information department and made available to researchers.

Data Quality

The NHO has specialized engineers who constantly check the data registration status, and if omissions or delays occur, will contact the hospital personnel and assist the hospital until the data is registered correctly. In addition, since the claims data is submitted to the insurer or the government, the data is checked for errors through an error checking system to ensure there are no input errors.

Data Resource Use

Validation Study

The NHO database is an invaluable source for clinical epidemiology and conducting observational studies. However, as with other hospitals in Japan, the data in the NHO database itself is billing and clinical information obtained from routine practice and is not originally intended for use in clinical research. Classification bias can occur when non-sensitive or unspecific tests, or incorrect variables of diagnostic criteria are used. Therefore, some validation studies for the NHO database have been conducted, and new studies are currently still ongoing. Since the information in the NHO database can link to the medical charts in the NHO hospitals at the patient level, several validation studies were conducted. In the validation study of diagnosis, procedures, and laboratory results in the NHO database with chart reviews, the primary diagnosis had sensitivity of 78.9% and specificity of 93.2%, respectively.20 Imai et al described the validity of detecting previously resolved hepatitis B virus infection in the NHO database with the chart review as the golden standard, with the positive predictive value (PPV) of 85.8%.21 In addition, the validation study of comparison with the procedure-based and the diagnosis-based identifications of severe sepsis and disseminated intravascular coagulation in administrative data was conducted.22 This validation study showed that the procedure-based algorithm is more sensitive than the diagnosis-based algorithm and has the potential to improve the likelihood of disease identification from the NHO database. Most recently, developing regression model-based event detection and validation could identify postsurgical infections from the routinely-collected DPC data in the NHO database on a cut-off score with C-statistic of 0.885, sensitivity of 92% and specificity of 72%, and on another cut-off score with sensitivity of 75% and specificity of 91%.23

Published Works

The NHO database has been used for many other clinical epidemiology studies as well as validation studies: predictors of survival in the elderly after cardiopulmonary resuscitation,24 risk stratification for physical morbidity associated with atypical antipsychotic treatment in Parkinson’s disease,25 activities of daily living after hip fracture surgery in elderly patients with spinal anesthesia,26 and a comparison of walking ability between peripheral nerve block and local infiltration analgesia after knee joint replacement surgery.27

Equalization of Medical Practices

To maintain and improve the quality of daily clinical practices with equality among all 140 hospitals in the NHO, the NHO headquarter is providing the clinical indicators in various processes and outcomes by calculating the clinical indicator’s values from the NHO database based on evidence-based guidelines.28,29 For example, some medical practices such as the proper usage of anticancer drugs or anticoagulation, are based on several evidences and adopted in the guidelines. If the recommendations are not followed in daily clinical practice and the medical treatment is carried out as desired, evidence-based evaluations will not be possible. In that case, even if database records are used in observational studies, it may be difficult to compare them with previously published studies based on the evidence. When the value of the clinical indicator has an outlier, the NHO Headquarters provides supportive information to the hospitals for detecting the reasons.

Strengths and Weaknesses of NHO Database

Strengths

Strengths of the NHO database as a research resource include basic patient attributes and diagnoses, medical practices, detailed laboratory tests and culture tests, and multicentre data collection from 140 hospitals on daily basis.

Breadth of data

As a national hospital, it provides a uniform medical system throughout Japan. The NHO database covers a wide range of diseases, from general diseases to policy medicine, intractable diseases, and rare diseases such as muscular dystrophy, severely multiple handicapped children, intractable neurological diseases, tuberculosis, and AIDS. In addition, electronic medical record information is linked to insurance claim information, which can be used for research on treatment effects and costs. It is one of the few large and continuous databases that can be used for researches at various levels: individual, hospital, regional, and national.

Representativeness and focus of NHO database

In Table 1 and Figure 2, compared with the Patient Survey 2017,30 the NHO database has representativeness in terms of the age proportion, and has focus on the acute care and severe diseases among the five major diseases defined by the Ministry of Health, Labour and Welfare: ischemic heart diseases (ICD-10 code: I20–I25), cerebrovascular diseases (I60–I69), malignant neoplasms (C00–C97), diabetes mellitus (E10–E14), dementia including vascular dementia (F01) and Alzheimer’s disease (F03). The patient number and proportion of the five major diseases stratified by age had very similar patterns of age distribution between the NHO database and the Patient Survey 2017 (Figure 2). However, when comparing the disease distribution, the NHO has a very higher proportion than the Patient Survey in the year 2017 (Table 1): ischemic heart diseases (NHO, 7.4% vs Patient Survey, 1.1%), cerebrovascular diseases (7.3% vs 1.7%), malignant neoplasm (15.1% vs 2.8%), and diabetes mellitus (14.4% vs 5.2%). Since the Patient Survey covered clinics and various types of hospitals throughout Japan, including many minor illnesses that do not require hospitalization, such as the common cold. Comparing proportions of the five major diseases with the Patient Survey shows that the NHO hospitals focus on acute and severe illness. For example, NHO’s percentage of cancer patients is far greater than the patient survey. This is due to the presence of 35 of NHO’s 140 hospitals (including 3 cancer centers) that coordinate cancer care in the region. The NHO also covers medical care for intractable diseases, and covers 68.8% of muscular dystrophy patients (ICD-10 code G710) who visit hospitals and clinics, as estimated from the Patient Survey 2017.30
Figure 2

Comparison of age distribution of patients between Patient Survey 2017 and NHO database in 2017 among the five major diseases: (A) Ischemic heart diseases, (B) Cerebrovascular diseases, (C) Malignant neoplasms, (D) Diabetes mellitus, (E) Dementia.

Comparison of age distribution of patients between Patient Survey 2017 and NHO database in 2017 among the five major diseases: (A) Ischemic heart diseases, (B) Cerebrovascular diseases, (C) Malignant neoplasms, (D) Diabetes mellitus, (E) Dementia.

Weaknesses

Missing Data

As for claims data, missing measurements are few because checks are carefully performed for medical fee billings. However, the claims data sometimes contain erroneous entries for clinical information such as patient profiles or patient status. In addition, codes indicating “unknown” are sometimes selected. Our previous validation study showed the validity of diagnoses, procedures, and laboratory data in the NHO database.20 The primary diagnoses in the NHO database had a specificity exceeding 96% and a sensitivity of 78.9%. The 10 common procedures of the NHO database had over 90% in both specificity for nine procedures and sensitivity for six procedures.20 Although the validation study showed 95% of agreement between the NHO database and chart reviews,20 there may be incorrectly entered or omitted vital data. When analyzing data, careful examination of the proportions and patterns of missing values may be necessary to make missing imputation.

Facility-Based Data

The data are created in each hospital. Therefore, the information out of the hospitals are not recorded. Social aspects related to health, such as social support, family structure, use of over-the-counter medications, income, and education level, are not recorded. There are some missing patient groups, such as uninsured patients and those who receive self-pay treatment or uninsured care. However, the proportion of uninsured patients is very small in Japan’s universal health insurance system.

General Issues of EHR and Challenges in NHO

Epidemiological studies using electronic health records (EHRs) data still generally have several weaknesses and overarching challenges, not only in the NHO database: validity of data, representativeness, data availability and interpretation, and missingness.31–33 For example, as noted in the previous section, hospital EHRs do not contain information on medical care provided outside the hospital. As a solution to this problem, based on the “Next Generation Medical Infrastructure Law” (official name “Act on Anonymized Medical Data That Are Meant to Contribute to Research and Development in the Medical Field”) which was legislated in 2018, the project has been initiated to expand the capability of the data resources by the linkage of the EHR data from the NHO database and the clinics across the country held by the Japan Medical Association. This collaboration will enable more consistent analyses of patient data and is expected to contribute to new medical discoveries.

Conclusion

The NHO database accumulates real-world data based on medical care at 140 hospitals across the country, and is used for many clinical studies. In recent years, many diverse medical information data resources exist in Japan that can be used for research.34–37 For example, Biobank Japan, which has collected hundreds of thousands of serum samples and clinical information on diseases, is an example of other large-scale data resources.36,37 Each of these medical information databases has a different focus and layers of granularity, from the societal to the molecular level. In the future, increased collaborations of data resources with different characteristics can be expected to yield new scientific findings, and it is important that their use be accompanied by ethical protections. In Japan, as in many other countries, many issues remain to be resolved in the future, such as laws, social consensus, ethical considerations, procedures, and technology, in order to achieve linkage and utilization of diverse data resources. As these issues will be resolved and data are properly utilized in the future, more diverse scientific outcomes will be achieved, from the individual level, such as tailor-made medicine, to the societal level, such as policy evidence.
  34 in total

1.  SS-MIX: a ministry project to promote standardized healthcare information exchange.

Authors:  M Kimura; K Nakayasu; Y Ohshima; N Fujita; N Nakashima; H Jozaki; T Numano; T Shimizu; M Shimomura; F Sasaki; T Fujiki; T Nakashima; K Toyoda; H Hoshi; T Sakusabe; Y Naito; K Kawaguchi; H Watanabe; S Tani
Journal:  Methods Inf Med       Date:  2011-01-05       Impact factor: 2.176

2.  Case-mix payment in Japanese medical care.

Authors:  Shinichi Okamura; Ryota Kobayashi; Tetsuo Sakamaki
Journal:  Health Policy       Date:  2005-11       Impact factor: 2.980

3.  Volume effect in paediatric brain tumour resection surgery: analysis of data from the Japanese national inpatient database.

Authors:  Daisuke Shinjo; Kimikazu Matsumoto; Keita Terashima; Tetsuya Takimoto; Tetsu Ohnuma; Takashi Noguchi; Kiyohide Fushimi
Journal:  Eur J Cancer       Date:  2019-02-01       Impact factor: 9.162

4.  [Application of the diagnosis procedure combination (DPC) data to clinical studies].

Authors:  Hideo Yasunaga; Hiroki Matsui; Hiromasa Horiguchi; Kiyohide Fushimi; Shinya Matsuda
Journal:  J UOEH       Date:  2014-09-01

5.  Validity of diagnoses, procedures, and laboratory data in Japanese administrative data.

Authors:  Hayato Yamana; Mutsuko Moriwaki; Hiromasa Horiguchi; Mariko Kodan; Kiyohide Fushimi; Hideo Yasunaga
Journal:  J Epidemiol       Date:  2017-01-27       Impact factor: 3.211

Review 6.  Overview of BioBank Japan follow-up data in 32 diseases.

Authors:  Makoto Hirata; Akiko Nagai; Yoichiro Kamatani; Toshiharu Ninomiya; Akiko Tamakoshi; Zentaro Yamagata; Michiaki Kubo; Kaori Muto; Yutaka Kiyohara; Taisei Mushiroda; Yoshinori Murakami; Koichiro Yuji; Yoichi Furukawa; Hitoshi Zembutsu; Toshihiro Tanaka; Yozo Ohnishi; Yusuke Nakamura; Koichi Matsuda
Journal:  J Epidemiol       Date:  2017-02-10       Impact factor: 3.211

7.  Consultation-liaison psychiatry in Japan: a nationwide retrospective observational study.

Authors:  Daisuke Shinjo; Hisateru Tachimori; Keiko Maruyama-Sakurai; Kenji Fujimori; Norihiko Inoue; Kiyohide Fushimi
Journal:  BMC Psychiatry       Date:  2021-05-05       Impact factor: 3.630

8.  Comparison of Procedure-Based and Diagnosis-Based Identifications of Severe Sepsis and Disseminated Intravascular Coagulation in Administrative Data.

Authors:  Hayato Yamana; Hiromasa Horiguchi; Kiyohide Fushimi; Hideo Yasunaga
Journal:  J Epidemiol       Date:  2016-04-09       Impact factor: 3.211

9.  History and Profile of Diagnosis Procedure Combination (DPC): Development of a Real Data Collection System for Acute Inpatient Care in Japan.

Authors:  Kenshi Hayashida; Genki Murakami; Shinya Matsuda; Kiyohide Fushimi
Journal:  J Epidemiol       Date:  2020-11-21       Impact factor: 3.211

View more
  1 in total

1.  Economic and clinical burden from carbapenem-resistant bacterial infections and factors contributing: a retrospective study using electronic medical records in Japan.

Authors:  Shinobu Imai; Norihiko Inoue; Hideaki Nagai
Journal:  BMC Infect Dis       Date:  2022-06-29       Impact factor: 3.667

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.