Yong Fan1, Yuzhuo Zhao, Peiyao Li, Xiaoli Liu, Lijing Jia, Kaiyuan Li, Cong Feng, Fei Pan, Tanshi Li, Zhengbo Zhang, Desen Cao. 1. Department of Biomedical Engineering and Maintenance Center, Chinese PLA General Hospital, Beijing 100853, China (Fan Y, Li PY, Zhang ZB, Cao DS); Department of Emergency, Chinese PLA General Hospital, Beijing 100853, China (Zhao YZ, Jia LJ, Li KY, Feng C, Pan F, Li TS); Medical Information Center, Chinese PLA General Hospital, Beijing 100853, China (Zhang ZB); School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China (Liu XL). Corresponding author: Zhang Zhengbo, Email: zhengbozhang@126.com.
Abstract
OBJECTIVE: To study the distribution of diseases in Medical Information Mart for Intensive Care (MIMIC-III) database in order to provide reference for clinicians and engineers who use MIMIC-III database to solve clinical research problems. METHODS: The exploratory data analysis technologies were used to explore the distribution characteristics of diseases and emergencies of patients (excluding newborns) in MIMIC-III database were explored; then, neonatal gestational age, weight, length of hospital stay in intensive care unit (ICU) were analyzed with the same method. RESULTS: In the MIMIC-III database, 46 428 patients were admitted for the first time, and 49 214 ICU records were recorded. There were 26 076 males and 20 352 females; the median age was 60.5 (38.6, 75.6) years, and most patients were between 60 and 80 years old. The first diagnosis in the disease spectrum analysis was firstly ranked by circulatory diseases (32%), followed by injury and poisoning (14%), digestive system disease (8%), tumor (7%), respiratory disease (6%) and so on. Patients with ischemic heart disease accounted for the largest proportion of circulatory disease (42%), the proportion of these patients gradually increased with age of 60-70 years old, then decreased. However, the proportion of patients with cerebrovascular disease declined first and then increased with age, which was the main cause of death of circulatory system disease (ICU mortality was 22.5%). Injury and poisoning patients showed a significant decrease with age. Digestive system diseases were younger than the general population (most people aged between 50 to 60 years), and non-infectious enteritis and colitis were the main causes of death (ICU mortality was 18.3%). Respiratory infections were predominant in infected patients (34%), but circulatory system infections were the main cause of death (ICU mortality was 25.6%). Secondly, in the neonatal care unit, premature infants accounted for the vast majority (82%). As the gestational age increased, the duration of ICU was decreased, and the mortality was decreased. CONCLUSIONS: The diseases distribution of patients can be provided by MIMIC-III database, which helps to grasp the overview of the volume and age distribution of the target patients in advance, and carry out the next step of research. Meanwhile, it points out the important role of exploratory data analysis in electronic health records analysis.
OBJECTIVE: To study the distribution of diseases in Medical Information Mart for Intensive Care (MIMIC-III) database in order to provide reference for clinicians and engineers who use MIMIC-III database to solve clinical research problems. METHODS: The exploratory data analysis technologies were used to explore the distribution characteristics of diseases and emergencies of patients (excluding newborns) in MIMIC-III database were explored; then, neonatal gestational age, weight, length of hospital stay in intensive care unit (ICU) were analyzed with the same method. RESULTS: In the MIMIC-III database, 46 428 patients were admitted for the first time, and 49 214 ICU records were recorded. There were 26 076 males and 20 352 females; the median age was 60.5 (38.6, 75.6) years, and most patients were between 60 and 80 years old. The first diagnosis in the disease spectrum analysis was firstly ranked by circulatory diseases (32%), followed by injury and poisoning (14%), digestive system disease (8%), tumor (7%), respiratory disease (6%) and so on. Patients with ischemic heart disease accounted for the largest proportion of circulatory disease (42%), the proportion of these patients gradually increased with age of 60-70 years old, then decreased. However, the proportion of patients with cerebrovascular disease declined first and then increased with age, which was the main cause of death of circulatory system disease (ICU mortality was 22.5%). Injury and poisoningpatients showed a significant decrease with age. Digestive system diseases were younger than the general population (most people aged between 50 to 60 years), and non-infectious enteritis and colitis were the main causes of death (ICU mortality was 18.3%). Respiratory infections were predominant in infectedpatients (34%), but circulatory system infections were the main cause of death (ICU mortality was 25.6%). Secondly, in the neonatal care unit, premature infants accounted for the vast majority (82%). As the gestational age increased, the duration of ICU was decreased, and the mortality was decreased. CONCLUSIONS: The diseases distribution of patients can be provided by MIMIC-III database, which helps to grasp the overview of the volume and age distribution of the target patients in advance, and carry out the next step of research. Meanwhile, it points out the important role of exploratory data analysis in electronic health records analysis.