Literature DB >> 32337324

The Data set for Patient Information Based Algorithm to Predict Mortality Cause by COVID-19.

Jing Li1, Lishi Wang1,2, Sumin Guo3, Ning Xie4, Lan Yao5,6, Yanhong Cao6, Sara W Day7, Scott C Howard7, J Carolyn Graff7, Tianshu Gu8, Jiafu Ji9, Weikuan Gu1,10, Dianjun Sun6.   

Abstract

The data of COVID-19 disease in China and then in South Korea were collected daily from several different official websites. The collected data included 33 death cases in Wuhan city of Hubei province during early outbreak as well as confirmed cases and death toll in some specific regions, which were chosen as representatives from the perspective of the coronavirus outbreak in China. Data were copied and pasted onto Excel spreadsheets to perform data analysis. A new methodology, Patient Information Based Algorithm (PIBA) [1], has been adapted to process the data and used to estimate the death rate of COVID-19 in real-time. Assumption is that the number of days from inpatients to death fall into a pattern of normal distribution and the scores in normal distribution can be obtained by observing 33 death cases and analysing the data [2]. We selected 5 scores in normal distribution of these durations as lagging days, which will be used in the following estimation of death rate. We calculated each death rate on accumulative confirmed cases with each lagging day from the current data and then weighted every death rate with its corresponding possibility to obtain the total death rate on each day. While the trendline of these death rate curves meet the curve of current ratio between accumulative death cases and confirmed cases at some points in the near future, we considered that these intersections are within the range of real death rates. Six tables were presented to illustrate the PIBA method using data from China and South Korea. One figure on estimated rate of infection and patients in serious condition and retrospective estimation of initially occurring time of CORID-19 based on PIBA. Published by Elsevier Inc.

Entities:  

Keywords:  COVID-19; Coronavirus; Death Rate; Estimation; Normal distribution; PIBA; Prediction

Year:  2020        PMID: 32337324      PMCID: PMC7180158          DOI: 10.1016/j.dib.2020.105619

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications table

Value of the data

These data provide the scientific community with a new methodology to estimate the death rate and then predict the death cases during an epidemic. Scientific researchers, CDC employees, government officers for disease control and management, and public population, will benefit from these data. These data will be very useful for the studies with the purpose either of disease control management or of related sources preparation to combat against an outbreak. Due to the limited amount of data samples collected in this article, some factors, such as the phases of an outbreak and the measurements issued by the department of disease control that might impact the death rate of an epidemic, could be taken into for further insights and development of experiments with a large amount of data.

Data description

The data of 33 death cases in Table 1 have been collected from the official website of the Health Commission of Hubei Province in China, which include the date that patients have onset of symptoms, the date that patients began to be taken into ICU and the date of decease. With these data, the days both from symptoms appearance to death and from ICU intake to death can be calculated. Following normal distribution, the mean score μ and standard deviation σ can be calculated either. Thus the 5 selected scores (μ, μ ± σ and μ ± 2σ) in normal distribution can be obtain as the basic elements for the following estimation and prediction of death rate, which are respectively 2, 8, 13, 19, 25 days.
Table 1

33 death cases in Wuhan city of Hubei province in China.

Patient No.Genderagesymptoms appearanceICU intakedeceasedays from symptoms appearance to deathdays from ICU intake to death
1M701/16/20201/19/20201/23/202074
2F761/18/20201/24/2020-6
3M721/12/20201/18/20201/23/2020115
4M791/12/20201/17/20201/24/2020127
5M551/9/20201/19/20201/24/2020155
6M871/13/20201/19/20201/23/2020104
7F661/10/20201/19/20201/21/2020112
8M581/4/20201/18/20201/24/2020206
9M661/11/20201/21/2020-10
10M781/14/20201/23/20201/24/2020101
11M651/13/20201/16/20201/23/2020107
12M671/11/20201/15/20201/24/2020139
13M5812/24/20191/1/20201/23/20203022
14F671/6/20201/12/20201/23/20201711
15F821/11/20201/17/20201/23/2020126
16F691/14/20201/22/2020-8
17M361/7/20201/9/20201/23/20201614
18M7312/29/20191/5/20201/22/20202417
19F701/16/20201/18/20201/23/202075
20M811/10/20201/13/20201/21/2020118
21F651/13/20201/15/20201/23/2020108
22F701/13/20201/21/2020-8
23M531/10/20201/20/20201/21/2020111
24M861/9/20201/9/20201/21/20201212
25M651/11/20201/21/2020-10
26M841/7/20201/10/20201/22/20201512
27M811/18/20201/22/20204
28F801/11/20201/18/20201/22/2020114
29F821/12/20201/20/20201/22/2020102
30M661/11/20201/16/20201/20/202094
31M891/13/20201/18/20201/19/202061
32M6912/31/20191/4/20201/15/20201511
33M331/10/20201/12/20202/6/20202725

Total cases2733
Standard deviation5.755.51
The Mean138

Notes: CHD-Coronary heart disease.

33 death cases in Wuhan city of Hubei province in China. Notes: CHD-Coronary heart disease. The disease information in Table 2 has been collected from the public media before we resume data analysing with the same method of death rate estimation and prediction in South Korea as in China [1]. We have collected accumulative confirmed cases and deaths and then new confirmed cases and new deaths in South Korea.
Table 2

Disease information in South Korea.

date2020-03-152020-03-142020-03-132020-03-122020-03-112020-03-102020-03-09
Accumulative confirmed cases8162808679797689775575137478
Accumulative Deaths75726766606053
New confirmed cases7610711011424235165
New deaths3516073
date2020-03-082020-03-072020-03-062020-03-052020-03-042020-03-032020-03-02
Accumulative confirmed cases7313704165936284562151864335
Accumulative Deaths50484342353228
New confirmed cases272448309663435851599
New deaths2517347
date2020-03-012020-02-292020-02-282020-02-272020-02-262020-02-252020-02-24
Accumulative confirmed cases37363150233717661261977833
Accumulative Deaths2117161312118
New confirmed cases586813571505284144231
New deaths4131132
date2020-02-232020-02-222020-02-212020-02-202020-02-192020-02-182020-02-17
Accumulative confirmed cases602436209111583130
Accumulative Deaths6221000
New confirmed cases16622798532711
New deaths4011000
date2020-02-162020-02-152020-02-142020-02-132020-02-122020-02-112020-02-10
Accumulative confirmed cases29280
Accumulative Deaths000
New confirmed cases1280
New deaths000
Disease information in South Korea. According to the analysis result from Table 1, we have selected 5 scores (μ, μ ± σ and μ ± 2σ) in normal distribution which are respectively 2, 8, 13, 19, 25 days. When we calculate the death rate by dividing death cases with confirmed cases, these confirmed cases should be the ones on the 2nd, 8th, 13th, 19th and 25th day before the day of prediction. Death rate 1 is calculated by dividing new death cases with new confirmed cases. Death rate 2 is calculated by dividing accumulative death cases with accumulative confirmed cases. When the death rate came out with a negative value or no value, that means the new confirmed cases might be wrong for some reason or there's no new cases on several days before. We corrected a negative death rate or no value into zero (in red), and then added the new death case to the one of the next days (in green). Each score we selected in normal distribution has a specific possibility when we take them into consideration of representatives in bell curve [1]. When we weighted each death rate on a day with their corresponding possibilities and then sum, the total death rate on each day can be obtained. Each curve consisting of several death rate will have a trendline and thus a formula to describe this trend as well as the current ratio between accumulative death cases and confirmed cases on each day (Table 4).
Table 4

Current ratio between accumulative death cases and confirmed cases.

Date2020-03-152020-03-142020-03-132020-03-122020-03-11
Current ratio between accumulative death cases and confirmed cases0.92%0.89%0.84%0.86%0.77%
The current ratio between accumulative death cases and confirmed cases is calculated by dividing accumulative death cases with accumulative confirmed cases on each day. The trendlines of death rate 1 and death rate 2 tend to intersect with the trendline of the current ratio finally, because the current ratio will be the real death rate at the end of epidemic. We considered that the intersection value of three trendline (death rate1 and 2, current ratio) will drop in the range of real death rate. When we calculated the death rate separately with the corresponding formula of their trendlines, two intersections have been acquired (Table 5-B). We pick the maximum value between them to predict new death cases in the following days (Table 6).
Table 5

Death rate estimation in South Korea.

5-A: Death rate derived from the formula of trendlines
Date3/11/20203/12/20203/13/20203/14/20203/15/20203/16/2020
Death rate 13.95%26.92%30.57%21.26%5.35%
current ratio0.79%0.82%0.85%0.88%0.91%0.94%
Death rate 223.35%20.90%18.45%16.00%13.55%11.10%

Date3/17/20203/18/20203/19/20203/20/20203/21/2020

Death rate 1
current ratio0.97%1.00%1.03%1.06%1.09%
Death rate 28.65%6.20%3.75%1.30%

Table 6

Deaths prediction by PIBA and actual death data in South Korea.

Date3/22/20203/21/20203/20/20203/19/20203/18/20203/17/20203/16/2020
7 lagging day1113023
13 lagging day2353759
19 lagging day9669653

Date3/22/20203/21/20203/20/20203/19/20203/18/20203/17/20203/16/2020

Min1113023
Max9669759
Actual deaths2837360
This table listed the number of deaths from March 16, 2020 to March 22, 2020 based on lagging days of 8, 13, and 19 days. The upper parts are predicted number of deaths based on days of 8, 13, and 19 days of the PIBA method. The lower part list the predicted minimum and maximum number of deaths, and actual reported deaths in each of the seven days. Fig. 1. Rates of 2019-nCoV infection and rate of patients in serious medical condition. Total rate in China (blue color), Hubei (orange color) and rest of country (grey color). Numbers on the vertical axis indicate the percentage of infections. Numbers on the horizontal axis indicate the date. Fig. 1A. The infection rates. Fig. 1B. The rate of patients in serious medical condition. Fig. 1C. Retrospective estimation of start time of disease based on PIBA and known information of patients in Wuhan. *Wr = Based on the rate of Wuhan; Rcr = Based on the rate from the rest of the country; Dt=doubling time [9]; Ir=infection rate; Sr= serious rate; Dr = death rate.
Fig. 1

Estimated rate of infection and patients in serious condition and retrospective estimation of initially occurring time of CORID-19 based on PIBA data.

Estimated rate of infection and patients in serious condition and retrospective estimation of initially occurring time of CORID-19 based on PIBA data.

Experimental design, materials, and methods

Tables are produced based on the Patient Information Based Algorithm (PIBA) [1]. PIBA has been adapted when estimating the death rate of COVID-19 in Real-time with publicly posted data. Following normal distribution, the different durations with different possibilities between symptom appearance and death have been derived from analysing 33 death cases in Wuhan city of Hubei province in China [2]. Based on these results, the total death rate in regions can be calculated specifically by putting in the different death rates with different durations together. While the trendline of these death rate curves meet the curve of current ratio between accumulative death cases and confirmed cases at some points in the near future, we considered that these intersections are within the range of real death rates. The data analysis was all following normal distribution, either in calculating the possibility of every selected score or in estimating the death rate. After collection of data of COVID patients from South Korea, the data was analysed with PIBA method as indicated above (Table 2). The death rate was first estimated (Table 3). The death rate then was calculated (Table 4). Following estimations, the PIBA method then was used to predict the number of deaths in the following week (Table 5). The predicated death numbers then were compared to the real death numbers (Table 6).
Table 3

Death rate estimation in South Korea.

3-A: Death rate analysis in South Korea
Death rate 1 from the date Symptoms2020-03-152020-03-142020-03-132020-03-122020-03-11
Mean-130.50%0.85%0.12%1.05%0.00%
1STDEV-80.67%1.62%0.15%1.38%0.00%
1STDEV-192.08%2.16%0.60%2.64%0.00%
2STDEV-2511.11%500.00%100.00%600.00%0.00%
2STDEV-22.76%0.00%0.41%17.14%0.00%
Death rate2 from the date Symptoms2020-03-152020-03-142020-03-132020-03-122020-03-11
Mean-131.73%1.93%2.13%2.82%3.40%
1STDEV-71.07%1.09%1.07%1.17%1.16%
1STDEV-197.68%8.64%11.13%15.14%28.71%
2STDEV-25129.31%232.26%223.33%227.59%214.29%
2STDEV-10.94%0.94%0.86%0.88%0.80%

Death rate estimation in South Korea. Current ratio between accumulative death cases and confirmed cases. Death rate estimation in South Korea. Deaths prediction by PIBA and actual death data in South Korea. Fig. 1 is produced based on the following procedure. Up to February 25, 2020, the total accumulated number of infected patients in China is 78,064 (data only from mainland China). The number of new cases per day has not increased in the past 9 days. The total accumulated number of people who were in close contact with an infected person is 647,406. Thus, by simply dividing the number of infected persons by the number of contacted persons, the total infection rate is only 12%, considerably lower than expected. Prior expectation has been much higher, based on multiple infectious routes [3], [4]. Using our formula, the results indicate that the current infectious rate is even lower than the rate based on the total numbers (see Fig. 1A). The infectious rate in Hubei province is currently around 4%, although previously the rate was as high as 39%. On average, the infectious rate overall in China is about 4%, while in Hubei it is 10%. In the rest of the country, it is 0.46%. Among the inpatients, the rate in serious medical condition ranges from 10% to 30% (see Fig. 1B), while it averages at 18% in China, 19% in Hubei, and 13% in the rest of country (except Hubei). Based on the estimated death rate, on January 22, there should be a total of 150 to 300 inpatients (see Fig. 5C). Based on the rate of patients who are severely ill among all patients, on January 2, there should be 216 to 315 patients. Based on the effective infection rate and based on the assumption of one week or 14 days from close contact to the onset of symptoms, there might be 2,160 to 68,478 people who were infected around December 20, 2019. If we believe the epidemic doubling time is approximately 6 days, the initial infection source may date back to as early as November or October 2019.
SubjectDeath rate estimation using normal distribution, of mean, standard deviations and formulas.
Specific subject areaThe data estimation focuses on the early estimation of death rate of infectious diseases, in particular, the disease COVID-19 caused by 2019-nCoV.
Type of dataTables, Figures
How data were acquiredData were obtained from official websites of provincial and central government of public health commissions of PR China and South Korea.
Data formatCollected data are formatted on Excel spreadsheets for analysing.
Parameters for data collectionData include the total number of patients, total number of deaths, daily numbers of new patients, daily number of new deaths, from starting data of official report to the presented time, e.g., March 22, 2020.
Description of data collectionData were collected through the cyberlinke of each official websites and copied and pasted the desired data onto Excel spreadsheets.
Data source locationCity/Town/Region: Hubei province, Heilongjiang province.Country: PR China, South Korea
Data accessibilityRaw data are from three official websites which are publically avaialbe.Health Emergency Office of the National Health Commission of the People's Republic of China at http://www.nhc.gov.cn/yjb/new_index.shtml.Hubei Province and Wuhan are from the Health Commission of Hubei Province at http://wjw.hubei.gov.cn/fbjd/dtyw/.Health Commission of Heilongjiang province at http://wsjkw.hlj.gov.cn/index.php/Home/Zwgk/all/typeid/42.
Related research article [1]Lishi Wang, Jing Li, Sumin Guo, Ning Xie, Lan Yao, Yanhong Cao, Sara W. Day, Scott C. Howard, J. Carolyn Graff, Tianshu Gu, Jiafu Ji, Weikuan Gu, Dianjun Sun. Real-time Estimation and Prediction of Mortality Caused by COVID-19 with Patient Information Based Algorithm. Science of the Total Environment. 2020, MS# STOTEN-D-20-06264. in press.
  4 in total

1.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors:  Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

2.  Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study.

Authors:  Joseph T Wu; Kathy Leung; Gabriel M Leung
Journal:  Lancet       Date:  2020-01-31       Impact factor: 79.321

3.  Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study.

Authors:  Nanshan Chen; Min Zhou; Xuan Dong; Jieming Qu; Fengyun Gong; Yang Han; Yang Qiu; Jingli Wang; Ying Liu; Yuan Wei; Jia'an Xia; Ting Yu; Xinxin Zhang; Li Zhang
Journal:  Lancet       Date:  2020-01-30       Impact factor: 79.321

4.  Real-time estimation and prediction of mortality caused by COVID-19 with patient information based algorithm.

Authors:  Lishi Wang; Jing Li; Sumin Guo; Ning Xie; Lan Yao; Yanhong Cao; Sara W Day; Scott C Howard; J Carolyn Graff; Tianshu Gu; Jiafu Ji; Weikuan Gu; Dianjun Sun
Journal:  Sci Total Environ       Date:  2020-04-08       Impact factor: 7.963

  4 in total
  1 in total

1.  Monitoring Mortality Caused by COVID-19 Using Gamma-Distributed Variables Based on Generalized Multiple Dependent State Sampling.

Authors:  Muhammad Aslam; G Srinivasa Rao; Muhammad Saleem; Rehan Ahmad Khan Sherwani; Chi-Hyuck Jun
Journal:  Comput Math Methods Med       Date:  2021-04-22       Impact factor: 2.238

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.