Literature DB >> 32598290

Using Open-Source Intelligence to Detect Early Signals of COVID-19 in China: Descriptive Study.

Elizabeth Benedict Kpozehouen1, Xin Chen1, Mengyao Zhu2, C Raina Macintyre1.   

Abstract

BACKGROUND: The coronavirus disease (COVID-19) outbreak in China was first reported to the World Health Organization (WHO) on December 31, 2019, and the first cases were officially identified around December 8, 2019. Although the origin of COVID-19 has not been confirmed, approximately half of the early cases were linked to a seafood market in Wuhan. However, the first two documented patients did not visit the seafood market. News reports, social media, and informal sources may provide information about outbreaks prior to formal notification.
OBJECTIVE: The aim of this study was to identify early signals of pneumonia or severe acute respiratory illness (SARI) in China prior to official recognition of the COVID-19 outbreak in December 2019 using open-source data.
METHODS: To capture early reports, we searched an open source epidemic observatory, EpiWatch, for SARI or pneumonia-related illnesses in China from October 1, 2019. The searches were conducted using Google and the Chinese search engine Baidu.
RESULTS: There was an increase in reports following the official notification of COVID-19 to the WHO on December 31, 2019, and a report that appeared on December 26, 2019 was retracted. A report of severe pneumonia on November 22, 2019, in Xiangyang was identified, and a potential index patient was retrospectively identified on November 17.
CONCLUSIONS: The lack of reports of SARI outbreaks prior to December 31, 2019, with a retracted report on December 26, suggests media censorship, given that formal reports indicate that cases began appearing on December 8. However, the findings also support a relatively recent origin of COVID-19 in November 2019. The case reported on November 22 was transferred to Wuhan approximately one incubation period before the first identified cases on December 8; this case should be further investigated, as only half of the early cases were exposed to the seafood market in Wuhan. Another case of COVID-19 has since been retrospectively identified in Hubei on November 17, 2019, suggesting that the infection was present prior to December. ©Elizabeth Benedict Kpozehouen, Xin Chen, Mengyao Zhu, C Raina Macintyre. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 18.09.2020.

Entities:  

Keywords:  COVID-19; biosecurity; epidemiology; infectious disease; surveillance

Mesh:

Year:  2020        PMID: 32598290      PMCID: PMC7505682          DOI: 10.2196/18939

Source DB:  PubMed          Journal:  JMIR Public Health Surveill        ISSN: 2369-2960


Introduction

Background

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a new betacoronavirus that was first reported in Wuhan, China, in December 2019; this virus has caused the worst pandemic of the past 100 years [1-3]. On December 31, 2019, Chinese authorities notified the World Health Organization (WHO) of an outbreak of pneumonia in Wuhan [2]. The WHO declared the coronavirus disease (COVID-19) outbreak to be a public health emergency of international concern on January 30, 2020, and it was declared a pandemic on March 12 [2,4]. It is commonly believed that the outbreak began in early December 2019. Coronaviruses are a large family of viruses that are found in many different species of animals, including camels, cattle, cats, and bats. Zoonotic coronaviruses that have emerged in humans are Middle Eastern respiratory syndrome coronavirus (MERS-CoV), sudden acute respiratory syndrome coronavirus (SARS-CoV), and now SARS-CoV-2. This is the third time in two decades that a zoonotic coronavirus has emerged from animals to infect humans [4]. Of the betacoronaviruses, SARS-CoV-2 is more closely related to SARS-CoV than to MERS-CoV [5]. The origin of SARS-CoV-2, its intermediary animal host, and the mechanism of its species jump to humans are not known [6,7]. Initially, it was believed that the COVID-19 pandemic originated at the Huanan Seafood Wholesale Market located in Wuhan, China, where farm animals, bats, and snakes were also sold [8]; this is still believed by many people. Approximately half of the initial cases were exposed to the seafood market; however, the first two identified cases did not visit the seafood market [9]. Viral RNA was found in environmental samples from the wet market, such as surfaces [10]. Phylogenetic analysis revealed that the viral RNA found in the environmental samples was very closely related to viruses sampled from the earliest Wuhan patients, suggesting that the market played a role in the early spread of the virus [10]. The source of positive environmental samples from the market is unknown, and animal samples from the market are not available. Therefore, it has not been possible to identify an animal source at the market [10]. On March 2020, however, the timeline of the pandemic was questioned when it was determined that the first person infected with the new disease may have been a Hubei resident who was infected on November 17, 2019 [11]. However, official information states that the first patient presented on December 8, 2019, and that the first exposure may have been around December 1 in the Huanan Seafood Wholesale Market [2]. Local health authorities initially failed to report the coronavirus epidemic, resulting in a delay in reporting it to the WHO until December 31, 2019. Emerging infectious diseases are becoming increasingly common [12,13]. The world is increasingly interconnected; therefore, it is essential to identify epidemics early [14]. A disease with true epidemic potential can grow exponentially within weeks or months; thus, each day of delay is a lost opportunity for prevention [15]. Rapid prediction, detection, and surveillance of outbreaks are critical in fighting emerging infectious diseases with epidemic potential [12]. The media may have reported a surge of unknown or undiagnosed cases of severe pneumonia, severe acute respiratory illness (SARI), or other related diseases prior to the official reporting of confirmed COVID-19 confirmed cases in China. Epidemic intelligence from open-source, informal data can provide early warnings of public health emergencies [12-14,16,17]. EpiWatch is a curated epidemic observatory that searches media reports, press releases, official reports, and social media for early detection of outbreaks of infectious diseases; it can be tailored for different languages [18]. EpiWatch provides early outbreak alerts and can be used to detect and monitor early reports of potential COVID-19 outbreaks through publicly available sources in settings with poor disease surveillance or censorship of information [19]. Early reports of unknown pneumonia in Hubei Province in China that appeared prior to official reports can be identified using open-source data, providing insight into whether COVID-19 was present in China before December 2019.

Aim

The aim of our study was to use open-source data to identify early signals of pneumonia and SARI in China prior to official recognition of the COVID-19 outbreak in December 2019.

Methods

EpiWatch is an open-source epidemic observatory that was developed at the University of New South Wales as a management web application enhanced by machine learning; it has been used to collect outbreak data since 2016. The principle of EpiWatch is that cases of infectious diseases or outbreaks may be reported in the news or discussed on social media before official notification by health authorities. EpiWatch mines open-source data to detect early signals, which can be customized for common clinical infectious disease syndromes. Many countries have weak or delayed surveillance systems and poor reporting. In other countries, censorship may prevent notification of serious epidemics. Open-source data can be used to help identify epidemic signals in such circumstances. The system includes three major features. First, reports are gathered from international organizations and news outlets by an intelligent and modular system. An administrator can easily add new sources without requiring further development of the application. The data collected include news reports and social media posts as well as grey literature, such as government reports. If the format in which data is delivered changes for a given source, an administrator can promptly modify the system to adapt to this change. This includes adding or changing the languages used for searching. The system is set up to support a variety of intelligent data gathering elements, such as natural language processing algorithms, regular expression matching, and supervised machine learning algorithms, to process reports and attempt to identify important data points such as outcomes, locations, and diseases mentioned within the gathered data. Second, EpiWatch reports are reviewed by a team of epidemiologists, ensuring a good level of quality control as well as increased accuracy and relevance. The EpiWatch management system is a web application that enables the internal team to log on and review reports and key data points identified by the automated data gathering system. The team can check the data collected by the automated system and correct any mistakes that are present. A machine learning system learns from this human input and corrections and uses that feedback to improve its ability to group reports and identify key information over time. Finally, the EpiWatch management web application consists of two software programs. One is a web application that is built on the Vue framework, and the other is a server-side application built on the NodeJS framework. Both applications are written in JavaScript. The third software program is the data-gathering program, which is also a NodeJS application written in JavaScript. This program is scheduled to run on a regular basis to re-scan sources at intervals chosen by the system administrator. Searches can be tailored for specific languages or regions as well as for specific infectious disease syndromes. The data are stored in a PostgreSQL database. Most of the data is textual in nature and is easily compressed; therefore, the storage requirements are currently very modest (<100 MB). The EpiWatch observatory is managed and funded by the Australian National Health and Medical Research Council (NHMRC) Centre of Research Excellence, Integrated Systems for Epidemic Response (ISER) and is managed by staff at the Biosecurity Program, The Kirby Institute, University of New South Wales Sydney. To capture early reports of SARI or pneumonia-related illnesses in China, searches were performed in the Chinese language using keywords reflecting severe acute respiratory syndrome or pneumonia as well as Wuhan and China as geolocations. We performed searches from October 1, 2019, to February 14, 2020. Any relevant news reports with the keywords pneumonia, SARI and related terms, and coronavirus were extracted. The information before December 31, 2019 (the date on which the WHO was notified of the COVID-19 outbreak) was reviewed for potential early signals of COVID-19. Google and the Chinese search engine Baidu were used [20,21]. Reports in Chinese were retrieved and reviewed by EK, XC, and MZ and translated to English.

Results

Between October 2019 and February 2020, a total of 218 reports were found and included in the study. There were no duplicates. We identified two potentially relevant news reports prior to December 31, 2019. Figure 1 shows the number of pneumonia and/or SARI reports from October 1, 2019, to February 14, 2020. It shows an increase in reports after the official notification to the WHO on December 31, 2019. A report appeared on December 26, 2019, with the heading “One sample is suspected as novel coronavirus”; this report appears to have been retracted, as the link to the news item has become invalid [22]. We found 11 reports of cases of pneumonia between October 1 and December 31, 2019, including a case identified retrospectively in March 2020, which is believed to be an index case. The number of reports in the same period one year prior was determined for comparison; there were 12 reports in 2018. Of the 11 reports in 2019, 3 (27%) were cases of pneumonia of unknown cause, and 7 (64%) had known causes; 2 reports (18%) were related to pulmonary nodules, 3 (27%) were caused by lung cancer, cerebral infarction, or asthma, and 2 cases (18%) were caused by bacterial infection. The information of interest was the single report of unknown serious pneumonia in November 2019 in Hubei, the province in China where the COVID-19 pandemic arose.
Figure 1

Reports of pneumonia, severe acute respiratory illness, or coronavirus from October 1, 2019, to February 14, 2020. COVID-19: coronavirus disease; WHO: World Health Organization.

Reports of pneumonia, severe acute respiratory illness, or coronavirus from October 1, 2019, to February 14, 2020. COVID-19: coronavirus disease; WHO: World Health Organization. On November 22, 2019, a local newspaper, the Wuhan Evening News, reported that a patient with severe pneumonia of unknown cause was taken to Wuhan as an emergency transfer by helicopter from Xiangyang in Hubei Province, 325 kilometers from Wuhan [23]. Figure 2 shows the location of Xiangyang in relation to Wuhan. After November 22, there were no reports of pneumonia in the local media, although it was later confirmed that by December 30, 2019, there were 27 cases of pneumonia of unknown cause in Wuhan.
Figure 2

Location of Xiangyang relative to Wuhan within Hubei Province.

Location of Xiangyang relative to Wuhan within Hubei Province.

Discussion

The origin of the COVID-19 epidemic is unknown. Only half the initial patients were exposed to the Huanan Seafood Wholesale Market [9], and the first two cases in Wuhan did not visit the market. No pneumonia or SARI signals in Wuhan were identified prior to December 31, which supports the relatively recent emergence of COVID-19. However, open-source intelligence identified a case of severe pneumonia in Xiangyang, Hubei Province, 325 km from Wuhan, who was transferred to Wuhan for treatment on November 21, 2019. This case may be part of an early outbreak cluster. In early March, it was reported that the first case of COVID-19, a different case identified retrospectively, may have been observed on November 17 [24-26]. Approximately one COVID-19 incubation period (2 weeks) [9] after November 17 to November 21, the first formally reported cases in Wuhan became symptomatic (around December 1-8). If no definitive diagnosis was made, further diagnostic investigation of the case from Xiangyang and epidemiological investigation is warranted to determine if this case did have COVID-19. There may be a connection between the Xiangyang patient and an unidentified early cluster of COVID-19. From December 31, 2019, through January 3, 2020, a total of 44 case patients with pneumonia of unknown etiology were detected by syndromic surveillance by the China Center for Disease Control and Prevention. Exposure to the Huanan Seafood Wholesale Market was initially suspected to be the origin of the virus, and the market was closed on January 1, 2020. At least 35 environmental samples from the seafood section of the Huanan Seafood Market in Wuhan tested positive for the virus [27,28]. However, the first two cases did not report visiting the seafood market, and there is no epidemiological link between the first patient and later cases [5,28]. This, together with the identification of at least two severe pneumonia cases in November (the one identified in this study and the case on November 17, 2019), suggests that the epidemic originated earlier than December 2019. The absence of news reports in December is curious given that the outbreak appears to have been recognized in early December. It is possible that media reporting was censored; this is supported by what appears to be a retracted news item on December 26. The findings also support the relatively recent origin of COVID-19 in November 2019. The case reported on November 22 was transferred to Wuhan approximately one incubation period before the first cases were reported on December 8. This case should be further investigated, as only half of the early cases were exposed to the Huanan Seafood Market. The Chinese government has been questioned about its failure to identify and report the epidemic early, which resulted in worldwide spread of the disease and led to a pandemic [29]. Surveillance of waste water may also shed light on the origin. A sample of stored waste water in Spain tested positive for SARS-CoV-2 in March 2019, raising questions about whether the infection was present much earlier than December that year [30]. Epidemic diseases grow exponentially and rapidly [31], as seen in China, Europe, and the United States [32]. Early detection and epidemic control can reduce epidemic growth and prevent further spread. Open-source intelligence is a potential tool to aid early detection, especially where formal surveillance data are lacking. Although these data are not validated, once a signal is detected, it can and should be formally investigated, tested, and validated. The use of open-source epidemic intelligence can supplement conventional surveillance to provide early detection of serious emerging epidemics, especially where official disease surveillance reporting is lacking.
  15 in total

1.  Risk assessment strategies for early detection and prediction of infectious disease outbreaks associated with climate change.

Authors:  E E Rees; V Ng; P Gachon; A Mawudeku; D McKenney; J Pedlar; D Yemshanov; J Parmely; J Knox
Journal:  Can Commun Dis Rep       Date:  2019-05-02

2.  Epidemic intelligence needs of stakeholders in the Asia-Pacific region.

Authors:  Aurysia Hii; Abrar Ahmad Chughtai; Tambri Housen; Salanieta Saketa; Mohana Priya Kunasekaran; Feroza Sulaiman; Nk Semara Yanti; Chandini Raina MacIntyre
Journal:  Western Pac Surveill Response J       Date:  2018-12-18

3.  The Role of Augmented Intelligence (AI) in Detecting and Preventing the Spread of Novel Coronavirus.

Authors:  Justin B Long; Jesse M Ehrenfeld
Journal:  J Med Syst       Date:  2020-02-04       Impact factor: 4.460

4.  The next big threat to global health? 2019 novel coronavirus (2019-nCoV): What advice can we give to travellers? - Interim recommendations January 2020, from the Latin-American society for Travel Medicine (SLAMVI).

Authors:  Cristian Biscayart; Patricia Angeleri; Susana Lloveras; Tânia do Socorro Souza Chaves; Patricia Schlagenhauf; Alfonso J Rodríguez-Morales
Journal:  Travel Med Infect Dis       Date:  2020-01-30       Impact factor: 6.211

Review 5.  Perinatal aspects on the covid-19 pandemic: a practical resource for perinatal-neonatal specialists.

Authors:  Francis Mimouni; Satyan Lakshminrusimha; Stephen A Pearlman; Tonse Raju; Patrick G Gallagher; Joseph Mendlovic
Journal:  J Perinatol       Date:  2020-04-10       Impact factor: 2.521

6.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors:  Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

7.  Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia.

Authors:  Qun Li; Xuhua Guan; Peng Wu; Xiaoye Wang; Lei Zhou; Yeqing Tong; Ruiqi Ren; Kathy S M Leung; Eric H Y Lau; Jessica Y Wong; Xuesen Xing; Nijuan Xiang; Yang Wu; Chao Li; Qi Chen; Dan Li; Tian Liu; Jing Zhao; Man Liu; Wenxiao Tu; Chuding Chen; Lianmei Jin; Rui Yang; Qi Wang; Suhua Zhou; Rui Wang; Hui Liu; Yinbo Luo; Yuan Liu; Ge Shao; Huan Li; Zhongfa Tao; Yang Yang; Zhiqiang Deng; Boxi Liu; Zhitao Ma; Yanping Zhang; Guoqing Shi; Tommy T Y Lam; Joseph T Wu; George F Gao; Benjamin J Cowling; Bo Yang; Gabriel M Leung; Zijian Feng
Journal:  N Engl J Med       Date:  2020-01-29       Impact factor: 176.079

Review 8.  COVID-19 infection: Origin, transmission, and characteristics of human coronaviruses.

Authors:  Muhammad Adnan Shereen; Suliman Khan; Abeer Kazmi; Nadia Bashir; Rabeea Siddique
Journal:  J Adv Res       Date:  2020-03-16       Impact factor: 10.479

9.  Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak.

Authors:  Shi Zhao; Qianyin Lin; Jinjun Ran; Salihu S Musa; Guangpu Yang; Weiming Wang; Yijun Lou; Daozhou Gao; Lin Yang; Daihai He; Maggie H Wang
Journal:  Int J Infect Dis       Date:  2020-01-30       Impact factor: 3.623

10.  SARS-CoV-2 and COVID-19: The most important research questions.

Authors:  Kit-San Yuen; Zi-Wei Ye; Sin-Yee Fung; Chi-Ping Chan; Dong-Yan Jin
Journal:  Cell Biosci       Date:  2020-03-16       Impact factor: 7.133

View more
  5 in total

Review 1.  Mouthrinses and SARS-CoV-2 viral load in saliva: a living systematic review.

Authors:  Akram Hernández-Vásquez; Antonio Barrenechea-Pulache; Daniel Comandé; Diego Azañedo
Journal:  Evid Based Dent       Date:  2022-05-24

Review 2.  Rapid diagnostic methods for SARS-CoV-2 (COVID-19) detection: an evidence-based report.

Authors:  Manish Kumar Verma; Parshant Kumar Sharma; Henu Kumar Verma; Anand Narayan Singh; Desh Deepak Singh; Poonam Verma; Areena Hoda Siddiqui
Journal:  J Med Life       Date:  2021 Jul-Aug

3.  Lifestyle Variations during and after the COVID-19 Pandemic: A Cross-Sectional Study of Diet, Physical Activities, and Weight Gain among the Jordanian Adult Population.

Authors:  Hanan Hammouri; Fidaa Almomani; Ruwa Abdel Muhsen; Aysha Abughazzi; Rawand Daghmash; Alaa Abudayah; Inas Hasan; Eva Alzein
Journal:  Int J Environ Res Public Health       Date:  2022-01-25       Impact factor: 3.390

4.  SARS-CoV-2: International Investigation Under the WHO or BWC.

Authors:  Mirko Himmel; Stefan Frey
Journal:  Front Public Health       Date:  2022-02-03

5.  Using open-source intelligence to identify early signals of COVID-19 in Indonesia.

Authors:  Yoser Thamtono; Aye Moa; Chandini Raina MacIntyre
Journal:  Western Pac Surveill Response J       Date:  2021-02-17
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.