| Literature DB >> 33776163 |
Yi Zhang1, Xiaojing Cai2,3, Caroline V Fry4, Mengjia Wu1, Caroline S Wagner3.
Abstract
The COVID-19 pandemic presented a challenge to the global research community as scientists rushed to find solutions to the devastating crisis. Drawing expectations from resilience theory, this paper explores how the trajectory of and research community around the coronavirus research was affected by the COVID-19 pandemic. Characterizing epistemic clusters and pathways of knowledge through extracting terms featured in articles in early COVID-19 research, combined with evolutionary pathways and statistical analysis, the results reveal that the pandemic disrupted existing lines of coronavirus research to a large degree. While some communities of coronavirus research are similar pre- and during COVID-19, topics themselves change significantly and there is less cohesion amongst early COVID-19 research compared to that before the pandemic. We find that some lines of research revert to basic research pursued almost a decade earlier, whilst others pursue brand new trajectories. The epidemiology topic is the most resilient among the many subjects related to COVID-19 research. Chinese researchers in particular appear to be driving more novel research approaches in the early months of the pandemic. The findings raise questions about whether shifts are advantageous for global scientific progress, and whether the research community will return to the original equilibrium or reorganize into a different knowledge configuration. © Akadémiai Kiadó, Budapest, Hungary 2021.Entities:
Keywords: COVID-19; International collaboration.; Research and development; Science; Topic analysis
Year: 2021 PMID: 33776163 PMCID: PMC7980735 DOI: 10.1007/s11192-021-03946-7
Source DB: PubMed Journal: Scientometrics ISSN: 0138-9130 Impact factor: 3.801
Data source and publication data
| Number of publications | ||
|---|---|---|
| Source | Pre-COVID-19 (January 1, 2009 to December 31, 2019 | COVID-19 (January 1, 2020 to April 23, 2020) |
| Scopus | 10,012 | 1714 |
| Web of science | 7838 | 822 |
| PubMed | 28,484 | 4334 |
| Preprints (BioRxiv/MedRxiv/arXiv) | N/A | 2147 |
| Combined (duplicates dropped) | 30,660 | 6337 |
| Combined, with topic data | 28,543 | 3485 |
| Combined, with topic and affiliation data | 27,424 | 3128 |
We included preprints in the COVID-19 period because the time pressures imposed by the pandemic crisis propelled ready and open sharing of even initial results, which may help us understand the early response of researchers to the COVID-19 crisis
Fig. 4Co-term map for the COVID-19 research in 2020 (May–October 2020). Note: In order to assess medium to longer-term trends, publications in May–October 2020 were used
Fig. 1Research framework
Fig. 3Co-term map for COVID-19 research in early 2020 (January-April 2020). Note that this version was re-generated based on the data in the source: Fry et al. (2020)
Stepwise term clumping process for identifying core terms on coronavirus-related research
| Step | Description | #Terms |
|---|---|---|
| 1 | Raw terms retrieved by an NLP function integrated in VantagePoint | 601,103 |
| 2 | Remove meaningless terms, e.g., pronouns, prepositions, and conjunctions | 594,116 |
| 3 | Remove common terms in scientific articles, e.g., “methods” | 584,465 |
| 4 | Remove terms starting with non-alphabetic characters, e.g., “step 1” or “1.5 m/s” | 517,502 |
| 5 | Consolidate terms with specific rules, e.g., abbreviations and related full names | 506,283 |
| 6 | Remove terms appearing in only one record | 89,497 |
| 7 | Consolidate terms with the same stem, e.g., “infectious disease” and “infectious diseases” | 81,871 |
| 8 | Remove single-word terms, e.g., “virus” | 68,055 |
| 9 | Consolidate terms based on given topics, e.g., “MERS” and “MERS-COV” | 64,776 |
Topic extraction for the pre- and COVID-19 periods
| ID | Topic label | Topic description |
|---|---|---|
| Pre COVID-19 period (2009–2019) | ||
| 1 | Epidemiology (996) | Host cells (539), United States (463), infected cells (450), spike protein (393), co-infection (357), Central Nervous System (287), influenza-like illness (252), early stage (251), T cells (243), healthcare workers (228), antibody response (225), S protein (217), host response (203), nucleic acid (200), dendritic cells (190), nucleocapsid protein (181), Receptor-Binding Domain (176), cross-sectional study (175), flow cytometry (173), mammalian cells (172) |
| 2 | Viral infection (1482) | Viral replication (714), Saudi Arabia (585), public health (569), viral pathogens (371), viral RNA (368), viral proteins (320), World Health Organization (267), viral genome (264), human health (256), infection control (254), viral load (237), viral entry (228), genetic diversity (220), human infection (195), Intensive Care Unit (191), case report (184), interferon (177), viral diseases (173), PEDV infection (171), health care workers (168) |
| 3 | Infectious diseases (1392) | Fever (480), severe disease (417), cell culture (416), disease control (416), clinical signs (408), infectious agents (259), young children (248), feline infectious peritonitis (241), clinical trials (231), Dromedary Camels (219), clinical features (203), control group (199), developing countries (197), human disease (196), West Africa (193), clinical characteristics (189), Clinical presentation (179), IFN-gamma (159), prevention (155), common cold (152) |
| 4 | Respiratory viruses (1081) | Respiratory syncytial virus (1061), respiratory infections (462), respiratory viral infections (363), respiratory disease (358), respiratory tract infections (354), Middle East (317), acute respiratory infections (300), respiratory syndrome virus (300), respiratory tract (275), respiratory pathogens (249), acute respiratory distress syndrome (210), Feline coronavirus (203), respiratory virus (203), respiratory symptoms (201), coronavirus infection (182), Bovine Coronavirus (180), respiratory illness (164), human coronaviruses (161), human coronavirus (160), Acute respiratory tract infections (147) |
| 5 | SARS-CoV (2370) | Immune response (771), HIV (588), gene expression (321), immune system (321), innate immune response (303), H5N1 (283), South Korea (280), Molecular Mechanisms (270), Monoclonal Antibodies (268), mouse model (256), study period (212), electron microscopy (202), inflammatory response (200), inhibitory effect (183), host immune response (175), molecular characterization (172), adaptive immune responses (166), mathematical model (156), endoplasmic reticulum (150), multiplex PCR (150) |
| 6 | MERS-CoV (2403) | Phylogenetic analysis (683), antiviral activity (508), human metapneumovirus (408), animal models (352), causative agent (324), Hong Kong (301), real-time PCR (282), vaccine development (263), ages (233), human population (230), clinical samples (202), crystal structure (201), high mortality (200), age groups (198), human bocavirus (197), licensee MDPI (193), virus-host interactions (189), antiviral effects (180), fecal samples (179), etiological agent (177), mortality rate (177) |
| 7 | H1N1 (558) | RT-PCR (432), innate immunity (302), Polymerase chain reaction (273), sequence analysis (228), community-acquired pneumonia (197), rapid detection (193), Escherichia coli (187), enzyme-linked immunosorbent assay (182), H7N9 (180), cross-reactivity (173), real-time RT-PCR (171), complete genome sequence (168), complete genome (167), porcine epidemic diarrhea (160), Multiple sclerosis (150), nasopharyngeal aspirates (150), host factors (147), control measures (141), protective immunity (140) |
| 8 | Porcine epidemic diarrhea virus (719) | Infectious bronchitis virus (654), influenza virus (639), virus infection (552), RNA viruses (437), virus replication (421), influenza viruses (379), pandemic influenza (309), Ebola virus (271), transmissible gastroenteritis virus (268), mouse hepatitis virus (239), hepatitis C virus (229), avian influenza (211), virus entry (211), Dengue virus (186), Zika virus (186), enveloped viruses (176), influenza virus infections (161), Ebola virus disease (155), virus detection (152), influenza vaccination (149) |
| COVID-19 Period (2020) | ||
| 1 | COVID-19 (2235) | COVID-19 outbreak (230), COVID-19 epidemic (127), clinical characteristics (116), United States (75), clinical features (74), mainland China (52), retrospective study (33), clinical manifestations (32), COVID-19 transmission (23), clinical outcomes (22), severe COVID-19 (22), clinical symptoms (21), Hong Kong (20), COVID-19 spread (18), traditional Chinese medicine (16), travel restrictions (16), Chinese government (15), retrospective cohort study (14), modeling studies (13), Case Study (12) |
| 2 | SARS-CoV-2 (751) | Disease control (41), healthcare workers (34), common symptoms (27), chest CT (26), Saudi Arabia (24), viral pneumonia (24), Intensive Care Unit (23), CT images (21), Informa UK (21), global spread (20), clinical course (19), clinical practice (18), etiological agent (17), Molecular Mechanisms (17), SARS-CoV-2 outbreak (17), intensive care (16), SARS-CoV-2 pandemic (16), C-reactive protein (14), CT findings (14), viral genome (14) |
| 3 | Wuhan (635) | Hubei province (131), fever (99), coronavirus disease (62), confirmed cases (54), mathematical model (50), severe disease (49), coronavirus (41), epidemiological characteristics (39), spike protein (39), phylogenetic analysis (38), immune response (30), personal protective equipment (29), angiotensin-converting enzyme 2 (27), rapid spread (26), porcine epidemic diarrhea virus (21), retrospective analysis (21), severe pneumonia (21), suspected cases (21), severe cases (20), transmission dynamics (20) |
| 4 | SARS-CoV (254) | South Korea (40), incubation period (32), respiratory infections (31), early detection (24), cardiovascular diseases (21), preventive measures (15), Open Access article (14), co-infection (13), online version (13), viral load (13), high morbidity (12), exponential growth (11), cross-infection (10), Pleural effusion (10), acute respiratory infections (8), bacterial infections (8), Chinese General Practice (8), early identification (8), Feline coronavirus (8), medical countermeasures (8) |
| 5 | COVID-19 pandemic (237) | Global pandemic (53), International Concern (51), ongoing outbreak (41), close contact (35), medical staff (33), causative agent (32), median age (30), imported case (24), Coronavirus Pandemic (23), coronavirus Outbreak (22), machine learning (20), healthcare systems (18), mechanical ventilation (17), global concern (14), case definition (13), Monoclonal Antibodies (13), real time (13), age groups (12), illness onset (12), diagnostic tests (11) |
| 6 | MERS-CoV (183) | Early stage (43), case fatality rate (30), respiratory syncytial virus (28), early phase (26), host cells (23), Receptor-Binding Domain (22), mortality rate (21), respiratory illness (20), Cytokine Storm (19), infectious bronchitis virus (17), Shanghai Shangyixun Cultural Communication Co. Ltd (17), genome sequence (14), convalescent plasma (13), decision-making (12), intermediate host (12), adverse effects (11), family Coronaviridae (11), family members (11), John Wiley (11), serial interval (11) |
| 7 | Epidemiology (112) | Infectious diseases (91), ill patients (36), case report (35), urgent need (35), infected patients (31), clinical trials (30), general population (25), influenza virus (25), Clinical presentation (18), immune system (18), cancer patients (15), infected individuals (14), Clinical management (13), influenza viruses (12), Lopinavir/ritonavir (12), severe illness (12), antibody response (11), HIV (11), Northern Italy (11), pediatric patients (11) |
| 8 | World Health Organization (90) | Public health (73), public health emergency (64), viral infection (56), control measures (40), acute respiratory distress syndrome (38), clinical data (33), infection control (30), pregnant women (30), respiratory viruses (30), coronavirus infection (29), global health (28), human-to-human transmission (28), respiratory disease (28), RT-PCR (27), viral replication (27), antiviral activity (26), vaccine development (25), licensee MDPI (24), symptom onset (23), infection prevention (22) |
The number following each term indicates the frequency of the term in the given dataset
Fig. 2Co-term map for the coronavirus research between 2018 and 2019. Note that this version was re-generated based on the data in the source: Fry et al. (2020)
Fig. 5Evolutionary pathways of the coronavirus research from 2009 to 2020. Note: Red dash circles mark topics where articles published/uploaded in 2020 are assigned, and the red digits indicate the number of those articles
Descriptive statistics for SEP topics
| Max | Min | Average | Std. Dev | ||
|---|---|---|---|---|---|
| Node | Number of terms | 9483 | 1 | 237.23 | 867.33 |
| Number of articles | 4837 | 1 | 457.53 | 704.01 | |
| Edge | Weight | 0.1272 | 0.0003 | 0.0142 | 0.0162 |
Similarities of 2020 topics with topics in the pre-COVID-19 period
| Topic label | Similarity | Community | |
|---|---|---|---|
| 1 | Viral vaccines [2020] | 0.0001 | #1 viral infection |
| 2 | Clinical assessment [2020] | 0.0005 | #3 respiratory viruses |
| 3 | Serial interval [2020] | 0.0015 | #1 viral infection |
| 4 | Global scale [2020] | 0.0019 | #1 viral infection |
| 5 | Overall prevalence [2020] | 0.0022 | #4 global health |
| 6 | Health systems [2020] | 0.0091 | #1 viral infection |
| 7 | Pregnant women [2020] | 0.0188 | #2 infectious diseases |
| 8 | Case fatality rate [2020] | 0.0200 | #7 immune response |
| 9 | Non pharmaceutical interventions [2020] | 0.0216 | #2 infectious diseases |
| 10 | Public health emergency [2020] | 0.0246 | #4 global health |
| 11 | Mathematical model [2020] | 0.0247 | #4 global health |
| 12 | Convalescent plasma [2020] | 0.0259 | #1 viral infection |
| 13 | N95 respirators [2020] | 0.0263 | #5 epidemiology |
| 14 | World Health Organization [2020] | 0.0278 | #4 global health |
| 15 | Mitigation strategies [2020] | 0.0294 | #5 epidemiology |
| 16 | Angiotensin converting enzyme 2 [2020] | 0.0305 | #1 viral infection |
| 17 | General public [2020] | 0.0306 | #3 respiratory viruses |
| 18 | Infectious disease outbreaks [2020] | 0.0348 | #2 infectious diseases |
| 19 | Antiviral activity [2020] | 0.0387 | #1 viral infection |
| 20 | Diagnostic tests [2020] | 0.0404 | #7 immune response |
| 21 | Respiratory failure [2020] | 0.0434 | #6 acute respiratory distress syndrome |
| 22 | Clinical features [2020] | 0.0459 | #5 epidemiology |
| 23 | Cross sectional study [2020] | 0.0508 | #4 global health |
| 24 | Control measures [2020] | 0.0524 | #3 respiratory viruses |
| 25 | Wuhan [2020] | 0.0653 | #5 epidemiology |
Status of sample topics
| No | Label | Status | TF-IDF |
|---|---|---|---|
| 1 | Central Nervous System [2011] | Resurgent | 0.5748 |
| 2 | IFN alpha [2012] | Resurgent | 0.4717 |
| 3 | Phylogenetic analysis [2013] | Resurgent | 0.4851 |
| 4 | Respiratory symptoms [2013] | Resurgent | 0.6230 |
| 5 | Viral replication [2013] | Resurgent | 0.6211 |
| 6 | Global health [2013] | Resurgent | 0.1675 |
| 7 | Acute respiratory distress syndrome [2014] | Resurgent | 0.1721 |
| 8 | Cell culture [2015] | Resurgent | 0.2483 |
| 9 | Fever [2015] | Resurgent | 0.3979 |
| 10 | SARS CoV [2009] | Always alive | 0.5173 |
| 11 | Viral infection [2010] | Always alive | 0.8175 |
| 12 | Infectious diseases [2011] | Always alive | 0.6786 |
| 13 | Respiratory viruses [2012] | Always alive | 0.6649 |
| 14 | MERS CoV [2013] | Always alive | 0.6681 |
| 15 | Porcine epidemic diarrhea virus [2014] | Always alive | 0.4979 |
| 16 | Epidemiology [2014] | Always alive | 0.5101 |
| 17 | Infectious bronchitis virus [2015] | Always alive | 0.4938 |
| 18 | Feline infectious peritonitis [2015] | Always alive | 0.2278 |
| 19 | Immune response [2015] | Always alive | 0.5533 |
| 20 | Public health [2015] | Always alive | 0.3173 |
| 21 | Host response [2015] | Always alive | 0.1614 |
| 22 | Respiratory pathogens [2015] | Always alive | 0.2139 |
| 23 | RNA viruses [2016] | Always alive | 0.3874 |
| 24 | Viral proteins [2016] | Always alive | 0.3138 |
| 25 | Respiratory syncytial virus [2016] | Always alive | 0.4118 |
| 26 | Disease control [2016] | Always alive | 0.2762 |
| 27 | United States [2016] | Always alive | 0.2718 |
| 28 | Viral RNA [2017] | Always alive | 0.2913 |
| 29 | Fecal samples [2017] | Always alive | 0.1640 |
| 30 | Crystal structure [2017] | Always alive | 0.1541 |
| 31 | Hong Kong [2017] | Always alive | 0.1392 |
| 32 | Coronavirus spike protein [2017] | Always alive | 0.0786 |
| 33 | Endoplasmic reticulum [2017] | Always alive | 0.0888 |
| 34 | Amino acids [2017] | Always alive | 0.2088 |
| 35 | Septic shock [2017] | Always alive | 0.0276 |
| 36 | Biological properties [2017] | Always alive | 0.0805 |
Logistic regression on the relationship of authorship and topic status in COVID-19
| Location of authors | Always alive | Resurgent | Emerging | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | (13) | (14) | (15) | |
| International collaboration | 0.262*** | −0.335** | −0.167** | ||||||||||||
| (0.084) | (0.167) | (0.084) | |||||||||||||
| Chinese authorship | −0.424*** | −0.368*** | 0.153 | 0.148 | 0.366*** | 0.307*** | |||||||||
| (0.079) | (0.092) | (0.140) | (0.162) | (0.078) | (0.090) | ||||||||||
| US authorship | 0.402*** | 0.280*** | −0.181 | −0.098 | −0.345*** | −0.252** | |||||||||
| (0.086) | (0.103) | (0.167) | (0.207) | (0.086) | (0.104) | ||||||||||
| China-US collaboration | 0.061 | 0.075 | −0.153 | −0.160 | −0.020 | −0.015 | |||||||||
| (0.163) | (0.201) | (0.318) | (0.387) | (0.161) | (0.198) | ||||||||||
| Mean of the dependent variable | 0.173 | 0.173 | 0.173 | 0.173 | 0.173 | 0.519 | 0.519 | 0.519 | 0.519 | 0.519 | 0.318 | 0.318 | 0.318 | 0.318 | 0.318 |
| Obs | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 |
| Pseudo R2 | 0.026 | 0.025 | 0.025 | 0.024 | 0.025 | 0.026 | 0.025 | 0.025 | 0.024 | 0.025 | 0.034 | 0.039 | 0.037 | 0.033 | 0.041 |
Estimates stem from logistic regression models with dependent variables being dummy variables indicating status of topics (always alive, resurgent, emerging). Publication type and a dummy for whether the article is a preprint are controlled in all models
Robust standard errors in parentheses
***p < 0.01, **p < 0.05, *p < 0.1
Fig. 6Evolutionary pathways of the coronavirus research from 2009 to 2020 resized to show China’s research emphases. Note: The size of nodes indicates the percentage of Chinese articles in global articles
Logistic regression on the relationship between researcher affiliations and topic status in COVID-19
| Researcher type | Always alive | Resurgent | Emerging | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | |
| Academic | −0.047 | 0.010 | 0.037 | −0.075 | 0.035 | 0.010 | ||||||
| (0.134) | (0.138) | (0.242) | (0.248) | (0.133) | (0.136) | |||||||
| Industry | 0.538 | 0.562 | −0.487 | −0.540 | −0.416 | −0.423 | ||||||
| (0.357) | (0.358) | (0.731) | (0.721) | (0.367) | (0.367) | |||||||
| Government | 0.285* | 0.297* | −0.676* | −0.705* | −0.111 | −0.116 | ||||||
| (0.164) | (0.168) | (0.370) | (0.382) | (0.165) | (0.169) | |||||||
| Mean of the dependent variable | 0.384 | 0.384 | 0.384 | 0.384 | 0.074 | 0.074 | 0.074 | 0.074 | 0.541 | 0.541 | 0.541 | 0.541 |
| Obs | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 | 2949 |
| Pseudo R2 | 0.021 | 0.021 | 0.021 | 0.022 | 0.024 | 0.024 | 0.026 | 0.027 | 0.033 | 0.033 | 0.033 | 0.033 |
Estimates stem from logistic regression models with dependent variables being dummy variables indicating status of topics (always alive, resurgent, emerging). Publication type and a dummy for whether the article is a preprint are controlled in all models
Robust standard errors in parentheses
***p < 0.01, **p < 0.05, *p < 0.1
Fig. 7Distribution of topic status in each community based on COVID-19 articles. Note: Among articles with each topic community in 2020, percentages of articles associated with always alive, emerging, and resurged topics are calculated, respectively