Literature DB >> 35156006

Epidemiology and genetic diversity of SARS-CoV-2 lineages circulating in Africa.

Olayinka Sunday Okoh¹, Nicholas Israel Nii-Trebi², Abdulrokeeb Jakkari³, Tosin Titus Olaniran^4,5, Tosin Yetunde Senbadejo⁶, Anna Aba Kafintu-Kwashie⁷, Emmanuel Oluwatobi Dairo^5,8, Tajudeen Oladunni Ganiyu⁶, Ifiokakaninyene Ekpo Akaninyene^4,5, Louis Odinakaose Ezediuno⁹, Idowu Jesulayomi Adeosun^10,11, Michael Asebake Ockiya¹², Esther Moradeyo Jimah^5,13, David J Spiro¹⁴, Elijah Kolawole Oladipo^5,10, Nídia S Trovão¹⁴.

Abstract

There is a dearth of information on COVID-19 disease dynamics in Africa. To fill this gap, we investigated the epidemiology and genetic diversity of SARS-CoV-2 lineages circulating in the continent. We retrieved 5229 complete genomes collected in 33 African countries from the GISAID database. We investigated the circulating diversity, reconstructed the viral evolutionary divergence and history, and studied the case and death trends in the continent. Almost a fifth (144/782, 18.4%) of Pango lineages found worldwide circulated in Africa, with five different lineages dominating over time. Phylogenetic analysis revealed that African viruses cluster more closely with those from Europe. We also identified two motifs that could function as integrin-binding sites and N-glycosylation domains. These results shed light on the epidemiological and evolutionary dynamics of the circulating viral diversity in Africa. They also emphasize the need to expand surveillance efforts in Africa to help inform and implement better public health measures.

Entities: Chemical

Keywords: Genomics; Phylogenetics; Virology

Year: 2022 PMID： 35156006 PMCID： PMC8817759 DOI： 10.1016/j.isci.2022.103880

Source DB: PubMed Journal: iScience ISSN： 2589-0042

Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that was first reported in Wuhan, China in December 2019, emerged as a novel virus causing a cluster of unusual pneumonia cases. Its outbreak soon became a worldwide pandemic that resulted in a global public health emergency (Guo et al., 2020). Coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 has affected all seven continents, with Africa currently being the least stricken by the pandemic (Lone and Ahmad, 2020). The first case confirmed in Africa was in Egypt on February 14, 2020, followed by Algeria on February 25, 2020. The first case reported in sub-Saharan Africa was confirmed in Nigeria on February 27, 2020 (NCDC, 2020). The first cases in other African countries were recorded in March 2020 (African Centres for Disease Control, 2020), including in Ghana on March 12, 2020. By September 6, 2021, about 7.9 million confirmed cases and more than 199 thousand deaths were reported by Africa CDC, as part of the more than 219 million confirmed cases and more than 4.5 million deaths reported globally by the World Health Organization (WHO). SARS-CoV-2 is the seventh coronavirus known to infect humans (Corman et al., 2020), and it is the third novel coronavirus known to have caused large-scale outbreaks in the 21st century. The first was the SARS-CoV in 2003 that also emerged in China (Rosling and Rosling, 2003; Alanagreh et al., 2020) followed by the Middle East Respiratory Syndrome Coronavirus (MERS-CoV) (Alanagreh et al., 2020; Cui et al., 2019; Zaki et al., 2012) that emerged in Saudi Arabia in 2012. Unlike SARS-CoV and MERS-CoV that cause severe disease in humans (Kumar et al., 2020), SARS-CoV-2 is more infectious but appears to have a lower case fatality rate (CFR) (Zhang and Holmes, 2020). The genome of SARS-CoV-2 is a positive-sense single-stranded RNA (+ssRNA) of approximately 29.9 kilobases (29,891 nucleotides) encoding 9,860 amino acids (Sapkota, 2020). The virus is classified among the Betacoronaviruses (β-CoVs) group under the Coronavirinae subfamily of the Coronaviridae family. The β-CoVs genus is known to infect humans, bats, and other wild animals (Chen et al., 2020). Genome replication produces two large ORFs that are translated into polyproteins processed post-translationally to produce 16 proteins comprising four structural proteins, namely envelope (E), spike (S), membrane (M), nucleocapsid (N), and at least nine accessory proteins, some of which are unique to SARS-CoV-2 and others conserved among coronaviruses (Wu et al., 2020). These 16 proteins play a crucial role in the viral RNA synthesis and immune evasion (Snijder et al., 2020; Shereen et al., 2020). Of the four structural proteins, the S-protein, which is made up of S1 and S2 domains, is known to play a unique role in SARS-CoV-2 replication. The protein functions during host cell attachment and entry by primarily mediating binding to the extracellular domains of its receptor, the angiotensin-converting enzyme 2 (ACE2), a transmembrane protein that is also used by SARS-CoV for cell entry (Wan et al., 2020). SARS-CoV-2 binding to ACE2 and fusion with cellular membrane are facilitated by the S1 receptor-binding domain (RBD) and the S2 subunit, respectively. This unique function makes the spike glycoprotein a target for the development of antibodies, therapeutics, and vaccines. Therefore, the mutational patterns in the S-protein and their circulation trends warrant surveillance for effective interventions against the ongoing COVID-19 pandemic. Protein motifs are small regions of amino acid sequences that facilitate the function of the protein and protein-protein interactions (PPIs). They mediate interactions with cellular proteins and molecular processes within the host cells (Sobhy, 2016). Infection by the virus involves a large number of PPIs between the infective virus and the target host cell (Alguwaizani et al., 2018). Motifs are nucleotide or amino acid sequences that are significant in the genome structure formation, function, and conserved regions in protein molecules. The conserved amino acid sequence may be responsible for protein-substrate binding, determining the active domain for enzymatic cleavage, binding to transcription factors, and the plasticity of protein, and thus repeated motifs are an essential evolutionary mechanism. Hence, identifying motifs and their repeated patterns is important in determining the binding domain of SARS-CoV-2 and to elucidate the evolutionary relationships among sequences (Luo and Nijveen, 2014). Hence, studying and understanding the functional motifs and repeat patterns of SARS-CoV-2 may aid in the prediction of viral protein characteristics, virus-host protein interactions or other putative roles. RNA viruses, including SARS-CoV-2, commonly generate and accumulate mutations in their genomes during viral replication. In humans, immunological pressure facilitates the accumulation and fixation of mutations as the epidemic persists. Mutations in the S protein constitute a major cause of public health concern, as they have the potential to alter the viral tropism and thereby potentially confer adaptation to new tissues and hosts, influence transmissibility and clinical outcomes, and/or confer resistance to neutralizing antibodies and therapeutics (Sui et al., 2008; Wibmer et al., 2021). In February 2020, a non-synonymous mutation was detected in the S-protein of SARS-CoV-2 viruses sampled from individuals in China and Europe. The mutation caused an amino acid change from aspartic acid to glycine at position 614 (D614G). Experimental and clinical findings associated the G614 variant with a selective advantage over the D614 virus, resulting in higher viral loads and increased infectivity (Korber et al., 2020), but not necessarily increasing the mortality rate (Plante et al., 2021). Between late summer and early autumn of 2020, several variants of the SARS-CoV-2 virus emerged. Variant 20B/501Y.V1 202012/01 classified as Pango lineage B.1.1.7, was identified in the United Kingdom (UK) and designated the Alpha Variant of Concern (VOC) (Volz et al., 2021). This variant emerged with an unusually large number of mutations and has since been detected in numerous countries around the globe, including several in Africa. Almost simultaneously, the independent emergence of the VOC 20C/501Y.V2 belonging to Pango lineage B.1.351, or Beta VOC was detected in South Africa (Tegally et al., 2021). Cases attributed to this variant have since been detected outside of South Africa (Volz et al., 2021), disseminating northwards in Africa. Early in 2021, the VOC 20J/501Y.V3 classified as Pango lineage P.1, or Gamma VOC was first identified in Brazil (Faria et al., 2021) and rapidly spread throughout the Americas, Europe, and Oceania. Despite the independent emergence of the 20B/501Y.V1, 20C/501Y.V2 and 20J/501Y.V3, they share a few common mutations. Because most of the current SARS-CoV-2 immunotherapeutic strategies target the RBD of the S-protein to prevent the binding of SARS-CoV-2 with ACE2 (Chan et al., 2020), alterations in the S-protein sequence could potentially affect the efficacy of immune-based therapeutic agents (Wibmer et al., 2021), making surveillance of spike mutations imperative to aid in the development of effective pharmaceutical interventions. As of January 7, 2021, nearly one year since the first case was reported in Africa, a total of 5229 SARS-CoV-2 complete genome sequences from 33 African countries had been deposited in public sequence databases including the Global Initiative on Sharing All Influenza Data (GISAID) (Elbe and Buckland-Merrett, 2017) (Shu and Mccauley, 2017), which can be studied to better understand the ongoing molecular epidemiology of SARS-CoV-2 in the continent (Oladipo et al., 2020). Initial genome sequence analysis suggested the importation of multiple SARS-CoV-2 strains, mainly of European origin and partly from China (Tessema et al., 2020). Knowledge of the evolutionary dynamics underlying the viral genome variation allows tracing the ongoing outbreak and informs the development and deployment of diagnostic tests (Wang et al., 2020a), as well as effective antiviral and vaccination strategies (Avise, 2000). For example, a recent genome-wide association study on SARS-CoV-2 genomes found variations at the genomic position 11,083 within the coding region of non-structural protein six to be associated with COVID-19 severity. The study showed that the G11083 variant was more commonly found in symptomatic cases, while the T11083 variant appeared to be more frequently associated with asymptomatic infections (Aiewsakun et al., 2020). Toyoshima and colleagues (2020) also performed a comprehensive investigation of 12,343 SARS-CoV-2 genome sequences isolated from individuals in six geographic areas and found that ORF1ab L4715 and S protein G614 variants showed significant positive correlations with fatality rates, which supports the finding suggesting that SARS-CoV-2 mutations might affect the susceptibility to SARS-CoV-2 infection or severity of COVID-19 (Toyoshima et al., 2020). It is to be noted that most sequences from Africa included in the genome variation analysis described above were mostly from North African countries including Egypt, but not those of sub-Saharan Africa. The epidemiology of SARS-CoV-2 in the African continent calls for a comprehensive study of the genomic and evolutionary patterns of this virus. Comparative analysis of viral genome sequences represents a very useful approach to provide insight into pathogen emergence and evolution. This study, therefore, pursues an in-depth investigation into the epidemiology, evolution, and molecular motifs of SARS-CoV-2 in Africa to shed light on the pandemic dynamics, and aid in informing the development and implementation of control measures in the African continent.

Results

Epidemiological trends of SARS-CoV-2 in Africa

Focusing only on sequence data will not give a true representation of the disease dynamics of SARS-CoV-2 in Africa, as only a fraction of the cases in Africa are sequenced. Therefore, we analyzed reported COVID-19 cases using data from OurWorldInData.org downloaded on January 8, 2021. As presented in Figure S1, North America has the highest number of COVID-19 cases (n = 25 million), followed by Europe (n = 19 million) and Asia (n = 18 million). Oceania has the least number of cases reported of the six continents (n = 20,575). The absolute number of cases in Africa (n = 2 million) and South America (n = 6 million) are a very small fraction of the cases in Asia, North America, and Europe. The global average of COVID-19 cases per 100,000 population (hereafter referred to as cases/pop) is 895, represented by the red line in Figure 1. We observed that the average number of COVID-19 cases per 100,000 persons in Oceania, Africa, and Asia are all below the global average. Considering the absolute cases of COVID-19, Asia is more affected than South America. However, taking population into consideration reveals that South America (1,287 cases/pop) has a higher burden of COVID-19 cases than Asia (390 cases/pop).

Figure 1

Number of COVID-19 reported cases and deaths per 100,000 population in the different continents

The red line represents the average absolute number of COVID-19 cases worldwide per 100,000 population in the world (i.e., (global COVID-19 cases/world population) x 100,000). The orange points represent the average number of deaths per 100,000 population in the different continents with its scale on the left y axis. The orange line represents the number of deaths globally per 100,000 world population.

Number of COVID-19 reported cases and deaths per 100,000 population in the different continents The red line represents the average absolute number of COVID-19 cases worldwide per 100,000 population in the world (i.e., (global COVID-19 cases/world population) x 100,000). The orange points represent the average number of deaths per 100,000 population in the different continents with its scale on the left y axis. The orange line represents the number of deaths globally per 100,000 world population. The number of deaths per 100,000 population (hereafter referred to as deaths/pop) followed the same trend as the number of cases per 100,000. Deaths per 100,000 in Oceania (2 deaths/pop), Africa (4 deaths/pop), Asia (6 deaths/pop) are far below the global value (19 deaths/pop), while it is above in South America (39 deaths/pop), Europe (55 deaths/pop), and North America (91 deaths/pop). This shows a positive correlation between the number of COVID-19 tests and the number of COVID-19 cases and deaths reported. In Africa, an analysis of COVID-19 cases per 100,000 population (Figure 2) showed that, while the continent had comparatively low case numbers, individual nations had high COVID-19 burdens. South Africa has been the most seriously affected with 1,260 cases/pop; however, Tunisia (678 cases/pop) and Morocco (791 cases/pop) in North Africa also appear to be greatly impacted. Similar patterns were observed for Libya in North Africa (1,076 cases/pop), Namibia (528 cases/pop) and Botswana (347 cases/pop) in Southern Africa, and Gabon in Central Africa (404 cases/pop). Other African countries were mildly impacted as well. Overall, the number of cases/pop demonstrates that northern and southern countries, and Gabon in Central Africa, were the most affected regions/country in Africa.

Figure 2

COVID-19 cases reported in African countries

Absolute number of cases (left). Number of cases per 100,000 population (right).

COVID-19 cases reported in African countries Absolute number of cases (left). Number of cases per 100,000 population (right). To get a deeper insight into the impact of SARS-CoV-2 on African countries, we analyzed the reported deaths as represented in Figure 3. Considering the absolute number of deaths, South Africa remained the worst hit on the continent with 20,241 deaths, distantly followed by Egypt (n = 6,453), and other North African countries. While Ethiopia, Kenya, Nigeria, Sudan, and Libya have reported more than 1,000 deaths each, about 20 African countries have recorded less than a total of 500 deaths. Considering the number of deaths recorded per 1,000 reported cases (deaths/cases) (Figure 3 - right), the analysis showed Western Sahara as the worst affected with 100 death/cases, followed by Sudan (76 death/cases), Egypt in the North (58 death/cases), and Chad in the West (63 death/cases).

Figure 3

Reported deaths from COVID-19 in Africa

Absolute number of total deaths (left) per country. Number of deaths per 1,000 reported cases (right).

Reported deaths from COVID-19 in Africa Absolute number of total deaths (left) per country. Number of deaths per 1,000 reported cases (right). South Africa and Morocco recorded the highest absolute numbers of COVID-19 cases in Africa (Figure 2 - left), though this might reflect the large number of tests (5,110,384 and 3,646,330 for South Africa and Morocco, respectively) conducted in these two countries (Figure 4). The third highest number of tests was conducted in Ethiopia (n = 1,562,008) albeit approximately 60% less than the number of tests conducted in Morocco. Taking population into consideration (number of tests conducted per 100,000 population; tests/pop), the islands of Mauritius and Cape Verde performed the highest number of tests (22,389 tests/pop and 18,250 tests/pop, respectively) followed by Botswana (13,955 tests/pop) and Gabon (11,860 tests/pop) who carried out more tests, relative to population, than South Africa (8,576 tests/pop) and Morocco (9,835 tests/pop).

Figure 4

Number of SARS-CoV-2 tests conducted in African countries

Absolute number of tests (left). Number of tests per 100,000 population (right).

Number of SARS-CoV-2 tests conducted in African countries Absolute number of tests (left). Number of tests per 100,000 population (right). We analyzed the number of positive tests per 1,000 COVID-19 tests (pos/test) conducted (Figure 5). Mayotte was estimated to have the highest positivity rate (271 pos/test), followed by Guinea (250 pos/test), South Sudan (249 pos/test), and Tunisia (200 pos/test). Other African countries with more than 100 positive cases per 1,000 COVID-19 tests include Libya (198 pos/test), Madagascar (186 pos/test), Gambia (166 pos/test), Cameroon (152 pos/test), Central African Republic (150 pos/test), South Africa (147 pos/test), Reunion (141 pos/test), Sao Tome and Principe (136 pos/test), Swaziland (110 pos/test), Egypt (110), and Ivory Coast (102 pos/test). The lowest positivity rates were observed in countries such as Benin (10 pos/test), Rwanda (9 pos/test), and Mauritius (2 pos/test).

Figure 5

COVID-19 positivity rate in Africa

Number of positive patients out of every 1000 COVID-19 tests. Gray shades represent countries for which data is not available.

COVID-19 positivity rate in Africa Number of positive patients out of every 1000 COVID-19 tests. Gray shades represent countries for which data is not available.

Evolutionary history

Despite the lower number of genetic sequences available compared to epidemiological data, the former allow us to gain insight into the viral diversity circulating in the continent as well as the evolutionary relationships and transmission dynamics of viruses in different regions. Worldwide, we observed 782 Pango lineages, nine GISAID clades and 10 Nextstrain clades. Europe submitted about 65% (n = 208,538) of the SARS-CoV-2 sequences in GISAID, the majority being from the United Kingdom (46%, n = 147,137). Africa submitted 2% of the SARS-CoV-2 sequences (n = 5,229) with most of the sequences (55%, n = 2,882) coming from South Africa. Democratic Republic of the Congo and Gambia followed with 7% (n = 360) each, then Kenya with 290 sequences (6%) and Nigeria with 4% (n = 223) (Figure S2). Even though in absolute terms, South Africa submitted most of the sequences (Figure 6 - left), a closer look at the number of sequences submitted (Figure 6 - right) revealed that the Democratic Republic of the Congo was the largest contributor of genomic data per 1,000 reported COVID-19 cases in Africa.

Figure 6

Sequences from African countries submitted to GISAID

Absolute number of sequences from Africa submitted to GISAID (left). Sequences available in GISAID per 1,000 SARS-CoV-2 cases in Africa (right). Gray shade represents countries for which no sequences were available (n = 27/57 (47.4%) countries).

Sequences from African countries submitted to GISAID Absolute number of sequences from Africa submitted to GISAID (left). Sequences available in GISAID per 1,000 SARS-CoV-2 cases in Africa (right). Gray shade represents countries for which no sequences were available (n = 27/57 (47.4%) countries). Using Nextstrain clade nomenclature (Figure S3), we observed that 20A.EU1 was the dominant circulating clade in Europe followed by 20B and 20A. Clades 20C and 20A are predominant in North America, while 20B was predominant in Oceania. In Africa (Table S3), 20B and 20A were the dominant circulating clades, accounting for about 82% of all sequences available for the continent. For simplicity, we present the most prevalent (top 1%) of the 782 Pango lineages identified worldwide in Figure S2. Europe had the most diverse lineages with B.1.177 being the most prevalent, followed by B.1.1 and B.1. In Asia, B.1.1 and B.1.1.284 were the dominant lineages, while D.2 was the most prevalent in Oceania. In North America, B.1, B.1.2, and B.1.1 were the most prevalent. In contrast, the top 1% Pango lineages (Figure S4) circulating elsewhere in the world were not observed in sequences from Africa and South America. The top 10% of the lineages circulating in Africa is represented in Figure S5. Pango lineage B.1.5 was the dominant lineage circulating in Africa, representing 11.3% (n = 591) of all diversity. Other prevalent lineages in Africa are B.1 (n = 546, 10.4%), B.1.1 (n = 518, 9.91%), B.1.1.206 (n = 481, 9.20%), B.1.351 (n = 349, 7%), and C.1 (n = 271, 5%). The first SARS-CoV-2 sequence collected from Africa on March 1, 2020 was found to belong to the Pango lineage B.1.5 (Figure 7), while the second reported on March 8, 2020 belongs to the B.1 lineage. Pango lineages B.1, B.1.1, and B.1.5 circulated in Africa from March 1, 2020 through June 7, 2020; and were replaced by Pango lineage B.1.1.206 that was first reported in the continent on June 8, 2020 and dominated between June 8 and November 2, 2020. The Pango lineage B.1.351 was first reported in South Africa on October 10, 2020 and it became the most prevalent lineage on November 3, 2020, likely as a consequence of the increased sequencing efforts in the country.

Figure 7

The incidence of the top five Pango lineages circulating in Africa between March 1, 2020 and January 7, 2021.

The incidence of the top five Pango lineages circulating in Africa between March 1, 2020 and January 7, 2021. Further, we analyzed the phylogenetic relationships among African sequences and those from other parts of the world (Figure 8). The phylogenetic tree demonstrates that African viruses cluster closely with viruses from all continents, but mostly with those from Europe, a source that generated some of the large outbreaks detected in the phylogenetic tree. Of note are the viruses sampled from Uganda in March 2020 which appear in large clusters of European viruses. Interestingly, we also observed a cluster of a Ugandan virus with another from Oceania, both with identical collection dates. Although Europe appeared to seed many African outbreaks, African viruses also frequently clustered with those from Asia, Oceania, and especially South America, which can be seen associated with viruses from South African, Congolese, and Gambian clusters. A similar phenomenon was observed with North America which had viruses clustering with those from Morocco, South Africa, Egypt, Kenya, and Nigeria. It was also observed that African clusters mostly contained sequences from the same or adjacent countries, which is evident in South African clusters (Data S1 and Data S2). A number of studies have focused on the South African outbreaks in depth (Giandhari et al., 2021; Tegally et al., 2021, Tegally et al., 2021). The current study identified outbreaks (clusters with more than 15 sequences) in Egypt, Democratic Republic of the Congo, and Gambia. Some of these clustered closely with viruses from other African countries as observed for instance with some of the Gambian outbreaks, which were related to West African neighbors, such as Senegal. (Data S2). This phenomenon was also observed for Kenya and neighboring Uganda viruses.

Figure 8

Maximum likelihood tree colored by continent

Phylogenetic tree inferred for a dataset with genetic sequences from all continents.

Maximum likelihood tree colored by continent Phylogenetic tree inferred for a dataset with genetic sequences from all continents. We looked specifically at the viral genetic diversity within Africa, as compared to the genetic diversity observed in other continents (Figure 9 and Data S3). Inspection of the continent-specific genetic distance distributions with a Wilcoxon signed-rank test revealed that the viral diversity circulating in Africa is significantly higher (p value < 2.2e-16) than that estimated in Oceania and South America, but significantly lower than that in Asia, Europe, and North America. These findings indicate that the African epidemic is closest to that of South America and farthest from those of Asia and North America (Figure S6).

Figure 9

Evolutionary divergence of SARS-CoV-2 across continents

Violin plots represent the distribution of pairwise genetic distances between all sequences for isolates in each continent. Vertical lines depict the mean pairwise genetic distance between all samples in each continent.

Evolutionary divergence of SARS-CoV-2 across continents Violin plots represent the distribution of pairwise genetic distances between all sequences for isolates in each continent. Vertical lines depict the mean pairwise genetic distance between all samples in each continent. We also investigated the genetic diversity across countries in the African continent (Figure S7). The lowest viral diversities were observed in Madagascar, Zambia, and Algeria, while the highest viral diversity circulated in Nigeria, Tunisia, and Sierra Leone. Of note, distinct viral populations were estimated among countries in North Africa (namely in Tunisia), West African countries (including Gambia, Ghana, Mali, Nigeria, Senegal, and Sierra Leone), and South Africa. Viruses collected in West Africa were also genetically distant from those in South Africa. Generally, with the exception of countries with overall lower within-country diversities, substantial diversity was observed among African countries.

Identification of repeat patterns and motifs

GLAM2 determines recurring motifs with deletions and insertions, and their presence in functional proteins. Using the Tomtom motif tool, we identified the motifs represented in the table below, as well as their functional class when compared with the motif database (Eukaryotic Linear Motif (ELM) resource - http://elm.eu.org/elms/LIG_IBS_1.html) (Table 1). The p-value denotes the likelihood of a random motif with equal width to function as a target and align better to motifs in the database, thus producing a match score as either good or better than another target. The e-value shows the expected number of false positives in the matches, with a threshold of ten or less.

Table 1

Repeat patterns and motifs in SARS-CoV-2 genomes from Africa

ID	Motif	Accession number	Length	Functional site class	p-value	e-value
1	ILRKGGRTIAFGGCVFSYVGCHNKCAYWVPRASANIGCNHTGVVGEGSEG	ELM: ELME000129 (LIG_IBS_1)	50	Integrin binding sites	3.43E-04	5.63E-02
1	ILRKGGRTIAFGGCVFSYVGCHNKCAYWVPRASANIGCNHTGVVGEGSEG	ELM: ELME000129 (LIG_IBS_1)	50	N-glycosylation site	4.70e-03	7.71E-01
2	ILRKGGRTIAFGGCVFSYVGCHNKCAYWVPRASANIGCNHTGVVGEGSEG	ELM: ELME000316 (LIG_Integrin_isoDGR_1)	50	Integrin binding sites	2.68E-02	2.68E-02

Repeat patterns and motifs in SARS-CoV-2 genomes from Africa The motifs were matched with the study dataset but were not found in all isolates (data not shown). We observed a position shift in the motifs across the isolates, but it was not clear if the shifts altered the functionality of the motifs. Therefore, it indicates that the motifs are present but at different positions in the genome, probably due to deletions and substitutions along the genome of African SARS-CoV-2 isolates. Two of the motifs revealed by the analysis were found in the ORF1ab gene (locus Gu280-gp01) and were identified as integrin binding sites occurring mainly at positions 396–445 (ID 1) and 3,361–3,409 (ID 2). Our analysis also uncovered a motif that could function as an N-glycosylation site, mostly in positions 396–445 (ID 1) (Table 1).

Discussion

Ongoing research on SARS-CoV-2 classical and genomic epidemiology in Africa is crucial for monitoring the circulating genetic diversity of the virus, its clinical presentation, and epidemiological profiles, as well as estimating the magnitude of a pandemic, and informing the development and implementation of effective control measures in the African continent. Mutations in the viral genome may also raise concerns for reliable and effective diagnostic surveillance and for monitoring of SARS-COV-2 transmission dynamics (Galloway et al., 2021). We assessed the impact of the COVID-19 pandemic in Africa by evaluating the trends of absolute number of cases and deaths, but given the limited testing on the continent, we also relied on the proportion of the population that was screened for COVID-19 and the proportion that tested positive (positivity rate), which were relatively elevated for Mayotte, Guinea, South Sudan, and Tunisia. Compared to other continents, Africa appears to be relatively spared in terms of case fatality rate. Nonetheless, Egypt, Sudan, Chad, and Niger, all of which share borders, were found to have the highest numbers of COVID-19-related deaths, and thus further investigation is necessary to uncover the factors that led to this public health burden. We estimated that the impact of SARS-CoV-2 in Africa has been below the global average, both in terms of cases and mortality. However, this is based on the available information associated with the reported metrics on cases, deaths, and number of tests conducted, which appear to be underestimated in Africa. The younger African population might also contribute to keeping the number of severe cases low compared to older populations elsewhere (Lee et al., 2020). The under-reporting calls for policy directions in Africa to be tailored toward expanding screening and improving implementation of measures that curb community spread. It has also been hypothesized that the apparently lower impact of the disease in Africa might be partly due to the heavy use of chloroquine, and its derivative hydroxychloroquine, to prevent or treat malaria, as well as autoimmune conditions and other diseases (Tönnesmann et al., 2013; Ben-Zvi et al., 2012). These antimalarials have been shown, in vitro and in vivo, to inhibit the pH-dependent steps in the replication of several viruses, including SARS-CoV-2 (Yao et al., 2020; Colson et al., 2020; Gao et al., 2020; Liu et al., 2020, Wang et al., 2020b). Although both hydroxychloroquine and chloroquine, either alone or in combination with azithromycin, are commonly used in several African countries for the treatment of COVID-19 (Abena et al., 2020), the use of these drugs has not been recommended by international health organizations, and findings suggest further studies are warranted to arrive at a conclusive basis for their use (Pastick et al., 2020). Furthermore, it is not certain that control measures that have proved effective in the global north will be equally effective in Africa. For instance, as shown in our findings, lockdown was not only counterproductive in different socio-economic areas, but also ineffective in curbing COVID-19 transmission in Africa. Some reasons that could be assigned include illiteracy, poverty, and cultural norms. Effective pursuance of grassroot education on good public health practices, mass distribution of disposable masks, free access to running water and soap, and availability of sanitizers in various public places, represent important avenues to tackling the current and future outbreaks. Although SARS-CoV-2 sequence data constitute an integral part of the decision making in other continents (Lu et al., 2020; Zhu et al., 2020), Africa has yet to fully employ sequence information to manage its COVID-19 outbreaks. This could be attributed to the limited technical competencies and infrastructural deficiencies (Jerving, 2020), such as low sequencing capacity, bioinformatics, computational skills and pipelines, and limited funding for and access to sequencing reagents. To date, African countries have contributed few SARS-CoV-2 genomic sequences in the global pool of open access repositories, such as NCBI GenBank and GISAID databases. Molecular epidemiology studies remain imperative as African countries reopen their borders with some level of COVID-19 testing but without real-time genomic surveillance to monitor the emergence of viral variants. To this end, this study pursued detailed phylogenetic inference, comparison of the evolutionary divergence, detection of repeat patterns and motifs, and analysis of the geographical distribution of SARS-CoV-2 trends in Africa. Here, we employed, among other methods, the Pango nomenclature system designed to implement a dynamic classification of SARS-CoV-2 lineages that incorporates both genetic and geographical components (O’Toole et al., 2021. in prep). The Pango nomenclature contains molecular signatures that are helpful for tracking SARS-CoV-2 introduction, emergence and spread (Andersen et al., 2020). We observed differences in the lineages circulating in Africa from those in most parts of the world. Lineage B.1.5 was identified as the dominant genetic lineage circulating in Africa, but most of the top 1% lineages in circulation worldwide were not found. This observation might be a consequence of under-surveillance as there is a relatively low number of African sequences available in the genetic databases, founder effects, or the inefficient implementation of control measures, such as testing of travelers that may contribute to viral introductions and subsequent spread in the community. Consequently, the biological significance of the B.1.5 lineage, its epidemiologic features and spatial patterns deserve monitoring and further exploration, as there have been no reports of changes in transmissibility, fatality rates, or vaccine efficacy, in contrast with lineage B.1.351 which dominated in early 2021 (Chen et al., 2021; Luo et al., 2021; Irfan and Chagla, 2021; Abu-Raddad et al., 2021; Shinde et al., 2021; Davies et al., 2021; Jassat et al., 2021). The phylogenetic topological relationships revealed that African genomes tended to cluster with those from Europe, which is in line with the high cultural and business connectivity between these continents. We also observed several noteworthy outbreaks in Egypt, Democratic Republic of the Congo, Gambia, and South Africa, which reflects the higher number of sequences available for these countries. The higher viral diversity in Africa, compared to that in Oceania and South America, can initially be thought to be a reflection of the within-continent and inter-continent connectivity, as well as the travel patterns of individuals in Africa, which includes several major metropolitan areas, such as those in South Africa, Egypt, Ethiopia, and Morocco. However, it might also be a consequence of diversity bottlenecks or the limited genomic surveillance in the other continents. In addition, we also observed that viruses circulating in different African regions are relatively genetically diverse. This might be a consequence of varied sources of introduction of a variety of lineages into the different regions of the continent. It can also be due to easier viral spread among neighboring countries or those that share language or economic ties. This observation highlights the need for more in-depth phylodynamic studies to gain insight into the transmission routes leading to viral introductions and outbreaks throughout the continent, though these have been partially addressed in other studies (Wilkinson et al., 2021). We also identified de novo protein motifs that may have functional significance for SARS-CoV-2. Integrins are essential eukaryotic cells' collagen receptors formed by a noncovalent interaction of two transmembrane glycoproteins subunits developing into about 24 varieties of heterodimers that facilitate the binding of cells to extracellular matrix and junctions. Hence, the integrin-binding domain facilitates cell-attachment and cell-adhesion (Sigrist et al., 2020). Integrins may be used in place of the ACE-2 receptor because there is an integrin binding motif (arginine-glycine-aspartate [RGD]) on the spike protein (Beddingfield et al., 2021), that could potentially mediate viral entry into host cells and influence SARS-CoV-2 tissue tropism, viral transmission, and pathogenicity. Therefore, integrins should be further studied in order to gain insight into their roles in the viral pathogenicity and transmission. Despite several studies focused on ACE2 (Makowski et al., 2021) (Lan et al., 2020), the hypothesis that SARS-CoV-2 integrins could serve as alternative viral receptors needs to be validated experimentally. In addition, several integrins are believed to be co-receptors of SARS-CoV-2 infections, and thus primary infection assays focusing on integrins should be carried out (Beddingfield et al., 2021). An N-glycosylation site employs a biosynthetic process of high complexity that is responsible for protein maturation along its secretory pathway (Yang et al., 2019; Galbán and Duckett, 2010). Glycosylation is a posttranslational modification in viral proteins that determines protein conformation, function, and host adaptation. It can also act as a defense mechanism for SARS-CoV-2 against the immune cells and antibodies of the host, making it difficult to distinguish, identify, and target the virus for elimination (Watanabe et al., 2019; Grant et al., 2020; Ramírez Hernández et al., 2021). This may in turn contribute to the cell infection rate and therefore to disease severity (Reily et al., 2019). We hypothesize that the presence and potential mutation of N-glycosylation domains on the SARS-CoV-2 genome may have implications for the binding affinity as previously described by Zhao et al. (2020). This may likely account for the immune evasion for SARS-CoV-2 by camouflaging immunogenic viral protein epitopes (Watanabe et al., 2020) as previously described by Watanabe et al. (2020). Further investigation using animal models would add to the understanding of the impact of mutations in the N-glycosylation domains on the efficacy of ongoing vaccination against SARS-CoV-2 around the world. This work was produced not without certain challenges. Our search for data brought to the fore a seeming lack of transparency in data disclosure and availability, both at the genetic and epidemiological fronts. This is concerning as it may deprive the continent and the global scientific community of useful information for consideration in the fight against the COVID-19 pandemic. Secondly, very few sequences from Africa were available in public and semi-public sequence databases as compared to other continents, and certain factors might have contributed to this outcome; however, key factors among these are the lack of resources and technical proficiencies. Sequencing, bioinformatics, and computational expertise can be greatly improved with capacity-building trainings organized by African entities and other international partners, such as the initiatives by the Fogarty International Center, National Institutes of Health and Johns Hopkins University Applied Physics Laboratory that have been regularly training scientists from low and middle-income countries, particularly during the COVID-19 pandemic. These trainings have greatly improved the technical skills of participants toward analyzing the epidemiological and evolutionary trends of SARS-CoV-2 in Africa, as presented in this study. Furthermore, funding from African countries to support African scientists to carry out in-depth research on various aspects of SARS-CoV-2 in Africa is scarce (Oladipo et al., 2020). Local governmental commitment to funding research would allow scientists to be more independent in their research pursuits. In conclusion, this work describes the molecular epidemiology, analyzes the genetic variability of SARS-CoV-2 in Africa, and highlights the need for continuous genomic and epidemiological surveillance, which is imperative for tracing the emergence of genetic variants that can have significant effects on antigenicity, immunity, transmissibility, and potential vaccine escape. This information will also allow investigation of the transmission dynamics and resurgence of waves of infection, as well as optimize public health measures, such as the deployment of vaccine formulations across the continent.

Limitations of the study

Despite being one of the few studies that comprehensively explored the viral genetic diversity, evolutionary history, and functional genome patterns of SARS-CoV-2 in Africa, we understand that our observations might have been conditioned by sampling bias. The limited testing capacity and/or under-reporting of cases and mortality might influence our estimation of the impact of SARS-CoV-2 in Africa, which was found to be below the global average. We note that where cases and deaths were reported, in some of the instances the actual number of tests conducted was not reported, which did not allow estimation of positivity rate. Therefore, it is conceivable that underestimation in the reported cases and mortality metrics could mask the actual incidence and impact of the pandemic in Africa. Another caveat lies in the full genome sequencing technology being also limited in most public health, research, or academic institutions in Africa. The associated cost, especially in less endowed countries, makes genome sequencing an option not considered routinely. Consequently, the number of sequences generated from most of the countries was scarce, and the fact that some countries were not represented at all in our analysis due to lack of sequence data might suggest that our observations may not represent the true state of the situation in the African continent. However, our findings shed light on the state of affairs and may help inform public health policies.

STAR★Methods

Key resources table

Resource availability

Lead contact

Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request, Dr. Nídia S. Trovão (nidia.trovao@nih.gov; nidiastrovao@gmail.com).

Materials availability

This study did not generate new unique reagents or genetic sequences.

Method details

Compilation of genomic datasets

Assembly of study dataset

SARS-CoV-2 genome sequences collected from Africa were obtained from the GISAID database (https://www.gisaid.org/) (Elbe and Buckland-Merrett, 2017) (Shu and Mccauley, 2017) on January 7, 2021, by only selecting complete genomes and excluding those with low coverage. As of January 7, 2021, nearly eleven months since the first case was reported in Africa, a total of 5229 SARS-CoV-2 complete genome sequences from 33 African countries were available in the GISAID database. The sequences were aligned using the online version of the MAFFT (Katoh et al., 2019) multiple sequence alignment tool hosted at https://mafft.cbrc.jp/alignment/software/closelyrelatedviralgenomes.html, with the Wuhan-Hu-1 (www.ncbi.nlm.nih.gov/nuccore/MN908947.3) as the reference sequence. The aligned sequences were manually edited and cleaned in AliView version 1.26 (Larsson, 2014), by excluding sequences with fewer than 75% unambiguous bases, and trimming the alignment at the 5’ and 3’ ends. We also removed duplicate sequences defined as those having identical nucleotide composition and having been collected on the same date and in the same country. This dataset was subjected to multiple iterations of phylogeny reconstruction using IQ-TREE multicore software version v1.6.12 (Nguyen et al., 2015) with parameters -m GTR+G -nt 50. TempEst (Rambaut et al., 2016) was used to exclude outlier sequences whose genetic divergence and sampling date were incongruent, resulting in a dataset with 2,414 sequences with 29,796 nucleotide base pairs.

Selecting a genomic background dataset

We used the Pango lineage classification available in the metadata associated with the sequences to identify the lineages circulating in Africa, as this nomenclature system is designed to integrate both genetic and geographical information about SARS-CoV-2 dynamics (O’Toole et al., in prep). In total, there were 143 Pango lineages circulating in Africa, as listed in Data S4. On January 7, 2021, we obtained from GISAID all available sequences belonging to lineages that circulate in Africa. A similar approach to that described above (including alignment using MAFFT, manual inspection using AliView, phylogenetic tree reconstruction with IQ-TREE and exclusion of root-to-tip outliers using TempEst) was employed, resulting in a dataset with 5002 sequences with 29,796 nucleotide base pairs.

Phylogenetic inference

We merged the study and background datasets, resulting in a final dataset with 7,416 sequences (Data S5). We computed the phylogeny with ultrafast bootstraps using IQ-TREE v1.6.12 (Hoang et al., 2018; Nguyen et al., 2015) with parameters -m GTR+G -bb 1,000 -bnni -nt 50. These analyses were conducted using the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health (Bethesda, MD, USA) (http://biowulf.nih.gov). Trees were rooted in the Wuhan-Hu-1 (GenBank: MN908947.3) reference genome and visualized in FigTree version 1.4.4.

Comparison of evolutionary divergence

We estimated the evolutionary divergence of several sequence datasets from each continent. Each continent-specific dataset consisted of sequences from the final dataset described above. The final continent-specific datasets were as follows: Africa (n = 2,414); Asia (n = 1,008); Europe (n = 999); North America (n = 997); Oceania (n = 1,000); South America (n = 998). To estimate the evolutionary divergence, we calculated the pairwise distance (in base substitutions per site) between all pairs of sequences within and between each continent. We conducted the analyses using the Molecular Evolutionary Genetics Analysis software version 10 (MEGA X) (Kumar et al., 2018; Stecher et al., 2020) and applied the maximum composite likelihood model (Tamura et al., 2004). The rate variation among sites was modeled with a gamma distribution (shape parameter = 4), and the differences in the composition bias among sequences were considered in evolutionary comparisons (Tamura and Kumar, 2002). We included 1st+2nd+3rd+non-coding codon positions, and all with less than 50% site coverage due to alignment gaps, missing data, and ambiguous bases, were eliminated (partial deletion option). R 4.0.3 software (R CoreTeam, 2019) was used for the visualization. The pairwise genetic distances were summarized and plotted using scripts designed in R 4.0.3 software (R CoreTeam, 2019). The R packages used were rio (Chan et al., 2021), tidyverse (Wickham et al., 2019), readr (Wickham et al., 2016), graphics (R Core Team, 2019), sm (Bowman and Azzalini, 2018), vioplot (Adler and Kelly, 2020), gridExtra (Auguie, 2017), and ggplot2 (Kahle and Wickham, 2013) [ref].

Geographical distribution of COVID-19 pandemic in Africa

SARS-CoV-2 genetic data was collected from GISAID as described above and epidemiological data was obtained from OurWorldInData.org (Roser et al., 2020) on January 8, 2021. One sequence did not have a clade assignment and was excluded from the statistics. We developed scripts for the statistical analysis using R 4.0.3 software (R CoreTeam, 2019). The R packages used were maptools (Bivand et al., 2021), RColorBrewer (Neuwirth, 2014), maps (Becker et al., 2018), mapdata (Becker et al., 2018), readxl (Wickham and Bryan, 2019), ggplot2 (Kahle and Wickham, 2013) (Kassambara, 2019), dplyr (Wickham et al., 2018), gridExtra (Auguie, 2017), ggcorrplot (Kassambara, 2019), ggpubr (Kassambara, 2020), ggmap (Kahle and Wickham, 2013), lubridate (Grolemund G et al., 2011), aweek (Kamvar, 2021), and mapproj (Mcilroy et al., 2020). The data and the R script for the analysis can be accessed at https://github.com/Yinkaokoh/updatedSARCoV2_project.

Detection of repeat patterns and motifs

The retrieved SARS-CoV-2 sequences from Africa were annotated using GLAM2 (http://meme-suite.org/tools/glam2). GLAM2 is a deletion and motif finding software for either nucleotide or amino acid sequences (Frith et al., 2008). The Wuhan isolate with accession number GenBank: NC_045512.2 was annotated for novel motifs, and the Biostrings R package from Bioconductor (Pagès et al., 2021) was used to find the motifs' appearance in the retrieved African SARS-CoV-2 sequences. The Tomtom tool (http://meme-suite.org/tools/tomtom) (Gupta et al., 2007) [, which equates one or more motifs against a database of known motifs, was employed to find overlapping positions across the motif database (Gupta et al., 2007).

Quantification and statistical analysis

To investigate how the viral genetic diversity circulating in Africa compares to that circulating in other continents we employed a Wilcoxon signed-rank test on the within-continent pairwise distances, using R 4.0.3 software (R CoreTeam, 2019). For each continent there are 2.912.491 estimates of within-continent pairwaise distances for the Africa set, 507.528 for the Asia set, 498.501 for the Europe set, 496.506 for the North America set, 499.500 for the Oceania set, and 497.503 for the South America set. We did not use any methods to determine whether the data met athe ssumptions of the statistical approach.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Software and algorithms

R	R Core Team 2019. R: A Language and Environment for Statistical Computing	https://cran.r-project.org/
Biostrings R package	Pagès et al., (2021)	https://bioconductor.org/packages/release/bioc/html/Biostrings.html; R package version 2.62.0
maptools R package	Bivand et al., (2021)	https://cran.r-project.org/web/packages/maptools/index.html; Package ‘maptools’
RColorBrewer R package	Neuwirth, 2014	https://cran.r-project.org/web/packages/RColorBrewer/index.html; RColorBrewer: ColorBrewer palettes. R package version 1.1-2.
maps R package	Becker et al., (2018)	https://cran.r-project.org/web/packages/maps/index.html; maps: Draw Geographical Maps. R package version 3.3. 0.
mapdata R package	Becker et al., (2018)	https://cran.r-project.org/web/packages/mapdata/index.html; maps: Draw Geographical Maps. R package version 3.3. 0.
readxl R package	Wickham and Bryan, 2019	https://cran.r-project.org/web/packages/readxl/; readxl: Read excel files. R package version, 1.
ggplot2 R package	Kahle and Wickham, 2013	https://cran.r-project.org/web/packages/ggplot2/index.html
dplyr R package	Wickham and Henry, 2020; Wickham et al., 2018	https://cran.r-project.org/web/packages/dplyr/index.html; R package version 0.7.6
gridExtra R package	Auguie, 2017	https://cran.r-project.org/web/packages/gridExtra/index.html; (R package version 2.3)
ggcorrplot R package	Kassambara, 2019	https://cran.r-project.org/web/packages/ggcorrplot/
ggpubr R package	Kassambara, 2020	https://cran.r-project.org/web/packages/ggpubr/index.html
ggmap R package	Kahle and Wickham, 2013	https://cran.r-project.org/web/packages/ggmap/index.html
mapproj R package	Mcilroy et al., 2020	https://cran.r-project.org/web/packages/mapproj/index.html
rio R package	Chan et al., (2021).	https://cran.r-project.org/web/packages/rio/index.html
tidyverse R package	Wickham et al., (2019).	https://cran.r-project.org/web/packages/tidyverse/index.html
readr R package	Wickham et al., (2016).	https://CRAN.R-project.org/package=readr
graphics R package	R Core Team (2019).	https://www.R-project.org/
sm R package	Bowman and Azzalini, (2018).	https://cran.r-project.org/web/packages/sm/index.html
Lubridate	Grolemund and Wickham (2011).	https://cran.r-project.org/web/packages/lubridate/index.html
Aweek	Kamvar (2021)	https://cran.r-project.org/web/packages/aweek/index.html
vioplot R package	Adler and Kelly, 2020	https://cran.r-project.org/web/packages/vioplot/index.html; https://github.com/TomKellyGenetics/vioplot
MAFFT	Katoh et al., 2019	https://mafft.cbrc.jp/alignment/server/add_fragments.html?frommanualnov6
Aliview	Larsson, 2014	https://ormbunkar.se/aliview/
Figtree	Not applicable	https://github.com/rambaut/figtree/releases
TempEst	Rambaut. et al.,. 2016	http://tree.bio.ed.ac.uk/software/tempest/
IQTree	Nguyen et al., 2015; Hoang et al., 2018	http://www.iqtree.org/
GLAM2	Frith et al., 2008	http://meme-suite.org/tools/glam2
Tomtom	Shobhit Gupta, JA Stamatoyannopolous, Timothy Bailey and William Stafford Noble, 2007	http://meme-suite.org/tools/tomtom
Data and code	This study	https://github.com/Yinkaokoh/updatedSARCoV2_projecthttps://doi.org/10.17632/bczg8z7yg2.1

Other

Sequence data from GISAID	Elbe and Buckland-Merrett, 2017; Shu and Mccauley, 2017	https://www.gisaid.org/
GISAID database authors and laboratories	This study	Data S5

76 in total

1. Prospects for inferring very large phylogenies by using the neighbor-joining method.

Authors: Koichiro Tamura; Masatoshi Nei; Sudhir Kumar
Journal: Proc Natl Acad Sci U S A Date: 2004-07-16 Impact factor: 11.205

Review 2. XIAP as a ubiquitin ligase in cellular signaling.

Authors: S Galbán; C S Duckett
Journal: Cell Death Differ Date: 2010-01 Impact factor: 15.828

Review 3. Glycosylation in health and disease.

Authors: Colin Reily; Tyler J Stewart; Matthew B Renfrow; Jan Novak
Journal: Nat Rev Nephrol Date: 2019-06 Impact factor: 42.439

Review 4. Review: Hydroxychloroquine and Chloroquine for Treatment of SARS-CoV-2 (COVID-19).

Authors: Katelyn A Pastick; Elizabeth C Okafor; Fan Wang; Sarah M Lofgren; Caleb P Skipper; Melanie R Nicol; Matthew F Pullen; Radha Rajasingham; Emily G McDonald; Todd C Lee; Ilan S Schwartz; Lauren E Kelly; Sylvain A Lother; Oriol Mitjà; Emili Letang; Mahsa Abassi; David R Boulware
Journal: Open Forum Infect Dis Date: 2020-04-15 Impact factor: 3.835

5. Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus.

Authors: Yushun Wan; Jian Shang; Rachel Graham; Ralph S Baric; Fang Li
Journal: J Virol Date: 2020-03-17 Impact factor: 5.103

6. Analysis of the SARS-CoV-2 spike protein glycan shield reveals implications for immune recognition.

Authors: Oliver C Grant; David Montgomery; Keigo Ito; Robert J Woods
Journal: Sci Rep Date: 2020-09-14 Impact factor: 4.379

7. In Vitro Antiviral Activity and Projection of Optimized Dosing Design of Hydroxychloroquine for the Treatment of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2).

Authors: Xueting Yao; Fei Ye; Miao Zhang; Cheng Cui; Baoying Huang; Peihua Niu; Xu Liu; Li Zhao; Erdan Dong; Chunli Song; Siyan Zhan; Roujian Lu; Haiyan Li; Wenjie Tan; Dongyang Liu
Journal: Clin Infect Dis Date: 2020-07-28 Impact factor: 9.079

8. SARS-CoV-2 genomic variations associated with mortality rate of COVID-19.

Authors: Yujiro Toyoshima; Kensaku Nemoto; Saki Matsumoto; Yusuke Nakamura; Kazuma Kiyotani
Journal: J Hum Genet Date: 2020-07-22 Impact factor: 3.172

9. The Integrin Binding Peptide, ATN-161, as a Novel Therapy for SARS-CoV-2 Infection.

Authors: Brandon J Beddingfield; Naoki Iwanaga; Prem P Chapagain; Wenshu Zheng; Chad J Roy; Tony Y Hu; Jay K Kolls; Gregory J Bix
Journal: JACC Basic Transl Sci Date: 2020-10-16