| Literature DB >> 34093495 |
Ning Song1, Guang-Lin Cui2, Qing-Lei Zeng3.
Abstract
Even though the COVID-19 epidemic in China has been successfully put under control within a few months, it is still very important to infer the origin time and genetic diversity from the perspective of the whole genome sequence of its agent, SARS-CoV-2. Yet, the sequence of the entire virus genome from China in the current public database is very unevenly distributed with reference to time and place of collection. In particular, only one sequence was obtained in Henan province, adjacent to China's worst-case province, Hubei Province. Herein, we used high-throughput sequencing techniques to get 19 whole-genome sequences of SARS-CoV-2 from 18 severe patients admitted to the First Affiliated Hospital of Zhengzhou University, a provincial designated hospital for the treatment of severe COVID-19 cases in Henan province. The demographic, baseline, and clinical characteristics of these patients were described. To investigate the molecular epidemiology of SARS-CoV-2 of the current COVID-19 outbreak in China, 729 genome sequences (including 19 sequences from this study) sampled from Mainland China were analyzed with state-of-the-art comprehensive methods, including likelihood-mapping, split network, ML phylogenetic, and Bayesian time-scaled phylogenetic analyses. We estimated that the evolutionary rate and the time to the most recent common ancestor (TMRCA) of SARS-CoV-2 from Mainland China were 9.25 × 10-4 substitutions per site per year (95% BCI: 6.75 × 10-4 to 1.28 × 10-3) and October 1, 2019 (95% BCI: August 22, 2019 to November 6, 2019), respectively. Our results contribute to studying the molecular epidemiology and genetic diversity of SARS-CoV-2 over time in Mainland China.Entities:
Keywords: Henan Province; Mainland China; SARS-CoV-2; evolutionary rate; tMRCA; whole-genome sequence
Year: 2021 PMID: 34093495 PMCID: PMC8172800 DOI: 10.3389/fmicb.2021.673855
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Detail information on the SNP in the 19 sequences.
| FAHZU0002 | 19B | 21 | 0 | C8782T | Synonymous | ORF1a | Ser2839Ser |
| G18598A | Missense | ORF1b | Ala1711Thr | ||||
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| C29095T | Synonymous | N | Phe274Phe | ||||
| C29586T | Missense | ORF10 | Pro10Leu | ||||
| FAHZU0007 | 19A | 1065 | 0 | C21219T | Synonymous | ORF1ab | Phe6985Phe |
| FAHZU0008 | 19A | 68 | 0 | T27432C | Synonymous | ORF7a | Ala13Ala |
| C28253T | Synonymous | ORF8 | Phe120Phe | ||||
| FAHZU0010 | 19B | 53 | 0 | C8782T | Synonymous | ORF1a | Ser2839Ser |
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| FAHZU0011 | 19A | 1195 | 0 | G11083T | Missense | ORF1a | Leu3606Phe |
| FAHZU0012 | 19B | 40 | 0 | G29742A | Downstream | S | |
| C8782T | Synonymous | ORF1a | Ser2839Ser | ||||
| C1077T | Missense | ORF1a | Pro271Leu | ||||
| G23438T | Missense | S | Ala626Ser | ||||
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| G28878A | Missense | N | Ser202Asn | ||||
| FAHZU0014 | 19B | 53 | 3 | C8782T | Synonymous | ORF1a | Ser2839Ser |
| C19185T | Synonymous | ORF1b | Cys1906Cys | ||||
| C16375T | Missense | ORF1b | Pro970Ser | ||||
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| FAHZU0017 | 19A | 543 | 0 | C29303T | Missense | N | Pro344Ser |
| C15324T | Synonymous | ORF1b | Asn619Asn | ||||
| FAHZU0018 | 19B | 2406 | 0 | C29095T | Synonymous | N | Phe274Phe |
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| G18598A | Missense | ORF1b | Ala1711Thr | ||||
| FAHZU0019 | 19B | 24 | 0 | C8782T | Synonymous | ORF1a | Ser2839Ser |
| C29095T | Synonymous | N | Phe274Phe | ||||
| G18598A | Missense | ORF1b | Ala1711Thr | ||||
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| C29586T | Missense | ORF10 | Pro10Leu | ||||
| FAHZU0020 | 19A | 117 | 0 | C16596T | Synonymous | ORF1b | Tyr1043Tyr |
| G26144T | Missense | ORF3a | Gly251Val | ||||
| FAHZU0021 | 19A | 51 | 0 | C21219T | Synonymous | ORF1b | Phe2584Phe |
| G11083T | Missense | ORF1a | Leu3606Phe | ||||
| FAHZU0022 | 19A | 266 | 0 | C21707T | Missense | S | His49Tyr |
| C9711T | Missense | ORF1a | Ser3149Phe | ||||
| FAHZU0028 | 19B | 1036 | 0 | C1912T | Synonymous | ORF1a | Ser549Ser |
| C8782T | Synonymous | ORF1a | Ser2839Ser | ||||
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| C16393T | Missense | ORF1b | Pro976Ser | ||||
| C18570T | Synonymous | ORF1b | Leu1701Leu | ||||
| FAHZU0032 | 19A | 70 | 0 | A21141G | Synonymous | ORF1b | Leu2558Leu |
| C28657T | Synonymous | N | Asp128Asp | ||||
| G3483T | Missense | ORF1a | Gly1073Val | ||||
| G26144T | Missense | ORF3a | Gly251Val | ||||
| C27928A | Missense | ORF8 | Thr12Asn | ||||
| FAHZU0033 | 19B | 49 | 0 | A1015G | Synonymous | ORF1a | Glu250Glu |
| C8782T | Synonymous | ORF1a | Ser2839Ser | ||||
| G17594A | Missense | ORF1b | Ser1376Asn | ||||
| A26664G | Missense | M | Ile48Val | ||||
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| FAHZU0034 | 19A | 47 | 0 | C6982T | Synonymous | ORF1a | Cys2239Cys |
| C19386T | Synonymous | ORF1b | Asp1973Asp | ||||
| G22081A | Synonymous | S | Gln173Gln | ||||
| T29483G | Missense | N | Ser404Ala | ||||
| FAHZU0035 | 19B | 33 | 0 | C8782T | Synonymous | ORF1a | Ser2839Ser |
| T18603C | Synonymous | ORF1b | His1712His | ||||
| C29095T | Synonymous | N | Phe274Phe | ||||
| G20683T | Missense | ORF1b | Val2406Phe | ||||
| C27925T | Missense | ORF8 | Thr11Ile | ||||
| T28144C | Missense | ORF8 | Leu84Ser | ||||
| FAHZU0036 | 19A | 1270 | 0 | C486T | Missense | ORF1a | Ser74Leu |
| C18512T | Missense | ORF1b | Pro1682Leu | ||||
| T18738C | Synonymous | ORF1b | Phe1757Phe |
Mutations are called relative to the reference sequence Wuhan-Hu-1. Unsequenced regions at the 5′ and 3′ end are not shown. Clade names are assigned by the Nextclade.
Three continuous gaps are inserted at position 21991.
Figure 1Clade assignment of the 19 Henan sequences analyzed by the Nextclade. Currently, five major clades are defined: 19A and 19B emerged in Wuhan and have dominated the early outbreak; 20A emerged from 19A out of dominated the European outbreak in March and has since spread globally; 20B and 20C are large genetically distinct subclades 20A. The 19 Henan sequences are highlighted and marked with solid circles at the end of their branches.
Demographic, baseline, and clinical characteristics of the 18 patients with COVID-19 in Henan Province.
| Mean ± SD | 65.11 ± 16.78 |
| Age group, n (%) | |
| <29 years | 0 (0.00) |
| 30-39 years | 2 (11.11) |
| 40-49 years | 2 (11.11) |
| 50-59 years | 0 (0.00) |
| 60-69 years | 5 (27.78) |
| 70-79 years | 6 (33.33) |
| 80-89 years | 3 (16.67) |
| ≥90 years | 0 (0.00) |
| Female | 4 (22.22) |
| Male | 14 (77.78) |
| Mild type | 0 (0.00) |
| Moderate type | 0 (0.00) |
| Severe type | 6 (33.33) |
| Critically ill type | 12 (66.67) |
| Yes | 7 (38.89) |
| No | 11 (61.11) |
| Yes | 4 (22.22) |
| No | 14 (77.73) |
| Comorbidities, n (%) | 11 (61.11) |
| Cardiovascular and cerebrovascular diseases, cerebrovascular diseases | 5 (27.78) |
| Endocrine system diseases | 4 (22.22) |
| Respiratory system diseases | 1 (5.55) |
| Malignant tumor | 1 (5.55) |
| Nervous system diseases | 0 (0.00) |
| Chronic kidney disease | 1 (5.55) |
| Chronic liver disease | 1 (5.55) |
| Chronic obstructive pulmonary disease | 1 (5.55) |
| More than 1 comorbidity | 4 (22.22) |
| Fever | 18 (100) |
| Cough | 13 (72.22) |
| Shortness of breath | 11 (61.11) |
| Muscle ache | 1 (5.55) |
| Headache and mental disorder symptoms | 2 (11.11) |
| Sore throat | 2 (11.11) |
| Rhinorrhea | 1 (5.55) |
| Chest pain | 0 (0.00) |
| Diarrhea | 1 (5.55) |
| Nausea and vomiting | 0 (0.00) |
| More than 1 sign or symptom | 17 (94.44) |
| Bilateral pneumonia | 18 (100.00) |
| Unilateral pneumonia | 0 (0.00) |
| No abnormal density shadow | 0 (0.00) |
| Antibiotic treatment | 17 (94.44) |
| Antiviral treatment | 18 (100.00) |
| Hormone therapy | 13 (72.22) |
| Intravenous immunoglobulin therapy | 14 (77.78) |
| Mechanical ventilation | 13 (72.22) |
| ECMO | 10 (55.56) |
| Convalescent plasma therapy | 11 (61.11) |
| Traditional Chinese medicine | 9 (50.00) |
| Negative and discharged | 14 (77.78) |
| Died | 4 (22.22) |
CT, computed tomography; ECMO, extracorporeal membrane oxygenation; SD, standard deviation.
Figure 2Time series and geographic distribution of the 729 SARS-CoV-2 genomes from Mainland China by sampling date. The geographic distribution of the 729 SARS-CoV-2 genomes from Mainland China in the present study is shown at the provincial level. Colors indicate different sampling provinces from Mainland China.
Figure 3Estimated maximum-likelihood phylogenetic tree of SARS-CoV-2 from Mainland China. Maximum-likelihood phylogenetic tree of SARS-CoV-2 for “dataset_729” from Mainland China is shown. Tree is midpoint rooted. Colors indicate different sampling provinces from Mainland China. The scale bar at the bottom indicates 0.00005 nucleotide substitutions per site.
Figure 4Root-to-tip genetic divergence plot of SARS-CoV-2 from Mainland China. Root-to-tip genetic divergence for “dataset_729” from Mainland China in the Maximum likelihood tree (as shown in Figure 3) plotted against sampling date is shown. Colors indicate different sampling provinces from Mainland China. Gray color indicates linear regression line.
Figure 5Estimated Bayesian time-scaled maximum-clade-credibility phylogenetic tree of SARS-CoV-2 from Mainland China. Circle at the tip is colored according to sampling provinces from Mainland China. Note that the axis of abscissas is scaled by decimal date.