| Literature DB >> 33513449 |
Manisha Goyal1, Katrien De Bruyne2, Alex van Belkum1, Brian West3.
Abstract
The current pandemic of COVID-19 is caused by the SARS-CoV-2 virus for which many variants at the Single Nucleotide Polymorphism (SNP) level have now been identified. We show here that different allelic variants among 692 SARS-CoV-2 genome sequences display a statistically significant association with geographic origin (p < 0.000001) and COVID-19 case severity (p = 0.016). Geographic variation in itself is associated with both case severity and allelic variation especially in strains from Indian origin (p < 0.000001). Using an new alternative bioinformatics approach we were able to confirm that the presence of the D614G mutation correlates with increased case severity in a sample of 127 sequences from a shared geographic origin in the US (p = 0.018). While leaving open the question on the pathogenesis mechanism involved, this suggests that in specific geographic locales certain genotypes of the virus are more pathogenic than others. We here show that viral genome polymorphisms may have an effect on case severity when other factors are controlled for, but that this effect is swamped out by these other factors when comparing cases across different geographic regions.Entities:
Keywords: COVID-19; Fatality risk; Genotyping; Haplotypes - geno-to-pheno correlation; SARS-CoV-2
Year: 2021 PMID: 33513449 PMCID: PMC7837616 DOI: 10.1016/j.meegid.2021.104730
Source DB: PubMed Journal: Infect Genet Evol ISSN: 1567-1348 Impact factor: 3.342
SARS-CoV-2 amino acid substitutions giving rise to haplotype variation as defined by genomic locus, position, and inferred date.
| Substitution | Locus | Codon # | Date |
|---|---|---|---|
| L - > S | ORF8 | 84 | 2020-01-12 |
| D - > G | S | 614 | 2020-01-12 |
| P - > L | ORF1b | 314 | 2020-01-13 |
| Q - > H | ORF3a | 57 | 2020-01-23 |
| T - > I | ORF1a | 265 | 2020-02-23 |
| Y - > C | ORF1b | 1464 | 2020-02-23 |
| P - > L | ORF1b | 1427 | 2020-02-23 |
Fig. 1SARS-CoV-2 haplotype counts among samples included in this study providing adequate patient status assessment.
Patient status transformation into a numerical score of case severity.
| Patient Status | Case Severity |
|---|---|
| Asymptomatic | 1 |
| Mild case/Outpatient/Retirement home/Symptomatic | 2 |
| Alive/Released/Recovered | 3 |
| Hospitalized | 4 |
| Severe/ICU | 5 |
| Deceased | 6 |
Fig. 2Minimum spanning tree for all SARS-CoV-2 genomes included in the present study. Genomes are labeled by haplotype and color-coded by country of origin.
Fig. 3COVID-19 case severity by haplotype distribution (H = 2.360; p = 0.016743).
Fig. 4Overview of COVID-19 case severity by country of origin (H = 58.285; p = 0.000000).
Contingency table for haplotype by country, with SARS-CoV-2 sequence counts shown; Chi square = 597.170, P = 0.000000.
| L.DP. YP.QT | L.GL. YP.HI | L.GL. YP.HT | L.GL. YP.QT | L.GP.YP.HI | L.GP.YP.QT | S.DP.YP.QT | |||
|---|---|---|---|---|---|---|---|---|---|
| USA | 8 | 131 | 14 | 22 | 2 | 2 | 1 | 11 | 8 |
| India | 4 | 1 | 62 | 46 | 0 | 1 | 0 | 0 | 1 |
| Spain | 5 | 2 | 1 | 44 | 0 | 0 | 5 | 0 | 33 |
| Italy | 6 | 0 | 0 | 81 | 0 | 0 | 0 | 0 | 0 |
| France | 6 | 26 | 16 | 22 | 0 | 0 | 0 | 0 | 0 |
ANOVA tests on numerical case severity versus SARS-CoV-2 haplotype.
| Country | Statistic | |
|---|---|---|
| USA | H = 11.222 | 0.129228 |
| India | H = 0.557 | 0.756956 |
| France | H = 2.383 | 0.496744 |
| Italy | Sum of ranks: | 0.023739 |
| Spain | H = 14.210 | 0.006653 |
SARS-CoV-2 haplotype counts for geographic divisions.
| L.DP. YP.QT | L.GL. YP.HI | L.GL. YP.HT | L.GL. YP.QT | L.GP.YP.HI | L.GP.YP.HT | L.GP.YP.QT | S.DP.CL.QT | S.DP.YP.QT | |
|---|---|---|---|---|---|---|---|---|---|
| California | 8 | 76 | 14 | 18 | 2 | 2 | 1 | 6 | 6 |
| Gujarat | 2 | 1 | 62 | 44 | 0 | 1 | 0 | 0 | 0 |
| Ile de France | 6 | 26 | 16 | 22 | 0 | 0 | 0 | 0 | 0 |
| Louisiana | 0 | 40 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Abruzzo | 0 | 0 | 0 | 23 | 0 | 0 | 0 | 0 | 0 |
| Basque Country | 0 | 0 | 0 | 13 | 0 | 0 | 3 | 0 | 5 |
| Lombardy | 0 | 0 | 0 | 19 | 0 | 0 | 0 | 0 | 0 |
| Texas | 0 | 9 | 0 | 3 | 0 | 0 | 0 | 1 | 2 |
| Friuli Venezia Giulia | 0 | 0 | 0 | 12 | 0 | 0 | 0 | 0 | 0 |
| Apulia | 0 | 0 | 0 | 12 | 0 | 0 | 0 | 0 | 0 |
| Andalusia | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 7 |
| Aragon | 1 | 0 | 1 | 7 | 0 | 0 | 0 | 0 | 1 |
| Galicia | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 4 |
| La Rioja | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 9 |
| Castilla | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 5 |
| Campania | 0 | 0 | 0 | 7 | 0 | 0 | 0 | 0 | 0 |
| Puerto Rico | 0 | 3 | 0 | 1 | 0 | 0 | 0 | 3 | 0 |
| Lazio | 6 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Veneto | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 0 |
| Melilla | 0 | 1 | 0 | 4 | 0 | 0 | 0 | 0 | 0 |
| Catalunya | 0 | 0 | 0 | 4 | 0 | 0 | 1 | 0 | 0 |
| Madrid | 1 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 |
| Telangana | 2 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 |
| Navarra | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 |
| Comunitat Valenciana | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 2 |
| Canarias | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| South Carolina | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Florida | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Marche | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| None | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| Montana | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Fig. 5COVID case severity versus haplotype in California, USA (H = 12.514; p = 0.129694).
Fig. 6COVID case severity versus the D614G mutation (Sum of ranks: G 7913.5, D 997.5; p = 0.031085).
Fig. 7Minimum spanning tree covering haplotype diversity at the D614G level in association with disease severity. Note that deceased patients are entirely in the G cluster, as are all but one of the still hospitalized patients.