| Literature DB >> 36069983 |
Lue Ping Zhao1, Terry P Lybrand2,3, Peter Gilbert4, Margaret Madeleine1, Thomas H Payne5, Seth Cohen4,5, Daniel E Geraghty6, Keith R Jerome4,5, Lawrence Corey4.
Abstract
Importance: With timely collection of SARS-CoV-2 viral genome sequences, it is important to apply efficient data analytics to detect emerging variants at the earliest time. Objective: To evaluate the application of a statistical learning strategy (SLS) to improve early detection of novel SARS-CoV-2 variants using viral sequence data from global surveillance. Design, Setting, and Participants: This case series applied an SLS to viral genomic sequence data collected from 63 686 individuals in Africa and 531 827 individuals in the United States with SARS-CoV-2. Data were collected from January 1, 2020, to December 28, 2021. Main Outcomes and Measures: The outcome was an indicator of Omicron variant derived from viral sequences. Centering on a temporally collected outcome, the SLS used the generalized additive model to estimate locally averaged Omicron caseload percentages (OCPs) over time to characterize Omicron expansion and to estimate when OCP exceeded 10%, 25%, 50%, and 75% of the caseload. Additionally, an unsupervised learning technique was applied to visualize Omicron expansions, and temporal and spatial distributions of Omicron cases were investigated.Entities:
Mesh:
Year: 2022 PMID: 36069983 PMCID: PMC9453543 DOI: 10.1001/jamanetworkopen.2022.30293
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Omicron Haplotype and Polymorphic Variations in US Sequences Detected in 10 or More Viruses
Omicron samples were collected from October 1 to December 24, 2021. Omicron haplotypes were assumed to have at least 10 mutations from the reference, and, in the event of missing amino acids, remaining amino acids have at least 80% of selected Omicron haplotypes. A full set of haplotypes in the United States are listed in eTable 1 in the Supplement, while those in Africa are listed in eTable 7 in the Supplement. A total of 28 polymutants (PMs) were selected, as follows: 1, A67; 2, T95; 3, G339; 4, S371; 5, S373; 6, S375; 7, K417; 8, N440; 9, G446; 10, S477; 11, T478; 12, E484; 13, Q493; 14, G496; 15, Q498; 16, N501; 17, Y505; 18, T547; 19, D614; 20, H655; 21, N679; 22, P681; 23, N764; 24, D796; 25, N856; 26, Q954; 27, N969; and 28, L981. ID indicates identifier.
First Omicron Cases Detected in Africa and the United States
| ID | Collection date | Location | Gender | Age | Lineage | Clade | Spike haplotype | Mutations, No. |
|---|---|---|---|---|---|---|---|---|
|
| ||||||||
| 1 | 12/31/2020 | South Africa, Eastern Cape | Male | 57 | B.1.576 | GRA | VIGSSSNXGSTKQGQYYXGYKHXDXHKF | 12 |
| 2 | 9/30/2021 | South Africa, Gauteng | Female | 52 | BA.1 | G | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 3 | 10/12/2021 | South Africa, Eastern Cape | Female | 16 | BA.1 | GRA | VIDLPFNKSSKEQGQYYKGYKHKYKHKF | 22 |
| 4 | 10/17/2021 | Nigeria, Abuja | Male | 32 | BA.1 | GRA | VIGSSSKKSNKARSRYHKGYKHKYKHKF | 23 |
| 5 | 10/24/2021 | South Africa, Eastern Cape | Female | 22 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 6 | 11/2/2021 | Nigeria, Abuja | Male | 51 | BA.1 | GRA | VIXXXXXXXXXXXXXXXKGYKHKYNHKF | 12 |
| 7 | 11/2/2021 | South Africa, Northern Cape | Female | 28 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 8 | 11/5/2021 | South Africa, Gauteng | Male | 26 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 9 | 11/8/2021 | South Africa, Gauteng | Unknown | Unknown | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 10 | 11/9/2021 | South Africa, Gauteng | Male | 23 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 11 | 11/9/2021 | South Africa, Gauteng | Male | 34 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 12 | 11/9/2021 | South Africa, Gauteng | Unknown | Unknown | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 13 | 11/9/2021 | Senegal/Dakar, Iressef Diamniadio | Female | 42 | B.1.1.529 | G | VIDLPFKKSNKARSRYHKGYKHKYKHKF | 27 |
|
| ||||||||
| 1 | 11/21/2021 | Maryland | Female | 40 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 2 | 11/22/2021 | New York City | Male | 33 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 3 | 11/24/2021 | Minnesota | Unknown | Unknown | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 4 | 11/24/2021 | New York City | Female | 32 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 5 | 11/24/2021 | Missouri | Female | 25 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 6 | 11/24/2021 | Virginia | Female | 23 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 7 | 11/25/2021 | New York City | Male | 30 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
| 8 | 11/25/2021 | New York | Male | 26 | BA.1 | GRA | VIDLPFNKSNKARSRYHKGYKHKYKHKF | 28 |
Abbreviation: ID, identification.
This Omicron haplotype is imputed to be the same core haplotype from VIDXXXXXXXXXXXXXXKGYKHKYKHKF, in which multiple missing residues were consecutive and were likely due to sequencing quality.
Figure 2. Locally Averaged Omicron Caseload Percentages
A, Omicron variant expansion in African countries. B, Omicron percentages over time across US states, organized by temporal pattern similarities.
Estimated Time for Omicron Caseload Percentages to Reach the 10%, 25%, 50%, and 75% Thresholds Across 12 African Countries
| African country | Frequency | First detection | Time to targeted percentage, d (95% CI) | ||||
|---|---|---|---|---|---|---|---|
| Negative | Positive | 10% | 25% | 50% | 75% | ||
| South Africa | 24 132 | 1908 | 12/31/2020 | 306 (302 to 308) | 311 (309 to 313) | 317 (315 to 318) | 322 (321 to 323) |
| Nigeria | 3626 | 84 | 10/17/2021 | 31 (25 to 38) | 37 (34 to 41) | 43 (41 to 45) | 49 (46 to 51) |
| Senegal | 704 | 29 | 11/9/2021 | 5 (−6 to 19) | 15 (7 to 24) | 25 (17 to 31) | 35 (27 to ∞) |
| Ghana | 2264 | 77 | 11/10/2021 | 0 (−4 to 5) | 5 (1 to 9) | 11 (7 to 15) | 20 (16 to ∞) |
| Botswana | 1781 | 388 | 11/11/2021 | 7 (−1 to 9) | 12 (6 to 14) | 17 (15 to 19) | 21 (20 to 23) |
| Reunion | 5096 | 7 | 11/22/2021 | NA | NA | NA | NA |
| Kenya | 5465 | 35 | 11/26/2021 | −7 (−10 to −4) | −2 (−4 to 1) | 3 (0 to 7) | 8 (4 to 14) |
| Mozambique | 910 | 17 | 11/26/2021 | −3 (−21 to 2) | −2 (−20 to 2) | −1 (−20 to 2) | 1 (−19 to 3) |
| Uganda | 926 | 18 | 11/29/2021 | −9 (−18 to 3) | 11 (2 to ∞) | NA | NA |
| Zambia | 876 | 46 | 11/30/2021 | −46 (−49 to −45) | −44 (−47 to −43) | −42 (−45 to −41) | −41 (−43 to −39) |
| Malawi | 773 | 34 | 12/2/2021 | −20 (−41 to −19) | −19 (−40 to −18) | −18 (−38 to −18) | −18 (−36 to −17) |
| Morocco | 458 | 21 | 12/14/2021 | −8 (−15 to −2) | −3 (−8 to 1) | 2 (−2 to 6) | 7 (3 to ∞) |
Abbreviation: NA, not applicable.
Frequencies of viruses carrying Omicron (positive) or not (negative).
The 95% CIs were obtained from 2.5% and 97.5% of the empirically computed distribution from 1000 bootstraps. The time was inestimable if the Omicron caseload had not crossed the target percentage. The upper bound of the 95% CI was left as ∞ if it was beyond December 28, 2021. A negative day is an extrapolated value from the model fitting to sparse data that the Omicron percentage jumped from 0% to 100% over a short time (see example in eFigures 1 and 2 in the Supplement on Zambia).
Estimated Time for Omicron Caseload Percentages to Reach the 10%, 25%, 50%, and 75% Thresholds Across 38 US States
| States | Frequency | First detection | Time to targeted percentage, d (95% CI) | Cluster group | |||
|---|---|---|---|---|---|---|---|
| Negative | Positive | 25% | 50% | 75% | |||
| Alabama | 930 | 15 | 12/5/2021 | 9 (−1 to 11) | 12 (10 to ∞) | NA | Middle |
| Arizona | 15 452 | 168 | 12/5/2021 | 11 (11 to 12) | 14 (14 to ∞) | NA | Late |
| California | 78 404 | 3426 | 11/26/2021 | 18 (18 to 18) | 20 (20 to 21) | NA | Middle |
| Colorado | 49 426 | 396 | 11/29/2021 | 13 (12 to 13) | 15 (14 to 16) | NA | Middle |
| Connecticut | 8596 | 162 | 11/28/2021 | 18 (16 to ∞) | NA | NA | Late |
| DC | 1697 | 307 | 11/30/2021 | 10 (9 to 11) | 12 (11 to 13) | 14 (14 to 15) | Early |
| Florida | 9690 | 275 | 11/26/2021 | 14 (14 to 15) | 17 (16 to 17) | 19 (19 to 21) | Early |
| Georgia | 4687 | 274 | 11/30/2021 | 11 (10 to 12) | 13 (12 to 13) | 14 (14 to 15) | Early |
| Hawaii | 1364 | 93 | 11/27/2021 | 7 (6 to 8) | 9 (8 to 10) | 11 (10 to 13) | Early |
| Idaho | 3166 | 21 | 12/5/2021 | 12 (10 to 14) | NA | NA | Late |
| Illinois | 13 425 | 248 | 11/30/2021 | 12 (12 to 14) | NA | NA | Late |
| Indiana | 11 120 | 93 | 12/8/2021 | 8 (7 to 8) | 10 (9 to 11) | NA | Late |
| Iowa | 4365 | 31 | 12/6/2021 | 10 (8 to ∞) | NA | NA | Late |
| Kansas | 4289 | 20 | 12/13/2021 | 6 (5 to ∞) | 8 (7 to ∞) | NA | Late |
| Kentucky | 4168 | 38 | 12/11/2021 | 5 (4 to 6) | 7 (6 to ∞) | NA | Late |
| Louisiana | 1742 | 333 | 12/1/2021 | 5 (4 to 6) | 7 (6 to 8) | 12 (9 to 16) | Early |
| Maryland | 8982 | 395 | 11/21/2021 | 21 (20 to 21) | 25 (24 to 26) | NA | Middle |
| Massachusetts | 40 270 | 288 | 11/27/2021 | 18 (17 to 18) | 19 (19 to 20) | 22 (21 to ∞) | Middle |
| Michigan | 19 568 | 21 | 12/1/2021 | NA | NA | NA | Late |
| Minnesota | 34 004 | 76 | 11/24/2021 | NA | NA | NA | Late |
| Mississippi | 1442 | 13 | 11/29/2021 | 20 (15 to ∞) | NA | NA | Late |
| Nebraska | 5447 | 21 | 11/29/2021 | NA | NA | NA | Late |
| Nevada | 6691 | 23 | 12/8/2021 | NA | NA | NA | Late |
| New Jersey | 11 479 | 203 | 11/26/2021 | 18 (16 to 19) | NA | NA | Late |
| New York | 23 536 | 1975 | 11/22/2021 | 17 (17 to 18) | 21 (21 to 22) | 25 (24 to 25) | Early |
| North Carolina | 10 833 | 145 | 12/2/2021 | 14 (13 to 14) | 16 (15 to 17) | 18 (17 to ∞) | Late |
| Ohio | 9686 | 376 | 11/29/2021 | 14 (13 to 14) | 16 (16 to 17) | 19 (18 to 20) | Middle |
| Oregon | 4935 | 83 | 12/7/2021 | 8 (7 to 9) | 11 (10 to 13) | NA | Late |
| Pennsylvania | 13 210 | 96 | 11/28/2021 | 15 (15 to 17) | NA | NA | Late |
| Rhode Island | 3507 | 15 | 11/30/2021 | NA | NA | NA | Late |
| South Carolina | 2732 | 56 | 12/4/2021 | 11 (9 to 12) | NA | NA | Late |
| Tennessee | 4602 | 35 | 11/26/2021 | 19 (18 to 21) | NA | NA | Late |
| Texas | 16 336 | 1096 | 11/27/2021 | 12 (12 to 13) | 14 (14 to 15) | 17 (16 to 17) | Early |
| Utah | 9338 | 20 | 11/29/2021 | NA | NA | NA | Late |
| Virginia | 5088 | 94 | 11/24/2021 | 20 (19 to 21) | 22 (21 to ∞) | NA | Middle |
| Washington | 13 643 | 688 | 11/27/2021 | 15 (14 to 15) | 18 (18 to 19) | NA | Middle |
| West Virginia | 6415 | 19 | 12/2/2021 | NA | NA | NA | Late |
| Wisconsin | 13 477 | 435 | 11/27/2021 | 18 (18 to 18) | 22 (21 to 22) | NA | Late |
Abbreviation: NA, not applicable.
Frequencies of viruses carrying Omicron (positive) or not (negative).
The 95% CIs were obtained from 2.5% and 97.5% of the empirically computed distribution from 1000 bootstraps. The time was inestimable if the Omicron caseload had not crossed the targeted percentage. The upper bound of the 95% CI was left as ∞ if it was beyond December 28, 2021.