Literature DB >> 34954792

CoV-Spectrum: Analysis of Globally Shared SARS-CoV-2 Data to Identify and Characterize New Variants.

Chaoran Chen1,2, Sarah Nadeau1,2, Michael Yared3, Philippe Voinov3, Ning Xie4, Cornelius Roemer2,5, Tanja Stadler1,2.   

Abstract

SUMMARY: The CoV-Spectrum website supports the identification of new SARS-CoV-2 variants of concern and the tracking of known variants. Its flexible amino acid and nucleotide mutation search allows querying of variants before they are designated by a lineage nomenclature system. The platform brings together SARS-CoV-2 data from different sources and applies analyses. Results include the proportion of different variants over time, their demographic and geographic distributions, common mutations, hospitalization and mortality probabilities, estimates for transmission fitness advantage and insights obtained from wastewater samples.
AVAILABILITY AND IMPLEMENTATION: CoV-Spectrum is available at https://cov-spectrum.ethz.ch. The code is released under the GPL-3.0 license at https://github.com/cevo-public/cov-spectrum-website.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Year:  2021        PMID: 34954792      PMCID: PMC8896605          DOI: 10.1093/bioinformatics/btab856

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Most mutations in the SARS-CoV-2 genome do not cause phenotypic changes in the virus. However, some mutations may change the virus such that it is (i) more transmissible, (ii) causes a more severe disease outcome or (iii) has the ability to evade immunity after infection or vaccination. A new variant with one of these properties is classified as a variant of concern (VOC; World Health Organization, 2021). It is crucial to rapidly identify and characterize new variants of concern such that public health measures can be adapted to emerging threats. Demonstrating that one of the VOC properties (i)–(iii) is met by a new variant in real time is an ongoing challenge for public health. In particular, observing a quickly spreading variant does not necessarily imply a transmission advantage. In fact, there are many factors that can influence a variant’s observed spread: geographic biases and different interventions in different regions, demographic biases and different behaviors in different populations, varying contact tracing efforts and varying test regimes, among others. For example, the variant named 20E (EU1) (B.1.177) spread across Europe in summer 2020. However, rather than having a transmission advantage, it appears that superspreading events and travel activities drove this spread (Hodcroft ). Therefore, it is important to investigate a broad set of data from different regions to evaluate the risk posed by new variants. This was done for the variant Alpha (B.1.1.7) which showed a consistent relative growth in many countries, indicating an intrinsic transmission advantage of 43–90% (Davies ). The CoV-Spectrum website aims to help track known VOCs and facilitate early identification of new ones. It brings together the global public dataset of genomic sequences and additional epidemiological data (Section 2.1) to provide a multifaceted view of a variant. The website’s variant search feature allows users to track combinations of amino acid and nucleotide mutations, in addition to already designated lineages.

2 Materials and methods

2.1 Data sources and data presentation

The primary data presented by CoV-Spectrum are genomic sequences. We currently provide two instances of CoV-Spectrum: one that uses data provided by GISAID (Elbe and Buckland-Merrett, 2017), and another one that uses data from NCBI GenBank provided through Nextstrain (Hadfield ). These are whole genome sequences of SARS-CoV-2 from countries across the globe as well as basic metadata such as the sampling date, location (often at the level of national divisions) and, for some sequences, the age and sex of the infected individual. We clean the location data with Nextstrain’s geo location rules (https://github.com/nextstrain/ncov-ingest/blob/master/source-data/gisaid_geoLocationRules.tsv) and we run Nextclade (Aksamentov ) to obtain aligned nucleotide and amino acid sequences. These data are updated daily. CoV-Spectrum uses this data to create plots summarizing the raw data and to perform statistical analyses. The plots include the prevalence, estimated number of cases, demographic and geographic distributions and the common mutations of a variant. Some of the plots can further be stratified by geographic divisions. These are presented in a grid, enabling the user to visually check whether the same dynamic is present in different divisions. For Switzerland, we receive additional metadata from the Swiss Federal Office of Public Health. The metadata is linked to the whole genome sequences and includes, e.g. additional demographic, hospitalization and mortality information. CoV-Spectrum uses this unique dataset to compute the hospitalization and mortality probabilities for different age groups and shows a plot that compares the hospitalization and mortality probabilities of confirmed cases infected with a selected variant with other variants. This enables direct assessment of VOC property (ii) (severe outcome). With this feature, we can see that the hospitalization probability of cases infected with the Alpha variant is indeed higher in older age groups, as suggested by other studies (Challen ; Davies ).

2.2 Statistical analysis

In addition to presenting the raw data, CoV-Spectrum applies statistical analyses to them. For instance, first, CoV-Spectrum shows the mutations that occur in sequences of a variant and, by ranking them by their Jaccard similarity, it helps identify the mutations that are specific to a particular variant. Second, CoV-Spectrum integrates a model to estimate variant transmission fitness advantages, as described in Chen . Chen presents static results for the Alpha variant in Switzerland, while CoV-Spectrum allows users to explore results for any variants and countries. This enables assessment of VOC property (i) (increased transmissibility). For Switzerland, we additionally receive estimates of the proportion of different variants in wastewater samples from collaborators. The underlying procedure is described in Jahn , and is currently applied to a selection of variants for which characteristic mutations are manually chosen. This allows to assess if mutations that are identified as spreading in the population based on clinical data is confirmed in wastewater data.

2.3 Linking other services

Many COVID-19 dashboards and web tools have been developed since the start of the pandemic, each with their own specific use case. CoV-Spectrum can serve as a hub between several of these services. Namely, the website directly integrates external services so that users can obtain more information about selected variants at the click of a button. For example, CoV-Spectrum can send a list of sequence identifiers to UShER (Turakhia ), which will then place the sequences on a predefined tree. It can also redirect the user to Taxonium (Sanderson, 2021), which highlights the selected variant in a precomputed global tree with millions of nodes. Finally, it links to CoVariants (Hodcroft, 2021), which provides users with curated information about a variant.

2.4 Sharing of results

To promote dissemination of real-time results, CoV-Spectrum’s plots and tables are made available to external websites via iframes. These plots remain interactive and will be automatically updated as new data arrives. We used this technique, e.g. to integrate plots into a dedicated website explaining the spread of the Alpha variant in Switzerland (https://cevo-public.github.io/Quantification-of-the-spread-of-a-SARS-CoV-2-variant/).

2.5 Implementation

The frontend of CoV-Spectrum is a single-page React application written in TypeScript. It retrieves data from two REST APIs. First, CoV-Spectrum’s own server application provides the non-sequence data. Then, our Lightweight API for Sequences (LAPIS; Chen and Stadler, 2021) provides the sequence data. LAPIS is a general API to query sequences that is maintained as a separate project. The servers are written in Kotlin and Java using the Spring Boot framework. Finally, the data are stored in a PostgreSQL database.

3 Conclusion

The CoV-Spectrum website facilitates rapid detection and characterization of circulating SARS-CoV-2 variants around the globe. The website offers users a convenient way to assess the available SARS-CoV-2 sequencing data together with its metadata. It provides rich information by providing timely figures and tables produced based on globally shared data. As mentioned, evaluating variants requires careful consideration of potential biases in the sequencing data. CoV-Spectrum aims to help with this task by providing the appropriate geographic and demographic context, wherever possible. Users should, however, be aware of possible sampling biases in the raw data, which may carry through to results presented on CoV-Spectrum. Thus, any results should be interpreted and communicated accordingly. CoV-Spectrum is only possible due to the ongoing efforts of the international community to perform sequencing and make the data rapidly and openly available on GISAID and GenBank. However, some crucial tasks like assessing VOC properties (ii) (severe outcome) and (iii) (immune/vaccine breakthrough) require additional metadata, such as the severity of infections or vaccine status (Gomez ). We call for global sharing of such metadata. The sharing of properly anonymized and aggregated data will facilitate the rapid identification of VOCs. This will be crucial for timely global public health responses.
  9 in total

1.  Uncertain effects of the pandemic on respiratory viruses.

Authors:  Gabriela B Gomez; Cedric Mahé; Sandra S Chaves
Journal:  Science       Date:  2021-06-04       Impact factor: 47.728

2.  Quantification of the spread of SARS-CoV-2 variant B.1.1.7 in Switzerland.

Authors:  Chaoran Chen; Sarah Ann Nadeau; Ivan Topolsky; Marc Manceau; Jana S Huisman; Kim Philipp Jablonski; Lara Fuhrmann; David Dreifuss; Katharina Jahn; Christiane Beckmann; Maurice Redondo; Christoph Noppen; Lorenz Risch; Martin Risch; Nadia Wohlwend; Sinem Kas; Thomas Bodmer; Tim Roloff; Madlen Stange; Adrian Egli; Isabella Eckerle; Laurent Kaiser; Rebecca Denes; Mirjam Feldkamp; Ina Nissen; Natascha Santacroce; Elodie Burcklen; Catharine Aquino; Andreia Cabral de Gouvea; Maria Domenica Moccia; Simon Grüter; Timothy Sykes; Lennart Opitz; Griffin White; Laura Neff; Doris Popovic; Andrea Patrignani; Jay Tracy; Ralph Schlapbach; Emmanouil T Dermitzakis; Keith Harshman; Ioannis Xenarios; Henri Pegeot; Lorenzo Cerutti; Deborah Penet; Anthony Blin; Melyssa Elies; Christian L Althaus; Christian Beisel; Niko Beerenwinkel; Martin Ackermann; Tanja Stadler
Journal:  Epidemics       Date:  2021-08-09       Impact factor: 5.324

3.  Spread of a SARS-CoV-2 variant through Europe in the summer of 2020.

Authors:  Tanja Stadler; Richard A Neher; Emma B Hodcroft; Moira Zuber; Sarah Nadeau; Timothy G Vaughan; Katharine H D Crawford; Christian L Althaus; Martina L Reichmuth; John E Bowen; Alexandra C Walls; Davide Corti; Jesse D Bloom; David Veesler; David Mateo; Alberto Hernando; Iñaki Comas; Fernando González Candelas
Journal:  Nature       Date:  2021-06-07       Impact factor: 49.962

4.  Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic.

Authors:  Yatish Turakhia; Bryan Thornlow; Angie S Hinrichs; Nicola De Maio; Landen Gozashti; Robert Lanfear; David Haussler; Russell Corbett-Detig
Journal:  Nat Genet       Date:  2021-05-10       Impact factor: 41.307

5.  Nextstrain: real-time tracking of pathogen evolution.

Authors:  James Hadfield; Colin Megill; Sidney M Bell; John Huddleston; Barney Potter; Charlton Callender; Pavel Sagulenko; Trevor Bedford; Richard A Neher
Journal:  Bioinformatics       Date:  2018-12-01       Impact factor: 6.931

6.  Data, disease and diplomacy: GISAID's innovative contribution to global health.

Authors:  Stefan Elbe; Gemma Buckland-Merrett
Journal:  Glob Chall       Date:  2017-01-10

7.  Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England.

Authors:  Sam Abbott; Rosanna C Barnard; Christopher I Jarvis; Adam J Kucharski; James D Munday; Carl A B Pearson; Timothy W Russell; Damien C Tully; Alex D Washburne; Tom Wenseleers; Nicholas G Davies; Amy Gimma; William Waites; Kerry L M Wong; Kevin van Zandvoort; Justin D Silverman; Karla Diaz-Ordaz; Ruth Keogh; Rosalind M Eggo; Sebastian Funk; Mark Jit; Katherine E Atkins; W John Edmunds
Journal:  Science       Date:  2021-03-03       Impact factor: 63.714

8.  Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7.

Authors:  Karla Diaz-Ordaz; Ruth H Keogh; Nicholas G Davies; Christopher I Jarvis; W John Edmunds; Nicholas P Jewell
Journal:  Nature       Date:  2021-03-15       Impact factor: 69.504

9.  Risk of mortality in patients infected with SARS-CoV-2 variant of concern 202012/1: matched cohort study.

Authors:  Robert Challen; Ellen Brooks-Pollock; Jonathan M Read; Louise Dyson; Krasimira Tsaneva-Atanasova; Leon Danon
Journal:  BMJ       Date:  2021-03-09
  9 in total
  22 in total

1.  ATGPred-FL: sequence-based prediction of autophagy proteins with feature representation learning.

Authors:  Shihu Jiao; Zheng Chen; Lichao Zhang; Xun Zhou; Lei Shi
Journal:  Amino Acids       Date:  2022-03-14       Impact factor: 3.520

2.  The rise and spread of the SARS-CoV-2 AY.122 lineage in Russia.

Authors:  Galya V Klink; Ksenia R Safina; Elena Nabieva; Nikita Shvyrev; Sofya Garushyants; Evgeniia Alekseeva; Andrey B Komissarov; Daria M Danilenko; Andrei A Pochtovyi; Elizaveta V Divisenko; Lyudmila A Vasilchenko; Elena V Shidlovskaya; Nadezhda A Kuznetsova; Anna S Speranskaya; Andrei E Samoilov; Alexey D Neverov; Anfisa V Popova; Gennady G Fedonin; Vasiliy G Akimkin; Dmitry Lioznov; Vladimir A Gushchin; Vladimir Shchur; Georgii A Bazykin
Journal:  Virus Evol       Date:  2022-03-05

3.  Within-host genetic diversity of SARS-CoV-2 in the context of large-scale hospital-associated genomic surveillance.

Authors:  Alexandra Mushegian; Scott Wesley Long; Randall James Olsen; Paul James Christensen; Sishir Subedi; Matthew Chung; James Davis; James Musser; Elodie Ghedin
Journal:  medRxiv       Date:  2022-08-19

Review 4.  Recombination in Coronaviruses, with a Focus on SARS-CoV-2.

Authors:  Daniele Focosi; Fabrizio Maggi
Journal:  Viruses       Date:  2022-06-07       Impact factor: 5.818

5.  Advancing genomic epidemiology by addressing the bioinformatics bottleneck: Challenges, design principles, and a Swiss example.

Authors:  Chaoran Chen; Sarah Nadeau; Ivan Topolsky; Niko Beerenwinkel; Tanja Stadler
Journal:  Epidemics       Date:  2022-05-14       Impact factor: 5.324

6.  Generalized Methodology for the Quick Prediction of Variant SARS-CoV-2 Spike Protein Binding Affinities with Human Angiotensin-Converting Enzyme II.

Authors:  Alexander H Williams; Chang-Guo Zhan
Journal:  J Phys Chem B       Date:  2022-03-22       Impact factor: 2.991

7.  An assessment of the potential impact of the Omicron variant of SARS-CoV-2 in Aotearoa New Zealand.

Authors:  Giorgia Vattiato; Oliver Maclaren; Audrey Lustig; Rachelle N Binny; Shaun C Hendy; Michael J Plank
Journal:  Infect Dis Model       Date:  2022-04-09

8.  Mapping Data to Deep Understanding: Making the Most of the Deluge of SARS-CoV-2 Genome Sequences.

Authors:  Bahrad A Sokhansanj; Gail L Rosen
Journal:  mSystems       Date:  2022-03-21       Impact factor: 7.324

9.  Survival among people hospitalized with COVID-19 in Switzerland: a nationwide population-based analysis.

Authors:  Nanina Anderegg; Radoslaw Panczak; Matthias Egger; Nicola Low; Julien Riou
Journal:  BMC Med       Date:  2022-04-26       Impact factor: 11.150

10.  Inferring transmission fitness advantage of SARS-CoV-2 variants of concern from wastewater samples using digital PCR, Switzerland, December 2020 through March 2021.

Authors:  Lea Caduff; David Dreifuss; Tobias Schindler; Alexander J Devaux; Pravin Ganesanandamoorthy; Anina Kull; Elyse Stachler; Xavier Fernandez-Cassi; Niko Beerenwinkel; Tamar Kohn; Christoph Ort; Timothy R Julian
Journal:  Euro Surveill       Date:  2022-03
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.