Literature DB >> 35294336

Targeted Sanger sequencing to recover key mutations in SARS-CoV-2 variant genome assemblies produced by next-generation sequencing.

Lavanya Singh¹, James E San¹, Houriiyah Tegally¹, Pius M Brzoska², Ugochukwu J Anyaneji¹, Eduan Wilkinson³, Lindsay Clark⁴, Jennifer Giandhari¹, Sureshnee Pillay¹, Richard J Lessells¹, Darren Patrick Martin⁵, Manohar Furtado², Anmol M Kiran^6,7, Tulio de Oliveira^1,3,8.

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is adaptively evolving to ensure its persistence within human hosts. It is therefore necessary to continuously monitor the emergence and prevalence of novel variants that arise. Importantly, some mutations have been associated with both molecular diagnostic failures and reduced or abrogated next-generation sequencing (NGS) read coverage in some genomic regions. Such impacts are particularly problematic when they occur in genomic regions such as those that encode the spike (S) protein, which are crucial for identifying and tracking the prevalence and dissemination dynamics of concerning viral variants. Targeted Sanger sequencing presents a fast and cost-effective means to accurately extend the coverage of whole-genome sequences. We designed a custom set of primers to amplify a 401 bp segment of the receptor-binding domain (RBD) (between positions 22698 and 23098 relative to the Wuhan-Hu-1 reference). We then designed a Sanger sequencing wet-laboratory protocol. We applied the primer set and wet-laboratory protocol to sequence 222 samples that were missing positions with key mutations K417N, E484K, and N501Y due to poor coverage after NGS sequencing. Finally, we developed SeqPatcher, a Python-based computational tool to analyse the trace files yielded by Sanger sequencing to generate consensus sequences, or take preanalysed consensus sequences in fasta format, and merge them with their corresponding whole-genome assemblies. We successfully sequenced 153 samples of 222 (69 %) using Sanger sequencing and confirmed the occurrence of key beta variant mutations (K417N, E484K, N501Y) in the S genes of 142 of 153 (93 %) samples. Additionally, one sample had the Y508F mutation and four samples the S477N. Samples with RT-PCR C t scores ranging from 13.85 to 37.47 (mean=25.70) could be Sanger sequenced efficiently. These results show that our method and pipeline can be used to improve the quality of whole-genome assemblies produced using NGS and can be used with any pairs of the most used NGS and Sanger sequencing platforms.

Entities: Chemical

Keywords: Illumina; S-gene; SARS-CoV-2 spike gene; Sanger; diagnostic failure; mutations; primer binding site; whole-genome sequencing

Mesh：

Year: 2022 PMID： 35294336 PMCID： PMC9176282 DOI： 10.1099/mgen.0.000774

Source DB: PubMed Journal: Microb Genom ISSN： 2057-5858

Impact Statement

Genomic epidemiology has recently come of age as the main tool to inform the introduction, evolution and spread of pathogens. Its dependence on high-quality, whole, or near-whole genomes is, however, affected by the rapid accumulation of mutations in viral pathogens that, among other problems in amplicon-based methods, result in failure to amplify targeted genomic regions when they occur in primer binding sites. In severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the emergence of variants of concern has been associated with primer failures, especially in the Alpha/B.1.1.7, Beta/B.1.351, Delta/B.1.672.2 and Omicron/B.1.1.529 variants. This work presents a quick, cost-effective and easy-to-use method and tools to extend coverage in important regions of near-whole genomes. Our method has been applied successfully to fill gaps in the SARS-CoV-2 receptor binding domain of the Beta variant and can be extended to any gene and to other pathogens, significantly improving the quality of genomic data available to inform public health decisions. Most importantly, we provide an end-to-end solution incorporating both the wet-laboratory and bioinformatics components that can be easily adapted by other laboratories to enhance the utility of already sequenced near-whole genomes.

Data Summary

The Sanger sequencing protocol can be accessed at: https://www.protocols.io/view/targeted-sequencing-by-sanger-to-recover-key-mutat-buhint4e. The SeqPatcher tool is also freely available at: https://github.com/krisp-kwazulu-natal/seqPatcher. All other data have been provided within the article or in the supplementary files including GISAID IDs for the sequences analysed in this study. Raw sequence data have been deposited in the National Center for Biotechnology Information (NCBI) Short Read Archive (SRA) under the project number PRJNA636748.

Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes of high quality and coverage are increasingly needed to support public health efforts to bring the coronavirus disease 2019 (COVID-19) pandemic to an end [1]. These data are needed to detect the emergence of variants of concern (VOCs) and variants of interest (VOIs) and track their prevalence and spread around the world. Of particular interest is the spike (S) gene, which is both a major determinant of viral infectivity [2, 3] and the primary target of virus-induced neutralizing antibodies (Nabs) [4] and, as such, is the site of multiple known mutations that enhance viral transmission and/or enable viral escape from the immune system [5]. As a result of this, the S gene exhibits markedly higher evolution rates than other genes in the SARS-CoV-2 genome [6, 7]. This makes the S gene an important target for viral diversity, transmission, pathogenicity and evolution studies. In addition to the potential immunological and epidemiological implications of mutations in the S gene, such mutations can also affect the binding, and therefore the utility, of the primers and probes that are used to diagnose and sequence SARS-CoV-2 [8]. Continuous assessments of the occurrence and spread of mutations in the S gene are therefore a crucial component of efforts to manage the pandemic. Short-read sequencing technologies such as Illumina can enable the scale and speed of whole-genome sequencing that is required for successful near real-time SARS-CoV-2 genomic surveillance [9]. However, the quality and depth of sequence data that are obtained with short-read sequencing platforms for any given genome can vary considerably between different regions of the genome [10, 11]. For example, the S gene codon 69–70 deletion in the Alpha/B.1.1.7 lineage has been noted to cause spike gene target failures (SGTFs) in diagnostic tests [12, 13]. The presence of mutations, especially minority variants, can also affect the annealing of primer sequences and reduce the efficiency of sequencing [14]. Several options to augment read coverage at target genome loci have been explored, including increasing the amounts of DNA that are used for next-generation sequencing (NGS), the use of DNA capture agents, and optimized DNA purification and amplification protocols, such as those employing DNA concentration steps or the sequencing of smaller amplified fragments [11]. Although effective in many instances, in others the quality and not the quantity of the available DNA is the limiting factor. Post-sequencing computational tools such as GapFiller [15] have also been developed in an attempt to extend the quantity of readable sequence that can be obtained from short reads in silico; however, the high confidence in basecalling needed for genomic surveillance limits the use of imputation-based methods. Altogether, these efforts highlight the extent of a persistent issue with the rapidly accumulating pool of genomic surveillance data; that is, the pervasiveness of significant gaps in near whole-genome sequences that are being deposited in public sequence repositories. Targeted gene-specific Sanger sequencing is a speedy, cost-effective and accurate means of sequencing genomic regions to fill in small gaps in partial whole genomes determined using NGS [11]. Specifically, short fragments (<1000 bp) from particular genome regions of interest, for example the region of the S gene encoding the receptor binding domain (RBD), can be amplified, sequenced and used to ‘patch’ near full-genome assemblies to produce higher quality genome sequences. This is especially useful when regions of low read coverage correspond to key sites of the genome known to harbour mutations of interest, such as the spike E484K and N501Y, which are mutations associated with escape of neutralizing and monoclonal antibodies, respectively [16, 17], and the K417N/T, which has been implicated in increased transmissibility due to elevated angiotensin converting enzyme - 2 (ACE2) receptor binding affinity [18]. Here we describe a targeted Sanger sequencing approach that can be used to sequence sub-regions of any gene to fill in the gaps remaining following whole-genome sequencing by NGS. We demonstrate the utility of the approach by using it to restore sequencing coverage in the RBD region of the S gene of the Beta (501Y.V2/B.1.351) lineage viruses that harbour the signature spike mutations K417N, E484K and N501Y. Specifically, we developed a set of custom primers, a Sanger sequencing protocol and a computational tool to recover mutations occurring in low/missing-coverage regions of whole-genome sequences. We show that we were able to detect the presence of the K417N, E484K and N501Y mutations accurately in 142 samples with C t scores between 13.85–37.47. Our results demonstrate that Sanger sequencing data can be easily integrated with NGS-derived whole-genome sequencing data or used on their own for rapid genotyping.

Methodology

Data

The samples analysed in this study include 222 SARS-CoV-2 nasopharyngeal swab samples from routine surveillance during the second wave of the pandemic in South Africa. Sample collection dates ranged between December 2020 and January 2021 at the peak of the epidemic dominated by the beta variant. All samples analysed were missing coverage in the RBD region of the genome after sequencing with NGS on the Illumina platform.

PCR primers

We designed sequencing primers to target the 401 base pair (bp) region between genome coordinates 22 698 and 23 098 of the SARS-CoV-2 Wuhan-Hu-1 reference (NC_045512.2) genome, which included mutation sites K417N (G22813T), E484K (G23012A) and N501Y (A23063T) of the beta variant of concern (VOC). The primers were designed using primer3 v4.1.3 [19-21] following Brzoska et. al [22].

cDNA synthesis and PCR

Briefly, RNA was extracted from samples using standard techniques; 10 µl of RNA (1 pg–100 ng) was used in a total reaction volume of 20 µl, using the one-step cDNA and PCR protocol (TaqPath 1-step RT-qPCR MasterMix, Thermo Fisher Scientific) and 501Y.V2 forward and reverse primers (fwd: GATCTCTGCTTTACTAATGTCTATGCAGAT, rev: GCTGGTGCATGTAGAAGTTCAAAAG). PCR reactions underwent thermocycling at the following conditions: Uracil-N-glycosylase (UNG) incubation at 25 °C for 2 min, reverse transcriptase (RT) incubation at 50 °C for 15 min and enzyme activation at 95 °C for 2 min, followed by 40 cycles of amplification at 95 °C for 3 s and 60 °C for 30 s.

Cycle sequencing and capillary electrophoresis

Ten microlitres of PCR amplicon was purified (ExoSap IT-express reagent, Thermo Fisher Scientific) and 1 µl of product used for cycle sequencing (BigDye Terminator v3.1 cycle sequencing kit, Thermo Fisher Scientific) using the same 501Y.V2 forward and reverse primers that were used in the amplification step. The products were then purified (BigDye Xterminator kit, Thermo Fisher Scientific) and capillary electrophoresed using a DNA Analyser (Applied Biosystems). The detailed step-by-step wet-laboratory protocol is available on protocols.io at: https://www.protocols.io/view/targeted-sequencing-by-sanger-to-recover-key-mutat-buhint4e.

Bioinformatics analysis and algorithm

Following capillary electrophoresis, the resulting sequence traces were visually inspected in Geneious Prime. The traces were then analysed and merged into the whole genome sequences using our custom tool, SeqPatcher. The tool accepts as input either trace (.ab1) files produced by the Sanger sequencing device or the consensus sequences from preanalysed trace files, a reference sequence of the gene of interest and one or more near-complete genome(s) with missing data, usually represented as Ns in the region of interest. The tool assumes that the gene of interest has only been partially sequenced in the near-complete genome(s), such that the location of the gene of interest can be determined by alignment. If trace files are provided, SeqPatcher checks that the length is greater than 100 bp and computes the median peak height of the bases within the interquartile range of the sequence. A warning is issued for sequences of length lower than 100 bp. Low-quality bases at the ends of the sequence with peaks <10 % of the median height are trimmed. The tool then analyses the remaining peaks to determine the prevalent base at each position and produces a fasta sequence. The sequence is pairwise aligned with the reference gene using muscle [23] to determine the longest contiguous aligned region (Fig. 1a, b). Where paired reads are provided (forward and reverse), the tool calculates a consensus. Mismatches between the overlapping region of the forward and reverse reads as well as ambiguous bases are resolved as shown in Table 1. InDels of length >15 nt or that are not a multiples of three in the Sanger sequences are included or excluded (i.e. insertions removed and deletions replaced with Ns) based on user provided input. The coordinates of the region to be replaced are determined relative to the reference gene. The region matching the reference gene within the whole-genome assembly is then ascertained using blat [24]. The precise location to be modified is calculated as shown in Fig. 1d, e, and the Sanger sequence is inserted into the whole-genome sequence. The final output is a file containing all the genomes. Only genomes that have a corresponding Sanger sequence are edited, relieving the user of the need to manually organize and process files in situations involving large numbers of sequences. The consensus Sanger sequences in fasta format are also output for the user. Additional detail of the algorithm has been provided in the documentation (https://github.com/krisp-kwazulu-natal/seqPatcher).

Fig. 1.

Table 1.

Derivation of the Sanger consensus sequence from the trace files. A fasta sequence is determined for each read and a consensus calculated by alignment of the reads to the reference gene sequence. Base refers to nucleotides A, C, T and G and A/B refer to different bases at that position. A dash (–) represents an indel

Sanger sequence with forward and reverse reads
Reference	Forward read	Reverse read	Final outcome
–	Any base	Any base	Any base
Base A	Base A	Base A	Based on user input, i.e. based on peak heights or ambiguous base (default: ambiguous base)
Base A	Base A	Base B	Based on user input, i.e. based on peak heights or ambiguous base (default: ambiguous base)
Base A	Base B	Base A	Base A
Base A	Base B	Base B	Base B
Base A	Base B	–	Base B
Base A	Base A	–	Base A
Base A	Ambiguous	Base B	Base B
Base A	Base B	Ambiguous	Base B
Sanger sequence with only one read (either forward or reverse)
–	Any base		Any base
Base A	–	na	–
Base A	Base B	na	Base B
Base A	Ambiguous	na	Based on user input, base a/ peak max base/ neighbour base/ base ambi. (Default: base ambi.)

Visual representation of the SeqPatcher workflow. (a) If the input is an .ab1 file then SeqPatcher analyses and infers consensus sequences in fasta format. (b) fasta sequences are aligned to the reference gene and (c) the longest overlap is determined. The location of the overlap relative to the gene is also determined. (d) The reference gene is then aligned to the respective whole genome to determine the location and the precise point of insertion is calculated. (e) Sanger sequence is inserted into the whole genome. SeqPatcher tries to report the correct/highly probable base in cases of conflict and keeps the locations of indels (insertions and deletions) intact during integration. r, ref gene length; rl, ref gene start position; rr, ref gene end position; gl, gene start position in assembly; gr, gene end position in assembly; black solid line, reference gene; dots, missing or gapped region in the sequence. Derivation of the Sanger consensus sequence from the trace files. A fasta sequence is determined for each read and a consensus calculated by alignment of the reads to the reference gene sequence. Base refers to nucleotides A, C, T and G and A/B refer to different bases at that position. A dash (–) represents an indel Sanger sequence with forward and reverse reads Reference Forward read Reverse read Final outcome – Any base Any base Any base Base A Base A Base A Based on user input, i.e. based on peak heights or ambiguous base (default: ambiguous base) Base A Base A Base B Based on user input, i.e. based on peak heights or ambiguous base (default: ambiguous base) Base A Base B Base A Base A Base A Base B Base B Base B Base A Base B – Base B Base A Base A – Base A Base A Ambiguous Base B Base B Base A Base B Ambiguous Base B Sanger sequence with only one read (either forward or reverse) – Any base Any base Base A – na – Base A Base B na Base B Base A Ambiguous na Based on user input, base a/ peak max base/ neighbour base/ base ambi. (Default: base ambi.) Only the near-complete genome sequences with a matching Sanger sequence are merged by SeqPatcher, such that the user does not need to first separate the sequences that need to be merged from an existing dataset of complete and near-complete sequences.

Validation

We analysed 153 sets of trace files from samples that were successfully Sanger sequenced. The trace files were analysed in both Geneious Prime and SeqPatcher to generate consensus sequences. The resulting consensus sequences were analysed in Nextclade (https://clades.nextstrain.org/) and the mutational sets were compared. Nextclade produces a report detailing sequence quality and coverage and identifies mutational changes observed at the nucleotide and protein levels relative to the Wuhan reference. To assess the merging step, we generated a Nextclade report of the whole-genome sequence before merging to determine the detectable mutations present at baseline. Next, we generated a Nextclade report of the Sanger sequences to identify the mutations recovered within these. Finally, we generated a Nextclade report of the mutations present in the consensus genomes after merging. The Nextclade report of the Sanger sequences enabled us to identify instances where merging the Sanger and NGS data yielded a composite genome sequence assembly within which the expected mutation profiles were and were not successfully detected, while the report after merging was used to confirm a successful merge.

Results

Sanger sequencing

We performed Sanger sequencing on 222 samples that, following sequencing with the Illumina platform, had yielded near-complete genome sequences that lacked part of the S gene (Table 1). Of these, 153 samples with a mean C t 25.70 (range: 13.85–37.47) were successfully Sanger sequenced, yielding at least 1 complete sequence (forward or reverse) that spans the region of interest (between 22 698 and 23 098, i.e. 401 bp relative to the Wuhan-Hu-1 reference). Five samples, mean C t 29.26 (19.16–36.07), yielded poor quality Sanger sequence with insufficient coverage of the region of interest and were therefore discarded, while 69 samples with a mean C t of 31.81 (20.78–38.92) did not yield a Sanger sequence and were therefore not analysed further. Fig. 2a shows our optimized Sanger sequencing protocol and Fig. 2b shows a summary of the sequencing results. Additional mutations (Y508F and S477N) associated with the S gene were also detected in five of the samples (Table S1).

Fig. 2.

Sanger sequencing and analysis. (a) The Sanger sequencing workflow. cDNA, complementary DNA; PCR, polymerase chain reaction; BDT, BigDye Terminator; BDX, BigDye Xterminator; CE, capillary electrophoresis. (b) Violin plot showing the distribution of RT-PCR cycle threshold scores for 222 samples; the C t score for each sample was based on the mean C t of the three SARS-CoV-2 targets (S gene, N gene, Orf1ab). Quartiles are represented by different shades within each plot, mean and median C ts are represented by yellow and pink points, respectively. The pink line shows the trend line.

Sequence merging

Using SeqPatcher, we inferred a consensus sequence for all samples that were successfully Sanger sequenced. The consensus sequences were successfully merged into the whole-genome sequences for all 153 of the cases. The success of the merge process was confirmed using Nextclade reports. All sequences with genome coverage >80% were deposited on GISAID (Table S2). Table S3 shows a Nextclade report for all the samples submitted to GISAID. A graphical summary of the workflow to incorporate Sanger sequences into the near-complete NGS-generated genome sequences is presented in Fig. 3.

Fig. 3.

Summary of the workflow for improving the coverage of WGS data using Sanger sequencing. Data from the sequencer are preprocessed to determine coverage. Sequences with gaps in regions of interest are sent for Sanger sequencing. The results from Sanger sequencing are inserted into the NGS whole genomes using SeqPatcher and the improved consensus of these is published to support genomic epidemiology.

Future directions

Future development directions include the implementation of a graphical interface for this analysis and for viewing the merged sequences for users that are unfamiliar with using command-line tools.

Discussion

Sanger sequencing has been suggested to be the most useful technique to sequence short fragments needed to patch near-complete genome assemblies following the failure of diagnostic assays [25], a common phenomenon when these assays target rapidly evolving regions of the genome. In the current study, we incorporated Sanger sequencing using a custom primer set that targets the RBD of the S gene of SARS-CoV-2, a region that commonly has low read coverage during NGS. The successful application of our method demonstrates the utility of Sanger sequencing in improving the coverage and quality of whole genomes derived from short-read sequencing. We performed RT-PCR on the samples to investigate the upper bounds of cycle threshold (C t) values at which Sanger sequencing would be viable. While our results show that C t values below 25 are suitable for the reliable application of Sanger sequencing, we managed to successfully Sanger sequence samples with C t values as high as 37.5. The Sanger sequencing RT-PCR-based protocol described here enabled sequence data to be generated both rapidly and reliably. Specifically, there is a one-step cDNA synthesis and PCR workflow that can be completed in approximately 6 h, which compares very favourably with the >30 h of sample preparation required for short-read sequencing technologies such as Illumina. Sixty-nine of the 222 samples failed Sanger sequencing; this was either due to failure to amplify by PCR and/or the cycle sequencing reaction(s) or because the resultant consensus sequences contained gaps (due to poor primer coverage). While most of these samples typically had high C t scores, others were well within the C t range of those that could be successfully sequenced (mean C t=25.70). The failure of these samples could possibly be attributed to poor sample quality or RNA degradation (due to multiple freeze–thaw events, for example). Key mutations were not found in 11 out of the 153 samples that had been successfully Sanger sequenced. A major challenge with analysing raw Sanger sequence data is the high cost associated with tools such as Geneious and Thermo Fisher’s proprietary sequencing analysis software, SeqA. Additionally, software such as SeqA is computing platform-dependent, which seriously limits its broad utility. SeqPatcher is free and utilizes dependences that are freely available or with functionality that is available entirely as open source. The codebase is implemented in Python, a platform-independent language, implying that it is accessible regardless of the user’s computing platform. Users are free to use as well as modify SeqPatcher to suit their personal needs. Our pipeline also adheres to key principles of scientific software development, such as ‘run as a single command’ and ‘extensive documentation’ to ensure there is no technical barrier to usage [26]. Our method thus encourages a do-it-yourself analytical framework. While SeqPatcher was developed to facilitate the completion of SARS-CoV-2 whole-genome sequences, the software enables the merging of any two related sequences. SeqPatcher also extends the utility of existing sequence analysis tools such as Biopython Bio.SeqIO [27], which provide functionality to analyse .abi formatted trace files through its specialized functions to integrate the Sanger sequence fragments into their corresponding whole genomes. Various tools such as GapFiller [15] have been developed in the past to extend the coverage of short-read whole-genome sequence assemblies in silico. These tools were mostly developed on the assumption of poor assembly, often as a result of complex genomic structural features, such as tandem repeats. However, assembly methods have improved significantly. Advanced assemblers such as metaSpades [28] and megahit [29] are capable of powerful methodologies, including de novo, reference guided and graph methodologies. Assembly applications such as GenomeDetective [30] combine the power of these assemblers with alignment and mapping tools such as aga [31], muscle [23] and Minimap2 [32] to ensure that the most optimal and accurate assembly of sequenced reads is obtained. As such, in many instances, gapped regions in a sequence represent true failure of diagnostics to amplify a targeted region due to primer mismatch, for example, and not the misplacement of reads. On the other hand, the sensitive nature of genomic surveillance requires very high confidence in called mutations, leaving little room for imputation. Employing a hybrid approach combining targeted sequencing with bioinformatic tools thus provides an accurate, feasible and cost/time-effective solution that can be integrated into any sequence generating laboratory to quickly increase the rates at which high-quality whole-genome sequences can be produced.

Conclusion

Targeted Sanger sequencing of short sequence fragments has the potential to increase the number and quality of whole genomes produced in the pursuit of global pathogen genomic surveillance. In the case of rapidly evolving pandemics caused by pathogens such as SARS-CoV-2 and HIV, routine updates of workflows and primer sets will be essential to ensure that degrees of sequencing coverage and quality do not begin to decline as pandemics progress. Click here for additional data file.

27 in total

1. BLAT--the BLAST-like alignment tool.

Authors: W James Kent
Journal: Genome Res Date: 2002-04 Impact factor: 9.043

2. Primer3--new capabilities and interfaces.

Authors: Andreas Untergasser; Ioana Cutcutache; Triinu Koressaar; Jian Ye; Brant C Faircloth; Maido Remm; Steven G Rozen
Journal: Nucleic Acids Res Date: 2012-06-22 Impact factor: 16.971

3. Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes.

Authors: Sara Cleemput; Wim Dumon; Vagner Fonseca; Wasim Abdool Karim; Marta Giovanetti; Luiz Carlos Alcantara; Koen Deforche; Tulio de Oliveira
Journal: Bioinformatics Date: 2020-06-01 Impact factor: 6.937

4. A Comparison of Whole Genome Sequencing of SARS-CoV-2 Using Amplicon-Based Sequencing, Random Hexamers, and Bait Capture.

Authors: Jalees A Nasir; Robert A Kozak; Patryk Aftanas; Amogelang R Raphenya; Kendrick M Smith; Finlay Maguire; Hassaan Maan; Muhannad Alruwaili; Arinjay Banerjee; Hamza Mbareche; Brian P Alcock; Natalie C Knox; Karen Mossman; Bo Wang; Julian A Hiscox; Andrew G McArthur; Samira Mubareka
Journal: Viruses Date: 2020-08-15 Impact factor: 5.048

5. Alternate primers for whole-genome SARS-CoV-2 sequencing.

Authors: Matthew Cotten; Dan Lule Bugembe; Pontiano Kaleebu; My V T Phan
Journal: Virus Evol Date: 2021-02-04

6. MUSCLE: a multiple sequence alignment method with reduced time and space complexity.

Authors: Robert C Edgar
Journal: BMC Bioinformatics Date: 2004-08-19 Impact factor: 3.169

7. The Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity.

Authors: Qianqian Li; Jiajing Wu; Jianhui Nie; Li Zhang; Huan Hao; Shuo Liu; Chenyan Zhao; Qi Zhang; Huan Liu; Lingling Nie; Haiyang Qin; Meng Wang; Qiong Lu; Xiaoyu Li; Qiyu Sun; Junkai Liu; Linqi Zhang; Xuguang Li; Weijin Huang; Youchun Wang
Journal: Cell Date: 2020-07-17 Impact factor: 41.582

8. Receptor-binding domain of SARS-CoV spike protein induces highly potent neutralizing antibodies: implication for developing subunit vaccine.

Authors: Yuxian He; Yusen Zhou; Shuwen Liu; Zhihua Kou; Wenhui Li; Michael Farzan; Shibo Jiang
Journal: Biochem Biophys Res Commun Date: 2004-11-12 Impact factor: 3.575

9. Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus.

Authors: Bette Korber; Will M Fischer; Sandrasegaram Gnanakaran; Hyejin Yoon; James Theiler; Werner Abfalterer; Nick Hengartner; Elena E Giorgi; Tanmoy Bhattacharya; Brian Foley; Kathryn M Hastie; Matthew D Parker; David G Partridge; Cariad M Evans; Timothy M Freeman; Thushan I de Silva; Charlene McDanal; Lautaro G Perez; Haili Tang; Alex Moon-Walker; Sean P Whelan; Celia C LaBranche; Erica O Saphire; David C Montefiori
Journal: Cell Date: 2020-07-03 Impact factor: 66.850

1 in total

1. The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance.

Authors: Houriiyah Tegally; Eduan Wilkinson; James E San; Matthew Cotten; Monika Moir; Bryan Tegomoh; Gerald Mboowa; Darren P Martin; Cheryl Baxter; Arnold W Lambisia; Amadou Diallo; Daniel G Amoako; Moussa M Diagne; Abay Sisay; Abdel-Rahman N Zekri; Abdou Salam Gueye; Abdoul K Sangare; Abdoul-Salam Ouedraogo; Abdourahmane Sow; Abdualmoniem O Musa; Abdul K Sesay; Abe G Abias; Adam I Elzagheid; Adamou Lagare; Adedotun-Sulaiman Kemi; Aden Elmi Abar; Adeniji A Johnson; Adeola Fowotade; Adeyemi O Oluwapelumi; Adrienne A Amuri; Agnes Juru; Ahmed Kandeil; Ahmed Mostafa; Ahmed Rebai; Ahmed Sayed; Akano Kazeem; Aladje Balde; Alan Christoffels; Alexander J Trotter; Allan Campbell; Alpha K Keita; Amadou Kone; Amal Bouzid; Amal Souissi; Ambrose Agweyu; Amel Naguib; Ana V Gutierrez; Anatole Nkeshimana; Andrew J Page; Anges Yadouleton; Anika Vinze; Anise N Happi; Anissa Chouikha; Arash Iranzadeh; Arisha Maharaj; Armel L Batchi-Bouyou; Arshad Ismail; Augustina A Sylverken; Augustine Goba; Ayoade Femi; Ayotunde E Sijuwola; Baba Marycelin; Babatunde L Salako; Bamidele S Oderinde; Bankole Bolajoko; Bassirou Diarra; Belinda L Herring; Benjamin Tsofa; Bernard Lekana-Douki; Bernard Mvula; Berthe-Marie Njanpop-Lafourcade; Blessing T Marondera; Bouh Abdi Khaireh; Bourema Kouriba; Bright Adu; Brigitte Pool; Bronwyn McInnis; Cara Brook; Carolyn Williamson; Cassien Nduwimana; Catherine Anscombe; Catherine B Pratt; Cathrine Scheepers; Chantal G Akoua-Koffi; Charles N Agoti; Chastel M Mapanguy; Cheikh Loucoubar; Chika K Onwuamah; Chikwe Ihekweazu; Christian N Malaka; Christophe Peyrefitte; Chukwa Grace; Chukwuma E Omoruyi; Clotaire D Rafaï; Collins M Morang'a; Cyril Erameh; Daniel B Lule; Daniel J Bridges; Daniel Mukadi-Bamuleka; Danny Park; David A Rasmussen; David Baker; David J Nokes; Deogratius Ssemwanga; Derek Tshiabuila; Dominic S Y Amuzu; Dominique Goedhals; Donald S Grant; Donwilliams O Omuoyo; Dorcas Maruapula; Dorcas W Wanjohi; Ebenezer Foster-Nyarko; Eddy K Lusamaki; Edgar Simulundu; Edidah M Ong'era; Edith N Ngabana; Edward O Abworo; Edward Otieno; Edwin Shumba; Edwine Barasa; El Bara Ahmed; Elhadi A Ahmed; Emmanuel Lokilo; Enatha Mukantwari; Eromon Philomena; Essia Belarbi; Etienne Simon-Loriere; Etilé A Anoh; Eusebio Manuel; Fabian Leendertz; Fahn M Taweh; Fares Wasfi; Fatma Abdelmoula; Faustinos T Takawira; Fawzi Derrar; Fehintola V Ajogbasile; Florette Treurnicht; Folarin Onikepe; Francine Ntoumi; Francisca M Muyembe; Frank E Z Ragomzingba; Fred A Dratibi; Fred-Akintunwa Iyanu; Gabriel K Mbunsu; Gaetan Thilliez; Gemma L Kay; George O Akpede; Gert U van Zyl; Gordon A Awandare; Grace S Kpeli; Grit Schubert; Gugu P Maphalala; Hafaliana C Ranaivoson; Hannah E Omunakwe; Harris Onywera; Haruka Abe; Hela Karray; Hellen Nansumba; Henda Triki; Herve Albéric Adje Kadjo; Hesham Elgahzaly; Hlanai Gumbo; Hota Mathieu; Hugo Kavunga-Membo; Ibtihel Smeti; Idowu B Olawoye; Ifedayo M O Adetifa; Ikponmwosa Odia; Ilhem Boutiba Ben Boubaker; Iluoreh Ahmed Muhammad; Isaac Ssewanyana; Isatta Wurie; Iyaloo S Konstantinus; Jacqueline Wemboo Afiwa Halatoko; James Ayei; Janaki Sonoo; Jean-Claude C Makangara; Jean-Jacques M Tamfum; Jean-Michel Heraud; Jeffrey G Shaffer; Jennifer Giandhari; Jennifer Musyoki; Jerome Nkurunziza; Jessica N Uwanibe; Jinal N Bhiman; Jiro Yasuda; Joana Morais; Jocelyn Kiconco; John D Sandi; John Huddleston; John K Odoom; John M Morobe; John O Gyapong; John T Kayiwa; Johnson C Okolie; Joicymara S Xavier; Jones Gyamfi; Joseph F Wamala; Joseph H K Bonney; Joseph Nyandwi; Josie Everatt; Joweria Nakaseegu; Joyce M Ngoi; Joyce Namulondo; Judith U Oguzie; Julia C Andeko; Julius J Lutwama; Juma J H Mogga; Justin O'Grady; Katherine J Siddle; Kathleen Victoir; Kayode T Adeyemi; Kefentse A Tumedi; Kevin S Carvalho; Khadija Said Mohammed; Koussay Dellagi; Kunda G Musonda; Kwabena O Duedu; Lamia Fki-Berrajah; Lavanya Singh; Lenora M Kepler; Leon Biscornet; Leonardo de Oliveira Martins; Lucious Chabuka; Luicer Olubayo; Lul Deng Ojok; Lul Lojok Deng; Lynette I Ochola-Oyier; Lynn Tyers; Madisa Mine; Magalutcheemee Ramuth; Maha Mastouri; Mahmoud ElHefnawi; Maimouna Mbanne; Maitshwarelo I Matsheka; Malebogo Kebabonye; Mamadou Diop; Mambu Momoh; Maria da Luz Lima Mendonça; Marietjie Venter; Marietou F Paye; Martin Faye; Martin M Nyaga; Mathabo Mareka; Matoke-Muhia Damaris; Maureen W Mburu; Maximillian G Mpina; Michael Owusu; Michael R Wiley; Mirabeau Y Tatfeng; Mitoha Ondo'o Ayekaba; Mohamed Abouelhoda; Mohamed Amine Beloufa; Mohamed G Seadawy; Mohamed K Khalifa; Mooko Marethabile Matobo; Mouhamed Kane; Mounerou Salou; Mphaphi B Mbulawa; Mulenga Mwenda; Mushal Allam; My V T Phan; Nabil Abid; Nadine Rujeni; Nadir Abuzaid; Nalia Ismael; Nancy Elguindy; Ndeye Marieme Top; Ndongo Dia; Nédio Mabunda; Nei-Yuan Hsiao; Nelson Boricó Silochi; Ngiambudulu M Francisco; Ngonda Saasa; Nicholas Bbosa; Nickson Murunga; Nicksy Gumede; Nicole Wolter; Nikita Sitharam; Nnaemeka Ndodo; Nnennaya A Ajayi; Noël Tordo; Nokuzola Mbhele; Norosoa H Razanajatovo; Nosamiefan Iguosadolo; Nwando Mba; Ojide C Kingsley; Okogbenin Sylvanus; Oladiji Femi; Olubusuyi M Adewumi; Olumade Testimony; Olusola A Ogunsanya; Oluwatosin Fakayode; Onwe E Ogah; Ope-Ewe Oludayo; Ousmane Faye; Pamela Smith-Lawrence; Pascale Ondoa; Patrice Combe; Patricia Nabisubi; Patrick Semanda; Paul E Oluniyi; Paulo Arnaldo; Peter Kojo Quashie; Peter O Okokhere; Philip Bejon; Philippe Dussart; Phillip A Bester; Placide K Mbala; Pontiano Kaleebu; Priscilla Abechi; Rabeh El-Shesheny; Rageema Joseph; Ramy Karam Aziz; René G Essomba; Reuben Ayivor-Djanie; Richard Njouom; Richard O Phillips; Richmond Gorman; Robert A Kingsley; Rosa Maria D E S A Neto Rodrigues; Rosemary A Audu; Rosina A A Carr; Saba Gargouri; Saber Masmoudi; Sacha Bootsma; Safietou Sankhe; Sahra Isse Mohamed; Saibu Femi; Salma Mhalla; Salome Hosch; Samar Kamal Kassim; Samar Metha; Sameh Trabelsi; Sara Hassan Agwa; Sarah Wambui Mwangi; Seydou Doumbia; Sheila Makiala-Mandanda; Sherihane Aryeetey; Shymaa S Ahmed; Side Mohamed Ahmed; Siham Elhamoumi; Sikhulile Moyo; Silvia Lutucuta; Simani Gaseitsiwe; Simbirie Jalloh; Soa Fy Andriamandimby; Sobajo Oguntope; Solène Grayo; Sonia Lekana-Douki; Sophie Prosolek; Soumeya Ouangraoua; Stephanie van Wyk; Stephen F Schaffner; Stephen Kanyerezi; Steve Ahuka-Mundeke; Steven Rudder; Sureshnee Pillay; Susan Nabadda; Sylvie Behillil; Sylvie L Budiaki; Sylvie van der Werf; Tapfumanei Mashe; Thabo Mohale; Thanh Le-Viet; Thirumalaisamy P Velavan; Tobias Schindler; Tongai G Maponga; Trevor Bedford; Ugochukwu J Anyaneji; Ugwu Chinedu; Upasana Ramphal; Uwem E George; Vincent Enouf; Vishvanath Nene; Vivianne Gorova; Wael H Roshdy; Wasim Abdul Karim; William K Ampofo; Wolfgang Preiser; Wonderful T Choga; Yahaya Ali Ahmed; Yajna Ramphal; Yaw Bediako; Yeshnee Naidoo; Yvan Butera; Zaydah R de Laurent; Ahmed E O Ouma; Anne von Gottberg; George Githinji; Matshidiso Moeti; Oyewale Tomori; Pardis C Sabeti; Amadou A Sall; Samuel O Oyola; Yenew K Tebeje; Sofonias K Tessema; Tulio de Oliveira; Christian Happi; Richard Lessells; John Nkengasong
Journal: Science Date: 2022-10-07 Impact factor: 63.714

1 in total