| Literature DB >> 35937998 |
Heng Li1,2, Xiaoping Hong1, Liping Ding1, Shuhui Meng1, Rui Liao1, Zhenyou Jiang3, Dongzhou Liu1.
Abstract
Detecting severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) needs human samples, which inevitably contain trace human DNA and RNA. Sequence similarity may cause invalid detection results; however, there is still a lack of gene similarity analysis of SARS-CoV-2 and humans. All publicly reported complete genome assemblies in the Entrez genome database were collected for multiple sequence alignment, similarity and phylogenetic analysis. The complete genomes showed high similarity (>99.88% sequence identity). Phylogenetic analysis divided these viruses into three major clades with significant geographic group effects. Viruses from the United States showed considerable variability. Sequence similarity analysis revealed that SARS-CoV-2 has 612 similar sequences with the human genome and 100 similar sequences with the human transcriptome. The sequence characteristics and genome distribution of these similar sequences were confirmed. The sequence similarity and evolutionary mutations provide indispensable references for dynamic updates of SARS-CoV-2 detection primers and methods.Entities:
Keywords: COVID-19; SARS-CoV-2 detection; coronavirus; coronavirus-COVID-19; mutation
Year: 2022 PMID: 35937998 PMCID: PMC9355506 DOI: 10.3389/fgene.2022.946359
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1The proportion (A) and geographic locations (B) of 92 full-length sequenced SARS-CoV-2 genome assemblies.
FIGURE 2Sequence alignment of 92 full-length SARS-CoV-2 genomes. The first reported genome (MN908947.3) in Wuhan city was used as a reference. The bar on the right represents the total number of mutated bases for each genome. The bottom line converged all the mutation sites of 92 SARS-CoV-2 genomes.
FIGURE 3The landscape of gene mutations in the analyzed SARS-CoV-2 genomes. (A) The number of mismatched base pairs in each gene. (B) Mismatch rate in each gene.
FIGURE 4Phylogenetic analysis of full-length SARS-CoV-2 genomes.
FIGURE 5(A) SARS-CoV-2 sampling and processing flow. (B) Distribution of similar sequences between the SARS-CoV-2 genome and human genome/transcriptome.
FIGURE 6Sequence characteristics of similar loci between the SARS-CoV-2 and human genomes. (A) Distribution of similar sequences in the human chromosome. (B) Length of similar and consensus sequences. (C) The proportion of identity and gap. (D) Distribution of similar sequences in sense and antisense strands.
FIGURE 7Sequence characteristics of similar loci between the SARS-CoV-2 genome and human transcriptome. (A) Length of similar and consensus sequences. (B) The proportion of identity and gap.