| Literature DB >> 35845149 |
Yamin Sun1,2, Min Wang2,3, Wenchao Lin2, Wei Dong2, Jianguo Xu1,4,5.
Abstract
Over the past two years, scientists throughout the world have completed more than 6 million SARS-CoV-2 genome sequences. Today, the number of SARS-CoV-2 genomes exceeds the total number of all other viral genomes. These genomes are a record of the evolution of SARS-CoV-2 in the human host, and provide information on the emergence of mutations. In this study, analysis of these sequenced genomes identified 296,728 de novo mutations (DNMs), and found that six types of base substitutions reached saturation in the sequenced genome population. Based on this analysis, a "mutation blacklist" of SARS-CoV-2 was compiled. The loci on the "mutation blacklist" are highly conserved, and these mutations likely have detrimental effects on virus survival, replication, and transmission. This information is valuable for SARS-CoV-2 research on gene function, vaccine design, and drug development. Through association analysis of DNMs and viral transmission rates, we identified 185 DNMs that positively correlated with the SARS-CoV-2 transmission rate, and these DNMs where classified as the "mutation whitelist" of SARS-CoV-2. The mutations on the "mutation whitelist" are beneficial for SARS-CoV-2 transmission and could therefore be used to evaluate the transmissibility of new variants. The occurrence of mutations and the evolution of viruses are dynamic processes. To more effectively monitor the mutations and variants of SARS-CoV-2, we built a SARS-CoV-2 mutation and variant monitoring and pre-warning system (MVMPS), which can monitor the occurrence and development of mutations and variants of SARS-CoV-2, as well as provide pre-warning for the prevention and control of SARS-CoV-2 (https://www.omicx.cn/). Additionally, this system could be used in real-time to update the "mutation whitelist" and "mutation blacklist" of SARS-CoV-2.Entities:
Keywords: De novo mutations; Mutation saturation; SARS-CoV-2; Transmission
Year: 2022 PMID: 35845149 PMCID: PMC9273572 DOI: 10.1016/j.jobb.2022.06.006
Source DB: PubMed Journal: J Biosaf Biosecur ISSN: 2588-9338
Fig. 1Mutation saturation curves of 12 types of base substitution. Identified DNMs in the SARS-CoV-2 sequenced genomes were grouped by base type and the cumulative saturation percentage of C-> T and G-A (A), C->A and G->T (B), T->C and A->G (C), A->T and T->A (D), C->G and G->C (E), T->G and A->C (F), and were plotted over time. The curve plateau reflects 100% saturation, which is marked in the figure by a vertical line.
Fig. 2A Sequence conservation analysis in the 5ʹ-UTR. The x-axis is the position on the genome, and the y-axis is the number of mutations that have never been detected. B is the conservative analysis of the sequence in the intergenic region of the orf1ab and spike genes. The x-axis is the position on the genome, and the y-axis is the number of mutations that have never been detected.
Fig. 3Association between DNMs and transmission rate. The x-axis represents the SARS-CoV-2 genome position, and the y-axis represents the weekly growth slope of each DNM in the SARS-CoV-2 population. Mutations with a growth slope greater than 0.001 are marked in red and the corresponding amino acid mutations are noted.
Fig. 4Features of the mutation and variant monitoring and pre-warning system (MVMPS). A is the website portal. B is the variant monitoring module, which shows the global distribution of each variant by time frame and geography. C is the mutation monitoring module, which shows the global distribution of each mutation by time frame and geography. D is the mutation blacklist and whitelist module.