| Literature DB >> 36061106 |
Zhi-Ying Xu1,2,3,4, Han Gao1,2,3,4, Qi-Yuan Kuang1,2,3,4, Jia-Bao Xing1,2,3,4, Zhi-Yuan Wang1,2,3,4, Xin-Yu Cao1,2,3,4, Si-Jia Xu1,2,3,4, Jing Liu1,2,3,4, Zhao Huang1,2,3,4, Ze-Zhong Zheng1,2,3,4, Lang Gong1,2,3,4, Heng Wang1,2,3,4, Mang Shi5, Gui-Hong Zhang1,2,3,4, Yan-Kuo Sun1,2,3,4.
Abstract
African swine fever (ASF) outbreak have caused tremendous economic loss to the pig industry in China since its emergence in August 2018. Previous studies revealed that many published sequences are not suitable for detailed analyses due to the lack of data regarding quality parameters and methodology, and outdated annotations. Thus, high-quality genomes of highly pathogenic strains that can be used as references for early Chinese ASF outbreaks are still lacking, and little is known about the features of intra-host variants of ASF virus (ASFV). In this study, a full genome sequencing of clinical samples from the first ASF outbreak in Guangdong in 2018 was performed using MGI (MGI Tech Co., Ltd., Shenzhen, China) and Nanopore sequencing platforms, followed by Sanger sequencing to verify the variations. With 22 sequencing corrections, we obtained a high-quality genome of one of the earliest virulent isolates, GZ201801_2. After proofreading, we improved (add or modify) the annotations of this isolate using the whole genome alignment with Georgia 2007/1. Based on the complete genome sequence, we constructed the methylation profiles of early ASFV strains in China and predicted the potential 5mC and 6mA methylation sites, which are likely involved in metabolism, transcription, and replication. Additionally, the intra-host single nucleotide variant distribution and mutant allele frequency in the clinical samples of early strain were determined for the first time and found a strong preference for A and T substitution mutation, non-synonymous mutations, and mutations that resulted in amino acid substitutions into Lysine. In conclusion, this study provides a high-quality genome sequence, updated genome annotation, methylation profile, and mutation spectrum of early ASFV strains in China, thereby providing a reference basis for further studies on the evolution, transmission, and virulence of ASFV.Entities:
Keywords: African swine fever; annotation; methylation; mutation spectrum; viral genome
Year: 2022 PMID: 36061106 PMCID: PMC9437553 DOI: 10.3389/fvets.2022.978243
Source DB: PubMed Journal: Front Vet Sci ISSN: 2297-1769
Reference strains selected in this study.
|
|
|
|
|
|
|---|---|---|---|---|
| Georgia 2007/1 |
| Georgia | 2007 | 190,584 |
| China/2018/AnhuiXCGQ |
| Anhui | 2018 | 189,393 |
| pig/HLJ/2018 |
| Heilongjiang | 2018 | 189,404 |
| DB/LN/2018 |
| Liaoning | 2018 | 189,404 |
| pig/China/CAS19-1/2019 |
| Guangdong | 2019 | 189,405 |
| Wuhan 2019-1 |
| Hubei | 2019 | 190,576 |
| Wuhan 2019-2 |
| Hubei | 2019 | 190,576 |
| CADC_HN09 |
| China | 2019 | 190,257 |
| pig/Heilongjiang/HRB1/2020 |
| Heilongjiang | 2020 | 189,355 |
| WBBS01 |
| China | 2018 | 189,394 |
| CN/2019/InnerMongolia-AES01 |
| Inner Mongolia | 2019 | 189,403 |
| SY18 |
| Liaoning | 2018 | 188,643 |
| HuB20 |
| Hubei | 2020 | 188,643 |
| GZ201801 |
| Guangdong | 2018 | 189,393 |
Figure 1Depth and coverage of GZ201801_2 measured using SAMtools v1.10 after sequencing (27). (A) MGI platform sequencing and (B) Nanopore sequencing.
Figure 2All annotations in the entire genome of Georgia 2007/1 and GZ201801. Compared to GZ201801, the genes marked with yellow represent the updated annotation of the GZ201801_2. The genes in blue and red represent different annotations GZ201801_2 in Georgia 2007/1, where blue represents different annotations due to site mutations, and red represents different annotations that appear owing to deletion or early termination.
Figure 3ASFV genome from China and Georgia 2007/1, which has been considered as the original strain circulating around Eurasia since 2007, showing the entire genome frame map of all genome-wide sequences compared to GZ201801_2. Each red line represents a site mutation, and the vacant positions represent deletion regions.
Figure 4Methylation of positive and reverse strands in the genome predicted using Tombo and its corresponding depth. (A) The score of 5mC and (B) 6mA modification of positive and reverse strands. We compared the raw signal levels between sequenced samples and alternative models to obtain the statistical results of the corresponding sites, Tombo generated the correlation scores to evaluate the possibility of methylation on certain sites in the full-length viral genome. A higher score indicated that the site was more likely to undergo methylation modification. (C) Depth of Nanopore sequencing of positive and reverse strands. The top 10 methylation sites and their adjacent regions in the ASFV_GZ201801_2 that are most likely to be modified by (D) 5mC and (E) 6mA methylation also predicted using Tombo.
Gene position, coded protein, and biological processes of the top 10 sites where 5mC methylation modification was predicted (28).
|
|
|
|
| |
|---|---|---|---|---|
|
|
| |||
| F778R | 57,745 | 60,081 | Ribonucleotide reductase (large subunit) | Nucleotide metabolism, transcription, replication, and repair |
| EP1242L | 67,263 | 70,991 | RNA polymerase subunit 2 | Nucleotide metabolism, transcription, replication, and repair |
| M448R | 80,232 | 81,578 | Uncharacterized protein | Unknown |
| C257L | 85,706 | 86,479 | Uncharacterized protein | Transmembrane domain |
| C475L | 86,458 | 87,885 | Poly-A polymerase large subunit | Nucleotide metabolism, transcription, replication, and repair |
| C962R | 89,826 | 92,714 | DNA primase | Nucleotide metabolism, transcription, replication, and repair |
| B646L | 104,342 | 106,282 | P72 major capsid protein | Structural integrity: structural proteins and proteins involved in morphogenesis |
| NP1450L | 129,825 | 134,207 | RNA polymerase subunit 1 | Nucleotide metabolism, transcription, replication, and repair |
| D1133L | 141,100 | 144,501 | Helicase superfamily II | Nucleotide metabolism, transcription, replication, and repair |
| H359L | 151,901 | 152,980 | RNA polymerase subunit 3–11 | Nucleotide metabolism, transcription, replication, and repair |
Gene position, encoded protein, and biological process of the top 10 sites where the 6mA methylation was predicted (28–30).
|
|
|
|
| |
|---|---|---|---|---|
|
|
| |||
| MGF 100-1R | 11,823 | 12,197 | Uncharacterized protein | Nucleotide metabolism, transcription, replication, and repair |
| MGF 360-8L | 23,778 | 24,737 | Uncharacterized protein | Unknown |
| A859L | 51,945 | 54,521 | RNA helicase | Nucleotide metabolism, transcription, replication, and repair |
| F778R | 57,745 | 60,081 | Ribonucleotide reductase (large subunit) | Nucleotide metabolism, transcription, replication, and repair |
| F1055L | 60,598 | 63,744 | Uncharacterized protein | Nucleotide metabolism, transcription, replication, and repair |
| K205R | 63,938 | 64,555 | Related to ER stress | Activates ER stress, autophagy, and NF-κb signaling pathway |
| C962R | 89,826 | 92,714 | DNA primase | Nucleotide metabolism, transcription, replication, and repair |
| B318L | 96,040 | 96,996 | Prenyltransferase | Enzymes |
| B354L | 100,365 | 101,429 | Uncharacterized protein | Unknown |
| CP2475L | 118,014 | 125,444 | pp.220 polyprotein precursor of p150, p37, p14, and p34. required for packaging of nucleoprotein core | Structural integrity: structural proteins and proteins involved in morphogenesis |
Figure 5(A) The distribution and frequency of theiSNVs) that a certain base (A, T, C, G) mutate into other three bases. (B) The distribution and frequency of the iSNVs that the other three bases mutate into a certain base. (C) The percentage of different iSNVs mutation types. (D) Number of iSNVs in coding and non-coding regions. (E) Proportion of iSNVs occurring in different positions in codons across the genome. (F) Proportion of iSNVs causing non-synonymous and synonymous mutations. (G) Proportion of iSNVs that mutate into different type of amino acid. The number of iSNVs in each figure is indicated on the corresponding bar.