| Literature DB >> 35056480 |
Alina Matsvay1,2, Marina Dyachkova1, Ivan Mikhaylov1, Daniil Kiselev3, Anna Say1, Valentina Burskaia4, Ilya Artyushin5, Kamil Khafizov2, German Shipulin1.
Abstract
Discovery and study of viruses carried by migratory birds are tasks of high importance due to the host's ability to spread infectious diseases over significant distances. With this paper, we present and characterize the first complete genome sequence of atadenovirus from a tern bird (common tern, Sterna hirundo) preliminarily named tern atadenovirus 1 (TeAdV-1). TeAdV-1 genome is a linear double-stranded DNA molecule, 31,334 base pairs which contain 30 methionine-initiated open reading frames with gene structure typical for Atadenovirus genus, and the shortest known inverted terminal repeats (ITRs) within the Atadenovirus genus consisted of 25 bases. The nucleotide composition of the genome is characterized by a low G + C content (33.86%), which is the most AT-rich genome of known avian adenoviruses within Atadenovirus genus. The nucleotide sequence of the TeAdV-1 genome shows high divergence compared to known representatives of the Atadenovirus genus with the highest similarity to the duck atadenovirus 1 (53.7%). Phylogenetic analysis of the protein sequences of core genes confirms the taxonomic affiliation of the new representative to the genus Atadenovirus with the degree of divergence from the known representatives exceeding the interspecies distance within the genus. Thereby we proposed a novel TeAdV-1 to be considered as a separate species.Entities:
Keywords: Adenoviridae; Atadenovirus; genome annotation; phylogenetics; tern atadenovirus 1
Year: 2021 PMID: 35056480 PMCID: PMC8781740 DOI: 10.3390/microorganisms10010031
Source DB: PubMed Journal: Microorganisms ISSN: 2076-2607
Figure 1Schematic view of the TeAdv-1 genome structure. Genes and ORFs are illustrated as arrows pointing in the direction of transcription. Green indicates a conserved set of genes present in the genomes of AdVs. Genes that are found in the genomes of more than one genus of AdVs, including the genus Atadenovirus, are shown in blue. Genes specific to Atadenovirus are highlighted in yellow. Specific for TeAdV-1 genes colored red.
Predicted protein-coding genes and methionine-initiated ORFs of TeAdV-1 and syntenic genes of phylogenetically closely related duck adenovirus 1 (DAdV-1) [4] and psittacine adenovirus 3 (PsAdV-3) [59].
| TeAdV-1 | Gene | Strand | Size (aa) | DAdV-1 | PsAdV-3 | |
|---|---|---|---|---|---|---|
| p32 K | 236 | 1174 | − | 312 | p32 K | p32 K |
| LH2 | 1209 | 1625 | + | 138 | LH2 | E1B protein, |
| LH1 | 1656 | 2804 | + | 382 | E1B 55 K | |
| IVa2 protein | 2875 | 3696 | − | 296 | IVa2 protein | IVa2 protein |
| 4614 | 4682 | |||||
| DNA | 3945 | 7181 | − | 1078 | DNA polymerase | DNA polymerase |
| pTP | 7157 | 8950 | − | 602 | pTP | pTP |
| 11680 | 11694 | |||||
| 52 K protein | 8985 | 9971 | + | 328 | 52 K protein | 52 K protein |
| pIIIa protein | 9955 | 11664 | + | 569 | pIIIa protein | pIIIa protein |
| penton | 11704 | 13062 | + | 452 | penton base protein | penton base protein |
| pVII protein | 13104 | 13559 | + | 151 | pVII | pVII |
| pX protein | 13568 | 13765 | + | 65 | pX | pX |
| pVI protein | 13800 | 14435 | + | 211 | pVI | pVI |
| hexon protein | 14456 | 17188 | + | 910 | hexon protein | hexon protein |
| protease | 17185 | 17790 | + | 201 | protease | protease |
| DNA-binding | 17809 | 18951 | − | 380 | DNA-binding | DNA-binding |
| 100 K protein | 19005 | 20924 | + | 639 | 100 K protein | 100 K protein |
| 22 K protein | 20758 | 20982 | + | 75 | ||
| 33 K protein | 20758 | 20973 | + | 150 | 33 K protein | 33 K protein |
| 21059 | 21292 | |||||
| pVIII protein | 21323 | 22129 | + | 268 | pVIII protein | pVIII protein |
| U-exon | 22142 | 22306 | − | 54 | U-exon | U-exon |
| fiber protein | 22324 | 24369 | + | 681 | fiber protein | fiber 2 protein |
| E4.3 protein | 24383 | 25273 | − | 296 | 34 K-2 | E4.3 protein |
| E4.2 protein | 25221 | 26027 | − | 268 | 34 K-1 | E4.2 protein |
| E4.1 protein | 25948 | 26385 | − | 145 | E4.1 protein | |
| ORF8 | 26628 | 26870 | − | 80 | ||
| ORF7 | 26888 | 27451 | − | 187 | ||
| ORF1 | 27508 | 27966 | + | 152 | ||
| ORF2 | 28072 | 28359 | + | 95 | ||
| ORF3 | 28366 | 28698 | + | 110 | ||
| ORF6 | 28872 | 29558 | − | 228 | ||
| ORF5 | 29658 | 30296 | − | 212 | ||
| ORF4 | 30424 | 31146 | + | 240 | ||
Figure 2Comparison of TeAdV-1 genome with genomes of other members of Atadenovirus genus. Colored bars represent the degree of pairwise identity of specific proteins conservative within Adenoviridae family and Atadenovirus genus, colored lines represent mean protein and genomic sequence identity across all genes as well as mean GC content.
The list of the core genes of the adenoviruses is determined based on the analysis of groups of orthologous genes of the studied set of genomes of the type representatives of the family.
| Gene | Annotation |
|---|---|
| 100 K protein | participation in the transport of hexon monomers to the nucleus and trimerization [ |
| 23 K protein | participation in the cleavage of some AdV precursor proteins [ |
| 52 K protein | participation in the packaging of the viral DNA into the capsid [ |
| DBP (DNA-binding protein) | participation in the elongation phase of AdV DNA replication by unwinding the template [ |
| hexon | major capsid protein [ |
| III (penton base) | major capsid protein [ |
| pIIIa | minor capsid protein [ |
| IVa2 | participation in the packaging of the viral DNA into the capsid [ |
| Pol (DNA polymerase) | participation in the elongation phase of AdV DNA replication [ |
| pTP (preterminal protein) | the protein primer for AdV DNA replication [ |
| pVI | minor capsid protein [ |
| pVIII | minor capsid protein [ |
Figure 3Maximum likelihood phylogenetic tree based on concatenated amino acid sequences of the 12 core proteins of 63 representatives of various species of the Adenoviridae family after 100 bootstrap replicates. Bootstrap values higher than 70 are marked next to the respective nodes (blue circles) showing a robust phylogenetic reconstruction.
Figure 4Maximum likelihood phylogenetic tree based on concatenated amino acid sequences of the 12 core proteins of 29 representatives of various species of the Atadenovirus genus after 1000 bootstrap replicates. Bootstrap values higher than 70 are marked next to the respective nodes (blue circles) showing a robust phylogenetic reconstruction. A clade of related species of avian viruses of the Atadenovirus genus is indicated (orange box).
Species delimitation schemes were obtained using the ASAP, PTP and GMYC approach. The following abbreviations are used: Bovine adenovirus F (BoAdV-F), Ovine atadenovirus D (OvAdV-D), Deer atadenovirus A (OdAdV-A), Bovine atadenovirus D (BoAdV-D), Bovine atadenovirus E (BoAdV-E), Lizard atadenovirus B (LiAdV-B), Lizard atadenovirus A (LiAdV-A), Psittacine atadenovirus A (PsAdV-A), Duck atadenovirus A (DAdV-A). The last row contains the total number of partitions obtained when analyzing 26 representatives of the Atadenovirus genus.
| ASAP | PTP | GMYC | GMYC |
|---|---|---|---|
| TAdV-1 | TAdV-1 | TAdV-1 | TAdV-1 |
| LC606503.1 | LC606503.1 | LC606503.1 | LC606503.1 |
| U40839.3 | U40839.3 | U40839.3 | U40839.3 |
| MK537328.1 | MK537328.1 | MK537328.1 | MK537328.1 |
| AF036092.3 | AF036092.3 | AF036092.3 | AF036092.3 |
| JQ345700.1 | JQ345700.1 | ||
| MT050041.1 | MT050041.1 | MT050041.1 | MT050041.1 |
| KJ156523.1 | KJ156523.1 | KJ156523.1 | KJ156523.1 |
| KJ675568.1 | KJ675568.1 | KJ675568.1 | KJ675568.1 |
| MN025529.1 | MN025529.1 | ||
| KJ452170.1 | KJ452170.1 | KJ452170.1 | KJ452170.1 |
| KF286430.1 | KF286430.1 | KF286430.1 | KF286430.1 |
| MT646045.1 | |||
| MN310513.1 | |||
| 9 | 11 | 8 | 14 |
Figure 5Comparison for predicted proteins 3D structures. For convenience, different colors were used for each predicted structure.
Figure 6Predicted 3D structure of hexon and penton base proteins, compared to hexon and penton base proteins of HAdV-41 and LAdV-2.
Figure 7Icosaedral asymmetric unit (AU) of TAdV, penton base, and LH3 proteins are highlighted with color.
Identified the most reliable candidate sites for positive selection. PP values ≥ 0.95 are in bold.
| Protein | Amino Acid Coordinate | ||
|---|---|---|---|
| 100 K protein | 118 | 0.886 |
|
| 230 | 0.807 |
| |
| 450 | 0.905 |
| |
| 96 | 0.87 |
| |
| 13 | 0.903 |
| |
| 162 | 0.904 |
| |
| 180 | 0.936 |
| |
| 192 |
|
| |
| 35 |
| 0.946 | |
| pIVa2 | 123 | 0.937 |
|
| 137 | 0.894 |
| |
| 152 | 0.946 |
| |
| 94 | 0.935 |
| |
| DNA polymerase | 1044 | 0.862 |
|
| 366 | 0.934 |
| |
| 367 | 0.927 |
| |
| 773 | 0.941 |
| |
| pTP | 145 | 0.85 |
|
| 187 | 0.946 |
| |
| 297 | 0.859 |
| |
| 405 | 0.926 |
| |
| 44 | 0.946 |
| |
| 445 | 0.939 |
| |
| 565 | 0.795 |
| |
| 89 | 0.93 |
| |
| 96 | 0.939 |
|