| Literature DB >> 28933407 |
Torstein Tengs1, Christine M Jonassen2.
Abstract
The mobile genetic element s2m has been described in several families of single-stranded RNA viruses. The function remains elusive, but an increasing number of s2m-containing sequences are being deposited in publicly available databases. Currently, more than 700 coronavirus sequences containing s2m can be found in GenBank, including the severe acute respiratory syndrome (SARS) coronavirus genome. This is an updated review of the pattern of s2m in coronaviruses, the possible functional implications and the evolutionary history.Entities:
Keywords: coronavirus; mobile genetic element; s2m; secondary structure
Year: 2016 PMID: 28933407 PMCID: PMC5456283 DOI: 10.3390/diseases4030027
Source DB: PubMed Journal: Diseases ISSN: 2079-9721
Figure 1Coronavirus phylogeny with s2m-containing sequences highlighted. ORF1ab polyprotein amino acid sequences were aligned using the program MUSCLE [8] and default parameters. The phylogenetic analysis was performed using the program SeaView [9] and the neighbor joining clustering method with Kimura two-parameter distances. In order to avoid large clades of closely related sequences, operational taxonomic units (OTUs) with similar GenBank taxonomical annotation and almost identical sequences were identified and basal members of these monophyletic groups chosen to represent such sequence clusters. For instance, there are 183 complete ORF1ab polyprotein sequences available from different strains of the Middle East respiratory syndrome (MERS) coronavirus. These sequences are represented by a single accession (in this case, GenBank accession number ALB08298; isolate KOREA/Seoul/035-1-2015). Based on visual inspection of the alignment it was determined that the sequences belonging to the Torovirinae subfamily could not be reliably aligned and were excluded from the analysis. Brackets show serogroups as well as betacoronavirus lineages and key branches with 100% bootstrap support (100 pseudoreplicates) have been indicated. Red circles indicate possible losses/gains of s2m (see discussion in text).
Coronavirus sequences in GenBank.
| Total | Containing s2m | |
|---|---|---|
| Coronavirus sequences | 20,068 | 706 (3.5%) |
| Alpha coronavirus * | 7190 | 0 |
| Beta coronavirus * | 4947 | 342 (6.9%) |
| Delta coronavirus * | 141 | 60 (42.6%) |
| Gamma coronavirus * | 6360 | 281 (4.4%) |
| Bafinivirus * | 12 | 0 |
| Torovirus * | 307 | 0 |
| Complete genomes | 1507 | 523 (34.7%) |
* GenBank taxonomy database annotation.
Figure 2s2m coronavirus sequence motifs. For each genotype, one representative accession was chosen to illustrate the conserved nature of s2m both on a primary (‘sequence’) level and on a secondary (stem-loop structure) level. Lines above alignment indicate co-varying/stem-forming elements and columns with non-conserved bases for these nucleotide positions have been color coded.
Figure 3s2m secondary structure. The s2m element from Pheasant coronavirus strain ph/UK/6/99 was folded using mfold [16]. Non-Watson-Crick base-pairings are shown in red. For a more detailed folding with tertiary interactions and long-range contacts indicated, see [9].