| Literature DB >> 28344575 |
Arshan Nasir1, Gustavo Caetano-Anollés2.
Abstract
The viral supergroup includes the entire collection of known and unknown viruses that roam our planet and infect life forms. The supergroup is remarkably diverse both in its genetics and morphology and has historically remained difficult to study and classify. The accumulation of protein structure data in the past few years now provides an excellent opportunity to re-examine the classification and evolution of viruses. Here we scan completely sequenced viral proteomes from all genome types and identify protein folds involved in the formation of viral capsids and virion architectures. Viruses encoding similar capsid/coat related folds were pooled into lineages, after benchmarking against published literature. Remarkably, the in silico exercise reproduced all previously described members of known structure-based viral lineages, along with several proposals for new additions, suggesting it could be a useful supplement to experimental approaches and to aid qualitative assessment of viral diversity in metagenome samples.Entities:
Keywords: SCOP; capsid; fold superfamily; protein structure; virion; virus taxonomy
Year: 2017 PMID: 28344575 PMCID: PMC5344890 DOI: 10.3389/fmicb.2017.00380
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
List of 27 capsid/coat related FSFs as identified from SCOP (.
| 48345 | a.115.1 | A virus capsid protein alpha-helical domain | BTV-like lineage | Keyword | 0.00 | 0.00 | 0.00 |
| 64465 | d.196.1 | Outer capsid protein sigma 3 | BTV-like lineage | Keyword | 0.00 | 0.09 | 0.00 |
| 82856 | e.42.1 | L-A virus major coat protein | BTV-like lineage | Keyword | 0.00 | 0.00 | 1.04 |
| 49818 | b.19.1 | Viral protein domain | BTV-like lineage | Literature | 0.00 | 0.00 | 0.00 |
| 56831 | e.28.1 | Reovirus inner layer core protein p3 | BTV-like lineage | Literature | 0.00 | 0.18 | 0.26 |
| 51274 | b.85.2 | Head decoration protein D (gpD, major capsid protein D) | HK97-like lineage | Keyword | 0.00 | 0.72 | 0.00 |
| 56563 | d.183.1 | Major capsid protein gp5 | HK97-like lineage | Keyword | 9.84 | 32.74 | 1.04 |
| 103417 | e.48.1 | Major capsid protein VP5 | HK97-like lineage | Keyword | 0.00 | 0.00 | 0.26 |
| 48045 | a.84.1 | Scaffolding protein gpD of bacteriophage procapsid | Picornavirus-like lineage | Keyword | 0.00 | 0.00 | 0.00 |
| 88633 | b.121.4 | Positive stranded ssRNA viruses | Picornavirus-like lineage | Literature | 0.00 | 1.08 | 12.27 |
| 88645 | b.121.5 | ssDNA viruses | Picornavirus-like lineage | SCOP relative | 0.00 | 0.00 | 4.18 |
| 88648 | b.121.6 | Group I dsDNA viruses | Picornavirus-like lineage | SCOP relative | 0.00 | 0.00 | 0.00 |
| 88650 | b.121.7 | Satellite viruses | Picornavirus-like lineage | SCOP relative | 0.00 | 0.00 | 0.00 |
| 49749 | b.121.2 | Group II dsDNA viruses VP | PRD1/Adenovirus-like lineage | SCOP relative | 0.00 | 0.00 | 1.31 |
| 47353 | a.28.3 | Retrovirus capsid dimerization domain-like | Retrotranscribing-like lineage? | Keyword | 0.00 | 0.00 | 17.23 |
| 47852 | a.62.1 | Hepatitis B viral capsid (hbcag) | Retrotranscribing-like lineage? | Keyword | 0.00 | 0.00 | 0.00 |
| 47943 | a.73.1 | Retrovirus capsid protein, N-terminal core domain | Retrotranscribing-like lineage? | Keyword | 0.00 | 0.00 | 5.22 |
| 50176 | b.37.1 | N-terminal domains of the minor coat protein g3p | Inovirus-like lineage? | Keyword | 0.00 | 0.00 | 0.00 |
| 57987 | h.1.4 | Inovirus (filamentous phage) major coat protein | Inovirus-like lineage? | Keyword | 0.00 | 0.99 | 0.00 |
| 103068 | d.254.1 | Nucleocapsid protein dimerization domain | Keyword | 0.00 | 0.00 | 0.00 | |
| 55405 | d.85.1 | RNA bacteriophage capsid protein | Keyword | 0.00 | 0.00 | 0.26 | |
| 101257 | a.190.1 | Flavivirus capsid protein C | Other/Unclassified | Keyword | 0.00 | 0.00 | 0.00 |
| 47195 | a.24.5 | TMV-like viral coat proteins | Other/Unclassified | Keyword | 0.00 | 0.00 | 4.18 |
Where available (23 out of 27), the distribution (%) in the proteomes of 122 Archaea (A), 1,115 Bacteria (B), and 383 Eukarya (E) are also given along with assignment to one of the four experimentally defined lineages (Abrescia et al., .
Figure 1PDB structures corresponding to FSFs of four experimentally defined viral lineages (Abrescia et al., . Helices, strands, and coils are colored red, blue, and gray, respectively. Text above structures indicates PDB ID along with chain information and FSF ccs. Additional monomers, ligands, and extra molecules were removed from visualization. 1ZBA, Foot-and-mouth disease virus; 2BPA, Bacteriophage phi-X174; 1DZL, Human papillomavirus type 16; 1STM, Satellite panicum mosaic virus; 1HX6, Bacteriophage PRD1; 1C5E, Bacteriophage lambda; 1OHG, Bacteriophage HK97; 1NO7, Herpes simplex virus 1; 1FN9, Reovirus; 1M1C, Saccharomyces cerevisiae virus L-A; and 2BTV, Bluetongue virus.
Genome type, host range, taxonomy assignment, and member families are listed for 23 capsid/coat related FSFs detected in our sampled proteomes.
| a.115.1 | dsRNA | Algae, Fungi, Plants, Vertebrates, Invertebrates | |
| d.196.1 | dsRNA | Algae, Fungi, Plants, Vertebrates, Invertebrates | |
| e.42.1 | dsRNA | Fungi, Protozoa, Invertebrates, Vertebrates | |
| b.19.1 | dsRNA, plus-ssRNA, minus-ssRNA | Algae, Fungi, Plants, Vertebrates, Invertebrates | |
| e.28.1 | dsRNA | Algae, Fungi, Plants, Vertebrates, Invertebrates | |
| b.85.2 | dsDNA | Archaea, Bacteria | |
| d.183.1 | dsDNA | Archaea, Bacteria | |
| e.48.1 | dsDNA | Vertebrates | |
| a.84.1 | ssDNA | Bacteria | |
| b.121.4 | plus-ssRNA, dsRNA, minus-ssRNA, and dsDNA | Algae, Plants, Vertebrates, Invertebrates, Archaea, Bacteria | |
| b.121.5 | ssDNA | Bacteria, Vertebrates, Invertebrates | |
| b.121.6 | dsDNA | Vertebrates | |
| b.121.7 | Unclassified | ssDNA | Unknown |
| b.121.2 | dsDNA | Vertebrates, Invertebrates, Protozoa, Algae, Bacteria | |
| a.28.3 | ssRNA-RT | Vertebrates | |
| a.62.1 | dsDNA-RT | Vertebrates | |
| a.73.1 | ssRNA-RT | Vertebrates | |
| b.37.1 | ssDNA | Bacteria | |
| h.1.4 | ssDNA | Bacteria | |
| d.254.1 | plus-ssRNA | Vertebrates | |
| d.85.1 | plus-ssRNA | Bacteria | |
| a.190.1 | Flaviviridae | plus-ssRNA | Vertebrates, Invertebrates, |
| a.24.5 | plus-ssRNA | Plants | |
Virus families hosts, as described by the NCBI Viral Genomes Resource (Bao et al., .
The Bao et al. (.
Figure 2Virion images of viral families representatives of the four experimentally defined lineages (Abrescia et al., . RNA and DNA viruses are shown in red and blue, respectively. Novel additions to existing lineages are indicated by an asterisk. Virion pictures were taken from ViralZone (Hulo et al., 2011) with permission from the Swiss Institute of Bioinformatics (SIB).
Figure 3Virion images of viral families representatives of the four computationally-defined lineages of unclassified viruses. RNA and DNA viruses are shown in red and blue, respectively. Only RNA 1 segment shown for Benyviridae. Virion pictures were taken from ViralZone (Hulo et al., 2011) with permission from the Swiss Institute of Bioinformatics (SIB).
Figure 4PDB structures corresponding to FSFs of four computationally defined viral lineages along with those unclassified are shown. Helices, strands, and coils are colored red, blue, and gray, respectively. Text above structures indicates PDB IDs along with chain information and FSF ccs. Synthetic structures (j.54.1 and j.9.7) not shown. Additional monomers, ligands, and extra molecules were removed from visualization. 2PWO and 3DS2, Human immunodeficiency virus type 1; 1QGT, Hepatitis B virus; 2G3P and 1IFD, Bacteriophage fd; 2I9F, Equine arteritis virus; 1UNA, Bacteriophage GA; 1SFK, Kunjin virus, and 1EI7, Tobacco mosaic virus, vulgare strain.