| Literature DB >> 33367605 |
Mateusz Kudla1,2, Kaja Gutowska1,3, Jaroslaw Synak1, Mirko Weber2, Katrin Sophie Bohnsack2, Piotr Lukasiak1,3, Thomas Villmann2, Jacek Blazewicz1,3, Marta Szachniuk1,3.
Abstract
MOTIVATION: Viruses are the most abundant biological entities and constitute a large reservoir of genetic diversity. In recent years, knowledge about them has increased significantly as a result of dynamic development in life sciences and rapid technological progress. This knowledge is scattered across various data repositories, making a comprehensive analysis of viral data difficult.Entities:
Year: 2020 PMID: 33367605 PMCID: PMC8016492 DOI: 10.1093/bioinformatics/btaa1066
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Dataflow in the Virxicon system
Fig. 2.Viral groups in Virxicon
The number of sequences in the database by classification, molecular types, topologies and resources (August 2, 2020)
| Virus group | Molecular type | Topology | Resource | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RNA | DNA | Linear | Circular | NCBI | GenBank | |||||||||
| DNA viruses | ||||||||||||||
| dsDNA | 19 896 | 8.6% | 3 | 0.0% | 19 893 | 8.6% | 13 963 | 6.0% | 5933 | 2.6% | 4385 | 1.9% | 15 511 | 6.7% |
| ssDNA | 22 405 | 9.7% | 9 | 0.0% | 22 396 | 9.7% | 4912 | 2.1% | 17 493 | 7.6% | 1528 | 0.7% | 20 877 | 9.0% |
| ssDNA/dsDNA | 28 | 0.0% | 0 | 0.0% | 28 | 0.0% | 2 | 0.0% | 26 | 0.0% | 14 | 0.0% | 14 | 0.0% |
| | 42 329 | 18.3% | 12 | 0.0% | 42 317 | 18.3% | 18 877 | 8.2% | 23 452 | 10.1% | 5927 | 2.6% | 36 402 | 15.7% |
| RNA viruses | ||||||||||||||
| dsRNA | 87 236 | 37.7% | 86 651 | 37.4% | 585 | 0.3% | 87 236 | 37.7% | 0 | 0.0% | 1379 | 0.6% | 85 857 | 37.1% |
| ssRNA(+) | 51 650 | 22.3% | 51 427 | 22.2% | 223 | 0.1% | 51 624 | 22.3% | 26 | 0.0% | 1978 | 0.9% | 49 672 | 21.5% |
| ssRNA(-) | 15 762 | 6.8% | 15 698 | 6.8% | 64 | 0.0% | 15 412 | 6.7% | 350 | 0.2% | 1057 | 0.5% | 14 705 | 6.4% |
| ssRNA(±)/ ssRNA(-) | 3255 | 1.4% | 3247 | 1.4% | 8 | 0.0% | 3251 | 1.4% | 4 | 0.0% | 245 | 0.1% | 3010 | 1.3% |
| | 157 903 | 68.2% | 157 023 | 67.8% | 880 | 0.4% | 157 523 | 68.0% | 380 | 0.2% | 4659 | 2.0% | 153 244 | 66.2% |
| RT viruses | ||||||||||||||
| ssRNA-RT | 9129 | 3.9% | 4074 | 1.8% | 5055 | 2.2% | 9.121 | 3.9% | 8 | 0.0% | 94 | 0.0% | 9035 | 3.9% |
| dsDNA-RT | 10 901 | 4.7% | 14 | 0.0% | 10 887 | 4.7% | 1542 | 0.7% | 9359 | 4.0% | 112 | 0.0% | 10 789 | 4.7% |
| | 20 030 | 8.7% | 4088 | 1.8% | 15 942 | 6.9% | 10 663 | 4.6% | 9367 | 4.0% | 206 | 0.1% | 19 824 | 8.6% |
| Viroids | ||||||||||||||
| ssRNA(viroids) | 11 241 | 4.9% | 11 110 | 4.8% | 131 | 0.1% | 482 | 0.2% | 10 759 | 4.6% | 39 | 0.0% | 11 202 | 4.8% |
Fig. 3.Data distribution in viral groups with the division into RNA, DNA, RT viruses and viroids
Fig. 4.The web-based interface of Virxicon
Selected features of viral databases
| NCBI virus | ViPR | ViralZone | Virxicon | |
|---|---|---|---|---|
| I Database contents | ||||
| Sequences |
|
|
| |
| Virus groups |
|
| ||
| Other viral data |
|
|
|
|
| Functional annotations |
|
| ||
| II Data filtering criteria | ||||
| Sequence |
| |||
| Sequence homology |
|
| ||
| Species |
|
|
|
|
| Taxonomy |
|
|
|
|
| Virus group |
|
| ||
| Molecular type |
| |||
| Topology |
| |||
| III Data available for download | ||||
| Single sequences |
|
|
| |
| Group-wide sequences |
| |||
| Search results |
|
|
| |
| IV Other facilities | ||||
| Submission of user data |
|
| ||
| Genome browser |
|
|
| |
| Additional access via API |
| |||
| Visualized statistics |
| |||