| Literature DB >> 31727090 |
Myeongji Cho1, Hayeon Kim2, Hyeon S Son3,4.
Abstract
BACKGROUND: Polyomaviruses (PyVs) have a wide range of hosts, from humans to fish, and their effects on hosts vary. The differences in the infection characteristics of PyV with respect to the host are assumed to be influenced by the biochemical function of the LT-Ag protein, which is related to the cytopathic effect and tumorigenesis mechanism via interaction with the host protein.Entities:
Keywords: Codon usage pattern; Functional domains; LT-Ag; Polyomavirus; RSCU; Sequence motif
Mesh:
Substances:
Year: 2019 PMID: 31727090 PMCID: PMC6854729 DOI: 10.1186/s12985-019-1245-2
Source DB: PubMed Journal: Virol J ISSN: 1743-422X Impact factor: 4.099
Proven and possible diseases associated with PyVs
| Host | Virus name | Species | Abbr. | Clinical correlate | Ref. |
|---|---|---|---|---|---|
| Human | Merkel cell polyomavirus | Human polyomavirus 5 | MCPyV | Merkel cell cancer | [ |
| Human | Trichodysplasia spinulosa-associated polyomavirus | Human polyomavirus 8 | TSPyV | Trichodysplasia spinulosa | [ |
| Human | BK polyomavirus | Human polyomavirus 1 | BKPyV | Polyomavirus-associated nephropathy; haemorrhagic cystitis | [ |
| Human | JC polyomavirus | Human polyomavirus 2 | JCPyV | Progressive multifocal leukoencephalopathy (PML) | [ |
| Human | Human polyomavirus 6 | Human polyomavirus 6 | HPyV6 | HPyV6 associated pruritic and dyskeratotic dermatosis (H6PD) | [ |
| Human | Human polyomavirus 7 | Human polyomavirus 7 | HPyV7 | HPyV7-related epithelial hyperplasia | [ |
| Monkey | Simian virus 40 | SV40 | PML-like disease in Immunocompromised animals | [ | |
| Hamster | hamster polyomavirus | polyomavirus 1 | HaPyV | Skin tumors | [ |
| Mouse | mouse pneumotropic virus | MPtV | Respiratory disease in suckling mice | [ | |
| Bird | budgerigar fledgling disease virus | Aves polyomavirus 1 | BFDV | Budgerigar fledgling disease; polyomavirus disease | [ |
| Finch | Finch polyomavirus | FPyV | Polyomavirus disease | [ | |
| Goose | Goose hemorrhagic polyomavirus | GHPV | Hemorrhagic nephritis and enteritis | [ |
References are specified for first description
Description of sequence data used in this study
| No. | ICTV Taxonomy | NCBI Reference Sequence | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Virus name | Abbr. | Accession No. | Host species | Isolation source | Country | Year | bp | Group | Ref. | |
| 1 | bat polyomavirus 4a | BatPyV4a | NC_038556.1 | spleen | French Guiana | 2011 | 5187 | M | [ | |
| 2 | ApanPyV1 | NC_019853.1 | NA | Germany | NA | 5273 | P | [ | ||
| 3 | bat polyomavirus 5b1 | BatPyV5b-1 | NC_026767.1 | spleen | Indonesia | 2012 | 5047 | M | [ | |
| 4 | bat polyomavirus 5a | BatPyV5a | NC_026768.1 | spleen | Indonesia | 2012 | 5075 | M | [ | |
| 5 | Bornean orang-utan polyomavirus | OraPyV-Bor | NC_013439.1 | blood | NA | NA | 5168 | P | [ | |
| 6 | Cardioderma polyomavirus | CardiodermaPyV | NC_020067.1 | rectal swab | Kenya | 2006 | 5372 | M | [ | |
| 7 | bat polyomavirus 4b | BatPyV4b | NC_028120.1 | spleen | French Guiana | 2011 | 5352 | M | [ | |
| 8 | chimpanzee polyomavirus | ChPyV | NC_014743.1 | blood | NA | NA | 5086 | P | [ | |
| 9 | vervet monkey polyomavirus 1 | VmPyV1 | NC_019844.1 | spleen | Zambia | 2009 | 5157 | P | [ | |
| 10 | vervet monkey polyomavirus 3 | VmPyV3 | NC_025898.1 | spleen | Zambia | 2009 | 5055 | P | [ | |
| 11 | Eidolon polyomavirus 1 | EidolonPyV | NC_020068.1 | rectal swab | Kenya | 2009 | 5294 | M | [ | |
| 12 | GgorgPyV1 | NC_025380.1 | NA | Congo Republic | 2008 | 5300 | P | [ | ||
| 13 | Human polyomavirus 9 | HPyV9 | NC_015150.1 | NA | Germany | 2009 | 5026 | H | [ | |
| 14 | Human polyomavirus 12 | HPyV12 | NC_020890.1 | NA | Germany | 2007 | 5033 | H | [ | |
| 15 | MfasPyV1 | NC_019851.1 | NA | Germany | NA | 5087 | P | [ | ||
| 16 | Merkel cell polyomavirus | MCPyV | NC_010277.2 | skin | USA | 2009 | 5387 | H | [ | |
| 17 | hamster polyomavirus | HaPyV | NC_001663.2 | NA | Germany | 1967 | 5372 | M | [ | |
| 18 | bat polyomavirus 3b | BatPyV3b | NC_028123.1 | spleen | French Guiana | 2011 | 4903 | M | [ | |
| 19 | mouse polyomavirus | MPyV | NC_001515.2 | NA | NA | NA | 5307 | M | NA | |
| 20 | New Jersey polyomavirus | NJPyV | NC_024118.1 | bicep muscle | USA | 2013 | 5108 | H | [ | |
| 21 | Otomops polyomavirus 2 | OtomopsPyV | NC_020066.1 | rectal swab | Kenya | 2006 | 4914 | M | [ | |
| 22 | Otomops polyomavirus 1 | OtomopsPyV1 | NC_020071.1 | rectal swab | Kenya | 2006 | 5176 | M | [ | |
| 23 | Pan troglodytes verus polyomavirus 2a | PtrovPyV2a | NC_025370.1 | NA | Cote d’Ivoire | 2010 | 5309 | P | [ | |
| 24 | Pan troglodytes verus polyomavirus 3 | PtrovPyV3 | NC_019855.1 | NA | Cote d’Ivoire | NA | 5333 | P | [ | |
| 25 | Pan troglodytes verus polyomavirus 4 | PtrovPyV4 | NC_019856.1 | NA | Cote d’Ivoire | NA | 5349 | P | [ | |
| 26 | Pan troglodytes verus polyomavirus 5 | PtrovPyV5 | NC_019857.1 | NA | Cote d’Ivoire | NA | 4994 | P | [ | |
| 27 | Pan troglodytes schweinfurthii polyomavirus 2 | PtrosPyV2 | NC_019858.1 | NA | Uganda | NA | 4970 | P | [ | |
| 28 | Pan troglodytes verus polyomavirus 1a | PtrovPyV1a | NC_025368.1 | NA | Cote d’Ivoire | 2009 | 5303 | P | [ | |
| 29 | Piliocolobus badius polyomavirus 2 | PbadPyV2 | NC_039051.1 | NA | Cote d’Ivoire | 2005 | 5148 | P | [ | |
| 30 | Piliocolobus rufomitratus polyomavirus 1 | PrufPyV1 | NC_019850.1 | NA | Cote d’Ivoire | NA | 5140 | P | [ | |
| 31 | raccoon polyomavirus | RacPyV | NC_023845.1 | raccoon | NA | USA | 2011 | 5016 | M | [ |
| 32 | RnorPyV1 | NC_027531.1 | spleen | Germany | 2005 | 5318 | M | [ | ||
| 33 | bat polyomavirus 3a-B0454 | BatPyV3a-B0454 | NC_038557.1 | spleen | French Guiana | 2011 | 5058 | M | [ | |
| 34 | Sumatran orang-utan polyomavirus | OraPyV-Sum | NC_028127.1 | blood | NA | NA | 5358 | P | [ | |
| 35 | Trichodysplasia spinulosa-associated polyomavirus | TSPyV | NC_014361.1 | skin | Netherlands | 2009 | 5232 | H | [ | |
| 36 | yellow baboon polyomavirus 1 | YbPyV1 | NC_025894.1 | spleen | Zambia | 2009 | 5064 | P | [ | |
| 37 | African elephant polyomavirus 1 | AelPyV1 | NC_022519.1 | protruding ulcerated fibroma | Denmark | 2011 | 5722 | M | [ | |
| 38 | BatPyV4a | BatPyV2c | NC_038558.1 | spleen | French Guiana | 2011 | 5371 | M | [ | |
| 39 | Myodes glareolus polyomavirus 1 | BVPyV | NC_028117.1 | blood serum and body fluids | Germany | 2013 | 5032 | M | [ | |
| 40 | bat polyomavirus 6a | BatPyV6a | NC_026762.1 | spleen | Indonesia | 2013 | 5019 | M | [ | |
| 41 | bat polyomavirus 6b | BatPyV6b | NC_026770.1 | spleen | Indonesia | 2012 | 5039 | M | [ | |
| 42 | bat polyomavirus 6c | BatPyV6c | NC_026769.1 | spleen | Indonesia | 2012 | 5046 | M | [ | |
| 43 | California sea lion polyomavirus 1 | SLPyV | NC_013796.1 | tongue | USA | 2006 | 5112 | M | [ | |
| 44 | CalbPyV1 | NC_019854.2 | NA | Germany | NA | 5013 | P | [ | ||
| 45 | CeryPyV1 | NC_025892.1 | NA | Cameroon | NA | 5189 | P | [ | ||
| 46 | vervet monkey polyomavirus 2 | VmPyV2 | NC_025896.1 | kidney | Zambia | 2009 | 5167 | P | [ | |
| 47 | CVPyV | NC_028119.1 | blood serum and body fluids | Germany | 2013 | 5024 | M | [ | ||
| 48 | bat polyomavirus 2a | BatPyV2a | NC_028122.1 | spleen | French Guiana | 2011 | 5201 | M | [ | |
| 49 | equine polyomavirus | EPyV | NC_017982.1 | eye | USA | 2003 | 4987 | M | [ | |
| 50 | BK polyomavirus | BKV; BKPyV | NC_001538.1 | NA | NA | NA | 5153 | H | [ | |
| 51 | KI polyomavirus | KIPyV | NC_009238.1 | NA | NA | NA | 5040 | H | [ | |
| 52 | JC polyomavirus | JCV; JCPyV | NC_001699.1 | NA | NA | NA | 5130 | H | [ | |
| 53 | Weddell seal polyomavirus | WsPyV | NC_032120.1 | kidney | Antarctica | 2014 | 5186 | M | NA | |
| 54 | simian virus 40 | SV40 | NC_001669.1 | NA | NA | NA | 5243 | P | [ | |
| 55 | Mastomys polyomavirus | MasPyV | NC_025895.1 | spleen | Zambia | 2009 | 4899 | M | [ | |
| 56 | MmelPyV1 | NC_026473.1 | salivary gland | France | 2014 | 5187 | M | [ | ||
| 57 | Miniopterus polyomavirus | MiniopterusPyV | NC_020069.1 | rectal swab | Kenya | 2006 | 5213 | M | [ | |
| 58 | mouse pneumotropic virus | MPtV | NC_001505.2 | NA | NA | NA | 4754 | M | [ | |
| 59 | Myotis polyomavirus | MyPyV | NC_011310.1 | NA | Canada | 2007 | 5081 | M | [ | |
| 60 | Pan troglodytes verus polyomavirus 8 | PtrovPyV8 | NC_028635.1 | Western chimpanzee | colon | Netherlands | 2014 | 5163 | P | [ |
| 61 | Pteronotus polyomavirus | PteronotusPyV | NC_020070.1 | oral swab | Guatemala | 2009 | 5136 | M | [ | |
| 62 | bat polyomavirus 2b | BatPyV2b | NC_028121.1 | spleen | French Guiana | 2011 | 5041 | M | [ | |
| 63 | rat polyomavirus 2 | RatPyV2 | NC_032005.1 | NA | USA | 2016 | 5108 | M | NA | |
| 64 | SsciPyV1 | NC_038559.1 | NA | Germany | NA | 5067 | P | NA | ||
| 65 | squirrel monkey polyomavirus | SquiPyV | NC_009951.1 | spleen | NA | NA | 5075 | P | [ | |
| 66 | alpaca polyomavirus | AlPyV | NC_034251.1 | NA | USA | 2014 | 5052 | M | [ | |
| 67 | WU polyomavirus | WUPyV | NC_009539.1 | NA | Australia | NA | 5229 | H | [ | |
| 68 | yellow baboon polyomavirus 2 | YbPyV2 | AB767295.2 | spleen and kidney | Zambia | 2009 | 5181 | P | [ | |
| 69 | Human polyomavirus 6 | HPyV6 | NC_014406.1 | skin | USA | 2009 | 4926 | H | [ | |
| 70 | Human polyomavirus 7 | HPyV7 | NC_014407.1 | skin | USA | 2009 | 4952 | H | [ | |
| 71 | MW polyomavirus | MWPyV | NC_018102.1 | stool | Malawi | 2008 | 4927 | H | [ | |
| 72 | STL polyomavirus | STLPyV | NC_020106.1 | fecal specimen | Malawi | NA | 4776 | H | [ | |
| 73 | Adélie penguin polyomavirus | ADPyV | NC_026141.2 | fecal material | Antarctica | 2012 | 4988 | A | [ | |
| 74 | budgerigar fledgling disease virus | BFDV | NC_004764.2 | Falconiformes and Psittaciformes (wild birds) | NA | NA | NA | 4981 | A | [ |
| 75 | butcherbird polyomavirus | Butcherbird PyV | NC_023008.1 | periocular skin | Australia | 2009 | 5084 | A | [ | |
| 76 | canary polyomavirus | CaPyV | NC_017085.1 | liver | Netherlands | 2007 | 5421 | A | [ | |
| 77 | crow polyomavirus | CpyV | NC_007922.1 | NA | NA | 2005 | 5079 | A | [ | |
| 78 | EgouPyV1 | NC_039052.1 | liver | Poland | 2014 | 5172 | A | [ | ||
| 79 | finch polyomavirus | FPyV | NC_007923.1 | NA | NA | 2005 | 5278 | A | [ | |
| 80 | goose hemorrhagic polyomavirus | GHPV | NC_004800.1 | goose | NA | Germany | 2001 | 5256 | A | [ |
| 81 | Hungarian finch polyomavirus | HunFPyV | NC_039053.1 | kidney and liver | Hungary | 2011 | 5284 | A | [ | |
| 82 | black sea bass-associated polyomavirus 1 | BassPyV1 | NC_025790.1 | NA | USA | 2014 | 7369 | F | [ | |
| 83 | bovine polyomavirus | BPyV | NC_001442.1 | kidney | NA | NA | 4697 | M | [ | |
| 84 | dolphin polyomavirus 1 | DPyV | NC_025899.1 | trachea | USA | 2010 | 5159 | M | [ | |
| 85 | giant guitarfish polyomavirus | GfPyV1 | NC_026244.1 | skin lesion | USA | 2014 | 3962 | F | [ | |
| 86 | sharp-spined notothenia polyomavirus | SspPyV | NC_026944.1 | NA | Antarctica | 2013 | 6219 | F | NA | |
No. 1~36: Alphapolyomaviruses; No. 37~68: Betaphapolyomaviruses; No. 69~72: Deltapolyomaviruses; No. 73~81: Gammapolyomaviruses; No. 82~86: Unassigned polyomaviruses; NA Not available
All 86 viruses were classified into 5 groups according to their host as follows: non-primate mammals (Group M); non-human primate (Group P); human (Group H); avian (Group A); fish (Group F)
Domains and motifs of PyVs used in this study
| No. | Abbr. | Accession no. | DnaJ domain | LXCXE motif | Helicase domain | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Start | End | nt length | Start | End | a.a. sequence | Start | End | nt length | |||
| 1 | BatPyV4a | NC_038556.1 | 12 | 67 | 168 | 107 | 111 | LRCDE | 405 | 564 | 480 |
| 2 | ApanPyV1 | NC_019853.1 | 12 | 77 | 198 | 122 | 126 | LFCNE | 441 | 601 | 483 |
| 3 | BatPyV5b-1 | NC_026767.1 | 12 | 74 | 189 | – | – | – | 376 | 536 | 483 |
| 4 | BatPyV5a | NC_026768.1 | 12 | 67 | 168 | – | – | – | 382 | 546 | 495 |
| 5 | OraPyV-Bor | NC_013439.1 | 12 | 77 | 198 | 122 | 126 | LFCDE | 422 | 602 | 543 |
| 6 | CardiodermaPyV | NC_020067.1 | 12 | 77 | 198 | 212 | 216 | LYCDE | 556 | 716 | 483 |
| 7 | BatPyV4b | NC_028120.1 | – | – | – | 152 | 156 | LLCEE | 458 | 651 | 582 |
| 8 | ChPyV | NC_014743.1 | 12 | 96 | 255 | – | – | – | 379 | 580 | 606 |
| 9 | VmPyV1 | NC_019844.1 | 12 | 80 | 207 | 107 | 111 | LHCNE | 479 | 640 | 486 |
| 10 | VmPyV3 | NC_025898.1 | 12 | 75 | 192 | 131 | 135 | LFCSE | 462 | 622 | 483 |
| 11 | EidolonPyV | NC_020068.1 | – | – | – | 236 | 240 | LRCDE | 588 | 752 | 495 |
| 12 | GgorgPyV1 | NC_025380.1 | – | – | – | 200 | 204 | LFCDE | 554 | 714 | 483 |
| 13 | HPyV9 | NC_015150.1 | 12 | 86 | 225 | 123 | 127 | LFCSE | 446 | 606 | 483 |
| 14 | HPyV12 | NC_020890.1 | – | – | – | – | – | – | 473 | 635 | 489 |
| 15 | MfasPyV1 | NC_019851.1 | 12 | 86 | 225 | 125 | 129 | LFCTE | 465 | 665 | 603 |
| 16 | MCPyV | NC_010277.2 | – | – | – | 212 | 216 | LFCDE | 567 | 727 | 483 |
| 17 | HaPyV | NC_001663.2 | – | – | – | 130 | 134 | LTCQE | 522 | 682 | 483 |
| 18 | BatPyV3b | NC_028123.1 | – | – | – | 107 | 111 | LYCDE | 467 | 630 | 492 |
| 19 | MPyV | NC_001515.2 | – | – | – | 142 | 146 | LFCYE | 549 | 709 | 483 |
| 20 | NJPyV | NC_024118.1 | 12 | 80 | 207 | 107 | 111 | LHCDE | 476 | 636 | 483 |
| 21 | OtomopsPyV | NC_020066.1 | 12 | 92 | 243 | 107 | 111 | LYCDE | 483 | 643 | 483 |
| 22 | OtomopsPyV1 | NC_020071.1 | – | – | – | 185 | 189 | LRCDE | 520 | 680 | 483 |
| 23 | PtrovPyV2a | NC_025370.1 | – | – | – | 200 | 204 | LFCDE | 556 | 716 | 483 |
| 24 | PtrovPyV3 | NC_019855.1 | 12 | 75 | 192 | – | – | – | 486 | 646 | 483 |
| 25 | PtrovPyV4 | NC_019856.1 | 12 | 75 | 192 | – | – | – | 489 | 646 | 474 |
| 26 | PtrovPyV5 | NC_019857.1 | 12 | 86 | 225 | 123 | 127 | LFCSE | 439 | 599 | 483 |
| 27 | PtrosPyV2 | NC_019858.1 | 12 | 85 | 222 | 108 | 112 | LYCSE | 432 | 632 | 603 |
| 28 | PtrovPyV1a | NC_025368.1 | – | – | – | 203 | 207 | LYCDE | 558 | 718 | 483 |
| 29 | PbadPyV2 | NC_039051.1 | 12 | 92 | 243 | 107 | 111 | LHCNE | 476 | 637 | 486 |
| 30 | PrufPyV1 | NC_019850.1 | 12 | 93 | 246 | 107 | 111 | LHCNE | 476 | 637 | 486 |
| 31 | RacPyV | NC_023845.1 | – | – | – | 167 | 171 | LFCEE | 504 | 685 | 546 |
| 32 | RnorPyV1 | NC_027531.1 | – | – | – | 128 | 132 | LYCSE | 535 | 698 | 492 |
| 33 | BatPyV3a-B0454 | NC_038557.1 | – | – | – | 107 | 111 | LHCHE | 477 | 637 | 483 |
| 34 | OraPyV-Sum | NC_028127.1 | 12 | 75 | 192 | – | – | – | 489 | 649 | 483 |
| 35 | TSPyV | NC_014361.1 | 12 | 77 | 198 | 122 | 126 | LFCHE | 445 | 605 | 483 |
| 36 | YbPyV1 | NC_025894.1 | 12 | 75 | 192 | 131 | 135 | LFCSE | 463 | 663 | 603 |
| 37 | AelPyV1 | NC_022519.1 | – | – | – | – | – | – | 400 | 564 | 495 |
| 38 | BatPyV2c | NC_038558.1 | – | – | – | 223 | 227 | LLCEE | 559 | 719 | 483 |
| 39 | BVPyV | NC_028117.1 | 12 | 67 | 168 | 146 | 150 | LTCHE | 383 | 574 | 576 |
| 40 | BatPyV6a | NC_026762.1 | – | – | – | 84 | 88 | LFCHE | 395 | 557 | 489 |
| 41 | BatPyV6b | NC_026770.1 | – | – | – | 98 | 102 | LFCHE | 407 | 570 | 492 |
| 42 | BatPyV6c | NC_026769.1 | – | – | – | 100 | 104 | LFCRE | 426 | 587 | 486 |
| 43 | SLPyV | NC_013796.1 | 12 | 77 | 198 | 113 | 117 | LHCHE | 397 | 556 | 480 |
| 44 | CalbPyV1 | NC_019854.2 | – | – | – | 100 | 104 | LFCNE | 410 | 570 | 483 |
| 45 | CeryPyV1 | NC_025892.1 | 12 | 75 | 192 | 105 | 109 | LFCHE | 402 | 562 | 483 |
| 46 | VmPyV2 | NC_025896.1 | 12 | 75 | 192 | 105 | 109 | LFCHE | 402 | 562 | 483 |
| 47 | CVPyV | NC_028119.1 | 12 | 67 | 168 | 145 | 149 | LSCNE | 382 | 573 | 576 |
| 48 | BatPyV2a | NC_028122.1 | 12 | 80 | 207 | – | – | – | 406 | 565 | 480 |
| 49 | EPyV | NC_017982.1 | 12 | 86 | 225 | 105 | 109 | LRCDE | 402 | 562 | 483 |
| 50 | BKPyV | NC_001538.1 | 12 | 75 | 192 | 105 | 109 | LFCHE | 402 | 562 | 483 |
| 51 | KIPyV | NC_009238.1 | – | – | – | 108 | 112 | LRCNE | 410 | 572 | 489 |
| 52 | JCPyV | NC_001699.1 | 12 | 75 | 192 | 105 | 109 | LFCHE | 401 | 561 | 483 |
| 53 | WsPyV | NC_032120.1 | 12 | 77 | 198 | 113 | 117 | LHCNE | 400 | 561 | 486 |
| 54 | SV40 | NC_001669.1 | 12 | 75 | 192 | 103 | 107 | LFCSE | 400 | 560 | 483 |
| 55 | MasPyV | NC_025895.1 | – | – | – | 101 | 105 | LFCNE | 414 | 576 | 489 |
| 56 | MmelPyV1 | NC_026473.1 | 12 | 80 | 207 | 111 | 115 | LRCDE | 365 | 559 | 585 |
| 57 | MiniopterusPyV | NC_020069.1 | 12 | 75 | 192 | 103 | 107 | LHCHE | 369 | 560 | 576 |
| 58 | MPtV | NC_001505.2 | – | – | – | 103 | 107 | LFCNE | 418 | 573 | 468 |
| 59 | MyPyV | NC_011310.1 | – | – | – | – | – | – | 441 | 603 | 489 |
| 60 | PtrovPyV8 | NC_028635.1 | 12 | 75 | 192 | 105 | 109 | LFCHE | 402 | 562 | 483 |
| 61 | PteronotusPyV | NC_020070.1 | 12 | 80 | 207 | 108 | 112 | LRCDE | 405 | 564 | 480 |
| 62 | BatPyV2b | NC_028121.1 | 12 | 80 | 207 | 108 | 112 | LRCDE | 406 | 617 | 636 |
| 63 | RatPyV2 | NC_032005.1 | 12 | 79 | 204 | 178 | 182 | LHCDE | 474 | 634 | 483 |
| 64 | SsciPyV1 | NC_038559.1 | – | – | – | 101 | 105 | LFCHE | 410 | 572 | 489 |
| 65 | SquiPyV | NC_009951.1 | – | – | – | 101 | 105 | LFCHE | 411 | 570 | 480 |
| 66 | AlPyV | NC_034251.1 | 12 | 67 | 168 | 107 | 111 | LYCNE | 407 | 567 | 483 |
| 67 | WUPyV | NC_009539.1 | 12 | 89 | 234 | 108 | 112 | LRCNE | 417 | 579 | 489 |
| 68 | YbPyV2 | AB767295.2 | 12 | 75 | 192 | 105 | 109 | LFCHE | 402 | 562 | 483 |
| 69 | HPyV6 | NC_014406.1 | – | – | – | 109 | 113 | LYCDE | 393 | 571 | 537 |
| 70 | HPyV7 | NC_014407.1 | – | – | – | 109 | 113 | LYCTE | 416 | 576 | 483 |
| 71 | MWPyV | NC_018102.1 | – | – | – | 105 | 109 | LSCNE | 421 | 580 | 480 |
| 72 | STLPyV | NC_020106.1 | 12 | 83 | 216 | 105 | 109 | LTCNE | 406 | 566 | 483 |
| 73 | ADPyV | NC_026141.2 | 8 | 61 | 162 | 69 | 73 | LYCEE | 408 | 582 | 525 |
| 74 | BFDV | NC_004764.2 | 6 | 82 | 231 | – | – | – | 372 | 532 | 483 |
| 75 | Butcherbird PyV | NC_023008.1 | 8 | 67 | 180 | 70 | 74 | LFCDE | 410 | 572 | 489 |
| 76 | CaPyV | NC_017085.1 | 8 | 61 | 162 | 67 | 71 | LSCNE | 390 | 550 | 483 |
| 77 | CpyV | NC_007922.1 | 11 | 80 | 210 | 69 | 73 | LQCEE | 405 | 569 | 495 |
| 78 | EgouPyV1 | NC_039052.1 | 8 | 75 | 204 | 70 | 74 | LYCEE | 374 | 572 | 597 |
| 79 | FPyV | NC_007923.1 | 6 | 70 | 195 | 60 | 64 | LFCDE | 382 | 543 | 486 |
| 80 | GHPV | NC_004800.1 | 8 | 81 | 222 | 65 | 69 | LFCDE | 404 | 599 | 588 |
| 81 | HunFPyV | NC_039053.1 | 6 | 77 | 216 | 60 | 64 | LFCDE | 382 | 543 | 486 |
| 82 | BassPyV1 | NC_025790.1 | – | – | – | 105 | 109 | LMCGE | 338 | 495 | 474 |
| 83 | BPyV | NC_001442.1 | 10 | 73 | 192 | 93 | 97 | LHCDE | 391 | 586 | 588 |
| 84 | DPyV | NC_025899.1 | 11 | 77 | 201 | 82 | 86 | LYCDE | 357 | 536 | 540 |
| 85 | GfPyV1 | NC_026244.1 | – | – | – | – | – | – | 348 | 517 | 510 |
| 86 | SspPyV | NC_026944.1 | – | – | – | – | – | – | 372 | 529 | 474 |
ScanProsite results together with ProRule-based predicted intra-domain features were used for functional domains retained in LT-Ag of PyVs. LXCXE motifs and their encoding sequences were extracted through the JAVA programming
Fig. 1Phylogenetic trees of PyV LT-Ag genes. PyVs were classified according to the host species (mammal, avian, and fish) in the ML-based trees constructed using nucleotide sequences of LT-Ag coding genes, DnaJ domains, and helicase domains (Alphapolyomaviruses []; Betaapolyomaviruses []; Deltapolyomaviruses []; Gammapolyomaviruses []; unassigned [])
Nucleotide compositions of the LT-Ag genes of 86 polyomaviruses
CAIH: result of comparison with Homo sapiens as reference set
dashed line: avain polyomaviruses, solid line: fish polyomaviruses
Fig. 2Compositional features of nucleotide sequences of LT-Ag coding genes, DnaJ domains, and helicase domains. a Nucleotide distribution of A, C, U, and G. b Distribution frequency calculated only for the third codon base. c GC and AT content at all codon positions (GC% and AT%) and at the third position (GC3s and AT3s)
Fig. 3The range of ENC values of the LT-Ag genes and two functional domains. The cross (×) indicates the mean ENC value, and the dot (•) indicates the minimum/maximum ENC value of the LT-Ag genes and two domains within LT-Ag. Each group, which we classified by host, was composed of 9 (Group A), 3 (Group F), 13 (Group H), 36 (Group M), and 25 (Group P) nucleotide sequence data of LT-Ag genes and helicase domains. DnaJ domains were not identified in 32 protein sequences, including 3 fish PyVs; thus, a total of 54 sequence data were used for the analysis
Nucleotide diversity, selection pressure, and neutrality tests of the LT-Ag genes and two domains of the PyV groups
| Genetic variability | Neutrality tests | Selection pressure | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Region | Group | m | n | S | η | k | π | Tajima’s D | Fu and Li’s D | Fu and Li’s F | dN/dS |
| LT-Ag | All | 86 | 944 | 837 | 2129 | 418.245 | 0.44306 | −0.04390ns | 1.45113ns | 0.96702ns | 2.163 |
| Group A | 9 | 1725 | 1283 | 2383 | 737.889 | 0.42776 | −0.82814ns | 0.0858ns | −0.15345ns | 0.282 | |
| Group F | 3 | 1657 | 1209 | 1522 | 910.333 | 0.54939 | NA | NA | NA | 0.684 | |
| Group H | 13 | 1648 | 1336 | 2725 | 725.192 | 0.44004 | −0.80590ns | 0.16114ns | −0.11521ns | 1.673 | |
| Group M | 36 | 1404 | 1205 | 2813 | 615.989 | 0.43874 | −0.35097ns | 0.89680ns | 0.54139ns | 0.523 | |
| Group P | 25 | 1602 | 1268 | 2653 | 666.147 | 0.41582 | − 0.20916ns | 0.88010ns | 0.62234ns | 0.318 | |
| DnaJ domain | All | 54 | 160 | 144 | 352 | 71.204 | 0.44503 | −0.28170ns | 1.14715ns | 0.71186ns | 0.261 |
| Group A | 9 | 162 | 119 | 214 | 68.083 | 0.42027 | −0.70347ns | 0.14282ns | −0.07065ns | 0.298 | |
| Group H | 7 | 192 | 146 | 237 | 82.143 | 0.42783 | −0.88626ns | −0.18339ns | − 0.37879ns | 0.417 | |
| Group M | 19 | 162 | 136 | 277 | 63.474 | 0.39181 | −0.83778ns | 0.31536ns | −0.03513ns | 0.289 | |
| Group P | 19 | 192 | 153 | 291 | 78.585 | 0.4093 | −0.23632ns | 0.71490ns | 0.50101ns | 0.262 | |
| Helicase domain | All | 86 | 424 | 348 | 827 | 165.867 | 0.3912 | 0.02756ns | 1.22733ns | 0.85387ns | 0.316 |
| Group A | 9 | 471 | 288 | 499 | 159.361 | 0.33835 | −0.68870ns | 0.11803ns | −0.08782ns | 0.150 | |
| Group F | 3 | 453 | 285 | 345 | 210 | 0.46358 | NA | NA | NA | 0.379 | |
| Group H | 13 | 477 | 326 | 632 | 174.667 | 0.36618 | −0.65740ns | 0.21440ns | −0.02451ns | 0.260 | |
| Group M | 36 | 447 | 346 | 738 | 170.876 | 0.38227 | −0.15171ns | 0.86494ns | 0.60171ns | 0.503 | |
| Group P | 25 | 471 | 317 | 619 | 161.56 | 0.34301 | −0.05815ns | 0.97206ns | 0.75361ns | 0.142 | |
m, number of sequences used; n, total number of sites (excluding sites with gaps/missing data); S, number of segregating sites; η, total number of mutations; k, average number of pairwise nucleotide differences; π, nucleotide diversity; dS, average number of synonymous substitutions per site; dN, average number of non-synonymous substitutions per site; NA, not available due to limited sequences for analysis of the gene-specific sequence dataset; ns, not significant
Fig. 4The relationship between ENC and GC3 (NC plot). ENC were plotted against GC content at the third codon position. The expected ENC from GC3 are shown as a solid line
Fig. 5PR2-bias plot analysis. A3/(A3 + T3) were plotted against G3/(G3 + C3). The A3 content is greater than T3, and the G3 content is greater than C3 in CDS of LT-Ag genes, DnaJ domains, and helicase domains from different host species. These LT-Ag genes and their retained domains prefer to use the T-end and G-end codons
Fig. 6Neutrality plot of GC12 vs. GC3. GC12 were plotted against GC3. GC12 is the ordinate, and GC3 is the abscissa, so each point in the figure represents one LT-Ag gene from a different host organism. The neutrality plotting results for LT-Ag genes show that the distribution of GC12 is relatively concentrated, GC3 is during 0.171 (Delphinus delphis [short-beaked common dolphin]) to 0.596 (Pygoscelis adeliae [Adélie penguin]). Neutrality plotting results for two functional domains also show that the distribution of GC12 is relatively concentrated, while GC3 is incompactly dispersed in the range of 0.175 (Pongo pygmaeus [Bornean orangutan]) to 0.646 (Pygoscelis adeliae [Adélie penguin]) for DnaJ domains and 0.128 (Delphinus delphis [short-beaked common dolphin]) to 0.606 (Pygoscelis adeliae [Adélie penguin]) for helicase domains
Fig. 7RSCU analysis of PyVs. There is variation in the differences between the codon preferences of the five groups in terms of the LT-Ag genes. We can see that there are relatively large differences among groups in the RSCU values of specific codons, such as codon AGA(arg) and TTA(leu)
Fig. 8Difference in RSCU values of 86 PyVs. Respective RSCU of the 86 LT-Ag coding genes, 54 DnaJ domain coding sequences, and 86 helicase coding sequences. All RSCU values are shown in the chromaticity diagram via chromaticity co-ordinates. The chrominance difference enables visual comparison of large data sets with various host species
RSCU distances of the host pairs calculated from the RSCU values for the abundant codons (RSCU ≥1.6) in the LT-Ag genes and two domains of PyVs
| Region | Host pairs | RSCU distances witin host pairs for abundant codons (RSCU≥1.6) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| TTT | TTA | ATT | TCT | CCT | ACT | GCT | AGA | AGG | Avg. | ||
| LT-Ag | A–F | 0.082 | 0.165b | 0.406 | 0.676 | 0.134 | 0.244 | 0.111 | 0.521 | 0.780 | 0.346 |
| A–H | 0.558a | 1.593a | 0.522 | 0.680a | 0.301 | 0.764a | 0.507 | 2.855a | 0.749 | 0.948 | |
| A–M | 0.435 | 1.169 | 0.512 | 0.334 | 0.257 | 0.471 | 0.204 | 2.355 | 0.645 | 0.709 | |
| A–P | 0.460 | 1.335 | 0.572a | 0.601 | 0.385a | 0.707 | 0.279 | 2.731 | 0.785a | 0.873 | |
| F–H | 0.477 | 1.428 | 0.117 | 0.004b | 0.167 | 0.520 | 0.618a | 2.335 | 0.032 | 0.633 | |
| F–M | 0.353 | 1.004 | 0.107 | 0.342 | 0.123 | 0.227 | 0.315 | 1.834 | 0.135 | 0.493 | |
| F–P | 0.378 | 1.170 | 0.166 | 0.074 | 0.251 | 0.463 | 0.389 | 2.210 | 0.005b | 0.567 | |
| H–M | 0.123 | 0.424 | 0.010b | 0.346 | 0.044b | 0.293 | 0.303 | 0.501 | 0.104 | 0.239 | |
| H–P | 0.098 | 0.259 | 0.050 | 0.079 | 0.083 | 0.057b | 0.228 | 0.125b | 0.036 | 0.113 | |
| M–P | 0.025b | 0.166 | 0.060 | 0.267 | 0.127 | 0.236 | 0.075b | 0.376 | 0.140 | 0.164 | |
| DnaJ | A–H | 1.682a | 1.064 | 0.224b | 1.896a | 0.542 | 1.580 | 0.395 | 3.157a | 0.151 | 1.188 |
| A–M | 0.511 | 0.662 | 0.561 | 1.566 | 0.669 | 1.668a | 0.651 | 2.874 | 0.012b | 1.019 | |
| A–P | 1.476 | 1.230a | 0.788a | 1.664 | 1.304a | 1.212 | 0.786a | 2.447 | 0.449 | 1.262 | |
| H–M | 1.171 | 0.402 | 0.338 | 0.330 | 0.127b | 0.088b | 0.256 | 0.283b | 0.139 | 0.348 | |
| H–P | 0.207b | 0.166b | 0.564 | 0.232 | 0.762 | 0.368 | 0.391 | 0.711 | 0.600a | 0.444 | |
| M–P | 0.965 | 0.568 | 0.226 | 0.098b | 0.635 | 0.456 | 0.135b | 0.427 | 0.461 | 0.441 | |
| Helicase | A–F | 0.142 | 0.494 | 0.489 | 0.633 | 0.653 | 0.003b | 0.168 | 0.128 | 0.716 | 0.381 |
| A–H | 0.416 | 1.739a | 0.492 | 0.435 | 0.106b | 0.946 | 0.735a | 2.628a | 1.044a | 0.949 | |
| A–M | 0.386 | 1.074 | 0.647a | 0.059 | 0.184 | 0.837 | 0.053b | 2.052 | 0.737 | 0.670 | |
| A–P | 0.363 | 1.349 | 0.594 | 0.049 | 0.445 | 1.013 | 0.599 | 2.502 | 0.949 | 0.873 | |
| F–H | 0.558a | 1.245 | 0.002b | 1.069a | 0.546 | 0.949 | 0.568 | 2.500 | 0.328 | 0.863 | |
| F–M | 0.528 | 0.580 | 0.158 | 0.574 | 0.836 | 0.840 | 0.115 | 1.924 | 0.021b | 0.620 | |
| F–P | 0.505 | 0.854 | 0.104 | 0.585 | 1.098a | 1.016a | 0.431 | 2.374 | 0.232 | 0.800 | |
| H–M | 0.029 | 0.665 | 0.155 | 0.495 | 0.290 | 0.110 | 0.682 | 0.576 | 0.307 | 0.368 | |
| H–P | 0.053 | 0.390 | 0.102 | 0.484 | 0.552 | 0.066 | 0.136 | 0.126b | 0.096 | 0.223 | |
| M–P | 0.024b | 0.274b | 0.053 | 0.011b | 0.262 | 0.176 | 0.546 | 0.450 | 0.212 | 0.223 | |
A–F avian–fish, A–H avian–human, A–M avian–non-primate mammals, A–P avian–non-human primate, F–H fish–human, F–M fish–non-primate mammals, F–P fish–non-human primate, H–M human–non-primate mammals, H–P human–non-human primate, M–P non-primate mammals–non-human primate; alargest RSCU distances among the host pairs for the corresponding codon; bsmallest RSCU distances among the host pairs for the corresponding codon
Fig. 9Mean RSCU distances of the host pairs calculated from the RSCU values for the abundant codons (RSCU ≥1.6) in the LT-Ag genes and two domains of PyVs
Fig. 10Correspondence analysis results for the RSCU values of strongly preferred codons in 86 PyVs (COA-RSCU). The COA results for over-represented codons (RSCU > 1.6) for five groups are shown in scatter plots b-f for groups A, F, H, M, and P, respectively. The plot dot distribution patterns of groups A and F vs. groups H, M, and P were compared (a). Overall, the plotted dots show high similarity in terms of distribution patterns in all groups, with a scattered range (− 0.2 to + 0.3, − 0.4 to + 0.4). Specifically, two dots plotted over the range were identified as LT-Ag genes for BFDV and ADPyV, and thus they can be seen to vary in terms of codon usage patterns. They are all avian polyomaviruses belonging to group A, and host organisms are wild birds and Pygoscelis adeliae (Adélie penguin) (a)