| Literature DB >> 15582660 |
L Y Han1, C Z Cai, Z L Ji, Y Z Chen.
Abstract
The function of a substantial percentage of the putative protein-coding open reading frames (ORFs) in viral genomes is unknown. As their sequence is not similar to that of proteins of known function, the function of these ORFs cannot be assigned on the basis of sequence similarity. Methods complement or in combination with sequence similarity-based approaches are being explored. The web-based software SVMProt (http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi) to some extent assigns protein functional family irrespective of sequence similarity and has been found to be useful for studying distantly related proteins [Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X., Chen, Y.Z., 2003. SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 31(13): 3692-3697]. Here 25 novel viral proteins are selected to test the capability of SVMProt for functional family assignment of viral proteins whose function cannot be confidently predicted on by sequence similarity methods at present. These proteins are without a sequence homolog in the Swissprot database, with its precise function provided in the literature, and not included in the training sets of SVMProt. The predicted functional classes of 72% of these proteins match the literature-described function, which is compared to the overall accuracy of 87% for SVMProt functional class assignment of 34582 proteins. This suggests that SVMProt to some extent is capable of functional class assignment irrespective of sequence similarity and it is potentially useful for facilitating functional study of novel viral proteins.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15582660 PMCID: PMC7111859 DOI: 10.1016/j.virol.2004.10.020
Source DB: PubMed Journal: Virology ISSN: 0042-6822 Impact factor: 3.616
Novel viral proteins, literature-described functional indications as suggested from experiment and/or sequence analysis, and SVMProt predicted functions
| Protein (SwiMSProt or NCBI accession number) | Virus | Literature-described function (reference) | Function characterized by SVMProt (probability of correct characterization | Predict on status |
|---|---|---|---|---|
| ADOMetase (P07693) | Bacteriophage T3 | Adenosylmethionine hydrolase (EC 3.3.1.2) ( | EC 3.3: hydrolase of ether bonds (99.0%); EC 2.7: transferase of phosphorus-containing groups (71.3%); DNA-binding proteins (65.4%); | M |
| AGT (P04519) | Enterobacteria phage T4 | DNA alpha-glucosyltransferase (EC 2.4.1.26) ( | EC 2.4: glycosyltransferase (80.4%); EC 2.7: transferase of phosphorus-containing groups (68.5%) | M |
| BGT (P04547) | Enterobacteria phage T4 | DNA beta-glucosyltransferase (EC 2.4.1.27) ( | EC 2.4: glycosyltransferases (95.7%); EC 2.5: transferase of alkyl or aryl groups, other than methyl groups (80.4 %) | M |
| DNA-directed RNA polymerase (P42488) | African swine fever virus (strain BA71V) | DNA-directed RNA polymerase, subunit 10 homolog (EC 2.7.7.6) ( | EC 2.7: transferase of phosphorus-containing groups (99.0%) | M |
| DNK (P04531) | Enterobacteria phage T4 | dNMPkinase (EC 2.7.4.13) ( | EC 2.7: transferase of phosphorus-containing groups (99.0%); EC 2.4: glycosyltransferase (96.4%); EC 1.1: oxidoreductase of the CH–OH group of donors (71.3%) | M |
| Endonuclease II (P07059) | Enterobacteria phage T4 | Endonuclease II (EC 3.1.21.1) ( | EC 3.1: hydrolase of ester bonds (99.0%) | M |
| Endonuclease IV (P39250) | Enterobacteria phage T4 | Endonuclease IV (EC 3.1.21.-) ( | No function predicted | NM |
| Endonuclease V (P04418) | Enterobacteria phage T4 | Endonuclease V (EC 3.1.25.1) ( | EC 3.1: hydrolase of ester bonds (99.0%) | M |
| Exonuclease (P03697) | Bacteriophage lambda | Exonuclease (EC 3.1.11.3) ( | EC 3.1: hydrolase of ester bonds (99.0%); EC 4.1: carbon–carbon lyases (88.1%); EC 2.7: transferase of phosphorus-containing groups (68.5%); EC 1.1: oxidoreductase of the CH–OH group of donors (58.6%) | M |
| FALPE (Q65010) | Amsacta moorei Entomopoxvirus | Associated with unique cytoplasmic structures, filament-associated protein ( | No function predicted | NM |
| Gp61.9 (P13312) | Enterobacteria phage T4 | Ribonuclease (EC 3.1.-.-) ( | EC 3.1: hydrolase of ester bonds (99.0%) | M |
| IRF protein (P13299) | Enterobacteria phage T4 | Intron-associated endonuclease 1 (EC 3.1.-.-) ( | EC 3.1: hydrolase of ester bonds (99.0 %); DNA-binding protein (83.9%) | M |
| I-TevII (P07072) | Enterobacteria phage T4 | Intron-associated endonuclease 2 (EC 3.1.-.-) ( | EC 3.1: hydrolase of ester bonds (99.0%) | M |
| MotA protein (P22915) | bacteriophage T4 | DNA-binding, transcription regulation ( | DNA-binding proteins (99.0 %); EC 3.1: hydrolase acting on ester bonds (68.5%) | M |
| ORF13 (P51715) | Haemophilus phage HP1 | Putative adenine-specific methylase (EC 2.1.1.72) ( | EC 2.1: transferase of one-carbon groups (99.0%); outer membrane (58.6%); mRNA-binding protein (58.6%) | M |
| Outer capsid protein VP4 (P35746) | Bovine rotavirus (serotype 10/strain B223) | surface outer capsid protein ( | Coat protein (99.0%) | M |
| Possible CC chemokine (NP_042976) | Human herpesvirus 6 | chemokine like ( | No function predicted | NM |
| Protein kinase (P00513) | Enterobacteria phage T7 | Protein kinase (EC 2.7.1.37) ( | EC 2.7: transferase of phosphorus-containing groups (99.0 %) | M |
| Putative BARF0 protein (Q8AZJ4) | Epstein–Barr virus | Membrane associated and encodes three arginine-rich motifs of RNA-binding properties ( | EC 4.1.-.-: carbon–carbon lyase (58.6%) | NM |
| R.CviAII (P31117) | Paramecium bursaria Chlorella virus 1 | Endonuclease CviAII (EC 3.1.21.4) ( | EC 3.1: hydrolase of ester bonds (99.0%) | M |
| R.CviJI (P52283) | Chlorella virus IL3A | Type II restriction enzyme CviJI (EC 3.1.21.4) ( | EC 3.1: hydrolase of ester bonds (99.0%); rRNA-binding proteins (98.8%); EC 3.4: peptidase (68.5%) | M |
| SeMNPV ORF18 (AAF33548) | Spodoptera exigua nucleopolyhedrovirus | Transferase ( | No function predicted | NM |
| SPLT13 (NP_258405) | SpLtMNPV virus | A noval envelope protein ( | No function predicted | NM |
| TRL10 (AAL27474) | Human cytomegalovirus (HCMV) | Structural envelop glycoprotein ( | Transmembrane (98.2%) | NM |
The SVMProt predicted functions are categorized in one of the four classes: The first class is M (matched), in which all of the literature-described functional indications are predicted. The second is PM (partially matched), in which some of the literature-described functional indications are predicted. The third is WC (weakly consistent), in which some of the predicted functions can be considered to be consistent with literature-described functional indications on an inconclusive basis. The fourth is NM (not matched), in which No function predicted of the literature-described functions matched or consistent with a predicted function.