Chuming Chen1, Hongzhan Huang1, Raja Mazumder2, Darren A Natale3, Peter B McGarvey3, Jian Zhang3, Shawn W Polson1, Yuqi Wang1, Cathy H Wu4. 1. Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA. 2. Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA. 3. Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA. 4. Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE 19711, USA Protein Information Resource, Georgetown University Medical Center, Washington, DC 20007, USA.
Abstract
MOTIVATION: The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uniformly defined, viral proteomes pose special challenges in this regard. Grouping viruses based on the similarity of their proteins at proteome scale can normalize against potential taxonomic nomenclature anomalies. RESULTS: We present Viral Reference Proteomes (Viral RPs), which are computed from complete virus proteomes within UniProtKB. Viral RPs based on 95, 75, 55, 35 and 15% co-membership in proteome similarity based clusters are provided. Comparison of our computational Viral RPs with UniProt's curator-selected Reference Proteomes indicates that the two sets are consistent and complementary. Furthermore, each Viral RP represents a cluster of virus proteomes that was consistent with virus or host taxonomy. We provide BLASTP search and FTP download of Viral RP protein sequences, and a browser to facilitate the visualization of Viral RPs. AVAILABILITY AND IMPLEMENTATION: http://proteininformationresource.org/rps/viruses/ CONTACT: chenc@udel.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: The enormous number of redundant sequenced genomes has hindered efforts to analyze and functionally annotate proteins. As the taxonomy of viruses is not uniformly defined, viral proteomes pose special challenges in this regard. Grouping viruses based on the similarity of their proteins at proteome scale can normalize against potential taxonomic nomenclature anomalies. RESULTS: We present Viral Reference Proteomes (Viral RPs), which are computed from complete virus proteomes within UniProtKB. Viral RPs based on 95, 75, 55, 35 and 15% co-membership in proteome similarity based clusters are provided. Comparison of our computational Viral RPs with UniProt's curator-selected Reference Proteomes indicates that the two sets are consistent and complementary. Furthermore, each Viral RP represents a cluster of virus proteomes that was consistent with virus or host taxonomy. We provide BLASTP search and FTP download of Viral RP protein sequences, and a browser to facilitate the visualization of Viral RPs. AVAILABILITY AND IMPLEMENTATION: http://proteininformationresource.org/rps/viruses/ CONTACT: chenc@udel.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Tatiana Tatusova; Stacy Ciufo; Scott Federhen; Boris Fedorov; Richard McVeigh; Kathleen O'Neill; Igor Tolstoy; Leonid Zaslavsky Journal: Nucleic Acids Res Date: 2014-12-15 Impact factor: 16.971
Authors: Chuming Chen; Darren A Natale; Robert D Finn; Hongzhan Huang; Jian Zhang; Cathy H Wu; Raja Mazumder Journal: PLoS One Date: 2011-04-27 Impact factor: 3.240
Authors: Maria Dahlin; Stephanie S Singleton; John A David; Atin Basuchoudhary; Ronny Wickström; Raja Mazumder; Stefanie Prast-Nielsen Journal: EBioMedicine Date: 2022-05-19 Impact factor: 11.205