| Literature DB >> 26139634 |
Sayoni Das1, David Lee1, Ian Sillitoe1, Natalie L Dawson1, Jonathan G Lees1, Christine A Orengo1.
Abstract
MOTIVATION: Computational approaches that can predict protein functions are essential to bridge the widening function annotation gap especially since <1.0% of all proteins in UniProtKB have been experimentally characterized. We present a domain-based method for protein function classification and prediction of functional sites that exploits functional sub-classification of CATH superfamilies. The superfamilies are sub-classified into functional families (FunFams) using a hierarchical clustering algorithm supervised by a new classification method, FunFHMMer.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26139634 PMCID: PMC4612221 DOI: 10.1093/bioinformatics/btv398
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Use of SDPs by FunFHMMer to infer functional coherence of cluster alignments. The coloured circles represent the node sequence clusters and each colour denotes a unique function. The schematic representation of the parent node MSA and the child nodes MSA is shown along with the phylogenetic tree. Child nodes are separated by a dashed line. Conserved positions in the MSA are shown in red and the SDPs are shown in green or yellow for different child nodes
Fig. 2.Function prediction using CATH FunFams. Workflow for making function predictions using CATH Functional Families
Fig. 3.EC number variation across protein classifications. Percentage of families or superfamilies having a certain number of EC terms for each of the domain-based protein classifications
Fig. 4.UniProt rollback assessment. Performance of FunFHMMer protocol on the UniProtKB/Swiss-Prot rollback assessment dataset compared with functional annotations predicted by DFX protocol, Pfam (native) family and CDD family assignments
Fig. 5.Network representation of the HUP Superfamily (CATH 3.40.50.620) showing available functional annotations in FunFams. The coloured nodes indicate FunFams annotated with different EC numbers and the grey nodes indicate FunFams without any Enzyme Commission (EC) annotation which include non-enzymes