| Literature DB >> 31581775 |
Chengxin Zhang, Lydie Lane1,2, Gilbert S Omenn, Yang Zhang.
Abstract
In 2018, we reported a hybrid pipeline that predicts protein structures with I-TASSER and function with COFACTOR. I-TASSER/COFACTOR achieved Gene Ontology (GO) high prediction accuracies of Fmax = 0.69 and 0.57 for molecular function (MF) and biological process (BP), respectively, on 100 comprehensively annotated proteins. Now we report blinded analyses of newly annotated proteins in the critical assessment of function annotation (CAFA) three function prediction challenge and in neXtProt. For CAFA3 results released in May 2019, our predictions on 267 and 912 human proteins with newly annotated MF and BP terms achieved Fmax = 0.50 and 0.42, respectively, on "No Knowledge" proteins, and 0.51 and 0.74, respectively, on "Limited Knowledge" proteins. While COFACTOR consistently outperforms simple homology-based analysis, its accuracy still depends on template availability. Meanwhile, in neXtProt 2019-01, 25 proteins acquired new function annotation through literature curation at UniProt/Swiss-Prot. Before the release of these curated results, we submitted to neXtProt blinded predictions of free-text function annotation based on predicted GO terms. For 10 of the 25, a good match of free-text or GO term annotation was obtained. These blind tests represent rigorous assessments of I-TASSER/COFACTOR. neXtProt now provides links to precomputed I-TASSER/COFACTOR predictions for proteins without function annotation to facilitate experimental planning on "dark proteins".Entities:
Keywords: COFACTOR; CP50 challenge; I-TASSER; critical assessment of function annotation (CAFA) 3; neXtProt; structure-based function annotation; uncharacterized human proteins validated at protein level (uPE1)
Year: 2019 PMID: 31581775 PMCID: PMC6900986 DOI: 10.1021/acs.jproteome.9b00537
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466