| Literature DB >> 29741643 |
Petri Törönen1, Alan Medlar1,2, Liisa Holm1,3.
Abstract
The unprecedented growth of high-throughput sequencing has led to an ever-widening annotation gap in protein databases. While computational prediction methods are available to make up the shortfall, a majority of public web servers are hindered by practical limitations and poor performance. Here, we introduce PANNZER2 (Protein ANNotation with Z-scoRE), a fast functional annotation web server that provides both Gene Ontology (GO) annotations and free text description predictions. PANNZER2 uses SANSparallel to perform high-performance homology searches, making bulk annotation based on sequence similarity practical. PANNZER2 can output GO annotations from multiple scoring functions, enabling users to see which predictions are robust across predictors. Finally, PANNZER2 predictions scored within the top 10 methods for molecular function and biological process in the CAFA2 NK-full benchmark. The PANNZER2 web server is updated on a monthly schedule and is accessible at http://ekhidna2.biocenter.helsinki.fi/sanspanz/. The source code is available under the GNU Public Licence v3.Entities:
Mesh:
Year: 2018 PMID: 29741643 PMCID: PMC6031051 DOI: 10.1093/nar/gky350
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Feature comparison between selected annotation servers. DE prediction stands for free text protein descriptions. Last database update is taken from explicit statements on annotation servers (at time of writing 22/03/18)
| Server | GO prediction | DE prediction | >1000 query sequences | Prob. estimate | Open source | Last database update/update schedule |
|---|---|---|---|---|---|---|
| PANNZER2 | Yes | Yes | Yes | Yes | Yes | Monthly (synchronised with UniProt) |
| ARGOT | Yes | No | Yes | No | No | 11/2016 |
| PFP | Yes | No | No | yes | No | Unknown |
| FunFam | Yes | No | No | No | Data can be downloaded | Daily |
| INGA | Yes | No | No | Yes | No | 04/2015 |
| eggNOG | Yes | Keyword | Yes | No | Yes | 11/2017 |
| dcGO | Yes | No | Error | Yes | No | 06/2016* |
An asterisk (*) following the last database update indicates that timestamps for database files were used instead. Timestamps are conservative because the data might be older than the timestamp suggests.
Figure 1.Comparison of query throughput. We first show query throughput for combined sequence search and annotation steps for PANNZER2, eggNOG-mapper, ARGOT2 and BLAST2GO. We also show separate speeds for the annotation and BLAST steps for ARGOT2 and BLAST2GO. Notice that PANNZER2 and eggNOG-mapper outperform even the annotation step in ARGOT2 and BLAST2GO.
Comparison of PANNZER2, using the ARGOT scoring function, ARGOT2 and eggNOG-mapper. Tests were repeated by a) omitting annotations with IEA, ISS and ND evidence codes and b) using all GO annotations. Evaluation was repeated using 5000 query subsets of the test data to allow for comparison with ARGOT2. We show results with Fmax and with Smin. Note that higher values of Fmax and lower values of Smin show better performance. PANNZER2 outperforms both eggNOG-mapper and ARGOT2 methods consistently
| Comparisons with the whole dataset | Comparisons with subsets of the data | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Truth Set = > | No IEA, ISS, ND | All evidence codes | No IEA, ISS, ND | All evidence codes | |||||||
| Ontology | Metric | PANZ2 | eggNOG | PANZ2 | eggNOG | PANZ2 | eggNOG | ARGOT2 | PANZ2 | eggNOG | ARGOT2 |
| BP | Fmax | 0.699 | 0.615 | 0.786 | 0.640 | 0.700 | 0.613 | 0.608 | 0.784 | 0.629 | 0.682 |
| MF | Fmax | 0.708 | 0.640 | 0.867 | 0.591 | 0.708 | 0.641 | 0.649 | 0.858 | 0.591 | 0.777 |
| CC | Fmax | 0.823 | 0.752 | 0.863 | 0.774 | 0.820 | 0.749 | 0.757 | 0.853 | 0.773 | 0.776 |
| BP | Smin | 31.401 | 45.918 | 27.643 | 45.376 | 30.264 | 45.920 | 38.375 | 27.474 | 46.408 | 42.483 |
| MF | Smin | 9.597 | 12.942 | 6.701 | 15.995 | 9.682 | 12.890 | 11.609 | 7.196 | 16.06 | 11.946 |
| CC | Smin | 9.415 | 14.053 | 7.917 | 14.114 | 9.645 | 14.184 | 13.418 | 8.692 | 14.401 | 15.587 |