| Literature DB >> 24564915 |
Alfredo Benso, Stefano Di Carlo, Hafeez Ur Rehman, Gianfranco Politano, Alessandro Savino, Prashanth Suravajhala.
Abstract
BACKGROUND: Today large scale genome sequencing technologies are uncovering an increasing amount of new genes and proteins, which remain uncharacterized. Experimental procedures for protein function prediction are low throughput by nature and thus can't be used to keep up with the rate at which new proteins are discovered. On the other hand, proteins are the prominent stakeholders in almost all biological processes, and therefore the need to precisely know their functions for a better understanding of the underlying biological mechanism is inevitable. The challenge of annotating uncharacterized proteins in functional genomics and biology in general motivates the use of computational techniques well orchestrated to accurately predict their functions.Entities:
Year: 2013 PMID: 24564915 PMCID: PMC3909112 DOI: 10.1186/1477-5956-11-S1-S1
Source DB: PubMed Journal: Proteome Sci ISSN: 1477-5956 Impact factor: 2.480
Figure 1An example of .
An example of Baker Yeast's Hypothetical Proteins conserved with different motifs.
| Hypothetical Proteins | Motif Pattern and Profiles Conserved | |
|---|---|---|
| 1 | YIL169C | Chemotaxis Transduce 2 |
| T SNARE | ||
| 2 | Truncated TYB | INTEGRASE |
| ASP PROTEASE |
Figure 2High-level view of the information flow of the proposed protein annotation pipeline.
Figure 3An example of context similarity score based on Gene Ontology for .
Figure 4Schematic view of predicted annotations, true annotations, and full annotation set, along with the concept of unpredictable FN.
Precision, Recall, Accuracy, and F1 for S. Cerevisiae and Homo sapiens datasets under different PISth and high GOSS values
|
| Dataset | Precision % | Recall % | Accuracy % | F1 % | |
|---|---|---|---|---|---|---|
| 0.0 | 0.99 | Homo sapiens | 29.05 | 88.30 | 77.50 | 43.72 |
| S. Cerevisiae | 05.78 | 92.92 | 66.16 | 10.87 | ||
| 0.25 | 0.99 | Homo sapiens | 83.34 | 79.71 | 82.65 | 81.48 |
| S. Cerevisiae | 75.30 | 86.56 | 81.16 | 79.21 | ||
| 0.50 | 0.99 | Homo sapiens | 85.38 | 79.11 | 81.95 | 82.12 |
| S. Cerevisiae | 79.73 | 81.76 | 80.06 | 80.73 |
Cerevisiae and Homo sapiens datasets under different PISth and high GOSS values
Precision, Recall, Accuracy, and F1 for S. Cerevisiae and Homo sapiens datasets without UFN terms
|
|
| Dataset | Precision % | Recall % | Accuracy % | F1 % |
|---|---|---|---|---|---|---|
| 0.0 | 0.99 | Homo sapiens | 29.05 | 94.375 | 78.05 | 44.42 |
| S. Cerevisiae | 05.78 | 95.85 | 66.20 | 10.89 | ||
| 0.25 | 0.99 | Homo sapiens | 83.34 | 90.84 | 87.80 | 86.93 |
| S. Cerevisiae | 75.30 | 90.81 | 84.04 | 82.33 | ||
| 0.50 | 0.99 | Homo sapiens | 85.38 | 90.35 | 87.66 | 87.79 |
| S. Cerevisiae | 79.73 | 89.44 | 83.73 | 84.31 |
Cerevisiae and Homo sapiens datasets without UFN terms.
Comparison of results with and without GO based relationships
| Dataset |
| Precision | Recall | Accuracy | F1 | |
|---|---|---|---|---|---|---|
| Homo sapiens | 0.0 | with GO | 29.05 | 94.375 | 78.05 | 44.42 |
| w/o GO | 11.49 | 99.39 | 13.79 | 20.60 | ||
| S. Cerevisiae | 0.0 | with GO | 05.78 | 95.85 | 66.20 | 10.89 |
| w/o GO | 02.30 | 97.47 | 02.89 | 04.51 | ||
| Homo sapiens | 0.25 | with GO | 83.34 | 90.84 | 87.80 | 86.93 |
| w/o GO | 60.24 | 97.65 | 63.51 | 74.51 | ||
| S. Cerevisiae | 0.25 | with GO | 75.30 | 90.81 | 84.04 | 82.33 |
| w/o GO | 52.58 | 96.17 | 56.12 | 67.99 | ||
| Homo sapiens | 0.50 | with GO | 85.38 | 90.35 | 87.66 | 87.79 |
| w/o GO | 65.22 | 97.22 | 67.38 | 78.06 | ||
| S. Cerevisiae | 0.50 | with GO | 79.73 | 89.44 | 83.73 | 84.31 |
| w/o GO | 64.17 | 94.91 | 65.84 | 76.57 |
Comparison of results with and without GO based relationships
Figure 5False Positive Rate trend for both Homo sapiens and Saccharomyces Cerevisiae datasets.
Figure 6Comparison of Precision, Recall, and F1 of our method (black) with S.Jaeger's [32], method (purple).
Figure 7Comparison of Precision, Recall, Accuracy and F1 of our method (black) with Narai's [15] method (purple).