| Literature DB >> 24165882 |
Hanhae Kim1, Junha Shin, Eiru Kim, Hyojin Kim, Sohyun Hwang, Jung Eun Shim, Insuk Lee.
Abstract
Saccharomyces cerevisiae, i.e. baker's yeast, is a widely studied model organism in eukaryote genetics because of its simple protocols for genetic manipulation and phenotype profiling. The high abundance of publicly available data that has been generated through diverse 'omics' approaches has led to the use of yeast for many systems biology studies, including large-scale gene network modeling to better understand the molecular basis of the cellular phenotype. We have previously developed a genome-scale gene network for yeast, YeastNet v2, which has been used for various genetics and systems biology studies. Here, we present an updated version, YeastNet v3 (available at http://www.inetbio.org/yeastnet/), that significantly improves the prediction of gene-phenotype associations. The extended genome in YeastNet v3 covers up to 5818 genes (∼99% of the coding genome) wired by 362 512 functional links. YeastNet v3 provides a new web interface to run the tools for network-guided hypothesis generations. YeastNet v3 also provides edge information for all data-specific networks (∼2 million functional links) as well as the integrated networks. Therefore, users can construct alternative versions of the integrated network by applying their own data integration algorithm to the same data-specific links.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24165882 PMCID: PMC3965021 DOI: 10.1093/nar/gkt981
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The cofunctional links of nine data types in YeastNet v3
| Network description | Number of proteins (coverage of coding genome) | Number of functional associations |
|---|---|---|
| Co-citation (CC) | 4355 (74%) | 82 427 |
| Co-expression (CX) | 5730 (97%) | 242 504 |
| Domain co-occurrence (DC) | 3679 (62%) | 29 880 |
| Genomic neighbor (GN) | 1863 (32%) | 29 475 |
| Genetic interaction (GT) | 4365 (74%) | 149 498 |
| High-throughput PPI | 5487 (93%) | 141 347 |
| Literature curated PPI | 5293 (90%) | 54 421 |
| Phylogenetic profiles (PG) | 2463 (42%) | 54 496 |
| Tertiary structure of protein (TS) | 1101 (19%) | 3510 |
| YeastNet v3 | 5818 (99%) | 362 512 |
aPPI: protein–protein interaction.
Figure 1.Precision-recall curves for YeastNet v3, YeastNet v2 and incorporated individual functional networks for nine distinct data types (CC, co-citation; CX, co-expression; DC, domain co-occurrence; GN, gene neighbor; GT, genetic interaction; HT, high-throughput protein–protein interaction; LC, literature curated protein–protein interaction; PG, phylogenetic profiles; TS, tertiary structure of protein). Precision was calculated from cofunctional links that were derived from the KEGG pathway annotation database, which were largely independent from the links in the network training set. Recall was measured as the percentage of coverage of the 5887 validated coding genes in the yeast genome. Gene pairs for each functional network were ranked by log likelihood scores from the benchmarking process, as described in the text. Precision and recall were calculated in a cumulative manner in which every consecutive 1000 gene pairs were binned (as indicated by each symbol). The plot shows that YeastNet v3 outperforms all other networks, including YeastNet v2.
Figure 2.Box-and-whisker plots summarize the predictive power of networks for various phenotype data sets: (a) 100 knockout phenotypes (KO); (b) 586 high-dimensional morphology parameters (HDM); and (c) 88 types of chemical/environmental sensitivities (CES). The predictive power of the phenotypes was measured by an ROC curve analysis and summarized as area under the curve (AUC). AUC scores present how well a network recovers the connectivity among genes for a given phenotype, where an AUC of 0.5 indicates a prediction based on chance and an AUC of 1 indicates a perfect prediction. In the given box-and-whisker plots, the boundaries of the box represent the first and third quartiles, the whiskers represent the 10th and 90th percentiles, and the black circles represent individual outliers.
Figure 3.A schematic figure of the three options for network-guided hypothesis generation that are implemented in the YeastNet v3 web server.