| Literature DB >> 25355510 |
Tak Lee1, Sunmo Yang1, Eiru Kim1, Younhee Ko1, Sohyun Hwang2, Junha Shin1, Jung Eun Shim1, Hongseok Shim1, Hyojin Kim1, Chanyoung Kim1, Insuk Lee3.
Abstract
Arabidopsis thaliana is a reference plant that has been studied intensively for several decades. Recent advances in high-throughput experimental technology have enabled the generation of an unprecedented amount of data from A. thaliana, which has facilitated data-driven approaches to unravel the genetic organization of plant phenotypes. We previously published a description of a genome-scale functional gene network for A. thaliana, AraNet, which was constructed by integrating multiple co-functional gene networks inferred from diverse data types, and we demonstrated the predictive power of this network for complex phenotypes. More recently, we have observed significant growth in the availability of omics data for A. thaliana as well as improvements in data analysis methods that we anticipate will further enhance the integrated database of co-functional networks. Here, we present an updated co-functional gene network for A. thaliana, AraNet v2 (available at http://www.inetbio.org/aranet), which covers approximately 84% of the coding genome. We demonstrate significant improvements in both genome coverage and accuracy. To enhance the usability of the network, we implemented an AraNet v2 web server, which generates functional predictions for A. thaliana and 27 nonmodel plant species using an orthology-based projection of nonmodel plant genes on the A. thaliana gene network.Entities:
Mesh:
Year: 2014 PMID: 25355510 PMCID: PMC4383895 DOI: 10.1093/nar/gku1053
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Summary of the 19 data types that support the co-functional links in AraNet v2
| Supporting evidence for co-functional links | Number of genes (% coding genome) | # links |
|---|---|---|
| (AT-CC) co-citation of | 7760 (28.3) | 78 000 |
| (AT-CX) co-expression of | 21 710 (79.2) | 500 595 |
| (AT-DC) domain co-occurrence between | 7634 (28.4) | 25 000 |
| (AT-GN) genomic neighborhood of | 2752 (10.0) | 51 000 |
| (AT-HT) interactions between | 4216 (15.3) | 8000 |
| (AT-LC) interactions between | 2709 (9.9) | 5168 |
| (AT-PG) phylogenetic profile similarity between | 3613 (13.2) | 66 000 |
| (CE-CC) co-citation of | 2630 (9.6) | 64 214 |
| (CE-CX) co-expression of | 3835 (14.0) | 63 000 |
| (DM-CX) co-expression of | 3968 (14.5) | 96 000 |
| (DR-CX) co-expression of | 2182 (8.0) | 46 000 |
| (HS-CX) co-expression of | 4973 (18.1) | 120 000 |
| (HS-HT) interactions between | 3855 (14.1) | 46 000 |
| (HS-LC) interactions between | 6148 (22.4) | 87 000 |
| (SC-CC) co-citation of | 4124 (15.0) | 96 152 |
| (SC-CX) co-expression of | 2964 (10.8) | 71 000 |
| (SC-GT) similarity of genetic interactions between | 2179 (7.9) | 19 000 |
| (SC-HT) interactions between | 2776 (10.1) | 57 000 |
| (SC-LC) interactions between | 4361 (15.9) | 89 866 |
| (AraNet v2) integrated network | 22 894 (83.5) | 895 000 |
Figure 1.Network assessment using a set of validation gene pairs based on SUBA3 (a) and GO-CC (b). The accuracy of the co-functional links of each network was calculated as the percentage of true positives for each bin of 1000 gene pairs. The resultant plot shows that AraNet v2 outperforms AraNet for the entire genome coverage range. (c) A box-and-whisker plot of network prediction power for 212 GO-CC terms with more than four annotated genes, measured by area under the curve from ROC analysis. AraNet v2 is also superior to the previous network in prediction for GO-CC annotations. (d) x-axis and y-axis represent the size of each GO-CC term and measured prediction power for the terms by AUC, respectively. These two variables have no significant correlation (r = 0.012), indicating no impact of gene set size on network prediction power.
Figure 2.A schematic figure of the two different query processes for the network-assisted hypothesis generation implemented in AraNet v2.
Figure 3.(a) A box-and-whisker plot of AraNet v2 prediction power (measure by AUC) for maize UniProt GO-BP annotations and those by a randomized model. (b) A box-and-whisker plot that summarizes the expression levels of five genes that were highly ranked among maize leaf initiation candidates across 42 nonleaf tissues (N) and 18 leaf tissues (L). The expression levels were measured by the log base 2 of the intensity value of the hybridized spots. All five genes show significantly elevated expression levels in leaf tissues (by Wilcoxon rank-sum test: GRMZM2G013617, P = 1.67 × 10−5; GRMZM2G396114, P = 5.52 × 10−4; GRMZM2G458728, P = 8.45 × 10−8; GRMZM2G137046, P = 8.56 × 10−7; GRMZM2G099319, P = 2.75 × 10−7).