| Literature DB >> 34747468 |
Chan Yeong Kim1, Seungbyn Baek1, Junha Cha1, Sunmo Yang1, Eiru Kim2, Edward M Marcotte3,4, Traver Hart2, Insuk Lee1.
Abstract
Network medicine has proven useful for dissecting genetic organization of complex human diseases. We have previously published HumanNet, an integrated network of human genes for disease studies. Since the release of the last version of HumanNet, many large-scale protein-protein interaction datasets have accumulated in public depositories. Additionally, the numbers of research papers and functional annotations for gene-phenotype associations have increased significantly. Therefore, updating HumanNet is a timely task for further improvement of network-based research into diseases. Here, we present HumanNet v3 (https://www.inetbio.org/humannet/, covering 99.8% of human protein coding genes) constructed by means of the expanded data with improved network inference algorithms. HumanNet v3 supports a three-tier model: HumanNet-PI (a protein-protein physical interaction network), HumanNet-FN (a functional gene network), and HumanNet-XC (a functional network extended by co-citation). Users can select a suitable tier of HumanNet for their study purpose. We showed that on disease gene predictions, HumanNet v3 outperforms both the previous HumanNet version and other integrated human gene networks. Furthermore, we demonstrated that HumanNet provides a feasible approach for selecting host genes likely to be associated with COVID-19.Entities:
Mesh:
Year: 2022 PMID: 34747468 PMCID: PMC8728227 DOI: 10.1093/nar/gkab1048
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Comparison between HumanNet v2 and v3
|
| HumanNet v2 | HumanNet v3 |
|---|---|---|
|
| Gene Ontology Biological Process (21 October 2012) (IDA, IMP); MetaCyc | Gene Ontology Biological Process (8 March 2021) (IDA, IMP); MetaCyc r22.5 |
|
| Based on ∼300k full-text articles from PubMed Central | Based on ∼650k full-text articles from PubMed Central; Updated algorithm for link prioritization |
|
| Based on 125 microarray-based and 33 RNA-seq-based GSEs (16,220 samples in total) | Inherited from HumanNet v2; Re-trained with the new Gold Standard |
|
| Co-essentiality links based on >100 shRNA and > 400 CRISPR-Cas9-based essential gene profiles | Genetic interactions from BioGRID and iRefIndex r14 and co-essentiality links based on ∼800 CRISPR Cas9-based essential gene profiles |
|
| Based on three pathway databases [KEGG (5 January 2017), BioCarta (5 January 2017), and Reactome (3 January 2017)] | Latest version of the databases [KEGG (12 April 2021), BioCarta (12 April 2021), and Reactome (14 April 2021)]; Updated algorithm for link prioritization |
|
| Based on domain profiles by InterPro r46 Profile | Based on domain profiles by InterPro r84 Profile |
|
| Based on 1748 prokaryotic (1626 bacteria and 122 archaea) genomes, 754 human metagenomes and 242 ocean sample metagenomes | Based on 9428 genus representative genomes of Prokaryotes from GTDB r95 |
|
| Transfer 10 latest functional gene networks for five species and transfer PIs of four vertebrate species (dog, cattle, rat and chicken) in iRefIndex r14; All orthology-transferred networks were integrated into a single network | Inherited from HumanNet v2; Excluded from the final HumanNet v3 |
|
| Based on 1626 bacterial and 122 archaeal genomes Analyzed two phylogenetic profiles for bacteria and Archaea, separately. | Inherited from HumanNet v2 Re-trained with the new Gold Standard |
|
| Non-redundant PI set from IRefIndex r14 | Non-redundant PI set from iRefIndex r17, BioPlex1, 2 and 3, BioGRID (v4.3.196), and IntAct (10 March 2021) databases; updated algorithm for link prioritization |
|
| Based on seven protein complex mapping data sets and five binary PI screen data sets |
CC: co-citation; CX: co-expression; CE: co-essentiality; GI: genetic interaction; DB: database; DP: domain profile; GN: gene neighboring; IL: interolog; PG: phylogenetic profile; LC: literature curation; HT: high-throughput protein–protein interaction; PI: protein–protein interaction.
Figure 1.An overview of HumanNet v3. (A, B) Bar graphs illustrating improvements in the numbers of genes (A) and functional links (B) as compared to HumanNet v2. (C) A summary of the three-tier model of HumanNet v3.
Figure 2.Network assessment for disease gene predictions. (A, B) The percentage of gene pairs that share disease annotation (y-axis, link precision) according to the DisGeNET (A) and GWAS Catalog (B) gene coverage (x-axis, gene recall) are cumulatively calculated for every 1000 links from the top links. As the PCNet network has no link score, the link precision and gene recall are calculated for the entire link. (C, D) The area under the receiver-operating characteristic curve (AUROC) up to a false positive rate (FPR) of 1% was measured for the network-based retrieval of disease genes annotated by DisGeNET (C) or GWAS Catalog (D) (***P < 0.0001, ns: P > 0.05 according to the two-tailed Mann–Whitney U test).
Figure 3.Validation of HumanNet-based candidate genes for COVID-19. (A) The number of connections between 43 guide genes derived from COVID-19 genome-wide association studies (GWAS) in HumanNet-XC. The histogram represents the distribution of network connectivity from 10 000 random 43 genes, and red vertical line indicates the number of connections between the 43 guide genes. (B) Mean hit count to 722 COVID-19 related gene sets. The red line and black line represent the mean hit count for top candidates and that for all other genes, respectively. (C-D) Enrichment ratio of DEGs specific for COVID-19 patients (C) and healthy controls (D) among top candidate genes. Different size of top candidates for validation were marked by color codes. DEGs were derived from three independent studies (Stephenson et al. (36), Schulte-Schrepping et al. (37), and Ren et al. (38)) and four distinct cell types (T, T cells; NK, natural killer cells; Myel, myeloid cells; B, B cells).