Literature DB >> 33511361

Contextualized Protein-Protein Interactions.

Anthony Federico1,2, Stefano Monti1,2.   

Abstract

Protein-protein interaction (PPI) databases are an important bioinformatics resource, yet existing literature-curated databases usually represent cell-type-agnostic interactions, which is at variance with our understanding that protein dynamics are context specific and highly dependent on their environment. Here, we provide a resource derived through data mining to infer disease- and tissue-relevant interactions by annotating existing PPI databases with cell-contextual information extracted from reporting studies. This resource is applicable to the reconstruction and analysis of disease-centric molecular interaction networks. We have made the data and method publicly available and plan to release scheduled updates in the future. We expect these resources to be of interest to a wide audience of researchers in the life sciences.
© 2020 The Authors.

Entities:  

Keywords:  context-relevant PPI; network biology; protein-protein interaction

Year:  2020        PMID: 33511361      PMCID: PMC7815950          DOI: 10.1016/j.patter.2020.100153

Source DB:  PubMed          Journal:  Patterns (N Y)        ISSN: 2666-3899


Introduction

Network biology is an emerging trend in biomedical research that takes a systems-based approach to understanding biological processes and modeling complex disease, whereby interacting molecules—rather than individual genes—are mapped to phenotypic outcomes. An accurate reconstruction of the interactions of the proteome would allow for a detailed understanding of how interacting proteins carry out cellular functions, explain biological phenomena, and predict the consequences of interventions. There is currently a large selection of repositories for protein-protein interactions (PPIs) and an ever-growing number of experimentally observed or computationally predicted interactions. Efforts have emerged, such as the Proteomics Standard Initiative Common Query Interface (PSICQUIC), to aggregate these interactions across various providers, enabling querying of millions of interactions based on a subset of interactors or detection methods., This has proved to be an essential resource for network-based analyses, mapping interactome networks, and seeding advanced network models. Whereas literature-curated interactions often include the experimental assays supporting the interaction, there is less emphasis on describing the biological context (e.g., the cell lines) in which an interaction was assayed in vitro. Protein dynamics are context specific and highly dependent on their environment. For example, it was reported that a majority of protein complexes measured in yeast were dependent on environmental conditions. Without context, one ignores the dynamic rewiring of biological networks and assumes PPIs measured across heterogeneous cellular contexts are uniformly relevant to a given biological system under study. This assumption has been shown to be false and that local conditions of observed interactions are important considerations when reconstructing networks for exploring specific biological subsystems. Thus, researchers should consider querying reported interactions relevant to the model under investigation. Previous efforts to infer environmental specificity of PPIs include integration of tissue-specific gene expression information—such as GTEx—whereby two proteins in a PPI that are both expressed above a certain threshold in a given tissue are deemed available for interaction., Here we present an alternative approach of utilizing associated cell lines within the original publication of literature-curated PPIs to infer environmental context. In summary, multiple lines of evidence suggest that the cellular context of reported PPIs is an important factor in determining the relevance of their use toward other biological research efforts. Here we present a data mining method for annotating existing PPI databases with contextual information in an attempt to determine their biological relevance.

Results

To demonstrate our method, we start with interactions from the Human Integrated Protein-Protein Interaction Reference (HIPPIE), a manually curated subset of experimentally detected PPIs from PSICQUIC. To date, HIPPIE contains 391,410 interactions from 41,330 publications sourced from various providers, including IntAct, MINT, BioGRID, HPRD, DIP, BIND, and MIPS. Each entry takes the form of two protein interactors—identified by their encoding gene symbol—and zero, one, or multiple PubMed identifiers (PIDs), in addition to other relevant information such as the types of experimental evidence supporting the interaction. The PIDs link to the original studies in which an interaction was reported. With these PIDs, one can reference the reporting studies to further understand the context in which an interaction was observed, such as the cell lines used to conduct the experiment. This information is valuable in determining which interactions are relevant to the context of one's own biological study, yet the manual curation of cellular context for hundreds of thousands of interactions is time intensive. Here we present an approach to automating this process through the use of two additional resources: NCBI's PubTator and ExPASy's Cellosaurus. PubTator is a text mining tool for literature curation that extracts bioconcepts (e.g., gene, disease, chemical, mutation, species, and cell line) from text and has pre-processed and annotated roughly 3 million full-text PubMed Central articles., Cellosaurus describes many of the available cell lines used in biomedical research. It provides unique cell-line accessions (CLA) for more than 100,000 cell lines, which can be mapped through a controlled vocabulary to various cell-line attributes, including an official cell identifier or name, category, species of origin, etc., as well as synonyms and common spelling variations or misspellings. Using these resources, we created a simple method for annotating a large collection of PPIs with cell-type contextual information. The basic workflow consists of mapping PPIs and their supporting publications reported in HIPPIE to cell lines reported in these studies extracted via PubTator to official cell-line identifiers and various cell-line attributes described in Cellosaurus. The idea is that the originating article for an interaction will likely describe one or more cell lines used in the study (e.g., in the methods section) and that these cell lines may have been used to carry out the experiment itself or are at least relevant to the interactions reported. By extracting this information to annotate existing interactions, we can filter interactions by cell-line context based on the biological system/state we are interested in. We developed a fast and reproducible pipeline for annotating literature-curated PPIs with associated PIDs. The pipeline can efficiently annotate hundreds of thousands of interactions in a few minutes. It does so by fetching and processing the raw bulk data of HIPPIE, PubTator, and Cellosaurus and generating three mapping tables (Figures 1A–1C). The first table (PPI table) maps interactions to reporting publications (one to many), the second (PID table) maps publications to extracted bioconcepts (cell line) and cell-line accession numbers (one to many), and the third (CLA table) maps cell-line accessions to official cell-line names and associated cell-type information (one to one). Due to the multi-mapping nature of the data, original interactions can be supported by multiple studies, each of which could report multiple cell lines. Therefore, we create an entry in the contextualized dataset for each combination observed. Using these tables, the pipeline executes the routine described in Figure 2 to create the dataset of contextualized PPIs (Figure 1D).
Figure 1

A Graphical Overview

The schematic describes the organization of existing bioinformatics resources to create three mapping tables—(A) the PPI table which maps interactions to reporting publications, (B) the PID table which maps publications to extracted cell lines, and (C) the CLA table which maps cell-line accessions to official cell-line names and associated cell-type information—to generate (D) the presented dataset of contextualized PPIs.

Figure 2

The Main Routine Behind PPI Context

The pseudocode includes the main routine executed in the data pre-processing pipeline for creating contextualized PPI entries from the three mapping tables. The tool can be downloaded from GitHub, which includes example commands for installing the required Python dependencies and fetching the raw data.

A Graphical Overview The schematic describes the organization of existing bioinformatics resources to create three mapping tables—(A) the PPI table which maps interactions to reporting publications, (B) the PID table which maps publications to extracted cell lines, and (C) the CLA table which maps cell-line accessions to official cell-line names and associated cell-type information—to generate (D) the presented dataset of contextualized PPIs. The Main Routine Behind PPI Context The pseudocode includes the main routine executed in the data pre-processing pipeline for creating contextualized PPI entries from the three mapping tables. The tool can be downloaded from GitHub, which includes example commands for installing the required Python dependencies and fetching the raw data. Interactions are ignored if they do not have supporting publications or have publications where cell lines are not reported or cannot be extracted. The result is a data frame of original PPIs with additional columns, including cell name, category, species, etc., for all annotatable PPIs (contextualized PPIs). This format is compatible with the primary use case envisioned for the data: building interaction networks by filtering on one or more cell types relevant to a biological setting or question of interest. Application of this routine to the latest versions (as of June 2020) of the previously described resources started with 391,410 original interactions and found at least one publication for 385,740 interactions. This resulted in a final contextualized dataset of 1,016,726 unique interaction/cell line pairs across 2,012 unique cell lines, originating from 247,065 interactions. We found that a majority of the contextualized interactions were derived from papers reporting human-derived cancer cell lines (Figure 3B). A majority of the reported interactions indeed come from commonly used cell lines such as HeLa and HEK293 (Figure 3A).
Figure 3

Summary of Contextualized PPIs

The processed dataset provides cell-line information for each contextualized PPI. The summary plots compare the frequencies of annotations for contextualized PPIs,including (A) the most frequently annotated cell-line names, (B) cell-line species of origin, (C) cell-line sex, and (D) cell-line category. The majority of annotations were human cancer-derived cell lines.

Summary of Contextualized PPIs The processed dataset provides cell-line information for each contextualized PPI. The summary plots compare the frequencies of annotations for contextualized PPIs,including (A) the most frequently annotated cell-line names, (B) cell-line species of origin, (C) cell-line sex, and (D) cell-line category. The majority of annotations were human cancer-derived cell lines. Despite a bias toward popular cell lines, there still remain sufficient interactions for many less common cell lines to perform disease-centric modeling through filtered PPIs. For example, we reconstructed a molecular interaction network using PPIs from the breast cancer cell lines MCF-7 and MDA-MB-231, resulting in a breast cancer-centric network of 4,645 nodes and 9,015 edges. By deriving PPIs annotated with breast cancer cell lines, we would expect these interactions to be experimentally validated in said cell lines or at least reported in a context relevant to breast cancer. Thus, this network should exhibit known properties of a breast cancer model better than non-breast cancer networks. To test this hypothesis, we assessed the network's ability to rediscover known disease genes through network propagation and compared it with the results from networks generated with other top cell lines in the dataset. To this end, we used the random walk with restart (RWR) algorithm, a popular method for network propagation. RWR measures the proximity of nodes in a graph to a given seed or set of seed nodes. The algorithm randomly traverses the graph starting from seed nodes and moving with a given restart probability. It exploits the disease module hypothesis, which postulates that disease genes are likely to be close to one another in a given network. Hence, highly traversed nodes (other than the disease-gene seeds) are classified as disease genes with high probability. Using this algorithm, we tested if the breast cancer network was more efficient at recovering known breast cancer disease genes. We queried 538 breast cancer genes from DisGeNET and adopted a standard random resampling approach, whereby the 538-gene set was randomly split in half, with half used as the seed set and recovery scored on the left-out half as the area under the receiver operating characteristic, with the process repeated 100 times. We compared the recovery scores of the breast cancer network with those of networks built from interactions annotated with the other most frequent 30 cell lines. For each network, we ranked and compared the mean recovery score across the 100 iterations. We found the breast cancer network to outperform networks built from non-breast cancer interactions at rediscovering known breast cancer genes (Table 1). There are many cell lines compared with which BRCA performs significantly better, and these would be primary candidates for removal when reconstructing a breast cancer-centric interaction network. Although encouraging that BRCA outranks other networks, it performs only marginally better than networks built from commonly used non-breast cancer cell lines such as HEK293 and HeLa. This is likely due to inspection bias toward well-studied disease genes known to play a role in multiple cancers (e.g., TP53) and commonly assayed in these well-established and widely adopted cell lines.21, 22, 23 In addition, we tested a network based on PPIs filtered for breast tissue expression (Breast Expressed) and found it had a recovery score roughly equal to that of BRCA, suggesting that both methods—literature mining and tissue expression—of inferring context are similar and could be used in complementary ways.
Table 1

Network Propagation of Disease Genes

Cell nameNodesEdgesDensityAssortivityMean AUROCSDDelta
BRCA4,6459,0158.4 × 10−4−1.8 × 10−10.6930.028
Breast Expressed10,850180,3423.1 × 10−3−6.8 × 10−20.6910.0140.002
HEK29311,06979,2071.3 × 10−3−7.4 × 10−20.6640.0170.029
HEK293T13,884116,7091.2 × 10−3−8.2 × 10−20.6590.0140.034
HeLa13,824149,1361.6 × 10−3−5.4 × 10−20.6580.0150.034
MEF (C57BL/6)4,1898,6899.9 × 10−4−2.2 × 10−10.6430.0230.049
DU1453,2198,0751.6 × 10−3−5.5 × 10−10.6370.0310.055
Jurkat2,4675,9532.0 × 10−3−4.1 × 10−10.6370.0340.056
HCT 11611,93682,9561.2 × 10−31.1 × 10−20.6330.0180.060
Schneider 24,22818,7452.1 × 10−3−7.2 × 10−20.6320.0280.060
U2OS6,66725,3091.1 × 10−3−2.7 × 10−10.6310.0200.061
SW4801,7634,3162.8 × 10−3−4.0 × 10−10.6290.0290.063
MCF-10A11,46961,7229.4 × 10−4−3.7 × 10−20.6270.0180.066
Hep-G21,3043,7284.4 × 10−3−3.0 × 10−10.6260.0460.066
BL-215,13511,4338.7 × 10−4−2.0 × 10−10.6250.0210.068
NCI-H19751,2953,5464.2 × 10−3−4.3 × 10−10.6180.0410.074
LS5131,2463,4864.5 × 10−3−4.4 × 10−10.6030.0380.089
NIH 3T32,9144,8061.1 × 10−3−2.9 × 10−10.6030.0300.090
HT-291,6934,2192.9 × 10−3−3.7 × 10−10.6010.0340.092
HeLa Kyoto4,99216,9011.4 × 10−3−1.1 × 10−10.6010.0220.092
MCF-10AT11,11257,7549.4 × 10−4−1.4 × 10−10.5970.0180.096
K-5621,9223,6872.0 × 10−3−3.8 × 10−10.5960.0360.096
MRC-52,0153,5381.7 × 10−3−3.7 × 10−10.5830.0220.109
T-REx-2935,39519,5581.3 × 10−3−4.1 × 10−10.5810.0220.111
HeLa S38,75639,1571.0 × 10−31.0 × 10−10.5800.0190.112
Sf91,8193,1591.9 × 10−3−2.0 × 10−10.5750.0290.118
JON1,3543,6294.0 × 10−3−4.1 × 10−10.5740.0360.119
SH-SY5Y8,42227,8647.9 × 10−4−1.5 × 10−10.5710.0230.122
HEK6,16119,5691.0 × 10−3−2.2 × 10−10.5460.0220.147
293T/AT11,9943,3151.7 × 10−3−3.1 × 10−10.5180.0280.174
hTERT-RPE12,5536,5772.0 × 10−3−4.2 × 10−10.4860.0350.206

A comparison of the recovery of breast cancer disease genes in a breast cancer-centric network and networks built from non-breast cancer interactions, in addition to measured graph properties, including nodes, edges, density, and assortivity. Delta values measure the difference in mean AUROC (area under the receiver operating characteristic) of 100 repeats between the BRCA network and the rest.

Network Propagation of Disease Genes A comparison of the recovery of breast cancer disease genes in a breast cancer-centric network and networks built from non-breast cancer interactions, in addition to measured graph properties, including nodes, edges, density, and assortivity. Delta values measure the difference in mean AUROC (area under the receiver operating characteristic) of 100 repeats between the BRCA network and the rest. We performed an additional test to determine if interactions were relevant to their derived cell-line annotations. In particular, we selected two genes known to be highly specific to breast cancer, BRCA1 and BRCA2, and counted the number of PPIs (interactions) involving one or both of these genes and annotated with breast cancer cell lines (MCF-7 and MDA-MB-231) compared with those annotated with one of the other cell lines. The expectation was that breast cancer annotated interactions should have a significantly higher proportion of interactions involving BRCA1 and/or BRCA2 than non-breast cancer annotated interactions. Indeed, relative to the total number of interactions in each network, we found BRCA1/2 interactions much more likely to be annotated with breast cancer cell lines, supporting our assumption that the method is extracting relevant cell-line interactions and that other PPIs annotated with these cell lines are also likely relevant to breast cancer (Table 2).
Table 2

Targeted Enrichment by Cell Line

Cell nameInteractionsBRCA1/2%pFDR
MDA-MB-2314,1851520.0362.6 × 10−1337.9 × 10−132
MCF-76,577670.0107.0 × 10−261.0 × 10−24
MCF-10A62,0191740.0031.9 × 10−51.9 × 10−4
MCF-10AT57,8321630.0032.8 × 10−52.1 × 10−4
U2OS26,221830.0038.5 × 10−55.1 × 10−4
BL-2111,981270.0023.3 × 10−11.0
MEF (C57BL/6)9,265210.0023.4 × 10−11.0
NIH 3T35,031100.0025.7 × 10−11.0
K-5624,19770.0027.5 × 10−11.0
JON3,64160.0027.5 × 10−11.0
HCT 11685,3661640.0028.0 × 10−11.0
Sf93,41440.0019.2 × 10−11.0
Hep-G23,85420.0011.01.0
SW4804,35110.0001.01.0
DU1458,12340.0001.01.0
Jurkat6,24510.0001.01.0
HeLa179,4072850.0021.01.0
HeLa S339,911350.0011.01.0
SH-SY5Y27,964170.0011.01.0
T-REx-29319,91260.0001.01.0
HEK20,38240.0001.01.0
HeLa Kyoto17,09310.0001.01.0
HEK293T140,1121320.0011.01.0
HEK29385,737450.0011.01.0
Schneider 218,78900.0001.01.0
hTERT-RPE16,73900.0001.01.0
HT-294,27100.0001.01.0
MRC-53,59700.0001.01.0
NCI-H19753,55000.0001.01.0
LS5133,48700.0001.01.0

A comparison of the annotation of BRCA1/2-interactions across the most frequent cell lines. The significance was computed with a hyper-geometric test for over-representation and p values were adjusted for multiple comparisons using the Benjamini-Hochberg method (FDR).

Targeted Enrichment by Cell Line A comparison of the annotation of BRCA1/2-interactions across the most frequent cell lines. The significance was computed with a hyper-geometric test for over-representation and p values were adjusted for multiple comparisons using the Benjamini-Hochberg method (FDR).

Reproducibility and Extension

In addition to hosting the presented data online, we also developed a command line interface utility for downloading and processing the raw data for reproducing our results and extending the method. The only prerequisite is access to a machine with Python installed. The repository can be cloned from GitHub to any local directory in addition to installation of the required Python dependencies through the following commands: $ git clone https://github.com/montilab/ppi-context $ pip install -r requirements.txt The full pipeline, which includes downloading and processing of raw data, can be run through a single command: $ python ppictx.py --download --run Given the constantly evolving nature of the repositories our approach uses as its input this pipeline is an essential contribution. The pipeline can optionally take as arguments file paths to the expected raw files locally stored if users wish to process alternative versions of the data. This pipeline is readily extensible to annotating interactions with additional cell-line information available on Cellosaurus as well as text mining methods alternative to PubTator.

Discussion

PPI databases are an important bioinformatics resource. Existing literature-curated databases usually represent cell-type-agnostic interactions that are not sufficiently specific to a domain of study to significantly improve the predictive accuracy and specificity of the learned models., Due to the dynamic rewiring of biological networks in different cellular states and environments, an ability to pre-filter interactions by individual cell lines and types will increase confidence that a given interaction is present in a given biological context and will enhance our ability to model these systems. Here we present a method for annotating existing and future literature-curated PPI databases with cell-contextual information. We also generated a cleaned dataset for general use, immediately applicable to support typical PPI-based analyses with additional context, such as querying known interactions of proteins of interest, reconstruction and analyses of molecular interaction networks, and multi-omics data integration approaches. Our approach assumes that cell lines extracted from reporting articles can be used to infer the biological context in which an interaction was detected, rather than identifying the natural cellular context in which a PPI would take place. More specifically, we expect extracted cell lines to have been used for experimental assays that either directly observe or are relevant to the reported interaction. Under this assumption, extracted cell lines can be used to infer the disease or tissue relevance of annotated interactions. We use breast cancer as a primary example to support these assumptions in finding that breast cancer-centric PPI networks are enriched for breast cancer-relevant interactions and display expected network properties such as the proximity of known breast cancer-disease genes. A limitation of the method was the availability of interaction-associated publications on PubMed pre-mined with PubTator. We were able to extract at least one cell line for 6,146 of potentially 41,329 articles, leaving room for improvement. However, since many of these articles report multiple interactions, the majority of original interactions were still annotated with at least one cell line. For example, the most frequent article (PubMed: 28514442) was associated with 56,297 interactions. In addition, this study assayed the observed interactions in HEK293T cells, exemplifying the disproportionate frequencies at which interactions are annotated with popular cell lines such as HEK. This relates to a third limitation, which is that many PPIs are tested in cell lines such as HEK, due to a high transfection efficiency rather than their relevance to the interrogated interaction. However, the primary purpose of the presented dataset is to provide researchers with an additional tool to make informed decisions about which literature-curated PPIs are relevant to their research needs. Some PPIs (e.g., those from high-throughput assays in HEK-related cells) may not provide contextual information researchers can leverage, while others (e.g., PPIs from many small-scale studies annotated with cell lines not primarily used as expression vectors) are likely to be more applicable. Last, direct comparisons of distinct context-specific networks are limited by the unequal and unknown sets of tested interactions per cell line (e.g., some cell lines are overstudied while some are understudied). Therefore, the absence of PPI in a given cell line could be because those proteins were interrogated and found not to interact, or their interaction was never tested under those conditions; we cannot distinguish between these cases. Despite these limitations, the major use case envisioned for the presented contextualized PPIs—building interaction networks relevant to a biological system of interest—will serve as an important resource to researchers. Furthermore, the contextualized dataset contains over 100 cell lines with at least 500 interactions each, facilitating an important filtering of non-relevant interactions and the application toward meaningful analyses—as exemplified by our breast cancer network—for a variety of research domains. As PPI resources continue to grow in size, so too will the contextualized dataset, as we plan to release scheduled updates of the data. In addition, we expect these annotations to improve as more full-text articles become available on PubMed Central and text mining resources such as PubTator improve and grow in coverage of available articles.

Conclusion

We use existing literature-curated PPI databases and available text mining resources to annotate interactions with cell-contextual information. The contextualized dataset is freely available and ready for use immediately in network-based analyses.

Experimental Procedures

Resource Availability

Lead Contact

Further information and requests for data or additional code should be directed to and will be fulfilled by the lead contact, Anthony Federico (anfed@bu.edu).

Materials Availability

This study did not generate any reagents or materials.

Data and Code Availability

The presented data and method are freely available online. The processed data are hosted on GitHub in addition to the source code for raw data fetching and pre-processing, which is implemented in Python and compatible with all major operating systems. We have also provided comprehensive documentation with code examples for working with the processed data. Repository: github.com/montilab/ppi-context Commit: 81e31020e6e4244ec23065c72d1fe614256b6391 Documentation: montilab.github.io/ppi-context Operating systems: Linux, OS X, Windows Programming languages: R, Python License: GNU GPLv3
  26 in total

1.  Too many roads not taken.

Authors:  Aled M Edwards; Ruth Isserlin; Gary D Bader; Stephen V Frye; Timothy M Willson; Frank H Yu
Journal:  Nature       Date:  2011-02-10       Impact factor: 49.962

Review 2.  Interactome networks and human disease.

Authors:  Marc Vidal; Michael E Cusick; Albert-László Barabási
Journal:  Cell       Date:  2011-03-18       Impact factor: 41.582

3.  The Cellosaurus, a Cell-Line Knowledge Resource.

Authors:  Amos Bairoch
Journal:  J Biomol Tech       Date:  2018-05-10

4.  PSICQUIC and PSISCORE: accessing and scoring molecular interactions.

Authors:  Bruno Aranda; Hagen Blankenburg; Samuel Kerrien; Fiona S L Brinkman; Arnaud Ceol; Emilie Chautard; Jose M Dana; Javier De Las Rivas; Marine Dumousseau; Eugenia Galeota; Anna Gaulton; Johannes Goll; Robert E W Hancock; Ruth Isserlin; Rafael C Jimenez; Jules Kerssemakers; Jyoti Khadake; David J Lynn; Magali Michaut; Gavin O'Kelly; Keiichiro Ono; Sandra Orchard; Carlos Prieto; Sabry Razick; Olga Rigina; Lukasz Salwinski; Milan Simonovic; Sameer Velankar; Andrew Winter; Guanming Wu; Gary D Bader; Gianni Cesareni; Ian M Donaldson; David Eisenberg; Gerard J Kleywegt; John Overington; Sylvie Ricard-Blum; Mike Tyers; Mario Albrecht; Henning Hermjakob
Journal:  Nat Methods       Date:  2011-06-29       Impact factor: 28.547

Review 5.  Network medicine: a network-based approach to human disease.

Authors:  Albert-László Barabási; Natali Gulbahce; Joseph Loscalzo
Journal:  Nat Rev Genet       Date:  2011-01       Impact factor: 53.242

6.  Rewiring makes the difference.

Authors:  Andrea Califano
Journal:  Mol Syst Biol       Date:  2011-01-18       Impact factor: 11.429

7.  A Guide to Transient Expression of Membrane Proteins in HEK-293 Cells for Functional Characterization.

Authors:  Amanda Ooi; Aloysius Wong; Luke Esau; Fouad Lemtiri-Chlieh; Chris Gehring
Journal:  Front Physiol       Date:  2016-07-19       Impact factor: 4.566

8.  HIPPIE v2.0: enhancing meaningfulness and reliability of protein-protein interaction networks.

Authors:  Gregorio Alanis-Lobato; Miguel A Andrade-Navarro; Martin H Schaefer
Journal:  Nucleic Acids Res       Date:  2016-10-24       Impact factor: 16.971

9.  RRW: repeated random walks on genome-scale protein networks for local cluster discovery.

Authors:  Kathy Macropol; Tolga Can; Ambuj K Singh
Journal:  BMC Bioinformatics       Date:  2009-09-09       Impact factor: 3.169

10.  PubTator: a web-based text mining tool for assisting biocuration.

Authors:  Chih-Hsuan Wei; Hung-Yu Kao; Zhiyong Lu
Journal:  Nucleic Acids Res       Date:  2013-05-22       Impact factor: 16.971

View more
  1 in total

1.  STonKGs: A Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs.

Authors:  Helena Balabin; Charles Tapley Hoyt; Colin Birkenbihl; Benjamin M Gyori; John Bachman; Alpha Tom Kodamullil; Paul G Plöger; Martin Hofmann-Apitius; Daniel Domingo-Fernández
Journal:  Bioinformatics       Date:  2022-01-05       Impact factor: 6.937

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.