| Literature DB >> 34432510 |
Erick F Velasquez1, Yenni A Garcia1, Ivan Ramirez1, Ankur A Gholkar1, Jorge Z Torres1,2,3.
Abstract
The elucidation of a protein's interaction/association network is important for defining its biological function. Mass spectrometry-based proteomic approaches have emerged as powerful tools for identifying protein-protein interactions (PPIs) and protein-protein associations (PPAs). However, interactome/association experiments are difficult to interpret, considering the complexity and abundance of data that are generated. Although tools have been developed to identify protein interactions/associations quantitatively, there is still a pressing need for easy-to-use tools that allow users to contextualize their results. To address this, we developed CANVS, a computational pipeline that cleans, analyzes, and visualizes mass spectrometry-based interactome/association data. CANVS is wrapped as an interactive Shiny dashboard with simple requirements, allowing users to interface easily with the pipeline, analyze complex experimental data, and create PPI/A networks. The application integrates systems biology databases such as BioGRID and CORUM to contextualize the results. Furthermore, CANVS features a Gene Ontology tool that allows users to identify relevant GO terms in their results and create visual networks with proteins associated with relevant GO terms. Overall, CANVS is an easy-to-use application that benefits all researchers, especially those who lack an established bioinformatic pipeline and are interested in studying interactome/association data.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34432510 PMCID: PMC8693966 DOI: 10.1091/mbc.E21-05-0257
Source DB: PubMed Journal: Mol Biol Cell ISSN: 1059-1524 Impact factor: 4.138
FIGURE 1:CANVS workflow. Mass spectrometry data files, comma-delimited text files, with protein UniProt accession numbers, protein descriptions, protein quantitative values (scores), and bait POIs are uploaded and rendered as interactive data tables. To clean the data, users can determine the significance of the identified proteins, given proper controls, using log 2–fold change and a Student’s t test. Significant protein identifications are then analyzed by applying Gene Ontology (GO) terms, the Comprehensive Resource of Mammalian Protein Complexes (CORUM) database, and the Biological General Repository for Interaction Datasets (BioGRID) database. The visNetwork R package is then used to visualize the GO PPI/A, CORUM PPI/A, and BIOGRID PPI/A networks.
FIGURE 2:CANVS cleaning method. CANVS allows users to upload interaction/association MS data, filter by a minimum number of replicates a protein should be present in, normalize proteins by the median value of each purification, and apply significance statistics. CANVS calculates the log 2–fold change and p values that can be visualized in an interactive volcano plot. The user can then filter by a certain p value or fold change and the results are used in the pipeline for further analysis/visualization.
FIGURE 3:Selection of Gene Ontology GO terms and filtering results with selected GO terms. Users can perform a keyword search and GO terms containing the keyword/s of interest and any associated subterms are retrieved. Only GO terms associated with protein hits in the dataset will appear and can be selected and applied as a filter. Proteins with the associated GO terms of interest are included in the network tables.
FIGURE 4:Creation of interactive PPI/A networks of (A) protein hits associated with the selected GO terms integrating (B) CORUM protein complex information and (C) BioGRID PPI information.