Literature DB >> 34536380

D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions.

Samuel Sledzieski1, Rohit Singh1, Lenore Cowen2, Bonnie Berger3.   

Abstract

We combine advances in neural language modeling and structurally motivated design to develop D-SCRIPT, an interpretable and generalizable deep-learning model, which predicts interaction between two proteins using only their sequence and maintains high accuracy with limited training data and across species. We show that a D-SCRIPT model trained on 38,345 human PPIs enables significantly improved functional characterization of fly proteins compared with the state-of-the-art approach. Evaluating the same D-SCRIPT model on protein complexes with known 3D structure, we find that the inter-protein contact map output by D-SCRIPT has significant overlap with the ground truth. We apply D-SCRIPT to screen for PPIs in cow (Bos taurus) at a genome-wide scale and focusing on rumen physiology, identify functional gene modules related to metabolism and immune response. The predicted interactions can then be leveraged for function prediction at scale, addressing the genome-to-phenome challenge, especially in species where little data are available.
Copyright © 2021 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  cow rumen; deep learning; embedding; function prediction; genome to phenome; interpretability; language models; metabolism; module detection; protein-protein interaction

Mesh:

Substances:

Year:  2021        PMID: 34536380      PMCID: PMC8586911          DOI: 10.1016/j.cels.2021.08.010

Source DB:  PubMed          Journal:  Cell Syst        ISSN: 2405-4712            Impact factor:   11.091


  95 in total

1.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

2.  Characteristics of dairy cows with a greater or lower risk of subacute ruminal acidosis: Volatile fatty acid absorption, rumen digestion, and expression of genes in rumen epithelial cells.

Authors:  X Gao; M Oba
Journal:  J Dairy Sci       Date:  2016-09-13       Impact factor: 4.034

3.  A computational framework for boosting confidence in high-throughput protein-protein interaction datasets.

Authors:  Raghavendra Hosur; Jian Peng; Arunachalam Vinayagam; Ulrich Stelzl; Jinbo Xu; Norbert Perrimon; Jadwiga Bienkowska; Bonnie Berger
Journal:  Genome Biol       Date:  2012-08-31       Impact factor: 13.583

4.  Comparative Interactomes of VRK1 and VRK3 with Their Distinct Roles in the Cell Cycle of Liver Cancer.

Authors:  Namgyu Lee; Dae-Kyum Kim; Seung Hyun Han; Hye Guk Ryu; Sung Jin Park; Kyong-Tai Kim; Kwan Yong Choi
Journal:  Mol Cells       Date:  2017-09-20       Impact factor: 5.034

5.  STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.

Authors:  Damian Szklarczyk; Annika L Gable; David Lyon; Alexander Junge; Stefan Wyder; Jaime Huerta-Cepas; Milan Simonovic; Nadezhda T Doncheva; John H Morris; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

6.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

7.  Going the distance for protein function prediction: a new distance metric for protein interaction networks.

Authors:  Mengfei Cao; Hao Zhang; Jisoo Park; Noah M Daniels; Mark E Crovella; Lenore J Cowen; Benjamin Hescott
Journal:  PLoS One       Date:  2013-10-23       Impact factor: 3.240

8.  GOGO: An improved algorithm to measure the semantic similarity between gene ontology terms.

Authors:  Chenguang Zhao; Zheng Wang
Journal:  Sci Rep       Date:  2018-10-10       Impact factor: 4.379

Review 9.  More Than a Metabolic Enzyme: MTHFD2 as a Novel Target for Anticancer Therapy?

Authors:  Zhiyuan Zhu; Gilberto Ka Kit Leung
Journal:  Front Oncol       Date:  2020-04-28       Impact factor: 6.244

Review 10.  FlyBase 2.0: the next generation.

Authors:  Jim Thurmond; Joshua L Goodman; Victor B Strelets; Helen Attrill; L Sian Gramates; Steven J Marygold; Beverley B Matthews; Gillian Millburn; Giulia Antonazzo; Vitor Trovisco; Thomas C Kaufman; Brian R Calvi
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

View more
  10 in total

1.  Topsy-Turvy: integrating a global view into sequence-based PPI prediction.

Authors:  Rohit Singh; Kapil Devkota; Samuel Sledzieski; Bonnie Berger; Lenore Cowen
Journal:  Bioinformatics       Date:  2022-06-24       Impact factor: 6.931

2.  Assessing sequence-based protein-protein interaction predictors for use in therapeutic peptide engineering.

Authors:  François Charih; Kyle K Biggar; James R Green
Journal:  Sci Rep       Date:  2022-06-10       Impact factor: 4.996

3.  DeepGOZero: improving protein function prediction from sequence and zero-shot learning based on ontology axioms.

Authors:  Maxat Kulmanov; Robert Hoehndorf
Journal:  Bioinformatics       Date:  2022-06-24       Impact factor: 6.931

4.  Induced fit with replica exchange improves protein complex structure prediction.

Authors:  Ameya Harmalkar; Sai Pooja Mahajan; Jeffrey J Gray
Journal:  PLoS Comput Biol       Date:  2022-06-03       Impact factor: 4.779

5.  PEPPI: Whole-proteome Protein-protein Interaction Prediction through Structure and Sequence Similarity, Functional Association, and Machine Learning.

Authors:  Eric W Bell; Jacob H Schwartz; Peter L Freddolino; Yang Zhang
Journal:  J Mol Biol       Date:  2022-03-05       Impact factor: 6.151

Review 6.  A Survey on Deep Networks Approaches in Prediction of Sequence-Based Protein-Protein Interactions.

Authors:  Bhawna Mewara; Soniya Lalwani
Journal:  SN Comput Sci       Date:  2022-05-19

7.  Polymerase II-Associated Factor 1 Complex-Regulated FLOWERING LOCUS C-Clade Genes Repress Flowering in Response to Chilling.

Authors:  Zeeshan Nasim; Hendry Susila; Suhyun Jin; Geummin Youn; Ji Hoon Ahn
Journal:  Front Plant Sci       Date:  2022-02-09       Impact factor: 5.753

8.  TMbed: transmembrane proteins predicted through language model embeddings.

Authors:  Michael Bernhofer; Burkhard Rost
Journal:  BMC Bioinformatics       Date:  2022-08-08       Impact factor: 3.307

Review 9.  Deep learning frameworks for protein-protein interaction prediction.

Authors:  Xiaotian Hu; Cong Feng; Tianyi Ling; Ming Chen
Journal:  Comput Struct Biotechnol J       Date:  2022-06-15       Impact factor: 6.155

Review 10.  Protein-protein interaction prediction with deep learning: A comprehensive review.

Authors:  Farzan Soleymani; Eric Paquet; Herna Viktor; Wojtek Michalowski; Davide Spinello
Journal:  Comput Struct Biotechnol J       Date:  2022-09-19       Impact factor: 6.155

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.