Literature DB >> 33381814

Geometricus represents protein structures as shape-mers derived from moment invariants.

Janani Durairaj1, Mehmet Akdel1, Dick de Ridder1, Aalt D J van Dijk1,2.   

Abstract

MOTIVATION: As the number of experimentally solved protein structures rises, it becomes increasingly appealing to use structural information for predictive tasks involving proteins. Due to the large variation in protein sizes, folds and topologies, an attractive approach is to embed protein structures into fixed-length vectors, which can be used in machine learning algorithms aimed at predicting and understanding functional and physical properties. Many existing embedding approaches are alignment based, which is both time-consuming and ineffective for distantly related proteins. On the other hand, library- or model-based approaches depend on a small library of fragments or require the use of a trained model, both of which may not generalize well.
RESULTS: We present Geometricus, a novel and universally applicable approach to embedding proteins in a fixed-dimensional space. The approach is fast, accurate, and interpretable. Geometricus uses a set of 3D moment invariants to discretize fragments of protein structures into shape-mers, which are then counted to describe the full structure as a vector of counts. We demonstrate the applicability of this approach in various tasks, ranging from fast structure similarity search, unsupervised clustering and structure classification across proteins from different superfamilies as well as within the same family.
AVAILABILITY AND IMPLEMENTATION: Python code available at https://git.wur.nl/durai001/geometricus.
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Substances:

Year:  2020        PMID: 33381814     DOI: 10.1093/bioinformatics/btaa839

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  2 in total

1.  Comparative host-coronavirus protein interaction networks reveal pan-viral disease mechanisms.

Authors:  David E Gordon; Joseph Hiatt; Mehdi Bouhaddou; Veronica V Rezelj; Svenja Ulferts; Hannes Braberg; Alexander S Jureka; Kirsten Obernier; Jeffrey Z Guo; Jyoti Batra; Robyn M Kaake; Andrew R Weckstein; Tristan W Owens; Meghna Gupta; Sergei Pourmal; Erron W Titus; Merve Cakir; Margaret Soucheray; Michael McGregor; Zeynep Cakir; Gwendolyn Jang; Matthew J O'Meara; Tia A Tummino; Ziyang Zhang; Helene Foussard; Ajda Rojc; Yuan Zhou; Dmitry Kuchenov; Ruth Hüttenhain; Jiewei Xu; Manon Eckhardt; Danielle L Swaney; Jacqueline M Fabius; Manisha Ummadi; Beril Tutuncuoglu; Ujjwal Rathore; Maya Modak; Paige Haas; Kelsey M Haas; Zun Zar Chi Naing; Ernst H Pulido; Ying Shi; Inigo Barrio-Hernandez; Danish Memon; Eirini Petsalaki; Alistair Dunham; Miguel Correa Marrero; David Burke; Cassandra Koh; Thomas Vallet; Jesus A Silvas; Caleigh M Azumaya; Christian Billesbølle; Axel F Brilot; Melody G Campbell; Amy Diallo; Miles Sasha Dickinson; Devan Diwanji; Nadia Herrera; Nick Hoppe; Huong T Kratochvil; Yanxin Liu; Gregory E Merz; Michelle Moritz; Henry C Nguyen; Carlos Nowotny; Cristina Puchades; Alexandrea N Rizo; Ursula Schulze-Gahmen; Amber M Smith; Ming Sun; Iris D Young; Jianhua Zhao; Daniel Asarnow; Justin Biel; Alisa Bowen; Julian R Braxton; Jen Chen; Cynthia M Chio; Un Seng Chio; Ishan Deshpande; Loan Doan; Bryan Faust; Sebastian Flores; Mingliang Jin; Kate Kim; Victor L Lam; Fei Li; Junrui Li; Yen-Li Li; Yang Li; Xi Liu; Megan Lo; Kyle E Lopez; Arthur A Melo; Frank R Moss; Phuong Nguyen; Joana Paulino; Komal Ishwar Pawar; Jessica K Peters; Thomas H Pospiech; Maliheh Safari; Smriti Sangwan; Kaitlin Schaefer; Paul V Thomas; Aye C Thwin; Raphael Trenker; Eric Tse; Tsz Kin Martin Tsui; Feng Wang; Natalie Whitis; Zanlin Yu; Kaihua Zhang; Yang Zhang; Fengbo Zhou; Daniel Saltzberg; Anthony J Hodder; Amber S Shun-Shion; Daniel M Williams; Kris M White; Romel Rosales; Thomas Kehrer; Lisa Miorin; Elena Moreno; Arvind H Patel; Suzannah Rihn; Mir M Khalid; Albert Vallejo-Gracia; Parinaz Fozouni; Camille R Simoneau; Theodore L Roth; David Wu; Mohd Anisul Karim; Maya Ghoussaini; Ian Dunham; Francesco Berardi; Sebastian Weigang; Maxime Chazal; Jisoo Park; James Logue; Marisa McGrath; Stuart Weston; Robert Haupt; C James Hastie; Matthew Elliott; Fiona Brown; Kerry A Burness; Elaine Reid; Mark Dorward; Clare Johnson; Stuart G Wilkinson; Anna Geyer; Daniel M Giesel; Carla Baillie; Samantha Raggett; Hannah Leech; Rachel Toth; Nicola Goodman; Kathleen C Keough; Abigail L Lind; Reyna J Klesh; Kafi R Hemphill; Jared Carlson-Stevermer; Jennifer Oki; Kevin Holden; Travis Maures; Katherine S Pollard; Andrej Sali; David A Agard; Yifan Cheng; James S Fraser; Adam Frost; Natalia Jura; Tanja Kortemme; Aashish Manglik; Daniel R Southworth; Robert M Stroud; Dario R Alessi; Paul Davies; Matthew B Frieman; Trey Ideker; Carmen Abate; Nolwenn Jouvenet; Georg Kochs; Brian Shoichet; Melanie Ott; Massimo Palmarini; Kevan M Shokat; Adolfo García-Sastre; Jeremy A Rassen; Robert Grosse; Oren S Rosenberg; Kliment A Verba; Christopher F Basler; Marco Vignuzzi; Andrew A Peden; Pedro Beltrao; Nevan J Krogan
Journal:  Science       Date:  2020-10-15       Impact factor: 47.728

2.  Fast protein structure comparison through effective representation learning with contrastive graph neural networks.

Authors:  Chunqiu Xia; Shi-Hao Feng; Ying Xia; Xiaoyong Pan; Hong-Bin Shen
Journal:  PLoS Comput Biol       Date:  2022-03-24       Impact factor: 4.475

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.