Literature DB >> 33591968

Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships.

Florian Huber1, Lars Ridder1, Stefan Verhoeven1, Jurriaan H Spaaks1, Faruk Diblen1, Simon Rogers2, Justin J J van der Hooft3.   

Abstract

Spectral similarity is used as a proxy for structural similarity in many tandem mass spectrometry (MS/MS) based metabolomics analyses such as library matching and molecular networking. Although weaknesses in the relationship between spectral similarity scores and the true structural similarities have been described, little development of alternative scores has been undertaken. Here, we introduce Spec2Vec, a novel spectral similarity score inspired by a natural language processing algorithm-Word2Vec. Spec2Vec learns fragmental relationships within a large set of spectral data to derive abstract spectral embeddings that can be used to assess spectral similarities. Using data derived from GNPS MS/MS libraries including spectra for nearly 13,000 unique molecules, we show how Spec2Vec scores correlate better with structural similarity than cosine-based scores. We demonstrate the advantages of Spec2Vec in library matching and molecular networking. Spec2Vec is computationally more scalable allowing structural analogue searches in large databases within seconds.

Entities:  

Mesh:

Year:  2021        PMID: 33591968      PMCID: PMC7909622          DOI: 10.1371/journal.pcbi.1008724

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  28 in total

1.  Mass spectral molecular networking of living microbial colonies.

Authors:  Jeramie Watrous; Patrick Roach; Theodore Alexandrov; Brandi S Heath; Jane Y Yang; Roland D Kersten; Menno van der Voort; Kit Pogliano; Harald Gross; Jos M Raaijmakers; Bradley S Moore; Julia Laskin; Nuno Bandeira; Pieter C Dorrestein
Journal:  Proc Natl Acad Sci U S A       Date:  2012-05-14       Impact factor: 11.205

2.  SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information.

Authors:  Kai Dührkop; Markus Fleischauer; Marcus Ludwig; Alexander A Aksenov; Alexey V Melnik; Marvin Meusel; Pieter C Dorrestein; Juho Rousu; Sebastian Böcker
Journal:  Nat Methods       Date:  2019-03-18       Impact factor: 28.547

3.  Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking.

Authors:  Mingxun Wang; Jeremy J Carver; Vanessa V Phelan; Laura M Sanchez; Neha Garg; Yao Peng; Don Duy Nguyen; Jeramie Watrous; Clifford A Kapono; Tal Luzzatto-Knaan; Carla Porto; Amina Bouslimani; Alexey V Melnik; Michael J Meehan; Wei-Ting Liu; Max Crüsemann; Paul D Boudreau; Eduardo Esquenazi; Mario Sandoval-Calderón; Roland D Kersten; Laura A Pace; Robert A Quinn; Katherine R Duncan; Cheng-Chih Hsu; Dimitrios J Floros; Ronnie G Gavilan; Karin Kleigrewe; Trent Northen; Rachel J Dutton; Delphine Parrot; Erin E Carlson; Bertrand Aigle; Charlotte F Michelsen; Lars Jelsbak; Christian Sohlenkamp; Pavel Pevzner; Anna Edlund; Jeffrey McLean; Jörn Piel; Brian T Murphy; Lena Gerwick; Chih-Chuang Liaw; Yu-Liang Yang; Hans-Ulrich Humpf; Maria Maansson; Robert A Keyzers; Amy C Sims; Andrew R Johnson; Ashley M Sidebottom; Brian E Sedio; Andreas Klitgaard; Charles B Larson; Cristopher A Boya P; Daniel Torres-Mendoza; David J Gonzalez; Denise B Silva; Lucas M Marques; Daniel P Demarque; Egle Pociute; Ellis C O'Neill; Enora Briand; Eric J N Helfrich; Eve A Granatosky; Evgenia Glukhov; Florian Ryffel; Hailey Houson; Hosein Mohimani; Jenan J Kharbush; Yi Zeng; Julia A Vorholt; Kenji L Kurita; Pep Charusanti; Kerry L McPhail; Kristian Fog Nielsen; Lisa Vuong; Maryam Elfeki; Matthew F Traxler; Niclas Engene; Nobuhiro Koyama; Oliver B Vining; Ralph Baric; Ricardo R Silva; Samantha J Mascuch; Sophie Tomasi; Stefan Jenkins; Venkat Macherla; Thomas Hoffman; Vinayak Agarwal; Philip G Williams; Jingqui Dai; Ram Neupane; Joshua Gurr; Andrés M C Rodríguez; Anne Lamsa; Chen Zhang; Kathleen Dorrestein; Brendan M Duggan; Jehad Almaliti; Pierre-Marie Allard; Prasad Phapale; Louis-Felix Nothias; Theodore Alexandrov; Marc Litaudon; Jean-Luc Wolfender; Jennifer E Kyle; Thomas O Metz; Tyler Peryea; Dac-Trung Nguyen; Danielle VanLeer; Paul Shinn; Ajit Jadhav; Rolf Müller; Katrina M Waters; Wenyuan Shi; Xueting Liu; Lixin Zhang; Rob Knight; Paul R Jensen; Bernhard O Palsson; Kit Pogliano; Roger G Linington; Marcelino Gutiérrez; Norberto P Lopes; William H Gerwick; Bradley S Moore; Pieter C Dorrestein; Nuno Bandeira
Journal:  Nat Biotechnol       Date:  2016-08-09       Impact factor: 54.908

4.  Topic modeling for untargeted substructure exploration in metabolomics.

Authors:  Justin Johan Jozias van der Hooft; Joe Wandy; Michael P Barrett; Karl E V Burgess; Simon Rogers
Journal:  Proc Natl Acad Sci U S A       Date:  2016-11-16       Impact factor: 11.205

5.  Fast metabolite identification with Input Output Kernel Regression.

Authors:  Céline Brouard; Huibin Shen; Kai Dührkop; Florence d'Alché-Buc; Sebastian Böcker; Juho Rousu
Journal:  Bioinformatics       Date:  2016-06-15       Impact factor: 6.937

6.  Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics.

Authors:  Xiaotao Shen; Ruohong Wang; Xin Xiong; Yandong Yin; Yuping Cai; Zaijun Ma; Nan Liu; Zheng-Jiang Zhu
Journal:  Nat Commun       Date:  2019-04-03       Impact factor: 14.919

Review 7.  Array programming with NumPy.

Authors:  Charles R Harris; K Jarrod Millman; Stéfan J van der Walt; Ralf Gommers; Pauli Virtanen; David Cournapeau; Eric Wieser; Julian Taylor; Sebastian Berg; Nathaniel J Smith; Robert Kern; Matti Picus; Stephan Hoyer; Marten H van Kerkwijk; Matthew Brett; Allan Haldane; Jaime Fernández Del Río; Mark Wiebe; Pearu Peterson; Pierre Gérard-Marchant; Kevin Sheppard; Tyler Reddy; Warren Weckesser; Hameer Abbasi; Christoph Gohlke; Travis E Oliphant
Journal:  Nature       Date:  2020-09-16       Impact factor: 49.962

8.  MetFrag relaunched: incorporating strategies beyond in silico fragmentation.

Authors:  Christoph Ruttkies; Emma L Schymanski; Sebastian Wolf; Juliane Hollender; Steffen Neumann
Journal:  J Cheminform       Date:  2016-01-29       Impact factor: 5.514

Review 9.  Navigating freely-available software tools for metabolomics analysis.

Authors:  Rachel Spicer; Reza M Salek; Pablo Moreno; Daniel Cañueto; Christoph Steinbeck
Journal:  Metabolomics       Date:  2017-08-09       Impact factor: 4.290

Review 10.  Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics.

Authors:  Ivana Blaženović; Tobias Kind; Jian Ji; Oliver Fiehn
Journal:  Metabolites       Date:  2018-05-10
View more
  13 in total

Review 1.  Metabolomics and genomics in natural products research: complementary tools for targeting new chemical entities.

Authors:  Lindsay K Caesar; Rana Montaser; Nancy P Keller; Neil L Kelleher
Journal:  Nat Prod Rep       Date:  2021-11-17       Impact factor: 13.423

2.  Memory-Efficient Searching of Gas-Chromatography Mass Spectra Accelerated by Prescreening.

Authors:  Aleksandr Smirnov; Yunfei Liao; Xiuxia Du
Journal:  Metabolites       Date:  2022-05-29

3.  Large-scale tandem mass spectrum clustering using fast nearest neighbor searching.

Authors:  Wout Bittremieux; Kris Laukens; William Stafford Noble; Pieter C Dorrestein
Journal:  Rapid Commun Mass Spectrom       Date:  2021-06-25       Impact factor: 2.419

4.  Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions.

Authors:  Grímur Hjörleifsson Eldjárn; Andrew Ramsay; Justin J J van der Hooft; Katherine R Duncan; Sylvia Soldatou; Juho Rousu; Rónán Daly; Joe Wandy; Simon Rogers
Journal:  PLoS Comput Biol       Date:  2021-05-04       Impact factor: 4.475

5.  Navigating through chemical space and evolutionary time across the Australian continent in plant genus Eremophila.

Authors:  Oliver Gericke; Rachael M Fowler; Allison M Heskes; Michael J Bayly; Susan J Semple; Chi P Ndi; Dan Staerk; Claus J Løland; Daniel J Murphy; Bevan J Buirchell; Birger Lindberg Møller
Journal:  Plant J       Date:  2021-09-16       Impact factor: 7.091

Review 6.  Advances in decomposing complex metabolite mixtures using substructure- and network-based computational metabolomics approaches.

Authors:  Mehdi A Beniddir; Kyo Bin Kang; Grégory Genta-Jouve; Florian Huber; Simon Rogers; Justin J J van der Hooft
Journal:  Nat Prod Rep       Date:  2021-11-17       Impact factor: 13.423

Review 7.  Networks and Graphs Discovery in Metabolomics Data Analysis and Interpretation.

Authors:  Adam Amara; Clément Frainay; Fabien Jourdan; Thomas Naake; Steffen Neumann; Elva María Novoa-Del-Toro; Reza M Salek; Liesa Salzer; Sarah Scharfenberg; Michael Witting
Journal:  Front Mol Biosci       Date:  2022-03-08

8.  SIMILE enables alignment of tandem mass spectra with statistical significance.

Authors:  Daniel G C Treen; Mingxun Wang; Shipei Xing; Katherine B Louie; Tao Huan; Pieter C Dorrestein; Trent R Northen; Benjamin P Bowen
Journal:  Nat Commun       Date:  2022-05-06       Impact factor: 17.694

Review 9.  Metabolomics-Guided Elucidation of Plant Abiotic Stress Responses in the 4IR Era: An Overview.

Authors:  Morena M Tinte; Kekeletso H Chele; Justin J J van der Hooft; Fidele Tugizimana
Journal:  Metabolites       Date:  2021-07-08

10.  Comprehensive Large-Scale Integrative Analysis of Omics Data To Accelerate Specialized Metabolite Discovery.

Authors:  Joris J R Louwen; Justin J J van der Hooft
Journal:  mSystems       Date:  2021-08-24       Impact factor: 6.496

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.