Joshua Klein1, Luis Carvalho1,2, Joseph Zaia1,3. 1. Program for Bioinformatics, Boston University, Boston, MA, USA. 2. Department of Math and Statistics, Boston University, Boston, MA, USA. 3. Department of Biochemistry, Boston University, Boston, MA, USA.
Abstract
Motivation: Glycosylation is one of the most heterogeneous and complex protein post-translational modifications. Liquid chromatography coupled mass spectrometry (LC-MS) is a common high throughput method for analyzing complex biological samples. Accurate study of glycans require high resolution mass spectrometry. Mass spectrometry data contains intricate sub-structures that encode mass and abundance, requiring several transformations before it can be used to identify biological molecules, requiring automated tools to analyze samples in a high throughput setting. Existing tools for interpreting the resulting data do not take into account related glycans when evaluating individual observations, limiting their sensitivity. Results: We developed an algorithm for assigning glycan compositions from LC-MS data by exploring biosynthetic network relationships among glycans. Our algorithm optimizes a set of likelihood scoring functions based on glycan chemical properties but uses network Laplacian regularization and optionally prior information about expected glycan families to smooth the likelihood and thus achieve a consistent and more representative solution. Our method was able to identify as many, or more glycan compositions compared to previous approaches, and demonstrated greater sensitivity with regularization. Our network definition was tailored to N-glycans but the method may be applied to glycomics data from other glycan families like O-glycans or heparan sulfate where the relationships between compositions can be expressed as a graph. Availability and implementation Built Executable: http://www.bumc.bu.edu/msr/glycresoft/ and Source Code: https://github.com/BostonUniversityCBMS/glycresoft. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Glycosylation is one of the most heterogeneous and complex protein post-translational modifications. Liquid chromatography coupled mass spectrometry (LC-MS) is a common high throughput method for analyzing complex biological samples. Accurate study of glycans require high resolution mass spectrometry. Mass spectrometry data contains intricate sub-structures that encode mass and abundance, requiring several transformations before it can be used to identify biological molecules, requiring automated tools to analyze samples in a high throughput setting. Existing tools for interpreting the resulting data do not take into account related glycans when evaluating individual observations, limiting their sensitivity. Results: We developed an algorithm for assigning glycan compositions from LC-MS data by exploring biosynthetic network relationships among glycans. Our algorithm optimizes a set of likelihood scoring functions based on glycan chemical properties but uses network Laplacian regularization and optionally prior information about expected glycan families to smooth the likelihood and thus achieve a consistent and more representative solution. Our method was able to identify as many, or more glycan compositions compared to previous approaches, and demonstrated greater sensitivity with regularization. Our network definition was tailored to N-glycans but the method may be applied to glycomics data from other glycan families like O-glycans or heparan sulfate where the relationships between compositions can be expressed as a graph. Availability and implementation Built Executable: http://www.bumc.bu.edu/msr/glycresoft/ and Source Code: https://github.com/BostonUniversityCBMS/glycresoft. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Frederick J Krambeck; Sandra V Bennun; Someet Narang; Sean Choi; Kevin J Yarema; Michael J Betenbaugh Journal: Glycobiology Date: 2009-06-08 Impact factor: 4.313
Authors: Alessio Ceroni; Kai Maass; Hildegard Geyer; Rudolf Geyer; Anne Dell; Stuart M Haslam Journal: J Proteome Res Date: 2008-03-01 Impact factor: 4.466
Authors: Navdeep Jaitly; Anoop Mayampurath; Kyle Littlefield; Joshua N Adkins; Gordon A Anderson; Richard D Smith Journal: BMC Bioinformatics Date: 2009-03-17 Impact factor: 3.169
Authors: Jean-Francois Greisch; Maurits A den Boer; Frank Beurskens; Janine Schuurman; Sem Tamara; Albert Bondt; Albert J R Heck Journal: J Am Soc Mass Spectrom Date: 2021-02-11 Impact factor: 3.109
Authors: Margaret Downs; Manveen K Sethi; Rekha Raghunathan; Matthew D Layne; Joseph Zaia Journal: Anal Bioanal Chem Date: 2022-02-02 Impact factor: 4.142