| Literature DB >> 33060657 |
Alexander A Merleev1, Dayoung Park2,3, Yixuan Xie3, Muchena J Kailemia3, Gege Xu3, L Renee Ruhaak3,4, Kyoungmi Kim5, Qiuting Hong3, Qiongyu Li3, Forum Patel1, Yu-Jui Yvonne Wan6, Alina I Marusina1, Iannis E Adamopoulos7,8, Nelvish N Lal1, Anupum Mitra5, Stephanie T Le1, Michiko Shimoda1, Guillaume Luxardi1, Carlito B Lebrilla9,10,11, Emanual Maverakis12.
Abstract
Alterations in the human glycome have been associated with cancer and autoimmunity. Thus, constructing a site-specific map of the human glycome for biomarker research and discovery has been a highly sought-after objective. However, due to analytical barriers, comprehensive site-specific glycoprofiling is difficult to perform. To develop a platform to detect easily quantifiable, site-specific, disease-associated glycan alterations for clinical applications, we have adapted the multiple reaction monitoring mass spectrometry method for use in glycan biomarker research. The adaptations allow for highly precise site-specific glycan monitoring with minimum sample prep. Using this technique, we successfully mapped out the relative abundances of the most common 159 glycopeptides in the plasma of 97 healthy volunteers. This plasma glycome map revealed 796 significant (FDR < 0.05) site-specific inter-protein and intra-protein glycan associations, of which the vast majority were previously unknown. Since age and gender are relevant covariants in biomarker research, these variables were also characterized. 13 glycopeptides were found to be associated with gender and 41 to be associated with age. Using just five age-associated glycopeptides, a highly accurate age prediction model was constructed and validated (r2 = 0.62 ± 0.12). The human plasma site-specific glycan map described herein has utility in applications ranging from glycan biomarker research and discovery to the development of novel glycan-altering interventions.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33060657 PMCID: PMC7567094 DOI: 10.1038/s41598-020-73588-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Site-specific map of the human serum glycome. The major glycans occurring at the glycosylations sites of the 17 most common serum glycoproteins are presented. When present, the sites of glycosylation (first of the two numbers) are as indicated in UNIPROT. When there is no position indicated, the glycosylation occurs at the immunoglobulin constant heavy chain domain 2 (CH2)-84.4 glycosylation site (IMGT numbering system). Glycan structures are presented as a four digit code where the first numeral represents the total number of mannose and galactose residues combined, the second represents the total number of N-acetylglucosamine residues, the third numeral corresponds to the number of fucose residues, and the final numeral is the number of sialic acid moieties. On the right side of each diagram is the log of the relative abundances of the glycans presented as box-and-whisker plots. The left and right bars connected to each box indicate the boundaries of the normal distribution and the left and right box edges mark the first and third quartile boundaries within each distribution. The bold line within the box indicates the median value of the distribution. On the left of each diagram are the square of the intra-protein Pearson Product Moment Correlation Coefficients (PPMCCs) for connected glycan pair.
Figure 2Intra-and inter-protein glycan associations. Log relative abundances for individual glycan pairs were graphed, and correlations were determined using Pearson Product Moment Correlation Coefficients (PPMCCs), which is abbreviated as “r”. (A–D) are intra-protein correlations. (E) represents inter-protein glycan correlations. (F) represents protein-glycan correlations. (A comprehensive list of all pairwise correlations between monitored analytes can be found in Data File S1) (n = 97).
Figure 3Effect of Age and Gender on glycosylation. (A) Log relative glycan abundance versus age. Examples of glycoforms significantly altered by age (a full list can be found in Table S4). Of note, IgG1 and IgG2 share several age-associated glycan modifications. Also, glycan 5411 is negatively correlated with age when present on IgG1, IgG2, and position 209 of IgM. IgM also declines with increasing age (P = 0.0011). (B) Representative site-specific glycosylations and proteins that are differentially expressed with respect to gender (a full list can be found in Table S5). The upper and lower bars connected to each box indicate the boundaries of the normal distribution and the upper and lower box edges mark the first and third quartile boundaries within each distribution. The bold line within the box indicates the median value of the distribution. Y-axis represents log relative abundance or log protein concentration where indicated.
Figure 4Age Prediction Models. (A) The graph represents the performance of a linear regression model for age prediction. The model was constructed from 5 different glycopeptides (IgG1 g:3510, IgG1 g:5410, IgM p:209 g:5411, IgM J chain g:5412, Hp p:241 g:7602). Diagnostic plots (residuals vs fitted, testing for linearity; normal Q-Q, to assess the distribution of the residuals; scale-location, to assess the homoscedastic of the data; and residuals vs leverage, to check for overly influential cases) for the model are presented to its right. (B) Linear regression model comprised of six glycopeptides (IgG1 g:3510, IgG1 g:5410, IgG2 g:3410, IgM p:209 g:5411, IgM J chain g:5412, Hp p: 241 g:7602) and 1 serum protein, IgG3. Model diagnostics are represented to the right (model performance parameters for age prediction models can be found in Table S7).