| Literature DB >> 27054587 |
Andrew G McDonald1, Keith F Tipton1, Gavin P Davey1.
Abstract
O-linked glycosylation is an important post-translational modification of mucin-type protein, changes to which are important biomarkers of cancer. For this study of the enzymes of O-glycosylation, we developed a shorthand notation for representing GalNAc-linked oligosaccharides, a method for their graphical interpretation, and a pattern-matching algorithm that generates networks of enzyme-catalysed reactions. Software for generating glycans from the enzyme activities is presented, and is also available online. The degree distributions of the resulting enzyme-reaction networks were found to be Poisson in nature. Simple graph-theoretic measures were used to characterise the resulting reaction networks. From a study of in-silico single-enzyme knockouts of each of 25 enzymes known to be involved in mucin O-glycan biosynthesis, six of them, β-1,4-galactosyltransferase (β4Gal-T4), four glycosyltransferases and one sulfotransferase, play the dominant role in determining O-glycan heterogeneity. In the absence of β4Gal-T4, all Lewis X, sialyl-Lewis X, Lewis Y and Sda/Cad glycoforms were eliminated, in contrast to knockouts of the N-acetylglucosaminyltransferases, which did not affect the relative abundances of O-glycans expressing these epitopes. A set of 244 experimentally determined mucin-type O-glycans obtained from the literature was used to validate the method, which was able to predict up to 98% of the most common structures obtained from human and engineered CHO cell glycoforms.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27054587 PMCID: PMC4824424 DOI: 10.1371/journal.pcbi.1004844
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Symbols used in O-glycan identifiers.
| Symbol | IUPAC Symbol | Definition |
|---|---|---|
| f | Fuc |
|
| K | Kdn | 2-Keto-3-deoxy- |
| L | Gal |
|
| N | Neu5Gc | |
| S | Neu5Ac | |
| V | GalNAc | |
| Y | GlcNAc | |
| s | -SO3H | Sulfate |
| a, b | Anomeric configuration | |
| [,] | [,] | Branch delimiters |
| T | Protein backbone |
Fig 1Structure identifier example.
The diantennary O-glycan defined by the structure identifier [Lb4Yb6][Lb4Yb3Lb3]VT, with its IUPAC name in linear condensed form.
The enzymes of O-glycosylation included in this study.
| Abbreviation | EC Number | IUBMB Name | Reaction | |
|---|---|---|---|---|
| EC 2.4.1.38 | UDP-L + *[Y*T = UDP + *[Lb4Y*T | |||
| ppGalNAc-T | EC 2.4.1.41 | polypeptide | UDP-V + T = UDP + VT | |
| EC 2.4.1.65 | 3-galactosyl- | GDP-f + *[Lb3Y*T = GDP + *[Lb3[fa4]Y*T | ||
| 4- | ||||
| EC 2.4.1.69 | galactoside 2- | GDP-f + *[Lb3Y*T = GDP + *[[fa2]Lb3Y*T | ||
| GDP-f + *[Lb3]VT = GDP + *[[fa2]Lb3]VT | ||||
| C2Gn-T | EC 2.4.1.102 | UDP-Y + [Lb3]VT = UDP + [Yb6][Lb3]VT | ||
| C1Gal-T1 | EC 2.4.1.122 | glycoprotein- | UDP-L + VT = UDP + [Lb3]VT | |
| 3- | ||||
| EC 2.4.1.146 | UDP-Y + [Yb6][Lb3]VT = UDP + [Yb6][Yb3Lb3]VT | |||
| EC 2.4.1.147 | acetylgalactosaminyl- | UDP-Y + VT = UDP + [Yb3]VT | ||
| C2/4Gn-T | EC 2.4.1.148 | acetylgalactosaminyl- | UDP-Y + [Yb3]VT = UDP + [Yb6][Yb3]VT | |
| EC 2.4.1.149 | UDP-Y + *[Lb4Y*T = UDP + *[Yb3Lb4Y*T | |||
| glucosaminyltransferase | ||||
| EC 2.4.1.152 | 4-galactosyl- | GDP-f + *[Lb4Y*T = GDP + *[Lb4[fa3]Y*T | ||
| 3- | ||||
| EC 2.4.1.- | ( | UDP-L + *[Y*T = UDP + *[Lb3Y*T | ||
| ST6Gal-I | EC 2.4.99.1 | CMP-S + *[Lb4Y*T = CMP + *[Sa6Lb4Y*T | ||
| ST6GalNAc-I | EC 2.4.99.3 | CMP-S + VT = CMP + [Sa6]VT | ||
| transferase | CMP-S + [Lb3]VT = CMP + [Sa6][Lb3]VT | |||
| ST3Gal-I | EC 2.4.99.4 | CMP-S + *[Lb3]VT = CMP + *[Sa3Lb3]VT | ||
| ST3Gal-III/IV | EC 2.4.99.6 | CMP-S + *[Lb4Y*T = CMP + *[Sa3Lb4Y*T | ||
| ST6GalNAc-III/IV | EC 2.4.99.7 | CMP-S + [Sa3Lb3]VT = CMP + [Sa6][Sa3Lb3]VT | ||
| acetylgalactosaminide 6- | ||||
| ST6GlcNAc-I | EC 2.4.99.- | ( | CMP-S + *[Yb3Lb3]VT = CMP + *[Sa6Yb3Lb3]VT | |
| Gcnt2 (I-GnT) | EC 2.4.1.- | ( | UDP-Y + *[Lb4Yb3L*T = UDP + *[[Yb6][Lb4Yb3]L*T | |
| glucosaminyltransferase) | ||||
| CHST4/6 | EC 2.8.2.- | (GlcNAc-6- | PAP-s + *[Y*T = ABP + *[[s6]Y*T | |
| GAL3ST2 | EC 2.8.2.- | ( | PAP-s+ *[Lb3*T = ABP + *[[s3]Lb3*T | |
| GAL4ST4 | EC 2.8.2.- | ( | PAP-s+ *[Lb4*T = ABP + *[[s3]Lb4*T | |
| EC 2.4.1.37 | ( | UDP-L + *[[fa2]L*T = UDP + *[La3[fa2]L*T | ||
| glucosaminyltransferase) | ||||
| EC 2.4.1.40 | glycoprotein-fucosylgalactoside | UDP-V + *[[fa2]L*T = UDP + *[Va3[fa2]L*T | ||
| galactosaminyltransferase | ||||
| EC 2.4.1.- | (glycoprotein-sialylgalactoside | UDP-V + *[Sa3Lb4*T = UDP + *[Sa3[Vb4]Lb4*T | ||
| galactosaminyltransferase) |
Abbreviated forms of enzyme reaction equations, including anomeric linkage types α/β (a/b). Where an EC number is unavailable, the expected sub-subclass is given. T denotes a Ser/Thr O -glycosylation site on the protein. An asterisk symbol acts as a wildcard character, denoting an oligosaccharide of unspecified length. Abbreviations used: PAP-s, 3′-phosphoadenosine-5′-phosphosulfate (PAPS); ABP, adenosine 3′,5′-bisphosphate; other symbols are defined in the text.
aCan also act on type-1 acceptors.
bThe products of sulfotransferase action (enzymes 20–22) do not block the activities of the other transferases.
Fig 2Enzyme simulation.
Output of the Perl script used to mimic the actions of the enzymes of Table 2, for four iterations of the method described in the text. Each in-silico reaction takes the form
Fig 3Initial stages of O-GalNAc glycosylation.
Following the addition of GalNAc to an unoccupied serine/threonine residue on a polypeptide backbone, addition of Gal or GlcNAc forms cores 1–4, before further extension takes place. The structure identifiers shown are: VT (Tn); [S3L3]VT (ST); [S6]VT (STn); [S6][S3L3]VT (diST); [L3]VT (core 1); [Y6][L3]VT (core 2); [Y3]VT (core 3); [Y6][Y3]VT (core 4). Structures are displayed using CFG symbols. All reactions were predicted from four iterations of the method, with enzymes 1–18 of the model active. For reasons of space, not all reactions are shown.
Fig 4Simulated O-glycosylation reaction networks.
A Graphical rendering of a 6-O-sulfated triantennary core-2 O-glycan with structure identifier [S3L4[f3][s6]Y6][[S3L4[f3][s6]Y6][S3L4[f3][s6]Y3]L3]VT. B. Predictive network in which the enzyme simulator is run in reverse, starting from the O-glycan structure identifier in (A), stopping when the final enzyme removes GalNAc from the protein. C. The reaction network generated in the forward (biosynthetic) direction using only the enzymes encountered in panel (B). Pathways are drawn from left to right. In (B) and (C), the structure drawn in panel (A) appears at the points indicated by the blue arrows. Nodes represent distinct O-glycans, and edges (reactions) are colour-coded by the type of monosaccharide being transferred: GalNAc (brown), Gal (yellow), Fuc (red), Neu5Ac (magenta), GlcNAc (blue) and sulfate (orange).
Fig 5Network properties.
A. The total number of O-glycans produced as a function of iteration number. B. The number of new structures appearing at each iteration number, for a series of networks limited by the maximum number of GlcNAcs incorporated (l), as indicated. C. The degree distribution after 14 iterations. D. Variation of β and γ indices, and network average clustering coefficient (〈C〉) with increasing iteration number.
Fig 6In-silico enzyme knockouts.
Effects of in-silico enzyme knockouts on network indices. The effects of single-enzyme knockouts on (A) the β index, (B) α and γ indices and (C) the network average clustering coefficient 〈C〉 are shown; each network in A–C was generated using 15 iterations of the method described in the text; the ordinate axis in each case shows the name of the enzyme being knocked out, while the abscissa shows the difference between the wild type and knockout indices.
Effects of single-enzyme knockouts on the distributions of common epitopes.
The numbers of O-glycans are expressed as percentages of the total number of glycans obtained in each experiment. See text for details.
| Knockout | Lea | Lex | SLea | SLex | Leb | Ley | H | A | B | Sda | other | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wt | 8.7 | 15.0 | 4.1 | 7.0 | 4.1 | 6.9 | 8.7 | 1.9 | 16.0 | 10.5 | 32.8 | |
| 14.2 | 0.0 | 14.2 | 0.0 | 14.2 | 0.0 | 14.2 | 14.2 | 30.3 | 0.0 | 22.5 | ||
| 0.0 | 16.3 | 0.0 | 7.8 | 0.0 | 7.7 | 9.8 | 0.0 | 16.0 | 11.6 | 41.1 | ||
| 12.6 | 21.2 | 6.4 | 11.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 17.2 | 41.2 | ||
| C2Gn-T | 8.7 | 15.0 | 4.1 | 7.0 | 4.1 | 6.9 | 8.7 | 1.9 | 16.0 | 10.5 | 32.8 | |
| C1Gal-T1 | 8.7 | 14.5 | 4.6 | 7.4 | 4.6 | 7.2 | 8.7 | 2.4 | 17.2 | 11.3 | 30.2 | |
| 8.2 | 14.2 | 4.3 | 7.3 | 4.3 | 7.1 | 8.2 | 2.3 | 18.2 | 11.1 | 31.0 | ||
| 8.7 | 15.2 | 3.8 | 6.8 | 3.8 | 6.7 | 8.7 | 1.6 | 15.5 | 10.1 | 34.1 | ||
| C2/4Gn-T | 8.5 | 15.0 | 3.8 | 7.0 | 3.8 | 6.8 | 8.5 | 1.6 | 15.7 | 10.4 | 34.0 | |
| 8.7 | 15.0 | 4.1 | 7.0 | 4.1 | 6.9 | 8.7 | 1.9 | 16.0 | 10.5 | 32.8 | ||
| 10.6 | 0.0 | 5.3 | 0.0 | 5.3 | 0.0 | 10.6 | 2.6 | 17.5 | 10.1 | 45.5 | ||
| 0.0 | 19.5 | 0.0 | 9.7 | 0.0 | 9.7 | 0.0 | 0.0 | 14.8 | 14.0 | 41.5 | ||
| ST6Gal-I | 9.4 | 16.4 | 4.5 | 7.9 | 4.5 | 7.7 | 9.4 | 2.1 | 18.0 | 11.9 | 27.4 | |
| ST3Gal-I | 8.7 | 15.0 | 4.1 | 7.0 | 4.1 | 6.9 | 8.7 | 1.9 | 16.1 | 10.5 | 32.7 | |
| ST3Gal-III/IV | 10.8 | 18.4 | 0.0 | 0.0 | 5.3 | 9.0 | 10.8 | 2.6 | 21.0 | 0.0 | 35.7 | |
| Gcnt2 | 8.6 | 8.6 | 6.3 | 6.4 | 6.3 | 6.3 | 8.6 | 4.3 | 21.4 | 11.1 | 29.3 | |
| 9.4 | 16.2 | 4.5 | 7.8 | 4.5 | 7.6 | 9.4 | 2.1 | 0.0 | 11.7 | 39.1 | ||
| 9.4 | 16.2 | 4.5 | 7.8 | 4.5 | 7.6 | 9.4 | 0.0 | 17.7 | 11.7 | 28.9 | ||
| 9.1 | 15.8 | 4.3 | 7.4 | 4.3 | 7.4 | 9.1 | 2.0 | 17.2 | 0.0 | 36.7 |
O-Glycans common to more than one published study, with their predictions in silico.
The structure marked NP was not predicted by the model constructed from the unmodified activities of Table 2. The sources of each glycan are given as reference numbers.
| Structure identifier | Sources | |
|---|---|---|
| [S3L3]VT | [ | ✓ |
| [S6][S3L3]VT | [ | ✓ |
| [S3L4Y6][S3L3]VT | [ | ✓ |
| [S3L4Y6][L3]VT | [ | ✓ |
| [L4Y6][S3L3]VT | [ | ✓ |
| [S6][L3]VT | [ | ✓ |
| [L4Y6][L3]VT | [ | ✓ |
| [L3]VT | [ | ✓ |
| [Y6][S3L3]VT | [ | ✓ |
| [S6]VT | [ | ✓ |
| [L4[f3]Y6][S3L3]VT | [ | ✓ |
| [S3L4[f3]Y6][S3L3]VT | [ | ✓ |
| [S3L4[f3]Y6][L3]VT | [ | ✓ |
| [L4Y3L4Y6][S3L3]VT | [ | ✓ |
| VT | [ | ✓ |
| [Y6][L3]VT | [ | ✓ |
| [S6][Y3]VT | [ | ✓ |
| [S6][L4Y3]VT | [ | ✓ |
| [L4[f3]Y6][L3]VT | [ | ✓ |
| [L4Y3L4Y6][L3]VT | [ | ✓ |
| [Y3]VT | [ | ✓ |
| [Y3L4Y6][L3]VT | [ | ✓ |
| [S6][S6L4Y3]VT | [ | ✓ |
| [S6L4Y6][S3L3]VT | [ | ✓ |
| [S6L4Y3L4Y6][S3L3]VT | [ | ✓ |
| [S3L4[f3]Y6][[S3L4[f3]Y6][S3L4[f3]Y3]L3]VT | [ | ✓ |
| [S3L4[f3]Y6][[S3L4[f3]Y6][L4[f3]Y3]L3]VT | [ | ✓ |
| [S3L4[f3]Y6][[S3L4Y6][S3L4[f3]Y3]L3]VT | [ | ✓ |
| [S3L4[f3]Y6][[L4[f3]Y6][S3L4[f3]Y3]L3]VT | [ | ✓ |
| [S3L4[f3]Y6][[L4[f3]Y6][S3L4Y3]L3]VT | [ | ✓ |
| [S3L4[f3]Y6][[L4Y6][S3L4[f3]Y3]L3]VT | [ | ✓ |
| [S3L4Y6][[S3L4[f3]Y6][S3L4[f3]Y3]L3]VT | [ | ✓ |
| [S3L4Y6][[L4[f3]Y6][S3L4[f3]Y3]L3]VT | [ | ✓ |
| [S3L4Y3]VT | [ | ✓ |
| [S3L4Y3L4Y6][S3L3]VT | [ | ✓ |
| [S3L4Y3L3]VT | [ | ✓ |
| [L4[s6]Y6][S3L3]VT | [ | ✓ |
| [L4[f3]Y6][[S3L4[f3]Y6][S3L4[f3]Y3]L3]VT | [ | ✓ |
| [L4[f3]Y3L4Y6][S3L3]VT | [ | ✓ |
| [L4Y6][[L4Y6][L4Y3]L3]VT | [ | ✓ |
| [L4Y3]VT | [ | ✓ |
| [L4Y3L4[f3]Y6][L3]VT | [ | NP |
| [L4Y3L4Y3]VT | [ | ✓ |
| [L4Y3L4Y3L4Y6][L3]VT | [ | ✓ |
| [L4Y3L3]VT | [ | ✓ |