Tiago Oliveira1, Mingfeng Zhang2, Eun Ji Joo2, Hisham Abdel-Azim3, Chun-Wei Chen2, Lu Yang2, Chih-Hsing Chou3, Xi Qin2, Jianjun Chen2, Kathirvel Alagesan1, Andreia Almeida1, Francis Jacob4, Nicolle H Packer1,5,6, Mark von Itzstein1, Nora Heisterkamp2, Daniel Kolarich1,6. 1. Institute for Glycomics, Griffith University, Gold Coast Campus, QLD, Australia. 2. Department of Systems Biology, Beckman Research Institute City of Hope, Monrovia, CA, USA. 3. Division of Hematology/Oncology and Bone Marrow Transplant, Children's Hospital Los Angeles, Los Angeles, CA, USA. 4. Glyco-Oncology, Ovarian Cancer Research, Department of Biomedicine, University Hospital Basel and University of Basel, Basel, Switzerland. 5. Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW, Australia. 6. ARC Centre of Excellence for Nanoscale BioPhotonics, Griffith University, QLD and Macquarie University, NSW, Australia.
Each year almost half a million new patients are diagnosed with leukemia (Globocan2020) 1. Acute leukemias including acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) belong to the group of more aggressive leukemias characterized by a rapid proliferation of malignant hematopoietic cells. ALL involving B-cell precursors (BCP-ALL) represents the most common type of cancer in children, and is also frequently diagnosed in adults 2. BCP-ALL can be further subdivided into 23 categories based on molecular characteristics 3. One subclass, MLL-r, involves rearrangements of the mixed‑lineage leukemia (MLL) gene located on chromosome 11q23. The t(4;11)(q21;q23)/MLL-AFF1(AF4) is the most frequent translocation involving the MLL gene, but 94 other genes can also be involved 4.More than 70% of infants with BCP-ALL have MLL involvement. Although in general treatment options for pediatric BCP-ALL have significantly improved and overall survival rates in children exceeds 90% 5, this specific subtype has among the lowest overall survival rates in adults as well as children 3, 6. Furthermore, newer therapies such as the infusion of autologous CAR-T cells directed against the CD19 antigen are less effective in MLL-r B-ALL due to lineage switch 7-9. Thus, discovering possible leukemia-specific antigens in MLL-r leukemia remains an important goal in the development of future therapeutics.While proteins have been primarily viewed as treatment targets, it is more than likely other targets exist based on yet-to-be discovered significant cell surface differences. For example, and importantly, potential differences at the level of glycosylation as a major form of post-translational protein modification have never been explored. This is of particular significance as glycosylation affects virtually all cell surface receptors as well as the extracellular matrix, both of which are well-known to play a major role in supporting cancer cell survival 10. Together with other glycoconjugates (proteoglycans and glycolipids), glycoproteins form the glycocalyx, a complex layer that surrounds every living cell. The glycocalyx structure is highly cell-type specific and is subject to dynamic changes, in particular as a consequence of malignant transformation 11-13. Such cell-surface alterations are important because they impact cellular recognition processes, cell behavior and immune responses 14-16.However, these alterations are difficult, if not impossible, to fully determine solely using genomic approaches as they are multi-layered and are partly the outcome of genome-independent regulation. This makes a combined multi‑omics approach the best and only option to capture a holistic and detailed picture of the glycocalyx structure. An in-depth multi‑omics data set of this type, however, does not exist for primary MLL‑r patient cells. In this unique study, we have undertaken such an integrated multi‑omics investigation of primary MLL‑r and control normal precursor B cells from bone-marrow. As outlined in the workflow shown in we have undertaken the first combined transcriptomics study that incorporates both glycomics and proteomics analyses for MML-r primary patient cells. Our results provide the first reported evidence that MLL-r cells have undergone a radical transformation of the protein glycocalyx when compared with healthy precursor B-cells. These data provide an exciting advance towards the development of novel therapies targeting this low-survival leukemia subtype.
Methods
Reagents
Water was purified using a Milli Q-8 system (Merck KGaA, Darmstadt, Germany). High quality-grade reagents were purchased from Sigma-Aldrich (St. Louis, MO, USA), unless otherwise mentioned. PNGase F (Cat#P0705) was from New England Biolabs, 500,000 U/mL. Sequencing-grade Trypsin was from Roche (Cat#11047841001). High-grade chloroform and methanol used to perform protein precipitation were from Merck (Cat#1024442500 and 1060072500, respectively).
Ethics statement
All human specimen collection protocols were reviewed and approved by Institution Review Boards. All methods were performed in accordance with the relevant guidelines and regulations. Collections were in compliance with ethical practices and Institution Review Boards approvals.
Pilot glycan isolation study
Primary leukemia samples typically undergo some processing. To determine if such procedures would significantly affect glycan recovery, we tested this on biological duplicate samples from the MLL-r B-cell precursor ALL cell line RS4;11 (ATCC #CRL-1873; established from patient Bone Marrow (BM) leukemia cells with an MLL-AF4 fusion protein). We compared RBC lysis, Ficoll density gradient /mononuclear cells isolation, flow sorting, 10% DMSO freezing medium/freeze/thaw, and samples receiving no further treatment (). In comparison, we also included samples grown on OP9 stromal cells. shows that relative intensities of recovered glycans were comparable between different procedures. Our analytical approach also allows discrimination between closely related sialylated glycan structures ().
MLL-r and normal control precursor B cell isolation
Starting materials for isolation of control healthy precursor B cells were four BM samples (R7B1, R7B3, R7B6 and R7B11) from different donors depleted of CD34+ stem cells. R7B11 was not further processed, whereas R7B6 was enriched in CD19+ B cells, and R7B1 and R7B3 were enriched for both B‑ and T-cells [CD19 and TCRα/β]. To isolate normal CD19+CD10+ precursor B-cells, around 2×109 of such viably frozen normal bone marrow cells from each sample were applied to EasySep™ Release Human CD19 Positive Selection Kit (STEMCELL Technologies, Vancouver, BC, Canada, Cat#17754) and EasySep™ Human CD10 Positive Selection Kit (STEMCELL Technologies, Cat#18358) columns.MLL samples in this study were BM0 [(11q23) relapse sample, 98% blasts], BM37 [(46XY, t(9;11) diagnosis, 98% blasts], and BM41 (60% blasts, MLL rearranged (11q23; t9;11) 46XY]. Ficoll‑Plaque Plus (GE Healthcare, Cat#17-1440-02) centrifugation was used to remove red blood cells, according to manufacturer's instructions, and viable cells were washed before being frozen with DMSO. Viably frozen cells were thawed and washed twice with ice-cold DPBS (500xg, 4 min, 4 °C). Each sample, consisting of 4-6×106 total cells, was divided into two fractions, one for proteomic/glycomic analyses and the other for RNA sequencing.
Sample preparation for (glyco)protein extraction
Frozen cell pellets of normal bone marrow (n = 4) and MLL-r (n = 3) cells were processed according to the same protocol: 2-3 million cells per sample were lysed in cell lysis buffer composed of Dulbecco's PBS (Sigma, Cat#D8537) containing 1% Triton X-100 (Sigma, Cat#T8787) and Protease Inhibitors (Sigma, Cat#P8340). Briefly, 400 µL of the cell lysis buffer were added to each sample, and tubes were kept on ice for 20 min. Samples were then homogenized using a T‑10 ULTRA-TURRAX® (IKA, Cat#0003737000) in 3 x 3 sec cycles, followed by 30 sec on ice between cycles. Samples were kept on ice for 10 more minutes, after which sonication was performed for 30 sec using a water-bath sonicator to shred DNA. Samples were centrifuged for 15 min at 15,000xg, 4 °C. Supernatants were collected to new tubes and all resulting aliquots were stored at -80 °C prior use. After lysis, all samples were quantified in triplicates using a Pierce™ BCA Protein Assay Kit (Thermo Scientific, Cat#23225).Protein content extracted from similar number of cells from samples R7B11 and BM41 was on average approximately 25 µg compared to 115 µg or more for the other 5 samples (). To ensure high-quality proteomics and glycomics preparations, these samples were excluded from the final preparations. The protein lysates of the other five samples were used as described in each individual method and as presented in .
RNA seq and RNA expression analysis
RNA was extracted from frozen cell pellets using Trizol (ThermO-Fisher 315596026) and further purified using a RNeasy Mini kit (Qiagen #74104). Quality control analysis was done using Bioanalyzer RNA Nano and D1000Chips. Nucleic acid concentrations were determined by Qubit. The mRNA library was prepared using an Illumina Truseq Stranded mRNA High Throughput Prep kit and samples were sequenced using an Illumina NextSeq 500 Mid-Output Sequencing Reagent kit (v2, 150 cycles), 132 M reads on an Illumina NextSeq 500 instrument. RNA-seq results were aligned twice to the human genome. One analysis used the GRCh37 annotation file (including around 50,000 genes including pseudogenes and non-coding RNA) whereas the second alignment, and of which results were used for proteomic comparisons presented here, made use of the genome assembly GRCh38.p13 (GCA_000001405.28) genome annotation files (around 19,862 mainly protein-encoding genes but not including IGHM). For the latter analysis, reads were quality-checked and processed using the nf-core/rnaseq analysis workflow v1.3 pipeline consisting of Nextflow v19.04.1, FastQC v0.11.8 Cutadapt v2.4, Trim Galore! v0.5.0, STAR vSTAR_2.6.1d, HISAT2 v2.1.0, Picard MarkDuplicates v2.18.27, Samtools v1.9, featureCounts v1.6.4, StringTie v1.3.5, Preseq v2.0.3, deepTools, v3.2.0, RSeQC v3.0.0, MultiQCv1.7 software. Raw count results were analyzed using edgeR to determine significantly regulated genes (criteria: fold change ≥2; p < 0.05; low expression filter rpkm <1.0). There were 3936 genes meeting the default criteria which were differentially expressed in this comparison, with 1883 genes up-regulated and 2053 genes down-regulated. The normalized RNA counts were plotted using GraphPad Prism (v8.4.3). Diagnosis leukemia samples including 70 MLL-r cases and 74 normal bone marrow controls 17 from the MiLE study (GSE13159) were compared for expression of GAG synthesis enzymes using online tools 18.
N- and O-glycan release for glycomics analyses
50 µg of (glyco)proteins were reduced by adding a volume of 500 mM of Dithiothreitol to each sample, to a final concentration of 20 mM, at 50 °C for 1 hour. After cooling, samples were then subjected to alkylation using 40 mM Iodoacetamide in the dark at room temperature. The reaction was quenched by adding another 20 mM of Dithiothreitol and incubating for 5 minutes.Proteins were precipitated using the Chloroform:Methanol:Water separation as described previously 19. The resulting protein pellet was left to air dry for 10 min. 5 µL of 8 M urea were added to each sample to resuspend the protein pellet using intensive vortexing. The final concentration of urea was adjusted to 4 M by adding 5 µL of pure water (MQ-H2O).The urea dissolved proteins were dot blotted onto a PVDF membrane (Immobilon-P, 0.44 µm pore, Merck Millipore), and N- and O-glycans were released as described previously 20 (see also Supplementary Methods for details). Before mass spectrometry analyses, N- and O-glycans were carbon cleaned off‑line using porous-graphitized carbon (PGC) material packed on top of C18 ZipTips to avoid any potential contaminants and then stored at -20 °C until their PGC-LC-ESI-MS/MS analyses.
PGC-nanoLC-ESI-MS/MS glycomics
The N- and O-glycome was determined using the PGC-nanoLC-ESI-MS/MS glycomics technology as described previously 20-22 (see also Supplementary Methods for details). Previous studies relating to glycosylation and BCP-ALL focused on the identification of particular glycoconjugates or glycan traits, such as the acetylation of sialic acid, 9‑O-acetyl‑Neu5Ac 23, 24. We note that the technologies used here do not allow to routinely evaluate the level of O-acetylation of sialic acids, as these labile modifications are lost due to the buffers used during the PGC-LC-ESI-MS/MS analyses. In addition, because of limited cell numbers, this study did not examine the samples for GAGs. All details are described following the respective MIRAGE (Minimum Information Required for A Glycomics Experiment) guidelines in the supplementary Material
25-28.
Glycan structure determination and relative quantitation
Glycan structures were determined as previously described 21, 22 (see also Supplementary Methods for details). Unsupervised clustering analysis of the relative glycan abundances, and the respective heat map representation, were performed using the package pheatmap (v1.0.12) available in R studio (v1.3.1073). The relative intensities were also plotted using GraphPad Prism (v8.4.3), and p-values were calculated by performing an unpaired t-test. Symbols of calculated significance (p < 0.01, *) are represented when groups are significantly different. All represented N‑ and O‑glycan structures and monosaccharides are depicted following the rules of the Symbol Nomenclature for Glycans (SNFG) 29, 30.
Cas9-CRISPR screen of glyco-enzymes
KOPN8 (https://web.expasy.org/cellosaurus/CVCL_1866), an MLL-r B-cell precursor ALL cell line, was genotype-verified using STR genotyping. Cells were stably transduced with a lentiviral vector containing a blasticidin-selection marker and expressing the Cas9 protein (AddGene #52962 plasmid 31). The sgRNA LV library was constructed in the pU6-sgRNA-EF1Alpha-PURO-T2A-RFP (ipUSEPR) vector 32. Each neutral selection gene control [neg, LacZ, Luc and Ren] was covered by ten sgRNAs each and 2 sgRNAs each were directed against ten essential gene controls [PCNA, POLR2D, POLR2A, PRL9, RPL2, CDK9 RPA3, RPS20, MYC, BRD4]. Cells transduced with LV expressing the latter sgRNAs would be expected to be depleted from the cell population. Target glycogenes were covered by 10 sgRNAs each. The entire screen included 1082 sgRNAs with 102 genes encoding glycan-remodeling enzymes, 4 neutral genes and 11 genes of which ablation is expected to reduce cell growth and viability. On d0 biological duplicates of around 20x106 KOPN8/Cas9 cells in 1640 medium were transduced using 10 µg/mL polybrene at a low MOI to ensure that most cells would be transduced with one or no sgRNAs. After 24 h, puromycin selection at 4 µg/mL was applied for 9 days, and 3 µg/mL puromycin was used from d10-d32. On d20 of selection almost all cells contained an ipUSEPR construct based on FACS for the RFP marker. Cells were harvested for DNA isolation on d28. After d32 cells were also plated on an OP9 stromal feeder layer and grown for an additional period under puromycin selection. Cells in the culture supernatant were harvested on d48 and also used for DNA isolation to obtain two biological replicates. Each isolate from 1-5×106 cells contained sufficient DNA to yield an approximately 1000x coverage. DNAs were sequenced on an Illumina NextSeq 550. Results were analyzed using MAGeCK [Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout 33
https://hpc.nih.gov/apps/MAGeCK.html. GiniIndex values for numbers of sgRNAs with 0 read counts varied between 0.09 and 0.36. The median-normalized read counts and the distribution of read counts were comparable across samples.
High-pH fractionation and C18-nanoLC-ESI-MS/MS analyses of peptides
50 µg of protein (samples R7B1, R7B3, R7B6, BM0 and BM37) and 10 µg (samples R7B11 and BM41) were reduced, alkylated and precipitated with Chloroform-Methanol as described above. The protein pellet was air-dried for 10 minutes before 100 µL of 25 mM of ammonium bicarbonate (Sigma, Cat#09830) were added to the pellets. Trypsin was added at a ratio 1:25 (enzyme:protein ratio) and samples were incubated for 18 h at 37 °C. After digestion, trypsin was heat‑inactivated at 95 °C for 10 min, and samples were dried under vacuum. 1000 U of PNGase F (2 µL) prepared in 50 µL of H218O (Sigma, Cat#329878) were added, and samples were incubated at 37 °C for 3 h to deglycosylate N-linked glycopeptides before drying under vacuum.Peptides were resuspended in 300 µL of 0.1% trifluoroacetic acid (TFA) and fractionated using a Pierce™ High pH Reversed-Phase Peptide Fractionation Kit (Sigma, Cat#84868) following the manufacturer's instructions. Briefly, the resuspended peptides were loaded to the pre-conditioned supplied spin columns, and washed (3,000xg, 2 min) once using water. Increasing concentrations of acetonitrile (ACN) (5%, 7.5%, 10%, 12.5%, 15%, 17.5%, 20%, and 50%) in 0.1% triethylamine (TEA) buffer were used to elute (3,000×g, 2 min) the bound peptides into eight distinct fractions. All the resulting fractions were dried under vacuum and kept at -20 °C until analysis. Samples were resuspended in 0.1% TFA and peptide amounts were quantified using a Thermo Scientific™ NanoDrop™ One/OneC Microvolume UV-Vis Spectrophotometer.The off-line fractionated peptides were identified using an Orbitrap Fusion™ Tribrid™ Mass Spectrometer coupled to an UltiMate™ 3000 UHPLC nanoLC (both Thermo Scientific™) (Supplementary Methods for details on the columns and methods used). Based on the NanoDrop quantitation, a volume corresponding to 600 ng of peptides were injected of each fraction.
Proteomics data analyses
All files were analyzed using the Andromeda search engine integrated into the MaxQuant suit (v6.10.43) 34. For high pH fractionated sample analyses using MaxQuant, the 8 fractions were combined according to their respective sample. As two injections were made for each fraction, this resulted in two combinations of 8 fractions, namely injection 1 and injection 2 for each sample. The HCD-MS/MS spectra were searched against in silico tryptic digest of Homo sapiens proteins from the UniProt sequence database (v10; May 2020) containing 20,359 protein sequences (Swiss-Prot IDs). All MS/MS spectra were searched with the pre-set MaxQuant parameters, and the following modifications were used: cysteine carbamidomethylation was set as a fixed modification; methionine oxidation, acetylation of protein N-terminus, and asparagine deamidation and 18O deamidation were allowed as variable modifications. False discovery rate (FDR) of the peptide spectral matches (PSMs), protein, and site were set to 1% based on Andromeda score. Match between runs (MBR) algorithm was activated to allow matching MS features between the different sample fractions and improve quantification 34.LFQ-Analyst was used for the label-free quantitation (LFQ) of the MaxQuant pre‑processed proteomic datasets 35. Two main “Conditions” were defined as “MLL-r” and “Normal BM”, and each injection was used as an independent replicate. In the Advanced Options setting, the “Adjusted p‑value cut-off” was defined to 0.01 (q-value, FDR<1%), whilst the “Log2 fold change cut-off” (log2FC) was defined to 2.The log2 and p‑values calculated by LFQ-Analyst were used to generate a volcano plot representation using the package EnhancedVolcano (v1.7.16) in R studio (v1.3.1073). GraphPad Prism (v8.4.3) was used for the RNA-protein integrative analyses, by plotting the calculated magnitude (log2) differences derived from our proteomics and RNA-seq analyses, targeting solely the differentially expressed proteins.
Results
MLL-r patient cells undergo a distinct protein O-glycome transformation
The initiating step of protein O-glycosylation is tightly controlled by 20 distinct GALNTs enzymes that post-translationally transfer a N-acetylgalactosamine (GalNAc) on folded glycoproteins 36. Of these, GALNT1, 2 and 3 are considered to be the most widely expressed and responsible for glycosylating the bulk of glycoprotein acceptor substrates 37. Each GALNT has specific preferences for the protein sequence/structure motifs that it can glycosylate, and the activity of some GALNTs can also depend on the previous action of other GALNTs. Because these enzymes are highly regulated in a cell-, tissue- and protein‑specific manner 37, 38, O-glycosylation is a major regulator of cell function 16. RNA-seq analyses identified ten GALNT gene transcripts of which six showed expression level differences between MLL-r and control cells (Fig. ). On a protein level, expression of GALNT1, GALNT2 and GALNT7 was confirmed, which also revealed significantly increased GALNT7 in MLL‑r cells (Fig. and ).The activity of GALNTs is the rate-limiting first step in O-glycan biosynthesis, but further O‑glycan extension is regulated by the concerted and competitive action of various glycosyltransferases (GTs). The mRNA expression levels of GTs responsible for extending the 3‑position of the initial GalNAc residue, such as C1GALT1 (encoded by C1GALT1), were unchanged. In contrast, expression levels of its essential molecular chaperone Cosmc (C1GALT1C1) 39, 40 doubled in MLL-r cells (Fig. ). mRNA expression was three-fold increased for GCNT1, the transferase responsible for initiating Core 2 type O-glycan synthesis (Fig. ), whereas expression of ST3GAL1, the sialyltransferase known to add a sialic acid on the core 1 galactose, remained unchanged (Fig. ). In addition, expression of ST6GALNAC1, the sialyltransferase competing with GCNT1 and thus preventing Core 2 type O-glycosylation, was very low in MLL-r cells (Fig. ). These transcriptomics data suggest a major remodeling of the MLL-r O-glycome.O-glycomics by LC MS/MS confirmed that MLL-r BCP-ALL cells exhibited a significantly altered O-glycome, shifting towards Core 2-type O-glycans, while Core 1-types were the major forms in normal control BCP cells. Overall, we identified 21 distinct O-glycans, including five Core 1, thirteen Core 2 and two O-fucose type glycans next to a sialylated hexose disaccharide (Fig. and 2B; ). In MLL-r cells, 51% of all glycans were Core 2 O-glycans compared to 26% in normal BCP cells. The level of Core 1 type O-glycans was almost halved in MLL-r (37% versus 64%, Fig. and C).More than 1300 human proteins are reported to be O-glycosylated 41-43. We found that mRNAs for around 70% of these were expressed in MLL-r/control cells, with 40% exhibiting significantly different expression in MLL-r samples (159 higher, 197 lower, ) compared to normal precursor B cell controls. Proteomics confirmed the presence of 241 previously reported O-glycoproteins, of which 33 were differentially expressed (with 9 upregulated in MLL-r cells; ). Together, these glycomic, transcriptomic and proteomic data provide clear evidence for the extensive remodelling in the O-glycoprotein and O‑glycan components of the MLL-r cell glycocalyx.
Sialylation and Lewis X fucosylation are increased on N-glycans of MLL-r cells
Protein N-glycosylation is critical for correct protein folding, cell-cell recognition and cellular interactions 44. In both MLL-r and control cells, oligomannose and complex type N‑glycans were the main forms of N-glycosylation (Fig. ). MLL-r cells showed increased complex type N-glycan levels compared to controls (52.4% versus 42.6%), while the levels of paucimannose (10.9% and 11.9%) and hybrid type (≈5%) N-glycans were similar. The complex type N-glycans were mainly biantennary. Only 5% of all N-glycans showed features consistent with tri- or tetra‑antennary structures, but these could not be further characterized as a consequence of their low abundance levels.The overall N-glycan sialylation levels were increased in MLL-r cells, with N-glycan structures carrying α2-6, both α2-3 and α2-6, or exclusively α2-3 linked sialic acids (Fig. ). These glycomic findings correlated well with the increased ST3GAL3 mRNA levels (Fig. ), which is likely to be the cause of the observed increased sialylation of tri- and tetra-antennary N-glycans in MLL-r cells. For ST6GAL1, however, a correlation between the decreased mRNA and protein levels in MLL-r cells and a decrease in α2-6 sialylation could not be found, as the overall α2‑6 sialylation levels remained unchanged (Fig. ).The functional behavior of glycoproteins and cells is also well-known to be influenced by fucosylation 45, 46. The human genome contains 13 functionally-distinct fucosyltransferase genes (FUTs), of which 5 were detected at the transcript level (Fig. ). Our glycomics analyses showed that core fucosylation, a product of FUT8 activity, was the major fucose modification present on about one-third of all N-glycans. While FUT8 transcript levels were lower in MLL-r, no difference in core-fucosylation levels on the glycans was observed between MLL-r and control cells (Fig. and F, ). In contrast, Lewis X type fucose was present on less than 1% of N-glycans in controls, but was almost triple the level in the MLL-r cells (Fig. ). Consistent with the glycomics data, transcripts for FUT4, one enzyme responsible for Lewis X synthesis 47, 48, were increased more than threefold (Fig. ). Overall, these data indicate a slight shift in the MLL-r N-glycome towards more complex type and Lewis X fucosylated N-glycans. Importantly, our data also indicate that a change in glycosyltransferase transcript levels is not automatically reflected in a change in the actual protein glycosylation.
Terminal lactosamine levels are increased in MLL-r
The protein levels of B4GALT1, the enzyme responsible for transferring galactose to generate the LacNAc epitope (Galβ1-4GlcNAc-) 47, were reduced 2.9-fold in MLL-r cells (Fig. ; ), without observing corresponding changes in B4GALT1 mRNA levels (). Interestingly, terminal (non-reducing end) LacNAc epitope levels on both N- and O-glycans were significantly increased in MLL-r cells. This included LacNAc glycoepitopes capped with α2-3 NeuAc residues (Fig. and 4B) that have recently been confirmed to be recognized by Galectin-1 and -3, while α2-6 sialylation of their counterparts prevents Galectin binding 49. Galectins are glycan-binding proteins involved in key physiological processes such as inflammation and signaling 50-52. They are also critical components of the tumor microenvironment, in particular in the bone marrow and in the context of various forms of leukemia 53. An increase of Galectin-1 has previously been reported in BCP-ALL, and specifically in the MLL-r subtype of ALL 54, 55. This is also supported by our data that showed an increase in Galectin-1 (LGALS1) mRNA and protein levels in MLL-r cells (Fig. and 4D, respectively). The transcript levels of Galectin-3 (LGALS3) and Galectin-3 binding protein (LGALS3BP), important regulators of innate immune responses often found upregulated in various cancer types 56, were significantly decreased in MLL-r cells. Reduced LGALS3BP was also confirmed at the protein level (Fig. and 4D). Thus, the observed glycocalyx differences between MLL-r and normal bone marrow (NBM) control cells are expected to significantly impact their ability to interact with and be recognized by glycan-binding proteins.
Combined proteomics and transcriptomics suggest remodeling of the MLL-r proteoglycome
We have undertaken a quantitative proteomic and RNA-seq analyses of GAG-associated transferases as proxies for potential glycocalyx changes affecting proteoglycans, due to the limited availability of patient-derived primary cell material. As shown in Fig. , heparan sulfate (HS) and chondroitin sulfate (CS) share a common tetrasaccharide linker (‑GlcAβ3Galβ3Galβ4Xylβ-Ser/Thr) that is attached via a xylose residue to a Ser or Thr residue 57. In MLL-r cells, two enzymes involved in the GAG linker biosynthesis, B3GALT6 and XYLT1, showed increased mRNA expression (Fig. and ).This linker tetrasaccharide is the starting point to synthesize either CS or HS chains. In MLL-r cells, CSGALNACT1 and CSGALNACT2 transcripts were reduced while EXTL2 and EXTL3 transcripts showed an increase (Fig. ). These data suggest that the GAG glycocalyx of MLL-r cells undergoes a major remodeling compared to control cells with a shift from CS towards HS (see also ). This shift is further supported by the fact that mRNA and protein expression levels of N-sulfoglucosamine sulfohydrolase (SGSH/SPHM, Fig. ), a hydrolase involved in HS degradation, are significantly decreased in MLL‑r cells. On the other hand, significantly increased levels of bone marrow proteoglycan (PRG2), a potent inhibitor of heparanase (HPSE) 58 and also a proteinase inhibitor 59, were unambiguously identified at the protein level (Fig. ) even though transcript levels were below the cut-off value of <1 rpkm. HPSE itself was not detected at the protein level, and no significant trends were observed in terms of the expression of the HPSE transcript (). These data clearly suggest (1) the presence of HS and (2) the active protection of HS in MLL-r cells, and support the notion that the proteoglycome presents novel potential therapeutic targets for MLL-r.
CRISPR screen supports redundancy in the critical role of glycan-remodeling enzymes on cell survival
Taken together, a dramatic remodeling of the glycome of MLL-r leukemia cells compared to normal BCP cells is apparent. We next selected 102 genes encoding glyco-enzymes expressed in MLL-r cells to determine if any are uniquely critical to cell survival or if functional redundancy is likely to exist. We performed a CRISPR library dropout screen in which each gene was represented by 10 different sgRNAs ( In this type of assay, sgRNAs are introduced into cells via lentiviral transduction at a very low multiplicity of infection to ensure that individual cells contain only a single sgRNA, and cultures are grown for an extended period of time to allow for loss of cells with destruction of essential genes. The barcodes in the sgRNA constructs then allows their detection via DNA sequencing (Fig. ). sgRNAs against glyco-enzyme genes were compared to sgRNAs that destroy essential genes such as MYC as well as to non-relevant sgRNAs. As shown in Fig. , compared to non-relevant sgRNAs (green circles), sgRNAs ablating many of the enzymes involved in glycan synthesis did not selectively disappear, indicating that these enzymes are not essential for cell survival in culture conditions. Other sgRNAs, under-represented compared to the non-relevant sgRNAs, included those targeting O-glycan synthesis (GALNT2, GCNT1, C1GALT1, C1GALT1, ). MGAT1 ablation clearly reduced cell survival (Fig. ), and it ranked among the most critical genes, comparable to essential genes such as MYC (Fig. ). Interestingly, ablation of OGT1 and OGA, enzymes responsible for application and removal of O-GlcNAc monosaccharides 60, and NGLY1, an enzyme located in the cytosol that removes N-glycans from glycoproteins subjected to proteasome degradation 61, was as lethal as that of essential genes such as MYC (Fig. ).
Transcriptomic and proteomic data show typically close correlation
To date no in-depth proteomic screening has been reported for patient-derived human primary MLL-r cells. The current knowledge on protein changes in leukemic MLL-r compared to normal precursor B cells is largely based on very few reported transcriptomics studies in which matched normal controls were included as a benchmark to evaluate differences 3, 62, 63. Therefore, we analyzed the total MLL-r proteome and compared it to that of control cells to evaluate how well the transcriptome and proteome data correlate. Overall, 408 proteins exhibited significantly different levels in the label-free quantitation (LFQ) analyses (FDR≤1% and log2FC≥2), of which 206 were upregulated in MLL-r cells (Fig. and ). This included 40 proteins that were previously reported to be O-glycosylated in other cell types 42. Unsurprisingly, a higher number of differentially-expressed transcripts were reported from the RNA-seq analyses and identified 3936 genes with differential expression at an mRNA level (FDR≤5% and log2FC≥1), with 1883 genes upregulated in MLL-r cells (). High pH-fractionation proteomics provided clear evidence that approximately 10% differentially regulated genes were observed.To investigate the correlation between our transcriptomic and proteomic datasets we plotted Log2 value changes of RNA against protein changes of the 397 differentially-expressed proteins identified by both techniques (Fig. ). Out of these, a very good correlation between the expression of protein and transcript was found for 227 proteins (Fig. ).Previous RNA expression analyses for BCP-ALL that included MLL-r samples and normal controls 3, 62, 63 reported increased expression of hallmark genes such as FLT3 64 and CSPG4 65 which was confirmed in our study both by transcriptomics and LFQ proteomics (Fig. and 7B). FLT3, for example, was almost 9 log-fold upregulated at the RNA-seq level, but just 4.35 log-fold upregulated in the proteomics dataset (Fig. ). Integrin α4 (ITGA4), for which increased expression has been previously associated with very poor outcomes in BCP-ALL 66, was the only integrin out of the ten detected that was upregulated in MLL-r cells at the transcript level. Transcriptomics data also showed an increased expression of mucin-like glycoproteins including CD43 (SPN) and decreased expression of P-selectin glycoprotein ligand-1 (SELPLG) (). Both are highly-glycosylated proteins, and these modifications may have contributed to why they were not detected in our proteomics dataset. In fact, one CD43 peptide (363-400) was identified in the proteomics dataset but excluded for further consideration due to our minimum requirement of two or more unique peptides for positive protein identification.
Discussion
Our studies clearly show that the glycocalyx is rich in hitherto unexplored diagnostic and therapeutic targets that remain elusive in gene/transcriptome-focused screening evaluations. Primary MLL-r cells, that have a block in differentiation compared to normal precursor B-lineage controls, show a vastly increased O-glycome complexity. Major shifts in the MLL-r glycocalyx include a strong increase of Core 2 type O-glycans (Fig. ), as well as an overall increase in sialylated N-glycans (Fig. ). These changes were accompanied by several alterations of the cell glycosylation machinery, determined at both the gene expression and glycosyltransferase levels. Remarkably, the Core 1 to Core 2 shift in the O‑glycome of MLL-r cells seems to be mainly a consequence of the expression of an important transferase, GCNT1, which was upregulated at the transcript level in MLL-r samples (Fig. ). These results show a striking similarity to the observations made in CML and AML cells, in which an increased activity and expression of GCNT1 led to an increase in Core 2 O-glycans, as compared to normal granulocytes 67. Interestingly, Giovannone et al investigated protein O-glycosylation in mature peripheral normal and malignant B-cells and found that less differentiated B-cells also have exhibited higher O-glycan complexity correlating with higher GCNT1 expression 68, supporting the notion that differentiation of B-cells in general reduces the complexity of the O-glycome.Furthermore, we found that a specific glycosyltransferase, GALNT7, which is one of the O‑GalNAc glycosylation-initiating enzymes, was significantly upregulated in patient samples at both transcript and protein levels (Fig. and 1B) as corroborated by gene expression array data (). Although this enzyme has previously been linked to carcinogenesis in cultured cell models of colorectal and breast cancer (for example 69, 70), this is the first time that GALNT7 was shown to be increased at both a transcript and protein level in primary patient MLL-r cells compared to normal controls. Yeoh et al
71 also reported a significant loss of GALNT7 mRNA in conjunction with a reduced leukemia burden in B-cell precursor patient samples during the course of initial induction chemotherapy (also see ). GALNT7 (also known as ppGalNAcT7) was reported to have a strict substrate preference, and to glycosylate peptides not glycosylated by any other ppGalNAcT isoform 37. This supports the notion that potential protein substrates such as CD45 68, CD43 or CD44 in MLL-r cells could undergo site-specific O-glycosylation events that are absent in normal BCP cells and which could be used as a leukemia-specific marker or target for treatment.We also identified increased levels of sialylated N-glycans in MLL-r cells. This is of interest given that that aberrant sialylation, with an emphasis on altered expression of ST6GAL1, has been extensively related to malignant transformation (as reviewed in 72, 73). Here we observed increased overall levels of sialylated N-glycans in MLL-r cells with increases in both to α2-6 and α2-3 linked sialic acid-linked glycoconjugates (Fig. ). Intriguingly, although MLL-r cells contained higher levels of α2-6 sialic acid containing N‑glycans, ST6GAL1 transcript and protein levels were significantly lower in these patient cells, suggesting that lower ST6Gal1 levels are still sufficient to modify N-glycans with α2-6 linked NeuAc residues, or that alternative pathways are present in MLL-r cells that maintain the overall α2-6 NeuAc levels. The results of our CRISPR screen (Fig. ) testing of the enzymes involved in glycan synthesis is consistent with the latter possibility. Interestingly, T-cell ALL cells made resistant to the chemotherapeutic agent desoxyepothilone B have decreased levels of α2-6 NeuAc residues on their membrane glycoproteins, which correlated with reduced ST6Gal1 activity and mRNA expression 74.Fucosylation was also altered in MLL-r cells. Interestingly, CD15, a Lewis X antigen detected by flow cytometry, is regarded as a hallmark of MLL-r BCP-ALL 75. We found that compared to normal BCP cells, levels of N-glycan associated Lewis X type fucose were significantly higher in the MLL-r samples, with FUT4, one of the enzymes responsible for Lewis X synthesis, being increased more than threefold (Fig. and 3F, also see ). Increased FUT4 RNA also is included in the signature that clusters MLL-r leukemias together from other subclasses 3 and was suggested to regulate migration and adhesion of B-ALL cells by integrin α5β1 76. Thus, we believe that increased Lewis X synthesis could contribute to MLL-AF4-driven migration and invasion 77.GAGs are extremely important biomolecules of the extracellular matrix, as they modulate protein function and stability. Tsidulko et al reported that many proteoglycans are expressed in EBV-transformed normal and malignant mature B-cell lines 78, and most studies investigating GAGs have similarly focused on in vitro cultured leukemia cell lines 79. Makatsori et al showed that the distribution of CS and HS differed between various leukemic cell lines 80. They also showed that CS was the more abundant GAG secreted in cell culture medium while the cellular levels of CS and HS were roughly comparable. GAGs have also been shown to be important for the ability of hematopoietic progenitor cells to bind to bone marrow endothelial cells 81, indicating an intrinsic relevance for GAGs in specific stages of immune cell maturation.Our RNA-seq results suggest that MLL-r cells partly modulate their GAG biosynthesis pathways from CS to HS (Fig. ). Interestingly, of the six known CS proteoglycans, CSPG4 (NG2), which is diagnostic for the MLL-r subtype of B-cell precursor ALL 82, 83, is also strongly overexpressed in our proteomics dataset. CSPG4 has a single O-linked CS chain attached at S995, with an oncofetal composition in many cancer cell types and mediating adhesion to integrins α4, β1 and α5β1 84. CHST11 (CS-specific sulfotransferase 11) is required for the synthesis of this CS structure 85 and we found this enzyme to be overexpressed in MLL-r cells compared to normal controls (Fig. ). Thus, CSPG4 may play an important role in MLL-r leukemia cell adhesion. Indeed, a recent study using a PDX model of MLL-r ALL showed that enzymatic removal of the CS chain from CSPG4 enhanced the effects of standard chemotherapy in mice 86, suggesting the observed chemo-sensitisation by reduced integrin-mediated adhesion in the bone marrow microenvironment.In agreement with previous reports 54, 55, we observed high levels of Galectin-1 in MLL‑r samples (2.75-fold higher, Fig. ), as well as higher levels of Galectin-1 binding epitopes on these cells (Fig. ). It has been suggested that Galectin-1 can also interact with chondroitin sulfate B and might be involved in the extracellular matrix assembly in smooth muscle cells 87. Given the strong upregulation of Galectin-1 in MLL-r cells, a cell surface remodeling towards higher levels of HS might also play an important role in MLL-r cell survival within the bone marrow environment.Many of the heparan sulfate proteoglycans (HSPGs) are expressed at relatively high levels in normal and leukemic B-cell precursors (). SPOCK2 mRNA, encoding a secreted HSPG 88 was especially abundant, and SDC2 and HSPG2 were significantly higher in the MLL-r samples (). Interestingly, HSPG2 was included in the group of 72 genes downregulated on d8 in a cohort of pediatric patients treated with chemotherapy and designated by Yeoh et al as predictive of overall response 71. HSPGs are key regulators of the bone marrow niche of normal hematopoietic stem cells 89, and thus the proposed cell surface remodeling from CS to HS synthesis could regulate both extracellular protein modification as well as change the GAG composition of those proteins modified by both CS as well as HS such as TGFBR3 88.We also compared RNA expression data with protein expression in the same cells. Within the proteomics dataset, we reliably confirmed protein level changes for about 11% of all transcripts identified to be differentially regulated in MLL-r cells. Within these 11%, we also identified, and reliably quantified, proteins generally considered to be present at low abundance levels, such as Golgi-bound glycosyltransferases (Fig. ). The overall trend with respect to up- or downregulation was in good agreement between the techniques for about 80% of the detected proteins/transcripts. Nevertheless, proteomics delivered a different quantitative result for around 19% of proteins and identified six proteins not detected by transcriptomics. Thus, there is clearly a significant value in applying both approaches as these deliver orthogonal information and cross-validate findings. Of the 11 proteins reliably identified and found to exhibit significant changes solely in the proteomics data, corresponding values were not present in the RNA-seq data. In part, this was due to the fact that the transcripts were not included in the UCSC annotation of hg38 (P0DTU4, a specific T-cell receptor beta chain) or were excluded from automated downstream analysis because they were not annotated as protein coding (IGHA1, IGHM). However, for others, RNA expression levels fell below the cut-off threshold of <1.0 (ACY1, CES1, EPX, ITGA2B, ITGB3, MMS22L, PRG2, VTN) indicating that high levels of specific proteins were present despite low to non-existent levels of the corresponding mRNA. We speculate that this could be due to recruitment of protein from exogenous sources within the bone marrow microenvironment. For example, vitronectin (VTN) and PRG2 are both well-known to bind to HS chains 90, 91, indirectly supporting our transcriptomics findings which suggests GAG-remodeling into a HS-rich glycocalyx in MLL-r cells. Similarly, eosinophil peroxidase (EPX), a protein rich in calcium and strongly expressed in bone marrow, could associate with strongly negatively-charged GAG-chains via charge interactions. ITGA2B and ITGB3 encode integrins that form the platelet glycoprotein (GP)IIb/IIIa 92, and their presence is most likely caused by platelet residues in the protein preparations. The lower integrin α2b and β3 (ITGA2B, ITGB3) protein levels found in the MLL-r samples () is consistent with the presumed BM microenvironment origin of these proteins: they would be expected to be present at lower levels in the BM of leukemia cases, which typically present with thrombocytopenia. Thus, our analyses of the primary leukemia samples may have detected changes in the tumor microenvironment that are invisible in cell-culture based experiments.We found that the expression of many genes involved in glycosylation differ between MLL-r cells and normal controls: 61 of the 221 human glycosyltransferases are differentially expressed in MLL-r cells. Some of these differences are likely to be caused by the overall modified metabolic state of leukemic cells compared to the corresponding normal controls, but a subset could also be aberrantly transcribed due to the fusion of the MLL gene [also known as KMT2a] with different partners. Godfrey et al analyzed potential target genes that contain an enhancer which can bind histone H3 with K29 methylation in among others the MLL-r B-ALL cell lines RS4;11 and SEM 93. Such genes are candidates for deregulation by the recruitment of MLL1/KMT2A fusion proteins to the DNA through interaction with the histone H3K79 methyltransferase DOT1L. We found that 865 genes with differential expression between MLL-r and controls in our RNA-seq analyses contain such enhancers including genes known to have aberrant expression such as CSPG4 and MEIS1. Interestingly, candidate MLL-DOT1L-regulated genes include those encoding enzymes involved in the O-glycosylation (GCNT1, GALNT1, GALNT2, GALNT7) and N-glycosylation (MGAT1, MGAT4A, MGAT 4B, and MGAT 5) pathways [not shown].Our survey of survival-associated genes (Fig. ) involved in glycan remodeling showed that those encoding enzymes that add terminal modifications such as sialylation and fucosylation have functional redundancy or are otherwise not critical for MLL-r cell survival under steady-state growth in culture. In contrast, functioning of the MGAT1 gene encoding the first branching enzyme in the N-glycan biosynthesis pathway was clearly required for cell survival. This result was unexpected seeing that knockout of mgat1 in mice is compatible with embryonic development until mid-gestation 94. In addition, shRNA-mediated MGAT1 knockdown in a human prostate cancer cell line was not lethal to these cells 95. We were also able to identify three other genes of which the function is critical to MLL-r B-ALL cells. High levels of OGT protein expression were measured in controls and MLL-r samples (Fig. ) and significantly lower OGA/MGEA5 RNA levels were found in the MLL-r samples (). Homeostasis of O-GlcNAc through activities of OGT and OGA has been previously implicated in normal hematopoietic development 96, 97, in differentiation arrest of AML and Jurkat T‑cells 98 as well as in other cancers 99. However, our study is the first to report that this modification is also essential for the survival of MLL-r B-ALL cells. The essential function of NGLY1 for MLL-r cell survival under conditions of steady-state growth was unexpected because human patients with NGLY1 loss-of-function exist 100. This enzyme is a non-redundant deglycosylase needed for the proteasome processing of misfolded proteins 101, with an essential function for AML cell survival 102. Interestingly, Tomlin et al reported the development of a small molecule inhibitor of NGLY1 that enhanced cytotoxicity of the proteasome inhibitor bortezomib used to treat the mature B-cell malignancies multiple myeloma and mantle cell lymphoma 103. Our results suggest such drug combination approaches may have specific lethality for MLL-r B-cell precursor ALLs as well.In summary we describe the first multi-omics study, covering the transcriptome, proteome and glycome, from limited numbers (≈106 cells per analysis) of primary MLL-r patient cells. While aspects of our study are consistent with previous observations concerning specific protein markers (FLT3, CSPG4, LGALS1) 64, 65, we have also found significantly different expression of other proteins such as the tyrosine kinase Fes. Interestingly, Kohlman et al reported that Fes would discriminate between AML and ALL with 11q rearrangement 104. However, in our analysis we found that Fes protein and mRNA are higher in leukemic B-cell precursor cells compared to normal controls, suggesting Fes as a general target for MLL-rearranged leukemias. High expression of Fes in BCP-ALL with MLL-r rearrangement compared to normal BM controls is also consistent with published data (). Thus, these results suggest that the development of a small molecule inhibitor targeting this kinase could be useful to treat MLL-rearranged leukemias. For example, a combination therapy with inhibitors of DOT1L or of the DOT1L/MENIN interaction, that are in clinical trials for the treatment of MLL-r acute leukemias 105, could be an exciting new therapeutic approach.Aberrant protein glycosylation has been extensively linked to malignancy across every type of cancer known to date (reviewed in 12, 106, 107). Importantly, even minor alterations in the glycan structure can strongly change the activity of glycoconjugates and through this mechanism alter cell behavior. Here we present the first combined, multi-omics study employing transcriptomics, proteomics and glycomics to capture the protein and glycan landscape of primary non-cultured MLL-r patient cells. This first comprehensive multi-omics study on MLL-r thus also represents a valuable, novel data resource. We find that the leukemia cell glycome, together with many proteins associated with glycan biosynthesis, are significantly changed compared to normal control precursor B-cells, indicating that the malignant phenotype of these cells could be regulated by such changes. We highlight the relevance of using primary, patient derived cells to identify leukemia-associated proteins invisible to gene-centric approaches, and demonstrate how integrated, multi-omics screening delivers a novel, inclusive map of the MLL-r cell landscape. Future studies therefore are warranted to determine how glycans on proteins have been remodeled and how these can be specifically targeted.
Conclusions
This first multi-omics analysis of primary MLL-r BCP-ALL leukemia cells revealed a global reorganization of the MLL-r cell glycocalyx. Using an integrated multi-omics workflow did not just enable us to discover previously unidentified diagnostic/therapeutic protein targets for MLL-r, but also revealed that a multi-omics approach delivers an added benefit to identify novel markers that are not detectable solely by a transcriptomics approach.Supplementary materials and methods, figures, table 1, and table headings for tables 2-7.Click here for additional data file.Supplementary table 2.Click here for additional data file.Supplementary table 3.Click here for additional data file.Supplementary table 4.Click here for additional data file.Supplementary table 5.Click here for additional data file.Supplementary table 6.Click here for additional data file.Supplementary table 7.Click here for additional data file.
Authors: Chao Gao; Melinda S Hanes; Lauren A Byrd-Leotis; Mohui Wei; Nan Jia; Robert J Kardish; Tanya R McKitrick; David A Steinhauer; Richard D Cummings Journal: Cell Chem Biol Date: 2019-02-07 Impact factor: 8.116
Authors: Matthew P Campbell; Jodie L Abrahams; Erdmann Rapp; Weston B Struwe; Catherine E Costello; Milos Novotny; Rene Ranzinger; William S York; Daniel Kolarich; Pauline M Rudd; Carsten Kettner Journal: Glycobiology Date: 2019-05-01 Impact factor: 4.313
Authors: M C Stubbs; Y M Kim; A V Krivtsov; R D Wright; Z Feng; J Agarwal; A L Kung; S A Armstrong Journal: Leukemia Date: 2007-09-13 Impact factor: 11.528
Authors: C H Pui; J E Rubnitz; M L Hancock; J R Downing; S C Raimondi; G K Rivera; J T Sandlund; R C Ribeiro; D R Head; M V Relling; W E Evans; F G Behm Journal: J Clin Oncol Date: 1998-12 Impact factor: 44.544
Authors: T Netelenbos; J van den Born; F L Kessler; S Zweegman; P A Merle; J W van Oostveen; J J Zwaginga; P C Huijgens; A M Dräger Journal: Leukemia Date: 2003-01 Impact factor: 11.528
Authors: Nicholas Giovannone; Aristotelis Antonopoulos; Jennifer Liang; Jenna Geddes Sweeney; Matthew R Kudelka; Sandra L King; Gi Soo Lee; Richard D Cummings; Anne Dell; Steven R Barthel; Hans R Widlund; Stuart M Haslam; Charles J Dimitroff Journal: Front Immunol Date: 2018-12-14 Impact factor: 7.561
Authors: Juliana Perez Botero; Kristy Lee; Brian R Branchford; Paul F Bray; Kathleen Freson; Michele P Lambert; Minjie Luo; Shruthi Mohan; Justyne E Ross; Wolfgang Bergmeier; Jorge Di Paola Journal: Haematologica Date: 2020-03-05 Impact factor: 9.941