Literature DB >> 34873817

Towards understanding the extensive diversity of protein N-glycan structures in eukaryotes.

Charlotte Toustou1, Marie-Laure Walet-Balieu1, Marie-Christine Kiefer-Meyer1, Marine Houdou2,3, Patrice Lerouge1, François Foulquier2, Muriel Bardor1,2.   

Abstract

N-glycosylation is an important post-translational modification of proteins that has been highly conserved during evolution and is found in Eukaryota, Bacteria and Archaea. In eukaryotes, N-glycan processing is sequential, involving multiple specific steps within the secretory pathway as proteins travel through the endoplasmic reticulum and the Golgi apparatus. In this review, we first summarize the different steps of the N-glycan processing and further describe recent findings regarding the diversity of N-glycan structures in eukaryotic clades. This comparison allows us to explore the different regulation mechanisms of N-glycan processing among eukaryotic clades. Recent findings regarding the regulation of protein N-glycosylation are highlighted, especially the regulation of the biosynthesis of complex-type N-glycans through manganese and calcium homeostasis and the specific role of transmembrane protein 165 (TMEM165) for which homologous sequences have been identified in several eukaryotic clades. Further research will be required to characterize the function of TMEM165 homologous sequences in different eukaryotic clades.
© 2021 The Authors. Biological Reviews published by John Wiley & Sons Ltd on behalf of Cambridge Philosophical Society.

Entities:  

Keywords:  Golgi apparatus; N-glycans; calcium homeostasis; endoplasmic reticulum; eukaryotes; glycosylation; glycosyltransferases; manganese homeostasis; regulation; structural diversity

Mesh:

Substances:

Year:  2021        PMID: 34873817      PMCID: PMC9300197          DOI: 10.1111/brv.12820

Source DB:  PubMed          Journal:  Biol Rev Camb Philos Soc        ISSN: 0006-3231


INTRODUCTION

N‐glycosylation is a co‐ and post‐translational modification (PTM) of proteins resulting in the attachment of an oligosaccharide to the nascent protein through the formation of an N‐glycosyl bond between the monosaccharide moiety and an asparagine residue belonging to the consensus sequence Asn‐X‐Ser/Thr/Cys (Gil, Velander & Van Cott, 2009; Zielinska et al., 2010; Matsui et al., 2011; Aebi, 2013). N‐glycosylation represents the most abundant PTM of proteins (Khoury, Baliban & Floudas, 2011) and influences their physicochemical properties and biological functions (Lingg et al., 2012; Varki, 2017a). Glycoproteomics studies have identified at least 2,000 glycoproteins in each of the following model organisms Caenorhabditis elegans, Arabidopsis thaliana, Drosophila melanogaster and Danio rerio (Zielinska et al., 2012), with this number a likely underestimate. This process is universal and is found in Eukaryota, Bacteria, and Archaea (Nothaft & Szymanski, 2010). In bacteria, N‐glycan processing starts in the cytoplasmic compartment and continues in the periplasm (Li et al., 2017). By contrast, in eukaryotes, the N‐glycosylation pathway is initiated in the endoplasmic reticulum (ER) and continues in the Golgi apparatus while glycoproteins travel along the secretory pathway (Colley, Varki & Kinoshita, 2015). In this review, we mainly focus on N‐glycosylation pathways in eukaryotes, describing the main steps leading to the extensive diversity of N‐glycan structures found in different organisms. We describe the more processed structures of N‐glycans isolated from the main groups Amorphea, Archaeplastida, Excavates and TSAR (Telonemia, Stramenopila, Alveolata and Rhizaria). We also highlight recent findings regarding the regulation of this process. It was recently reported in mammals and plants that the biosynthesis of complex‐type N‐glycans is regulated through manganese (Mn2+) and calcium (Ca2+) homeostasis involving the newly identified transmembrane protein 165 (TMEM165). Future directions for research on protein N‐glycosylation are also discussed.

N‐GLYCAN BIOSYNTHESIS IN THE ER: A CONSERVED PROCESS DURING EVOLUTION

In eukaryotes, protein glycosylation involves hundreds of different molecular actors, including enzymes involved in the biosynthesis and transport of nucleotide sugars as well as glycosyltransferases (GTs) and glycosylhydrolases (GHs), called glycoenzymes. In humans for example, more than 500 glycoenzymes take part in N‐ and O‐glycan processing including 114 enzymes and transporters involved in carbohydrate metabolism and transport, 76 glycosidases and more than 200 GTs (Neelamegham & Mahal, 2016). GTs involved in N‐glycan biosynthesis or processing predominantly use activated nucleotide monosaccharides (also called nucleotide sugars) as donor substrates and catalyse their transfer to an acceptor substrate that can either be a lipid‐linked oligosaccharide or an N‐glycan (Rini & Esko, 2017). They represent between 1 and 2% of eukaryotic genomes (Lairson et al., 2008). These actors are localised in different compartments, allowing efficient stepwise and well‐controlled N‐glycan processing. In eukaryotes, the synthesis of nucleotide sugars such as uridine diphosphate‐galactose (UDP‐Gal), guanidine diphosphate‐fucose (GDP‐Fuc), uridine diphosphate‐N‐acetylglucosamine (UDP‐GlcNAc), uridine diphosphate‐glucose (UDP‐Glc) or guanidine diphosphate‐mannose (GDP‐Man) starts in the cytoplasm. By contrast, cytidine monophosphate‐N‐acetylneuraminic acid (CMP‐Neu5Ac) synthesis occurs in the nucleus and uridine diphosphate‐xylose (UDP‐Xyl) in animals is formed in the Golgi apparatus from imported uridine diphosphate‐glucuronic acid (UDP‐GlcA). These nucleotide sugars are used as donor substrates in the different steps of N‐glycan processing (Bar‐Peled & O'Neill, 2011). N‐glycan processing is initiated in the ER with the synthesis of an oligosaccharide intermediate precursor linked to a dolichol (Dol) lipid, commonly known as lipid‐linked oligosaccharide (LLO), of diverse final structure ranging from Glc3Man9GlcNAc2‐P‐P‐Dol (Fig. 1) to a minimal form GlcNAc2‐P‐P‐Dol in protists (Samuelson & Robbins, 2015). LLO synthesis begins at the cytosolic face of the ER and requires the successive action of many different GTs called ALGs (asparagine‐linked glycosylation). These enzymes are embedded in the ER membrane and use activated nucleotide sugars and specific sugar acceptors to elongate the LLO (Aebi, 2013).
Fig. 1

Structures of oligosaccharidic precursors and examples of mature N‐glycans. In the endoplasmic reticulum, the oligosaccharidic precursor is synthesized and then transferred en bloc to the nascent protein. This structure is found in almost all eukaryotes. However, in some protists and microalgae, the oligosaccharidic precursor can be linear and shortened. The oligosaccharide is then processed further in the Golgi apparatus to yield a completely different structure. As exemplified here, the four main types of human and plant N‐glycans found in mature N‐glycoproteins are oligomannose, complex, hybrid and paucimannose. The linkages between each monosaccharide are shown. N‐glycan structures drawn according to the recently updated Symbol Nomenclature for Glycans (Varki, 2017b). Note that all eukaryotic N‐glycans share the common core glycan structure Man2‐3GlcNAc2 (shaded in grey). Fuc, fucose; Gal, galactose; Glc, glucose; GlcNAc, N‐acetylglucosamine; Man, mannose; Neu5Ac, N‐acetylneuraminic acid; Xyl, xylose.

Structures of oligosaccharidic precursors and examples of mature N‐glycans. In the endoplasmic reticulum, the oligosaccharidic precursor is synthesized and then transferred en bloc to the nascent protein. This structure is found in almost all eukaryotes. However, in some protists and microalgae, the oligosaccharidic precursor can be linear and shortened. The oligosaccharide is then processed further in the Golgi apparatus to yield a completely different structure. As exemplified here, the four main types of human and plant N‐glycans found in mature N‐glycoproteins are oligomannose, complex, hybrid and paucimannose. The linkages between each monosaccharide are shown. N‐glycan structures drawn according to the recently updated Symbol Nomenclature for Glycans (Varki, 2017b). Note that all eukaryotic N‐glycans share the common core glycan structure Man2‐3GlcNAc2 (shaded in grey). Fuc, fucose; Gal, galactose; Glc, glucose; GlcNAc, N‐acetylglucosamine; Man, mannose; Neu5Ac, N‐acetylneuraminic acid; Xyl, xylose. The first step during LLO biosynthesis is catalysed by dolichol‐phosphate GlcNAc phosphotransferase 1 (DPAGT1), also known as ALG7, and results in the transfer of a phospho‐GlcNAc group from UDP‐GlcNAc on the dolichol phosphate (Dol‐P). A GlcNAc and five Man residues then are sequentially added from UDP‐GlcNAc and GDP‐Man by the successive actions of multi‐enzymatic complexes ALG13/14 and the three mannosyltransferases ALG1 (β(1,4)‐mannosyltransferase), ALG2 (α(1,3)/(1,6)‐mannosyltransferase) and ALG11 (α(1,2)‐mannosyltransferase) (Gao, Nishikawa & Dean, 2004) to yield the dolichol pyrophosphate heptasaccharide Man5GlcNAc2‐P‐P‐Dol. At this stage, the LLO undergoes a trans bilayer translocation across the ER membrane, requiring the activity of a flippase (Alaimo et al., 2006; Rush, 2016; Verchère et al., 2021). In humans, deficiency of the proposed flippase RFT1 leads to a severe decrease in N‐glycosylation site occupancy on newly synthesized glycoproteins (Vleugels et al., 2009). After flipping of Man5GlcNAc2‐P‐P‐Dol, synthesis in most eukaryotes is completed on the luminal side of the ER (Breitling & Aebi, 2013) with four mannosylation steps respectively catalysed by ALG3, ALG9, ALG12 and ALG9. Then, three terminal Glc residues are consecutively added by ALG6, ALG8 and ALG10 to achieve the biosynthesis of the oligosaccharide precursor Glc3Man9GlcNAc2‐P‐P‐Dol that will be used as a donor for the N‐glycosylation of specific Asn residues (Fig. 1). Its en bloc transfer from the lipid‐linked precursor to newly synthesized proteins is catalysed by oligosaccharyltransferase (OST) (Silberstein & Gilmore, 1996; Lizak et al., 2011; Breitling & Aebi, 2013). An N‐glycosyl bond is then formed between the anomeric carbon of the proximal‐reducing GlcNAc of the oligosaccharide and the nitrogen atom of the lateral chain of the asparagine belonging to the consensus N‐glycosylation site of the protein (Kowarik et al., 2002). In most organisms, OST is an enzymatic complex composed of several different subunits (Ruiz‐Canada, Kelleher & Gilmore, 2009; Hamieh et al., 2017; Wild et al., 2018; Ramírez, Kowal & Locher, 2019). The oligosaccharide structure linked to the newly synthesized N‐glycoprotein will then be further trimmed in the ER lumen by the action of specific glycosidases: α‐glucosidase I, α‐glucosidase II and eventually an ER‐mannosidase leading to Man8‐9GlcNAc2 oligomannosides. Interaction of glucosylated glycan intermediates with two ER‐resident lectin‐like chaperones, calnexin and calreticulin, ensures ER quality control of N‐glycoproteins (Parodi, Cummings & Aebi, 2015). Once correctly folded, glycoproteins leave the ER to reach the Golgi apparatus.

N‐GLYCAN BIOSYNTHESIS IN THE GOLGI APPARATUS LEADS TO A RICH DIVERSITY OF N‐GLYCAN STRUCTURES

Oligosaccharides ranging from Man8‐9GlcNAc2 to the canonical Man5GlcNAc2 are present in most eukaryotes and are derived from the trimming of ER Man8‐9GlcNAc2 by α‐mannosidases. By contrast, a specific Man5GlcNAc2 structure has been recently identified in organisms synthesizing a truncated LLO Glc3Man5GlcNAc2‐P‐P‐Dol in the ER (Fig. 1), as reported for Trichomonas vaginalis, Tetrahymena thermophila, Chlamydomonas reinhardtii and Trypanosoma brucei (Jones et al., 2005; Manthri et al., 2008; Paschinger et al., 2012; Levy‐Ontman et al., 2014; Lucas et al., 2018). This non‐canonical oligomannoside will be called ‘linear’ Man5GlcNAc2 herein since it exhibits a Manα(1,2)Manα(1,2)Man sequence α(1,3)‐linked to the β‐Man residue of the core. After leaving the ER, oligomannosidic N‐glycans (Fig. 1) are processed further in the Golgi apparatus. However, in contrast to the ER processing steps, Golgi N‐glycan maturation involves a large diversity of glycosidases and GTs giving rise to a diversity of organism‐specific complex, hybrid and paucimannosidic N‐glycans (Fig. 1). This N‐glycan diversity is discussed in the following subsections for Amorphea, Excavates, Archaeplastida and the TSAR supergroup.

Amorphea

Amorphea are members of a taxonomic supergroup that includes fungi and animals and can be divided into two main subgroups: Amoebozoa and Obazoa. The latter includes Opisthokonta that can further be subdivided into two groups: Holomycota (fungi, yeasts) and Holozoa (animals and their closest single‐cell relatives) (Fig. 2).
Fig. 2

The structural diversity of N‐glycans synthesized in eukaryotes overlaid on the phylogenetic distribution of the different organisms. The most mature N‐glycan structures are shown for each clade. N‐glycan structures drawn according to the recently updated Symbol Nomenclature for Glycans (Varki, 2017b) using the Glycoworkbench tool v2.1. Phylogeny adapted from Burki et al. (2020) and http://lifemap.univ‐lyon1.fr/explore.html. Ara, arabinose; Araƒ, arabinofuranose; dHex, deoxyhexose; Fuc, fucose; Gal, galactose; Galƒ, galactofuranose; GalNAc, N‐acetylgalactosamine; Glc, glucose; GlcA, glucuronic acid; GlcNAc, N‐acetylglucosamine; Hex, hexose; HexA, Hexuronic acid; HexNAc, N‐acetylhexosamine; Man, mannose; Neu5Ac, N‐acetylneuraminic acid; Neu5Gc, N‐glycolylneuraminic acid; Xyl, xylose.

The structural diversity of N‐glycans synthesized in eukaryotes overlaid on the phylogenetic distribution of the different organisms. The most mature N‐glycan structures are shown for each clade. N‐glycan structures drawn according to the recently updated Symbol Nomenclature for Glycans (Varki, 2017b) using the Glycoworkbench tool v2.1. Phylogeny adapted from Burki et al. (2020) and http://lifemap.univ‐lyon1.fr/explore.html. Ara, arabinose; Araƒ, arabinofuranose; dHex, deoxyhexose; Fuc, fucose; Gal, galactose; Galƒ, galactofuranose; GalNAc, N‐acetylgalactosamine; Glc, glucose; GlcA, glucuronic acid; GlcNAc, N‐acetylglucosamine; Hex, hexose; HexA, Hexuronic acid; HexNAc, N‐acetylhexosamine; Man, mannose; Neu5Ac, N‐acetylneuraminic acid; Neu5Gc, N‐glycolylneuraminic acid; Xyl, xylose.

Amoebozoa

To date, protein N‐glycan profiles have been reported for only three amoebozoan species: two human parasites, Acanthamoeba strains (clade Discosea) and Entamoeba histolytica (clade Evosea), and the non‐infectious slime mould Dictyostelium discoideum (clade Evosea). Amoebozoan species synthesize oligomannosides ranging from Man5GlcNAc2 to Man9GlcNAc2 with some additional substitutions (Fig. 2) (Magnelli et al., 2008; Schiller et al., 2009, 2012; Feasley et al., 2010; Feasley, van der Wel & West, 2015; Obregón et al., 2019). In Acanthamoeba strains, the major N‐glycan structures are oligomannosides with 7–10 hexoses (Hex) partially substituted with methyl (Me) and pentose residues for which the position is still ambiguous (Fig. 2) (Schiller et al., 2012). Further analyses have led to the biochemical characterization of a canonical Man5GlcNAc2 N‐glycan substituted by an α(1,6)‐Fuc residue capped with an α‐linked Man (Schiller et al., 2012). Similar substitutions of the core Fuc with a Hex have also been reported in nematodes (Gutternigg et al., 2007b; Yan et al., 2012; Paschinger & Wilson, 2015), cephalopods (Zhang et al., 1997), gastropods (Wuhrer et al., 2004) and platyhelminthes (Paschinger et al., 2011). However, in these organisms, the Fuc residue is capped with a Gal residue instead of a Man residue. The most abundant N‐glycan found in E. histolytica is Man5GlcNAc2 (Magnelli et al., 2008). N‐glycome analysis of E. histolytica has also highlighted smaller oligomannosides from Man2GlcNAc2 to Man5GlcNAc2 carrying a Gal residue linked to the terminal mannoses. This Gal residue might eventually be substituted by an additional Glc (Magnelli et al., 2008). N‐glycans from D. discoideum are characterized by the presence of additional GlcNAc residues on oligomannoside N‐glycans (Couso et al., 1987; Schiller et al., 2009; Feasley et al., 2010; Nakagawa, Tojo & Fujii, 2011). Additional glycan modifications like fucosylation of the core proximal GlcNAc that bisects the GlcNAc have been reported, as well as substitutions on terminal monosaccharides with Fuc, sulfate or methylphosphate (Fig. 2) (Feasley et al., 2010; Nakagawa et al., 2011; Hykollari et al., 2013, 2017). Concerning core fucosylation, both α(1‐3)‐ and α(1‐6)‐ linkages have been reported (Schiller et al., 2009; Nakagawa et al., 2011).

Obazoa

Holomycota

Holomycota include yeasts and the large kingdom of Fungi. In this group, the main N‐glycan structures found on glycoproteins are oligomannosides (Fig. 2). For the phylum Basidiomycota, the two species Pseudozyma antarctica and Mallassezia furfur synthesize oligomannosides composed of 3–9 Man residues (Flores et al., 2019). Although oligomannosides are also observed in Saccharomycotina species (Gong et al., 2009), substitutions by Gal residues have been reported (Ballou, Ballou & Ball, 1994; Ziegler et al., 1999). Moreover, yeasts often extend their N‐glycan structures to form large polymannosidic structures (de Pourcq, de Schutter & Callewaert, 2010). Concerning Leotiomyceta, especially Penicillium species, similar N‐glycans have been identified with additional modifications such as the attachment of galactofuranose (Galf) residues, phosphoethanolamine and phosphorylcholine (Fig. 2) (Hykollari et al., 2016).

Holozoa (metazoans)

Porifera
Sponges were long considered members of the Archaeplastida kingdom but are now classified as a branch of Metazoa. N‐glycosylation in Porifera seems not to be as complex as those of Bilateria described below (Fig. 2). The sponge Haliclona caerulea synthesizes protein N‐linked glycans based on a Man7GlcNAc2 structure substituted with N‐acetylhexosamine (HexNAc) and deoxyhexose (dHex) residues leading to a HexNAc7Man7dHex2 structure (Fig. 2). Larger N‐glycans with up to 13 additional Hex residues have also been detected but these still require structural characterization (Carneiro et al., 2013).
Cnidaria
In Cnidaria, investigation of the protein N‐glycan profiles of the freshwater Hydra magnipapillata revealed polyfucosylated LacdiNAc (GalNAcβ(1,4)GlcNAc) structures that play a role in its regenerative process (Sahadevan et al., 2014).
Bilateria
Bilateria is a very large group composed of Protostomia (arthropods, annelids, and molluscs) and Deuterostomia (echinoderms, chordates and hemichordates). The diversity of the N‐glycan structures within this phylum is large, ranging from paucimannosidic to di‐, tri‐ and tetra‐antennary complex N‐glycans (Fig. 2). In Protostomia, protein N‐glycosylation is more sophisticated. For molluscs, data have been reported mainly for two subgroups: Bivalvia and Gastropoda which includes Heterobranchia and Caenogastropoda. N‐glycan analysis of these clades has revealed a large variety of structures, including oligomannosidic and complex N‐glycans with an immense variety of modifications including core xylose, methylated Hex, Fuc and N‐acetylgalactosamine (GalNAc) substitutions (Fig. 2) (Lommerse et al., 1997; Dolashka‐Angelova et al., 2003; Gutternigg et al., 2007a). The di‐ and tri‐antennary structures can also be mono‐ or di‐sulfated on the Hex residues or O‐methylated on the terminal Gal, Man or GalNAc (Dolashka‐Angelova et al., 2003; Kurz et al., 2013). A core di‐fucosylation has been observed in the bivalve mollusc Crassostrea virginica in addition to a repertoire of sulfated and blood group‐A epitopes (Kurz et al., 2013). Alternatively, the core α(1,3)‐Fuc residue can be replaced by a Hex residue on the proximal GlcNAc. Polyantennary N‐glycans containing 4‐O‐methyl GlcA(1,4)GlcNAc(1,3)Fuc(1,4)GlcNAc have been described in the mollusc Mytilus edulis (Zhou et al., 2013). Monofucosylation on the LacdiNAc motif associated with or without a core xylose has been reported in Biomphalaria glabrata (Lehr et al., 2007). Within Mollusca, xylosylation of the core seems to be specific to the phylum Heterobranchia (Gutternigg et al., 2007a). Other remarkable features are the presence in Rapana venosa of a HexNAc(HexA)Fuc motif linked to a terminal GlcNAc (Sandra et al., 2007). In Volvarina rubella, a carnivorous and scavenging marine gastropod, a complex N‐glycome encompasses a range of oligomannosidic, paucimannosidic, core‐modified and complex N‐glycans. The latter include highly modified N‐glycans bearing N‐methyl‐2‐aminoethylphosphonate, phosphorylcholine, methyl, sulfate, or aminoethylphosphonate substitutions and also a core α(1,6)‐Fuc capped with a Hex and an additional Fuc residue (Fucα(1,2)Galβ(1,4)Fucα(1,6)) (Eckmair et al., 2016). N‐glycans of Platyhelminthes exhibit less‐sophisticated structures compared to other bilaterians (Fig. 2). They perform core α(1,6)‐fucosylation as well as galactosylation and phosphorylation of the terminal GlcNAc residues as reported for Fasciola hepatica N‐glycans (Ravidà et al., 2016). N‐glycan structures identified on proteins of Dugesia japonica more commonly have a Galβ(1,4)Fuc disaccharide on the proximal GlcNAc residue and an additional methyl Hex linked to this motif (Paschinger et al., 2011). More complex N‐glycan structures (not shown in Fig. 2) have been observed in trematodes, notably the presence of polygalactosylation (up to five Gal residues, which may be methylated) on the core fucose. Methylation was also observed on the Man residues leading to polymethylated glycans in Schmidtea mediterranea (Subramanian et al., 2018). In addition, LacdiNAc bearing three Fuc residues, Lewis x epitope (Fucα(1,3)‐Galβ(1,4)), core xylose or even core difucosylation were identified in Schistosoma species (Khoo et al., 1997; Wuhrer et al., 2006a,b; Jang‐Lee et al., 2007; Mickum et al., 2016). Nematodes represent 80% of land animals (van den Hoogen et al., 2019). The structural diversity of the N‐glycans found in nematodes is very large, for example approximately 150 different N‐glycan structures have been identified in Dirofilaria immitis (Martini et al., 2019). A large complexity and diversity of N‐glycans exist in the model organism Caenorhabditis elegans, which shares similar N‐glycan features with other nematode species, such as Fuc residues or a Gal‐Fuc disaccharide linked to the core, and attachment of phosphorylcholine which can also be found in Ascaris suum, Oesophagostomum dentatum and Haemonchus contortus (van Die et al., 1999; Cipollo et al., 2005; Pöltl et al., 2007; Butschi et al., 2010; Yan et al., 2012, 2018; Paschinger & Wilson, 2015). However, C. elegans also has species‐specific N‐glycan modifications, such as the presence of a bisecting Gal motif that, to our knowledge, has never been reported in other organisms (Yan et al., 2015). A detailed list of the diversity of N‐glycan structures in C. elegans is provided by Paschinger, Yan & Wilson (2019) with the most processed structure being MeFuc5Gal5Man2GlcNAc2 (Fig. 2). Other structures such as poly‐fucosylated GlcNAc motifs and tetra‐antennary N‐glycans with terminal GlcA have been identified in D. immitis (Martini et al., 2019). Multi‐antennary LacdiNAc structures have also been described in D. immitis and Trichinella spiralis (Fig. 2; Martini et al., 2019; Morelle et al., 2000). Insects are divided into four main phyla: Polyneoptera, Hymenoptera, Diptera and Lepidoptera (Fig. 2). As reported for nematodes, insects have the ability to attach GlcA to their N‐glycans. This residue has been found in all insect phyla investigated except Polyneoptera for which this substitution has not yet been described (Aoki & Tiemeyer, 2010; Stanton et al., 2017; Hykollari et al., 2018). The polyneopteran Locusta migratoria was the first example of an organism that can synthesise N‐glycans exhibiting an aminoethylphosphonate motif (Hård et al., 1993). This substitution was subsequently observed in glycoproteins from royal jelly and bee venom (Hykollari et al., 2018). LacdiNAc motifs similar to those described above for Heterobranchia also have been reported. Surprisingly, Lepidoptera can synthesise protein N‐glycans with phosphorylcholine substituents (Stanton et al., 2017). Sulfation of Man and Fuc residues of the core N‐glycan also have been observed in Lepidoptera. By contrast, in Hymenoptera, only Man sulfation has been reported (Stanton et al., 2017). Such sulfation of N‐glycans has also been reported in Crustacea, especially in the spiny lobster Panulirus interruptus (Van Kuik et al., 1986, 1987). The presence of sialic acids substitutions in Drosophila melanogaster has also been reported (Aoki & Tiemeyer, 2010). The super‐phylum Deusterostomia is composed of echinoderms, chordates and hemichordates. Proteins from echinoderms carry oligomannosides up to Man9GlcNAc2 (Şahar & Deveci, 2017). Recently, mass spectrometry analyses demonstrated the presence of more complex N‐glycans in Holothuria atra. Indeed, Vanbeselaere et al. (2020) identified oligomannoside structures (72%), the most sophisticated being an unusual N‐glycan containing a phosphate (P) and three Glc residues (P1Glc3Man9GlcNAc2). Truncated forms containing only seven Man have also been identified. The remaining 28% of N‐glycans are complex and hybrid structures exhibiting core α(1,6)‐Fuc and a LacNAc (Galβ(1,4)GlcNAc) motif. This latter motif can also be fucosylated and substituted by two N‐glycolylneuraminic acids (Neu5Gc), the first being α(2,3)‐linked to the Gal residue and the second attached in α(2,6) to the GlcNAc of the antennae. The N‐glycome of the brittle star Ophiactis savignyi was recently described. In this species, core fucosylation, and various sulfate and Neu5Gc substitutions that can be methylated have been reported. Structures with polyLacNAc motifs or a terminal GlcNAc bearing a phosphate residue are also synthesized in O. savignyi (Eckmair et al., 2020). Chordates include vertebrates (fish, amphibians, reptiles, birds, and mammals) and tunicates. The tunicate Ciona intestinalis synthesises mainly hybrid N‐glycans containing nine Man residues, one Glc and with core β(1,2)‐Xyl and core α(1,3)‐Fuc residues attached (Yagi et al., 2008). Species from the phylum Actinopterygii (ray‐finned fishes) synthesize very complex poly‐antennary N‐glycans substituted with core α(1,6)‐Fuc, bisecting GlcNAc, LacNAc motif with or without α(2,3)‐linked sialic acid and/or β(1,4)‐Gal and a Fuc on the outer GlcNAc residues (Taguchi et al., 1994; Hanzawa, Suzuki & Natsuka, 2017). In mammals, complex N‐glycans are di‐ to tetra‐antennary structures composed of LacNAc sequences carrying a core α(1,6)‐Fuc. These antennae are terminated by sialic acid residues, either Neu5Ac or Neu5Gc (Figs 1 and 2). Humans do not have Neu5Gc (Brinkman‐Van der Linden et al., 2000; Tangvoranuntakul et al., 2003; Bardor et al., 2005). Some mammalian species, but not humans, are able to synthesize the ‘alpha Gal’ epitope that consists of α(1,3)‐Gal linked to the LacNAc motif (Galili et al., 1988; Galili, 2013).

Excavates

Excavates are diverse single‐cell flagellate organisms ranging from free‐living species such as the freshwater alga Euglena gracilis to human parasites like Trypanosoma brucei (clade Euglenozoa) or Trichomonas vaginalis (clade Metamonada).

Euglenozoa

The freshwater alga E. gracilis is able to synthesise oligomannoside N‐glycans ranging from Man6GlcNAc2 to Man9GlcNAc2 (de la Canal & Parodi, 1985; O'Neill et al., 2017), a small proportion of which are substituted by a putative aminoethylphosphonate moiety (O'Neill et al., 2017). Oligomannosidic N‐glycans have also been observed on the variant surface glycoproteins of the human pathogen T. brucei (Zamze et al., 1990, 1991; Jones et al., 2005; Manthri et al., 2008; Damerow et al., 2016). However, in contrast to E. gracilis, T. brucei synthesizes complex N‐glycans with unusually large poly‐LacNAc‐containing structures (Fig. 2) (Zamze et al., 1991; Mehlert et al., 1998; Atrih et al., 2005; Manthri et al., 2008; Damerow et al., 2016). Such poly‐LacNAc‐containing structures have only been observed on proteins of the bloodstream form but not in the procyclic form of T. brucei (Mehlert et al., 1998; Nolan, Geuskens & Pays, 1999). Although rare, these poly‐LacNAc structures have also been found in cancerous tissue in humans (Ichikawa et al., 1999; Ishida et al., 2005; Holst, Wuhrer & Rombouts, 2015).

Parabasiala

Protein N‐glycosylation in the clade Parabasiala has received little attention to date and is restricted to the sexually transmitted parasite T. vaginalis. The main N‐glycan of T. vaginalis is a linear Man5GlcNAc2 oligomannoside that results from the transfer in the ER of a truncated precursor Glc3Man5GlcNAc2‐P‐P‐Dol on the N‐glycosylation site of the protein (Samuelson et al., 2005; Paschinger et al., 2012; Lombard, 2016). This linear Man5GlcNAc2 oligomannoside is then trimmed into Man4GlcNAc2 and Man3GlcNAc2 that are substituted by a pentose (likely Xyl), phosphoethanolamine or LacNAc motif, thus generating hybrid N‐glycan structures (Fig. 2) (Paschinger et al., 2012).

Archaeplastida

The Archaeplastida are a major group of autotrophic eukaryotes comprising Rhodophyta and Chloroplastida. This latter subgroup is subdivided into organisms of the clade Streptophyta, which includes land plants and desmids, and the Chlorophyta clade containing the green algae.

Chloroplastida

Streptophyta

In addition to oligomannosides, Embryophytes (land plants and mosses) and Desmidiales are able to synthesise complex‐type N‐glycans (Lerouge et al., 1998). To date, studies have mainly been carried out in land plants and mosses, but similar structures have been recently reported in the charophycean green alga Penium margaritaceum belonging to Desmidiales (Ruiz‐May et al., 2018). Complex N‐glycans of streptophytes are characterized by the presence of a core GlcNAc2Man3GlcNAc2 similar to that of mammals and insects that is substituted by a core α(1,3)‐Fuc and a core β(1,2)‐Xyl (Fig. 1). In land plants and mosses, additional modifications of N‐glycans could result in the synthesis of the Lewis a (Lea) epitope that results from the transfer of both an α(1,4)‐Fuc and a β(1,3)‐Gal to terminal GlcNAc residues (Figs 1 and 2). These complex N‐glycan epitopes are highly conserved in land plants (Lerouge et al., 1998) and also in mosses such as Physcomitrella patens (Fitchette et al., 1999; Viëtor et al., 2003; Mega, 2007). Proteins from land plants also carry truncated oligosaccharides consisting of mature N‐glycans lacking terminal GlcNAc residues (Lerouge et al., 1998). Such structures are called paucimannosidic N‐glycans (Fig. 1). These glycans result from the action of hexosaminidases in the secretory system, the extracellular matrix or the storage organelles (e.g. vacuole) (Strasser et al., 2007; Liebminger et al., 2011; Shin et al., 2017).

Chlorophyta

The clade Chlorophyta includes green algae such as Chlorella and Chlamydomonas species. Although phylogenetically close to Embryophytes, their protein N‐glycan profiles largely differ from those of land plants. Oligomannosides represent the major N‐glycans of proteins isolated from Chlorophytes. Man5GlcNAc2 oligomannoside found in Chlamydomonas reinhardtii exhibits a linear sequence (Vanier et al., 2017). This non‐canonical structure is derived from Glc3Man5GlcNAc2‐P‐P‐Dol occurring in the ER of this green microalga (Fig. 1) (Lucas et al., 2018). In addition, oligomannosides from chlorophytes are partially mono‐ or di‐ O‐methylated and substituted with deoxyhexoses and/or pentoses (Mathieu‐Rivet et al., 2013, 2014; Levy‐Ontman et al., 2014; Schulze et al., 2017; Vanier et al., 2017; Mócsai et al., 2019; Lucas et al., 2020; Oltmanns et al., 2020). In Chlamydomonas reinhardtii, N‐linked glycans are substituted by α(1,3)‐Fuc residues on the proximal GlcNAc and Xyl residues linked to α‐Man and β‐Man residues of Man4GlcNAc2 and Man5GlcNAc2 oligomannosides (Lucas et al., 2020). Two recent studies reporting the protein N‐glycomes of Chlorella species (clade Chlorellales) revealed an unsuspected variety of new N‐glycans composed of short oligomannosides substituted by various Gal, arabinose (Ara) and GlcNAc residues, some of which (Gal and Ara) exhibited either furanose or pyranose forms (Mócsai et al., 2019, 2020a,b). In addition, N‐glycans exhibiting GlcNAc at their non‐reducing end have also been reported in Botryococcus braunii (clade Elliptochloris) (Schulze et al., 2017). The presence of such GlcNAc‐terminated glycans suggests the action of an N‐acetylglucosaminyltransferase I in the N‐glycan processing of these microalgae, a transferase that is absent in C. reinhardtii (Vanier et al., 2017).

Rhodophyta

In rhodophytes, protein N‐glycosylation profiles are mainly from Porphyridium sp. Proteins from this red alga carry O‐methylated oligomannosides substituted with one or two Xyl residues. Of particular interest, one of these two Xyl residues is described to be linked to the second GlcNAc of the core N‐glycan structure (Levy‐Ontman et al., 2011).

TSAR

The TSAR supergroup includes the clades Telonemia, Stramenopila, Alveolata and Rhizaria. To date, data regarding N‐glycosylation have only been published for proteins isolated from alveolates and stramenopiles.

Alveolata

The phylum Alveolata consists of unicellular organisms (protozoans) such as dinoflagellates, ciliates and apicomplexans (protozoan parasites including those responsible for the malaria and toxoplasmosis diseases). In contrast to dinoflagellates for which no information on protein N‐glycosylation is available, several studies have focused on protein N‐glycan structures in ciliates and apicomplexans. Short oligomannosides have been characterized in apicomplexans such as Toxoplasma gondii and Cryptosporidium parvum. For other apicomplexan species, such as the malaria parasite Plasmodium falciparum, only one or two GlcNAc residues are transferred to the asparagine of the N‐glycosylation consensus site (Fauquenoy et al., 2008, 2011; Garénaux et al., 2008; Luk, Johnson & Beckers, 2008; Haserick et al., 2017; Gas‐Pascual et al., 2019). Glucose‐terminated linear Man5GlcNAc2 N‐linked glycan has been identified in apicomplexan species (Taniguchi et al., 1985; Fauquenoy et al., 2008, 2011; Luk et al., 2008; Calow et al., 2016; Haserick et al., 2017). More recently, unexpected structures like Glc2Man6GlcNAc2 and Man6GlcNAc2 have been reported in T. gondii (Gas‐Pascual et al., 2019). These glycans derive from the trimming of a truncated Glc2‐3Man5‐6GlcNAc2‐P‐P‐Dol precursor synthesized in the ER (Taniguchi et al., 1985; Garénaux et al., 2008; Haserick et al., 2017). In ciliates, shorter oligomannosides (Man3GlcNAc2) have been characterized, especially in Tetrahymena thermophila and Tetrahymena pyriformis (Taniguchi et al., 1985; Becker & Rüsing, 2003; Weide et al., 2006; Calow et al., 2016).

Stramenopila

Data on protein N‐glycans in stramenopiles are restricted to Phaeodactylum tricornutum. In contrast to apicomplexan species, this diatom is able to synthesize oligomannoside N‐glycans resulting from complete ER processing of the lipid‐linked precursor up to Glc2Man9GlcNAc2‐P‐P‐Dol (Lucas et al., 2018). After quality control events occurring in the ER, this glycan precursor is then trimmed into oligomannosides ranging from Man9GlcNAc2 to Man5GlcNAc2 N‐glycans (Baïet et al., 2011). These oligomannosides have recently been demonstrated to be identical to those of mammals and plants (Dumontier et al., 2021). With regards to Golgi maturation, expression of an N‐acetylglucosaminyltransferase gene [for α‐1,3‐mannosyl‐glycoprotein 2‐β‐GlcNAc transferase (GnT I)] from P. tricornutum was demonstrated to be able to restore complex N‐glycan maturation in the mammalian Chinese Hamster Ovary (CHO) cell line Lec1 mutant that lacks its endogenous GnT I (Baïet et al., 2011). However, although this diatom expresses a functional GnT I and other molecular actors of N‐glycan processing in the Golgi machinery (Zhang et al., 2019), only small amounts of paucimannosidic N‐glycans bearing a core α(1,3)‐Fuc have been detected in P. tricornutum proteins.

REGULATION OF THE ‐GLYCOSYLATION PATHWAY

At the evolutionary level, glycosylation provides eukaryotes with a rich combinatorial system, generating an incredible diversity of N‐glycan structures without prior genome modifications (Gagneux, Aebi & Varki, 2015; Varki & Gagneux, 2015). Fig. 2 shows the most common processed N‐glycans for the different eukaryotic clades. However, within a given organism, N‐glycoproteins exist as multiple glycosylation variants depending on their maturation level and number of N‐glycosylation sites. Regulation of the glycoproteome therefore is essential for numerous biological and physiological functions and hence represents an unprecedented level of complexity. An important aspect of this regulation is that there is no gene encoding N‐glycan structures per se, but 2% of the mammalian genome is estimated to encode genes involved in the synthesis of these N‐glycan structures. As noted in Section II, a prodigious enzymatic arsenal secures the biosynthesis of N‐glycan structures (Moremen, Tiemeyer & Nairn, 2012). Expression of these actors is a key parameter in the observed N‐glycan structural diversity, depending on many cellular parameters such as regulation at the transcriptional (transcription factors, epigenetic regulation), post‐transcriptional (microRNA, miRNA) (Nairn et al., 2008; Antony et al., 2014; Dewal et al., 2015; Lau et al., 2015), translational (speed of synthesis) and post‐translational (conformation, modifications, interaction with other proteins) levels (Rabouille et al., 1995; Tu & Banfield, 2010; Struwe & Reinhold, 2012; Gao et al., 2014; Neelamegham & Mahal, 2016; Blackburn et al., 2018; Liu, Doray & Kornfeld, 2018). Interestingly, the glycan structure itself, at least for the N‐glycan processing, governs the abundance of certain glycoproteins. Indeed, for N‐glycoproteins, quality control based on structures of the glycan chains takes place in the ER (Ruddock & Molinari, 2006; Xu & Ng, 2015). This quality control not only monitors N‐glycoprotein folding but also discriminates correctly folded glycoproteins from poorly folded ones destined to be degraded. Folding state can be considered as a key element in governing the N‐glycan structure generated through the Golgi apparatus. The accessibility of the oligosaccharide side chains to GTs and glycosidases in the Golgi apparatus determines whether a specific N‐glycan is processed into a complex‐type N‐glycan or not. This has been illustrated nicely for the well‐known phytohemagglutinin‐L (PHA‐L), the lectin of the common bean Phaseolus vulgaris, which possesses a non‐processed oligomannosidic N‐glycan at the asparagine‐12 and a complex‐type N‐glycan at asparagine‐60 (Sturm & Chrispeels, 1986; Sturm, Bergwerff & Vliegenthart, 1992; Bardor et al., 1999). Besides the notion of accessibility, the N‐glycan structures generated through the Golgi apparatus are also dictated by the strong substrate specificities and sequential action of different Golgi GTs and GHs. This led to the concept of “Go/No Go” pathways proposed by H. Schachter, the best example being the activity of the GnT I considered as a ‘Go’ signal for the activity of the Golgi mannosidase II (Schachter, 1991; Ünligil et al., 2000). The spatiotemporal organization of the glycosylation actors in the secretory pathway is also crucial (Blackburn, D'Souza & Lupashin, 2019). The molecular mechanisms that govern this differential subcellular localization are far from being identified and understood. Nevertheless, about 10 years ago the importance of intra‐vesicular Golgi trafficking in the regulation of protein glycosylation was highlighted in human patients (Wu et al., 2004; Foulquier et al., 2006, 2007; D'Souza, Taher & Lupashin, 2020). Many mutations affecting seven of the eight constituent subunits of the conserved oligomeric Golgi (COG) complex (COG3 being the exception) were subsequently identified in patients with congenital disorders of glycosylation (CDG), a rare metabolic disease affecting glycosylation (Reynders et al., 2011; Péanne et al., 2018). The COG complex has been studied in many different organisms (Saccharomyces cerevisiae, A. thaliana, D. melanogaster, C. elegans) and in most studied organisms, COG defects were shown to be associated with Golgi morphological disruption, protein glycosylation and vesicular trafficking defects (Blackburn et al., 2019). The mechanisms by which glycosylation defects are generated have been the subject of numerous studies and have been partially elucidated. COG defects disturb the Golgi structure and the retrograde intra‐Golgi trafficking of vesicles containing important proteins for the glycosylation process, such as GTs and/or nucleotide‐sugar transporters (Reynders et al., 2009, 2011). Instability of some GTs, resulting from inadequate lysosomal targeting, in particular GlcNAc transferases I and II) (MGAT), α‐mannosidase class 2A member 1 (Man2A1), β‐1,4‐galactosyltransferase 1 (β4GalT1) and β‐galactoside α‐2,6‐sialyltransferase 1 (ST6Gal1) has been reported in COG‐depleted cells (Foulquier et al., 2006; Reynders et al., 2009; Pokrovskaya et al., 2011). Altogether, these processes account for the observed microheterogeneity of the N‐glycan structures on a specific glycoprotein. In addition, regulation of the Golgi N‐glycan process through Mn2+ and Ca2+ homeostasis has been investigated very recently. A specific discussion of this recent and original regulation is provided below.

Overview of the different GTs and their Mn dependency within ‐glycan processing

Human Golgi GTs can be classified into three main groups according to their structural folds: GT‐A, GT‐B and GT‐C (Oriol et al., 2002; Lairson et al., 2008; Breton, Fournel‐Gigleux & Palcic, 2012; Gloster, 2014; Albuquerque‐Wendt et al., 2019). GT‐A and GT‐B share a close topology and use nucleotide sugars as donor substrates. Structurally, GT‐A possess two abutting secondary structures called Rossmann folds (alternative β‐sheets and α‐helices, β/α/β). By contrast, in GT‐B, the two β/α/β Rossmann domains face each other and are flexibly linked. One of the main differences between GT‐A and GT‐B is the presence of a highly conserved DXD motif (aspartic acid–any amino acid–aspartic acid) in the active site of GT‐A that is absent in GT‐B. This motif is crucial as it allows the carboxylates to coordinate a divalent cation and/or a ribose required for the stabilization of the nucleotide sugar. Note that for a specific GT, GnT I, this motif is comprised of three specific amino acids: glutamate, aspartic acid and aspartic acid (EDD) (Breton et al., 2012). GT‐As are thus considered metal‐ion‐dependent enzymes while GT‐Bs are generally metal‐ion‐independent. More recently, GT‐Cs have been identified as a new structural family of GTs possessing several hydrophobic transmembrane domains and using lipid phosphate‐linked sugar as the donor substrate (Oriol et al., 2002; Gloster, 2014; Albuquerque‐Wendt et al., 2019). These specific GTs act in the ER during the so‐called dolichol cycle. Like GT‐As, GT‐Cs possess a DXD or even a DD motif that is crucial for their enzymatic activity (Lommel et al., 2011; Albuquerque‐Wendt et al., 2019). While magnesium (Mg2+), zinc (Zn2+) or cobalt (Co2+) ions can be used as a cofactor by GT‐A enzymes, Mn2+ is considered the most prominent. This has been investigated by following the kinetics of catalytic activity of β4GalT1 (EC 2.4.1.38), a key enzyme of N‐glycan processing that catalyses, only in the presence of Mn2+, the transfer of Gal from UDP‐Gal to the non‐reducing end GlcNAc of complex N‐glycans (Ramakrishnan, Ramasamy & Qasba, 2006). β4GalT1 is a GT‐A folded enzyme containing a D248XD250 motif located in the cleft of its catalytic domain, which serves as a Mn2+ binding site that then allows the binding of UDP‐Gal (Arnold et al., 2000; Ramakrishnan et al., 2006). Fig. 3 summarizes the known and putative Mn2+‐dependent glycoenzymes that are involved in the N‐glycan‐processing pathway in mammals (human) and plants (Arabidopsis thaliana). The affinity of these enzymes for Mn2+ is not yet known and it could well be that according to their subcellular localization, a tight regulation of local Mn2+ homeostasis is required to sustain the corresponding glycosylation reactions. These Mn2+‐dependent enzymes are Golgi localized, highlighting the importance of Golgi Mn2+ homeostatic regulation in N‐glycan processing. Although surprising, the molecular mechanisms of Golgi Mn2+ homeostasis were unknown until recently.
Fig. 3

Mn2+‐dependent glycoenzymes of N‐glycosylation processing. During the biosynthesis of N‐linked glycans, the oligosaccharyltransferase (OST) complex ensures the transfer of the oligosaccharide precursor to the nascent protein. The catalytic activity of this complex is driven by subunits STT3A for co‐translational glycosylation and STT3B for co‐ and post‐translational glycosylation. Both subunits act as glycosyltransferases (GTs) and are Mn2+ dependent. Afterwards, during endoplasmic reticulum (ER) quality control of N‐glycoproteins, UGGT activity also requires Mn2+ ions as well as GnT I, GnT II, β4GalT1, GnTIVa,b and GnTVb involved in Golgi N‐glycan processing in mammals and GnT I, Gn T II, FUT11 and β3GalT1 in Arabidopsis thaliana. Symbol nomenclature of monosaccharides follows Varki (2017b), see Fig. 1 for key. The red dot indicates the asparagine residue from the Asn‐X‐Ser/Thr/Cys consensus sequence. β3GalT1, β(1,3)‐galactosyltransferase 1; β4GalT1, β(1,4)‐galactosyltransferase 1; FUT11, α(1,3)‐fucosyltransferase; GnT I, mannosyl α(1,3)‐glycoprotein β(1,2)‐N‐acetylglucosamine transferase; GnT II, mannosyl β‐(1,6)‐glycoprotein β(1,2)‐N‐acetylglucosamine transferase; GnT IVa,b, mannosyl α(1,3)‐glycoprotein β(1,4)‐N‐acetylglucosamine transferase 4a and 4b; GnT V, mannosyl α(1,6)‐glycoprotein β(1,6)‐N‐acetylglucosamine transferase 5; mRNA, messenger RNA; UGGT, UDP‐glucose glycoprotein glucosyltransferase 1.

Mn2+‐dependent glycoenzymes of N‐glycosylation processing. During the biosynthesis of N‐linked glycans, the oligosaccharyltransferase (OST) complex ensures the transfer of the oligosaccharide precursor to the nascent protein. The catalytic activity of this complex is driven by subunits STT3A for co‐translational glycosylation and STT3B for co‐ and post‐translational glycosylation. Both subunits act as glycosyltransferases (GTs) and are Mn2+ dependent. Afterwards, during endoplasmic reticulum (ER) quality control of N‐glycoproteins, UGGT activity also requires Mn2+ ions as well as GnT I, GnT II, β4GalT1, GnTIVa,b and GnTVb involved in Golgi N‐glycan processing in mammals and GnT I, Gn T II, FUT11 and β3GalT1 in Arabidopsis thaliana. Symbol nomenclature of monosaccharides follows Varki (2017b), see Fig. 1 for key. The red dot indicates the asparagine residue from the Asn‐X‐Ser/Thr/Cys consensus sequence. β3GalT1, β(1,3)‐galactosyltransferase 1; β4GalT1, β(1,4)‐galactosyltransferase 1; FUT11, α(1,3)‐fucosyltransferase; GnT I, mannosyl α(1,3)‐glycoprotein β(1,2)‐N‐acetylglucosamine transferase; GnT II, mannosyl β‐(1,6)‐glycoprotein β(1,2)‐N‐acetylglucosamine transferase; GnT IVa,b, mannosyl α(1,3)‐glycoprotein β(1,4)‐N‐acetylglucosamine transferase 4a and 4b; GnT V, mannosyl α(1,6)‐glycoprotein β(1,6)‐N‐acetylglucosamine transferase 5; mRNA, messenger RNA; UGGT, UDP‐glucose glycoprotein glucosyltransferase 1.

TMEM165, an essential protein in Golgi ‐glycan processing and Mn homeostasis

The discovery of TMEM165 as a gene related to human CDG (Houdou & Foulquier, 2020) began a new era in the understanding of the regulation of Golgi Mn2+ homeostasis (Foulquier et al., 2012). An understanding of the crucial role of TMEM165 in Golgi glycosylation maintenance arose unambiguously from the identification of CDG patients with mutations in TMEM165. All TMEM165‐CDG patients were diagnosed as type II CDG based on serum‐transferrin isoelectric focusing. In addition, mass spectrometry analysis of total N‐glycans from patients’ sera highlighted strong Golgi glycosylation defects with an increased level of abnormal N‐glycan structures lacking both Gal and sialic acid residues (Foulquier et al., 2012). This was confirmed in TMEM165 knock‐out cell lines (HEK cells) (Morelle et al., 2017) where synthesised N‐glycans provided evidence for a strong galactosylation defect, a mild GlcNAcylation defect and a slight sialylation defect. It is now clear that TMEM165 deficiency not only affects N‐glycan processing but also all other Golgi Mn2+‐dependent glycosylation pathways such as glycosaminoglycans and glycolipid synthesis (Bammens et al., 2015; Morelle et al., 2017; Haouari et al., 2020). TMEM165 was also recently found to be involved in lactose biosynthesis during milk production (Snyder et al., 2019). The glycosylation defect is not restricted to TMEM165 but can be extended to all investigated uncharacterized protein family 0016 (UPF0016) members, a conserved family of transporter to which TMEM165 belongs (Demaegd et al., 2014; Thines, Stribny & Morsomme, 2020). One of the best examples is Gdt1p, the yeast ortholog of TMEM165. The phenotype of the yeast null mutant gdt1Δ is a strong growth defect when cultured in the presence of high external calcium chloride concentration that can be correlated to strong N‐ and O‐glycosylation abnormalities (Demaegd et al., 2013). In addition to yeast Gdt1p, Yang et al. (2021) recently characterized the function of PML3 (photosynthesis‐affected mutant 71 like 3; At5g36290), one of the five UPF0016 family members found in Arabidopsis thaliana, and demonstrated that PML3 played a critical role in Golgi glycosylation and cell wall biosynthesis under Mn2+‐deficient conditions. They also demonstrated interestingly a lower abundance of core α(1,3)‐Fuc and β(1,2)‐Xyl on the N‐glycan structures isolated from protein extracts of pml3 mutants. This defect is completely suppressed when the plant mutants are grown under Mn2+‐supplemented hydroponic nutrient solution (Yang et al., 2021).

TMEM165 structure and functional domains

TMEM165 belongs to the UPF0016 family, a highly conserved family of membrane proteins. A phylogenetic analysis of TMEM165 homologous sequences among different eukaryotic clades shows strong sequence homologies (Fig. 4). The sequence alignments reveal the presence of two copies of the hydrophobic domain E‐Ø‐G‐D‐(KR)‐(ST) (where Ø indicates any hydrophobic residue) oriented in an antiparallel manner and highly conserved throughout species independently of the synthesis of complex‐type N‐glycans. This specific orientation could result from an ancient gene‐duplication event as proposed by Demaegd et al. (2014) based on bacterial UPF0016 members. Many UPF0016 homologous sequences can be found in databases but only a few have been characterized, including the eukaryotic TMEM165 (human, mouse, zebrafish), Gdt1p (Saccharomyces cerevisiae) (Foulquier et al., 2012; Demaegd et al., 2013, 2014; Reinhardt, Lippolis & Sacco, 2014; Bammens et al., 2015; Snyder et al., 2019) and PAM71 (photosynthesis‐affected mutant 71), CMT1 (chloroplast manganese transporter 1) and PML3–5 (plant A. thaliana) (Schneider et al., 2016; Hoecker, Leister & Schneider, 2017; Eisenhut et al., 2018; Hoecker et al., 2020). During the last decade, many studies have attempted to unravel the exact functions of these different members (Demaegd et al., 2013; Reinhardt et al., 2014; Bammens et al., 2015; Colinet et al., 2016; Potelle et al., 2017; Snyder et al., 2017, 2019; Thines et al., 2018; Stribny et al., 2020; Wang et al., 2020). So far, all these studies agree upon a function as a cation transporter required to maintain Ca2+, Mn2+ and H+ homeostasis in the Golgi apparatus. Site‐directed mutagenesis within the two consensus motifs E‐Ø‐G‐D‐(KR)‐(ST) demonstrated the crucial involvement of amino acid residues in these consensus motifs for Ca2+ and Mn2+ transport activities. The resulting altered transport function could be due to impaired cation affinity or pocket conformation changes, emphasizing that the amino acids of the two conserved motifs constitute the cation binding sites of the UPF0016 members.
Fig. 4

Phylogenetic tree showing a selection of transmembrane protein 165 (TMEM165) homologs in Eukaryotes. Putative TMEM165 protein sequences from organisms belonging to the clades depicted in Fig. 2 were retrieved in the subfamily PTHR12608:SF1 (TRANSMEMBRANE PROTEIN 165) of the PANTHER database version 16 (Mi et al., 2021; http://www.pantherdb.org/). A basic local alignment tool for proteins (BlastP) search with the human TMEM165 protein sequence (accession number Q9HC07) was also carried using the non‐redundant protein sequences (https://blast.ncbi.nlm.nih.gov/) or in the EnsemblProtists database (Phaeodactylum tricornutum), to identify additional similar sequences. Phylogeny created using the Phylogeny.fr platform (Dereeper et al., 2008; http://www.phylogeny.fr/index.cgi) with the selected 21 sequences. The ‘One Click’ mode was chosen to perform the analysis with the following steps: the protein sequences were aligned using MUSCLE 3.8.31 (Edgar, 2004), a curation step by Gblocks 0.91b (Castresana, 2000) was included and the phylogenetic tree was subsequently built using the PhyML 3.1/3.0 aLRT program (Guindon & Gascuel, 2003; Anisimova & Gascuel, 2006). Graphical representation and editing of the phylogenetic tree were done with TreeDyn 198.3 (Chevenet et al., 2006). The red numbers show the percentage of replicate trees in which the associated taxa clustered together, thus reflecting evolutionary distances based on sequence homologies.

Phylogenetic tree showing a selection of transmembrane protein 165 (TMEM165) homologs in Eukaryotes. Putative TMEM165 protein sequences from organisms belonging to the clades depicted in Fig. 2 were retrieved in the subfamily PTHR12608:SF1 (TRANSMEMBRANE PROTEIN 165) of the PANTHER database version 16 (Mi et al., 2021; http://www.pantherdb.org/). A basic local alignment tool for proteins (BlastP) search with the human TMEM165 protein sequence (accession number Q9HC07) was also carried using the non‐redundant protein sequences (https://blast.ncbi.nlm.nih.gov/) or in the EnsemblProtists database (Phaeodactylum tricornutum), to identify additional similar sequences. Phylogeny created using the Phylogeny.fr platform (Dereeper et al., 2008; http://www.phylogeny.fr/index.cgi) with the selected 21 sequences. The ‘One Click’ mode was chosen to perform the analysis with the following steps: the protein sequences were aligned using MUSCLE 3.8.31 (Edgar, 2004), a curation step by Gblocks 0.91b (Castresana, 2000) was included and the phylogenetic tree was subsequently built using the PhyML 3.1/3.0 aLRT program (Guindon & Gascuel, 2003; Anisimova & Gascuel, 2006). Graphical representation and editing of the phylogenetic tree were done with TreeDyn 198.3 (Chevenet et al., 2006). The red numbers show the percentage of replicate trees in which the associated taxa clustered together, thus reflecting evolutionary distances based on sequence homologies.

TMEM165 homologous sequences in eukaryotic clades and future research

For TMEM165 homologous sequences, the two consensus domains E‐Ø‐G‐D‐(KR)‐(ST) are highly conserved across the different eukaryotic clades (Fig. 4) although an extensive diversity of N‐glycan structures can be found. Some species clearly show evidence for Golgi glycosylation maturation while some only possess oligomannosidic‐type N‐glycans resulting from ER glycosylation. As shown for A. thaliana, TMEM165 orthologs are required for proper maturation of N‐glycan structures (Yang et al., 2021). The function of TMEM165 in the eukaryotic clades synthesizing only oligomannosidic‐type N‐glycans can be questioned. We noted from our phylogenetic analysis that the highest variability and heterogeneity of TMEM165 proteins is found in the variable length of the N‐terminal region which does not show any obvious conservation. According to the species, the N‐terminal extension of the eukaryotic UPF0016 members may thus have a role in targeting to reach their proper final destination. The observed diversity and variability in the N‐terminal part of the eukaryotic UPF0016 members could then explain their different subcellular localizations. Specific targeting systems have certainly evolved to ensure the proper localization of proteins and the inherent complexity of the N‐terminal region found in TMEM165 homologous sequences may coincide with the evolution of protein translocation systems used for membrane insertion and targeting. In terms of function, differential subcellular localization could allow the transport of Mn2+/Ca2+ from the cytoplasm to specific luminal compartments such as the ER/Golgi in order to enable specific enzymatic reactions such as glycosylation and/or to protect the cytoplasm from Mn2+excess and toxicity. Further studies are required to understand better the contribution of TMEM165 homologous sequences throughout the different eukaryotic clades and its possible involvement in regulation of N‐glycosylation.

CONCLUSIONS

The extensive diversity observed for most mature N‐glycan structures, summarized in Fig. 2, indicates that N‐glycan processing steps in the Golgi apparatus involve a large repertoire of organism‐specific glycosidases and GTs, as well as other enzymes required for specific sugar substitutions (e.g. methyl, sulfate, methylphosphate, phosphoethanolamine, phosphorylcholine, etc.). The initial investigations of N‐glycosylation have mainly focused on mammalian proteins, attempting to unravel this essential protein PTM and to understand the functions of mature N‐glycans in human physiology and pathology. More recently, N‐glycosylation of proteins from other eukaryotes has been investigated. The emergence of sensitive and efficient analytical technologies for the structural identification of mixture of complex oligosaccharides have allowed in‐depth study of N‐glycan profiles from numerous organisms, even for low amounts of available proteins, highlighting the rich structural diversity of mature N‐glycans among eukaryotes. The functions of such N‐glycan diversity of secreted proteins in eukaryotes are diverse. To date, numerous studies carried out on the functions of protein N‐glycosylation in mammals have demonstrated their importance in cell–cell interactions, cell–protein recognition and interaction, and in human pathology (Varki, 2017a). The diversity of mature N‐glycans in other eukaryotes suggests that N‐glycosylation of proteins in all of these organisms does not solely function to ensure their proper folding in the ER. Indeed, considering the energy cost for each step of the N‐glycan pathway, we can assume that the N‐glycans resulting from Golgi maturation play important biological functions in organisms belonging to the different clades. Beyond the elucidation of the biological relevance of such N‐glycan diversity, the regulation of the Golgi processing steps is a key issue. Indeed, such regulation is multifactorial. Moreover, recent studies have demonstrated that in some organisms, complex N‐glycan processing steps are regulated through Golgi Mn2+ homeostasis, including molecular actors like TMEM165. More research is needed to understand the regulation of the N‐glycan processing steps in other eukaryotes and to establish whether regulation of N‐glycosylation through Mn2+ homeostasis is universal. In this context, future studies should attempt to decipher the roles of TMEM165 in the regulation of N‐glycan processing in organisms that synthesise oligomannosides or mature N‐glycans.
  202 in total

1.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.

Authors:  J Castresana
Journal:  Mol Biol Evol       Date:  2000-04       Impact factor: 16.240

2.  A conserved acidic motif is crucial for enzymatic activity of protein O-mannosyltransferases.

Authors:  Mark Lommel; Andrea Schott; Thomas Jank; Verena Hofmann; Sabine Strahl
Journal:  J Biol Chem       Date:  2011-09-28       Impact factor: 5.157

Review 3.  Retention mechanisms for ER and Golgi membrane proteins.

Authors:  Caiji Gao; Yi Cai; Yejun Wang; Byung-Ho Kang; Fernando Aniento; David G Robinson; Liwen Jiang
Journal:  Trends Plant Sci       Date:  2014-04-30       Impact factor: 18.313

4.  Multiple xylosyltransferases heterogeneously xylosylate protein N-linked glycans in Chlamydomonas reinhardtii.

Authors:  Pierre-Louis Lucas; Elodie Mathieu-Rivet; Philippe C T Song; Anne Oltmanns; Corinne Loutelier-Bourhis; Carole Plasson; Carlos Afonso; Michael Hippler; Patrice Lerouge; Narimane Mati-Baouche; Muriel Bardor
Journal:  Plant J       Date:  2020-01-25       Impact factor: 6.417

5.  Man, apes, and Old World monkeys differ from other mammals in the expression of alpha-galactosyl epitopes on nucleated cells.

Authors:  U Galili; S B Shohet; E Kobrin; C L Stults; B A Macher
Journal:  J Biol Chem       Date:  1988-11-25       Impact factor: 5.157

Review 6.  Biochemistry, molecular biology, and genetics of the oligosaccharyltransferase.

Authors:  S Silberstein; R Gilmore
Journal:  FASEB J       Date:  1996-06       Impact factor: 5.191

Review 7.  N-linked protein glycosylation in the ER.

Authors:  Markus Aebi
Journal:  Biochim Biophys Acta       Date:  2013-04-10

8.  Mutation of the COG complex subunit gene COG7 causes a lethal congenital disorder.

Authors:  Xiaohua Wu; Richard A Steet; Ognian Bohorov; Jaap Bakker; John Newell; Monty Krieger; Leo Spaapen; Stuart Kornfeld; Hudson H Freeze
Journal:  Nat Med       Date:  2004-04-25       Impact factor: 53.440

9.  Mapping the distribution of Golgi enzymes involved in the construction of complex oligosaccharides.

Authors:  C Rabouille; N Hui; F Hunte; R Kieckbusch; E G Berger; G Warren; T Nilsson
Journal:  J Cell Sci       Date:  1995-04       Impact factor: 5.285

10.  Complexity of the eukaryotic dolichol-linked oligosaccharide scramblase suggested by activity correlation profiling mass spectrometry.

Authors:  Alice Verchère; Andrew Cowton; Aurelio Jenni; Monika Rauch; Robert Häner; Johannes Graumann; Peter Bütikofer; Anant K Menon
Journal:  Sci Rep       Date:  2021-01-14       Impact factor: 4.379

View more
  1 in total

1.  Pseudomonas syringae DC3000 infection increases glucosylated N-glycans in Arabidopsis thaliana.

Authors:  Gernot Beihammer; Andrea Romero-Pérez; Daniel Maresch; Rudolf Figl; Réka Mócsai; Clemens Grünwald-Gruber; Friedrich Altmann; Els J M Van Damme; Richard Strasser
Journal:  Glycoconj J       Date:  2022-10-21       Impact factor: 3.009

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.