| Literature DB >> 25566323 |
Luis Sanchez-Pulido1, Chris P Ponting1.
Abstract
Carriers of the Glu167Lys coding variant in the TM6SF2 gene have recently been identified as being more susceptible to non-alcoholic fatty liver disease (NAFLD), yet exhibit lower levels of circulating lipids and hence are protected against cardiovascular disease. Despite the physiological importance of these observations, the molecular function of TM6SF2 remains unknown, and no sequence similarity with functionally characterized proteins has been identified. In order to trace its evolutionary history and to identify functional domains, we embarked on a computational protein sequence analysis of TM6SF2. We identified a new domain, the EXPERA domain, which is conserved among TM6SF, MAC30/TMEM97 and EBP (D8, D7 sterol isomerase) protein families. EBP mutations are the cause of chondrodysplasia punctata 2 X-linked dominant (CDPX2), also known as Conradi-Hünermann-Happle syndrome, a defective cholesterol biosynthesis disorder. Our analysis of evolutionary conservation among EXPERA domain-containing families and the previously suggested catalytic mechanism for the EBP enzyme, indicate that TM6SF and MAC30/TMEM97 families are both highly likely to possess, as for the EBP family, catalytic activity as sterol isomerases. This unexpected prediction of enzymatic functions for TM6SF and MAC30/TMEM97 is important because it now permits detailed experiments to investigate the function of these key proteins in various human pathologies, from cardiovascular disease to cancer.Entities:
Keywords: CDPX2; EBP; MAC30; NAFLD; TM6SF2; TMEM97; cancer; cholesterol
Year: 2014 PMID: 25566323 PMCID: PMC4263179 DOI: 10.3389/fgene.2014.00439
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1EXPERA domain-containing proteins in humans. (A) Summary table of known function/disease relationships for all EXPERA domain-containing proteins in humans. (B) Transmembrane topology and common features in human members of the EXPERA domain-containing families. Common topology for the core four transmembrane regions (colored cylinders) present in all members of the EXPERA superfamily (Figures 3, S1–S3). The blue oval labels the putative C-terminal ER retention signal (lysine-rich sequence). Black circles label the most conserved position of the EXPERA superfamily (E101 and E255 in TM6SF2, D56 in MAC30 and D108 in EBP).
Figure 3Representative multiple sequence alignment of the EXPERA domain. Putative EBP catalytic residues (identified by alanine-scanning) described by Moebius et al. are label in black (Moebius et al., 1999). A mutation identified in TM6SF2 is label in red (Holmen et al., 2014; Kozlitina et al., 2014; Sookoian et al., 2014). Human sequence names are highlighted and the only member of the EXPERA superfamily in Saccharomyces cerevisiae, part of the MAC30/TMEM97 family, is indicated by a yellow box. Numbers shown in green represent inserted amino acids that have been removed from the alignment. Different groups of the EXPERA sequences identified by sequence similarity are shown by colored lines to the left of the alignment: light red, TM6SF family first repeat; dark red, TM6SF family second repeat; yellow, MAC30/TMEM97 family; purple, EBP family. DUF2781 (in blue), previously defined in Pfam (includes TM6SF second repeat and MAC30 family). The TMHMM helix transmembrane (Krogh et al., 2001) consensus prediction are shown below the alignment for each family, in red, yellow, and violet cylinders for TM6SF (repeats 1 and 2), MAC30/TMEM97, and EBP families, respectively (see Figures S1–S3). The limits of the protein sequences included in the alignment are indicated by flanking residue positions. Alignments were produced with T-Coffee, HMMer, and HHpred (Eddy, 1996; Notredame et al., 2000; Söding et al., 2005; Finn et al., 2011) using default parameters and slightly refined manually. The alignment was presented with the program Belvu (Sonnhammer and Hollich, 2005) using a coloring scheme indicating the average BLOSUM62 scores (which are correlated with amino acid conservation) of each alignment column: red (>0.7), violet (between 0.7 and 0.4) and light yellow (between 0.4 and 0.2). Sequences are named according to their UniProt identifications (Wu et al., 2006). Species abbreviations: ARATH, Arabidopsis thaliana (Mouse-ear cress); ASPFC, Neosartorya fumigata (Fungus); AURDE, Auricularia delicata (White-rot fungus); CAEEL, Caenorhabditis elegans; CANGA, Candida glabrata (Yeast); CHLRE, A8JGX8_CHLRE, Chlamydomonas reinhardtii (Green alga); CIOIN, Ciona intestinalis; CRAGI, Crassostrea gigas (Pacific oyster); DEBHA, Debaryomyces hansenii (Yeast); DICDI, Dictyostelium discoideum (Slime mold); EMIHU, Emiliania huxleyi (Chromalveolata); HUMAN, Homo sapiens; LOTGI, Lottia gigantea (Giant owl limpet); MALGO, Malassezia globosa (Fungus); MONBE, Monosiga brevicollis (Choanoflagellate); NAEGR, Naegleria gruberi (Amoeba); NEMVE, Nematostella vectensis (Starlet sea anemone); NEUCR, Neurospora crassa (Fungus); OSTTA, Ostreococcus tauri (Green alga); PHACS, Phanerochaete carnosa (Fungus); PICPG, Komagataella pastoris (Yeast); PIRID, Piriformospora indica (Fungus); SALR5, Salpingoeca rosetta (Choanoflagellate); SCHPO, Schizosaccharomyces pombe (Fission yeast); SELML, Selaginella moellendorffii (Spikemoss); STRPU, Strongylocentrotus purpuratus (Purple sea urchin); THAPS, Thalassiosira pseudonana (Marine diatom); TRIAD, Trichoplax adhaerens; VOLCA, Volvox carteri (Green alga); YARLI, Yarrowia lipolytica (Yeast); YEAST, Saccharomyces cerevisiae (Baker's yeast).
Figure 4HHpred comparison E-values. The numbers correspond to E-values from HHpred (Söding et al., 2005) profile searches against a Pfam profile database which includes profiles that represent families shown in the figure. Profile-to-profile matches were evaluated in terms of an E-value, which is the expected number of non-homologous proteins with a score higher than that obtained for the database match. An E-value much lower than one indicates statistical significance. Solid lines represent statistically significant sequence similarity relationships, e.g., the MAC30/TMEM97 family calibrated profile finds with 0.003 and 0.03 E-values the profiles of EBP family and TM6SF first EXPERA repeat, respectively. The black dotted line between TM6SF second EXPERA domain repeat and EBP shows the unique relationship found with a non-highly significant value (E-value 1.2). The sequence similarity between TM6SF (second EXPERA domain repeat) and MAC30/TMEM97 families (presented inside the blue dotted oval) was already described in a DUF (Domain of Unknown Function) entry of Pfam (DUF2781, Pfam family identification: PF10914) (Bateman et al., 2010; Punta et al., 2012).
Figure 2(A) Mapping alanine-scanning mutagenesis and known disease causing missense mutations in the EBP model. Alanine-scanning (Moebius et al., 1999) identified 11 residues as major determinants of EBP catalytic activity (His77, Glu81, Trp102, Tyr105, Asp109, Arg111, Tyr112, Glu123, Thr126, Asn194, and Trp197; here renumbered to the current EBP_HUMAN SwissProt entry numbering by subtracting one to the number of each position). Four (orange points) are present in exactly the same positions as are disease associated mutations (H76Y, E80K, R110Q, and W196S) and the remaining seven (W101, Y104, D108, Y111, E122, T125, and N193) (yellow points) are located in the vicinity of disease associated mutations (fewer than five residues-distant). Mapped CDPX2 disease causing missense mutations (red points), derived from Human Gene Mutation Database (HGMD) and PubMed analysis (Stenson et al., 2003), are: M1I (Steijlen et al., 2007), M1V (Hello et al., 2010), R62W (Herman et al., 2002), L66P (Whittock et al., 2003), C67R (Morice-Picard et al., 2011), W68C (Lambrecht et al., 2014), C72Y (Herman et al., 2002), I75N (Barboza-Cerda et al., 2014), H76Y (Umekoji et al., 2008), E80K (Braverman et al., 1999; Ikegawa et al., 2000; Aughton et al., 2003), W82C (Has et al., 2002; Shirahama et al., 2003), S98F (Tysoe et al., 2008), S98P (Tysoe et al., 2008), E103K (Kolb-Mäurer et al., 2008), G107R (Derry et al., 1999), R110Q (Derry et al., 1999; Hou, 2013), V119G (Non-lethal) (Cañueto et al., 2012; Bode et al., 2013), G130V (Herman et al., 2002), S133R (Braverman et al., 1999; Derry et al., 1999), R147G (Becker et al., 2001), R147H (Braverman et al., 1999; Has et al., 2000; Ikegawa et al., 2000; Shirahama et al., 2003), G157S (Herman et al., 2002), D162H (Whittock et al., 2003), L164P (Cañueto et al., 2012), Y165C (Shirahama et al., 2003), G173R (Herman et al., 2002), W196S (Herman et al., 2002), L203P (Has et al., 2002), D206Y (Ausavarat et al., 2008). L18P and W47C (Milunsky et al., 2003; Furtado et al., 2010) present a less severe phenotype called MEND (Male EBP Disorder with Neurological Defects) syndrome (Arnold et al., 2012). (B) Reaction catalyzed by EBP. Cholesterol carbon atoms C7, C8, and C9 are label.