Erol C Bayraktar1, Konnor La1, Kara Karpman2, Gokhan Unlu1,3,4, Ceren Ozerdem1, Dylan J Ritter3,4, Hanan Alwaseem5, Henrik Molina5, Hans-Heinrich Hoffmann6, Alec Millner7, G Ekin Atilla-Gokcumen7, Eric R Gamazon3, Amy R Rushing3, Ela W Knapik3,4, Sumanta Basu8, Kıvanç Birsoy9. 1. Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA. 2. Center for Applied Mathematics, Cornell University, Ithaca, NY, USA. 3. Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA. 4. Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA. 5. Proteomics Resource Center, The Rockefeller University, New York, NY, USA. 6. Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA. 7. Department of Chemistry, University at Buffalo, The State University of New York (SUNY), Buffalo, NY, USA. 8. Department of Statistics and Data Science, Cornell University, Ithaca, NY, USA. 9. Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA. kbirsoy@rockefeller.edu.
Abstract
Coessentiality mapping has been useful to systematically cluster genes into biological pathways and identify gene functions1-3. Here, using the debiased sparse partial correlation (DSPC) method3, we construct a functional coessentiality map for cellular metabolic processes across human cancer cell lines. This analysis reveals 35 modules associated with known metabolic pathways and further assigns metabolic functions to unknown genes. In particular, we identify C12orf49 as an essential regulator of cholesterol and fatty acid metabolism in mammalian cells. Mechanistically, C12orf49 localizes to the Golgi, binds membrane-bound transcription factor peptidase, site 1 (MBTPS1, site 1 protease) and is necessary for the cleavage of its substrates, including sterol regulatory element binding protein (SREBP) transcription factors. This function depends on the evolutionarily conserved uncharacterized domain (DUF2054) and promotes cell proliferation under cholesterol depletion. Notably, c12orf49 depletion in zebrafish blocks dietary lipid clearance in vivo, mimicking the phenotype of mbtps1 mutants. Finally, in an electronic health record (EHR)-linked DNA biobank, C12orf49 is associated with hyperlipidaemia through phenome analysis. Altogether, our findings reveal a conserved role for C12orf49 in cholesterol and lipid homeostasis and provide a platform to identify unknown components of other metabolic pathways.
Coessentiality mapping has been useful to systematically clustn class="Gene">er genes into biological pathways and identify gene functions1-3. Here, using the debiased sparse partial correlation (DSPC) method3, we construct a functional coessentiality map for cellular metabolic processes across humancancer cell lines. This analysis reveals 35 modules associated with known metabolic pathways and further assigns metabolic functions to unknown genes. In particular, we identify C12orf49 as an essential regulator of cholesterol and fatty acid metabolism in mammalian cells. Mechanistically, C12orf49 localizes to the Golgi, binds membrane-bound transcription factor peptidase, site 1 (MBTPS1, site 1 protease) and is necessary for the cleavage of its substrates, including sterol regulatory element binding protein (SREBP) transcription factors. This function depends on the evolutionarily conserved uncharacterized domain (DUF2054) and promotes cell proliferation undercholesterol depletion. Notably, c12orf49 depletion in zebrafish blocks dietary lipid clearance in vivo, mimicking the phenotype of mbtps1 mutants. Finally, in an electronic health record (EHR)-linked DNA biobank, C12orf49 is associated with hyperlipidaemia through phenome analysis. Altogether, our findings reveal a conserved role for C12orf49 in cholesterol and lipid homeostasis and provide a platform to identify unknown components of other metabolic pathways.
While most components of metabolic pathways have been well-defined, a significant
portion of metabolic reactions still has unidentified enzymes or regulatory components,
even in lown class="Gene">er organisms[4-8]. Co-essentiality mapping was previously
used for systematic identification of large-scale relationships among individual
components of gene sets[1-3]. Perturbation of enzymes or regulatory
units involved in the same metabolic pathway should display similar effects on cellular
fitness across cell lines, suggesting that correlation of essentiality profiles may
provide the unique opportunity to identify unknown components associated with a
particular metabolic function.
To genn class="Gene">erate a putative co-essentiality network for metabolic genes, we analyzed
genetic perturbation datasets from the DepMap project collected from 558 cancer cell
lines (Fig. 1a)[9-11]. Existing
computational methods for constructing co-essentiality networks primarily rely on
Pearson correlation, which is not suitable for distinguishing between direct and
indirect gene associations and leads to false positive edges in the network (Extended Data Fig. 1a,b). However, gaussian graphical models (GGM) calculate partial correlation
and offer unique advantage over commonly used Pearson correlation networks by
automatically removing indirect associations among genes from the network, hence
reducing false positives and producing a small number of high confidence set of putative
interactions for follow-up validation[12]. We therefore applied debiased sparse partial correlation (DSPC), a
GGM technique, to measure associations between the essentiality scores of genes from
humancancer cell lines. In prior work[13], we have successfully used DSPC to build networks among metabolites
and identified new biological compounds. Of note, this method, while useful for
generating high confidence lists, does not account for dependence among cell lines, a
key strength of previously published work[3,11]. After removing
networks with large numbers of components (i.e. electron transport chain), we focused on
genes with a high Pearson correlation (|r|>0.35) with at least one of the 2,998
metabolism-related genes in the dataset. Our analysis of positively correlated genes
revealed a set of 202 genes organized in 35 metabolic networks, 33 of which we can
assign a metabolic function using literature searches and STRING database (Fig. 1b, Extended Data
Fig. 2).
Figure 1,
Genetic coessentiality analysis assigns metabolic functions to
uncharacterized genes
A. Scheme of the computational steps to generate the
metabolic coessentiality network.
B. Heatmap depicting the partial correlation values of the
essentialities of genes in the metabolic coessentiality networks.
C. Correlated essentialities of the genes encoding members
of glycolysis, pyruvate metabolism, squalene synthesis, mevalonate and sialic
acid metabolism. The thickness of the lines indicates the level of partial
correlation.
D. Genetic coessentiality analysis assigns metabolic
functions to uncharacterized genes. Orange and blue boxes show genes with
unknown and known functions, respectively. The thickness of the lines is
indicative of partial correlation.
E. Pearson correlation values of the essentiality scores
of genes in indicated metabolic networks.
F. Unbiased clustering of fitness variation of indicated
genes across 558 human cancer cell lines.
Extended Data Fig. 1
Comparative Simulation between partial and Pearson correlation
A. Simulation experiment of a subnetwork from an E. coli network
demonstrating the advantage of using partial correlation over Pearson
correlation.
B. Receiver operating characteristic (ROC) curve based on the
simulated data. (n= 500 independent samples)
Extended Data Fig. 2
Metabolic coessentiality modules
35 Metabolic coessentiality modules. Blue line indicates a
previously known interaction between the genes. Poorly characterized genes
are highlighted as orange.
Among these networks are glycolysis (n class="Gene">PGAM1,
GPI, ENO1, HK2,
PGP), squalene synthesis (FDPS,
FDFT1, SQLE), sialic acid metabolism
(SLC35A1, CMAS, GNE,
NANS), plasmalogen synthesis (FAR1, AGPS, TMEM189,
PEX7) and pyruvate utilization (MPC2,
PDHB, DLAT, CS, MDH2,
MPC1) but also networks that were not part of a known metabolic
pathway, suggesting the presence of unidentified metabolic pathways (Fig. 1c). Our analysis also identified associations between
genes of unknown function and those encoding components of well-characterized metabolic
pathways. Interestingly, the functions of three of these genes have recently been
discovered (Fig. 1d, Extended Data Fig. 2). UBIAD1, a prenyltransferase, has been shown to bind
to HMGCR to promote its degradation at ER in the presence of sterols[14]. CHP1, which is associated with glycerolipid
synthesis pathway in our analysis, binds to and is necessary for the function of the
protein product of AGPAT6, the rate-limiting enzyme for glycerolipid
synthesis[15]. Additionally, a
recent study identified TMEM189, a gene associated with plasmalogen
synthesis, as the elusive plasmanylethanolamine desaturase[16]. Interestingly, squalene and mevalonate
synthesis clustered into different networks, consistent with additional functions of the
branches of cholesterol metabolism. Indeed, while loss of HMG-CoA synthase would
decrease all intermediates as well as cholesterol, loss of squalene synthase or
downstream enzymes would decrease cholesterol but increase upstream intermediates, hence
leading to different cellular outcomes[17]. Finally, several genes of unknown function, such as
C12orf49 and TMEM41A, have correlated
essentialities with those of genes encoding components of sterol regulatory element
binding proteins (SREBP)-regulated lipid metabolism, raising the possibility that they
may be involved in the regulation of SREBPs or their downstream targets (Fig. 1e,f; Extended Data Fig. 3a). Due to their strong correlation and
unknown function, we focused our attention on these two genes.
Extended Data Fig. 3
C12orf49 is necessary for cell growth under sterol depletion
A. Pearson correlation values of the essentiality
scores of the indicated genes across different cancer cell lines
(n=558).
B. Differential sgRNA score for C12orf49 gene of Jurkat
cell line in the presence or absence of sterols.
C. Fold change in cell number (log2) of U-87 MG or
MDA-MB-435 c12orf49_KO cell line following a 6-day growth under lipoprotein
depletion in the absence or presence of sterols. (mean ± SD, n=3
biologically independent samples). Statistical significance was determined
by two-tailed unpaired t-test.
D. Immunoblots of c12orf49 in the indicated knockout
cells of HEK293T. Actin was used as the loading control. The experiment was
repeated independently twice with similar results.
E. (left) Immunoblots of c12orf49 knockout and addback
cells in Jurkat cells. Actin was used as the loading control. The experiment
was repeated independently twice with similar results. (right) Fold change
in cell number (log2) of indicated knockout and rescued addback Jurkat cells
following a 6-day growth under lipoprotein depletion in the absence or
presence of sterols. (mean ± SD, n=3 biologically independent
samples). Statistical significance was determined by two-tailed unpaired
t-test.
F. Fold change in cell number (log2) of indicated
knockout and rescued addback HEK293T cells following a 6-day growth under
lipoprotein depletion in the absence or presence of sterols. (mean ±
SD, n=3 biologically independent samples). Statistical significance was
determined by two-tailed unpaired t-test.
pan class="Gene">Sterol regulatory element binding proteins (SREBPs) are transcription factors
that regulate transcription of genes encoding many enzymes in the cholesterol and fatty
acid synthesis[18]. SREBPs are normally
bound to endoplasmic reticulum (ER) membranes and are activated through a proteolytic
cascade regulated by sterols[19,20]. Cleaved SREBPs localize to nucleus
and induce expression of cholesterol synthesis genes enabling cells to survive understerol depletion[21,22]. Given the strong coessentialities of
C12orf49 and TMEM41A with the SREBP pathway, we
hypothesized that these uncharacterized genes may be required for the activation of
cholesterol synthesis and cell proliferation upon cholesterol deprivation. To address
this possibility, we generated a small CRISPR library consisting of 103 sgRNAs targeting
genes involved in SREBP maturation and lipid metabolism (3–8 sgRNA/gene) (Fig. 2a). Using this focused library, we performed
negative selection screens for genes whose loss potentiates anti-proliferative effects
of lipoprotein depletion. Among the scoring genes were MBTPS1 and
SCAP, both of which are involved in SREBP processing[23-25], but also C12orf49, a gene of unknown function
that has not been previously linked to cholesterol metabolism (Fig. 2b, Extended Data Fig.
3b). Consistent with the screening results, depletion of
C12orf49 strongly decreases proliferation of HEK293T, Jurkat and
othercancer cell lines (U87 and MDA-MB-435) undercholesterol depletion, indicating a
generalized role for C12orf49 in cholesterol homeostasis (Fig. 2c,d; Extended Data Fig. 3c,d). Importantly, expression of an sgRNA-resistant humanC12orf49 cDNA in the null cells or free cholesterol addition
completely restores proliferation under lipoprotein depletion (Extended Data Fig. 3e,f).
None of the SREBPs scored likely due to highly complementary and redundant functions.
Notably, TMEM41A was not a scoring gene in these screens, suggesting
that it may function in other downstream processes regulated by SREBPs, such as lipid
biosynthesis or saturation. Indeed, TMEM41A, similar to fatty acid synthesis enzymes,
localizes to ER and its loss substantially impacts cellular lipid composition (Extended Data Fig. 4a-c). In individual assays, TMEM41A-null cells are more
sensitive to the treatment of palmitate, which kills cells at high concentrations likely
due to the dysregulation of the membrane saturation (Extended Data Fig. 4d,e). Altogether,
these results identify C12orf49 and TMEM41A as major
components of cholesterol and fatty acid metabolism.
Figure 2,
C12orf49 is necessary for cholesterol synthesis and SREBP-induced gene
expression in human cells
A. Schematic for the focused CRISPR-Cas9 based genetic
screen.
B. Differential sgRNA scores for the indicated genes. Blue
bars indicate genes that are significantly and differentially essential under
lipoprotein depletion. Boxes represent the median, and the first and third
quartiles, and the whiskers represent the minimum and maximum of all data
points. n=8 independent sgRNAs targeting each gene except for previously
validated sgRNAs for ACSL3 (n=3) and ACSL4
(n=4)[15].
C. Immunoblot of C12orf49 in the indicated cancer cell
lines (left). Actin was used as the loading control. Fold
change in cell number (log2) of Jurkat wild type and C12orf49_KO
cells following 6-day growth under lipoprotein depletion with the indicated
treatments (mean ± SD, n=3 biologically independent samples)
(middle). Representative images of indicated cell lines
under lipoprotein depletion at the end of the experiment
(right).
D. Fold change in cell number (log2) of HEK293T
wild type and C12orf49_KO cells following 6-day growth under lipoprotein
depletion with the indicated treatments (mean ± SD, n=3 biologically
independent samples).
E. Mass isotopologue analysis of cholesterol in Jurkat
wild type and C12orf49_KO cells in the absence and presence of sterols after 48
hours of incubation with 13C-acetate (mean ± SD, n=3
biologically independent samples).
F. Fold change in mRNA levels (log2) of SREBP
target genes in indicated Jurkat cell lines following 8h growth under
lipoprotein depletion in the presence and absence of sterols (mean ± SD,
n=3 biologically independent samples).
G. Immunoblots of SREBP target proteins in indicated
Jurkat cell lines following 24h growth under lipoprotein depletion in the
presence and absence of sterols. Actin was used as the loading control.
H. Immunoblots of mature SREBP1 and SREBP2 in indicated
Jurkat cell lines following 24h growth under lipoprotein depletion in the
presence and absence of sterols. Lamin B1 was used as the loading control.
I. Localization of SREBP1 in
C12orf49-null HEK293T cells expressing control or
C12orf49 cDNA under lipoprotein depletion in the presence
or absence of sterols (Scale bar, 8 μm).
The experiments were repeated independently at least twice with similar
results. Statistical significance was determined by two-tailed unpaired
t-test.
Extended Data Fig. 4
TMEM41A is involved in lipid metabolism
A. Pearson correlation values of the essentiality
scores of the indicated genes across different cancer cell lines
(n=558).
B. Localization of TMEM41A to ER. Wild type HEK293T
cells expressing FLAG-TMEM41A cDNA were processed for immunofluorescence
analysis using antibodies against FLAG and PDI (ER). White color indicates
overlap (Scale bar, 8 μm). The experiment was repeated independently
twice with similar results.
C. Heatmap showing the relative abundance of indicated
lipid species in TMEM41-null Jurkat cells and those expressing sgRNA
resistant TMEM41A cDNA.
D. Immunoblot of TMEM41A in Jurkat wild type cell line,
TMEM41A nulls and those expressing TMEM41A cDNA. Actin was used as the
loading control. The experiment was repeated independently twice with
similar results.
E. Fold change in cell number (log2) of Jurkat wild
type cell line, TMEM41A-null cells and those expressing TMEM41A cDNA after a
7-day growth upon treatment of indicated palmitate concentrations
(0–80 uM). (mean ± SD, n=3 biologically independent samples).
Statistical significance was determined by two-tailed unpaired t-test.
We next sought to undn class="Gene">erstand why cells require C12orf49 to
proliferate undercholesterol depletion. To first determine whetherC12orf49 is necessary for de novo cholesterol synthesis, we
performed metabolite tracing experiments in Jurkat cells using
[U-13C]-Acetate (Fig. 2e). While acetate
contributes to cellular cholesterol under lipoprotein depletion, we observed
significantly lower labeling in C12orf49-null cells, indicating a
problem in the synthesis (Fig. 2e). Consistent with
the requirement of sterols for viral infection[26-28],
C12orf49 loss also decreases Bunyamwera virus infectivity in
mammalian cell lines and total viral titers (Extended Data
Fig. 5a). As cholesterol synthesis pathway comprises over thirty successive
steps that are transcriptionally regulated [22,29-32], we considered that a dysfunction in gene
expression might lead to defective synthesis and reliance on extracellular cholesterol.
Indeed, C12orf49-null cells fail to induce expression of cholesterol
metabolism genes understerol depletion (Fig.
2f,g). Furthermore, in line with the
role of SREBPs in the transcription of cholesterol synthesis genes, loss of C12orf49
reduced mature (cleaved) SREBP protein levels and blocked nuclear translocation of
SREBPs (Fig. 2h,i). Similarly, expression of other genes known to be induced by SREBPs, such
as fatty acid synthase (FASN), low density lipoprotein receptor
(LDLR), acetyl-coA carboxylase (ACC) and ATP citrate lyase
(ACLY) did not change in C12orf49-null cells
(Fig. 2f,g, Extended Data Fig. 5b). Finally, SREBPs
fail to induce the transcription of the reporter luciferase under the control of sterol
regulatory elements in C12orf49-null cells (Extended Data Fig. 5c). These results suggest that C12orf49,
like SCAP and MBTPS1, is necessary for SREBP activation and subsequent regulation of its
biosynthetic targets.
Extended Data Fig. 5
Role of C12orf49 in sterol synthesis and SREBP-mediated
transcription
A. (top left) Percentage of Bunyamwera virus-positive
cells at 72-hours post-infection (MOI=0.1IU/Ml) in indicated knockout and
addback HEK293T cells (mean ± SD, n=3 biologically independent
samples). Statistical significance was determined by two-tailed unpaired
t-test. (top right) Viral titer measured by TCID50 assays on BHK-21 cells
with the harvested supernatant from the Bunyamwera virus infected HEK293T
cells of C12orf49 knockouts and addbacks. (mean ± SD, n=3
biologically independent samples) Statistical significance was determined by
two-tailed unpaired t-test. (bottom) Growth of the viral titers at different
time points in the knockout and addback cells.
B. Fold change in mRNA levels (log2) of SREBP target
genes in indicated Jurkat cell lines following 8h growth under lipoprotein
depletion in the presence and absence of sterols (mean ± SD,
n=3).
C. Relative luminescence activity (Luciferase/Renilla)
in the indicated HEK293 cell lines following transfection with firefly
luciferase under SRE promoter and Renilla luciferase for normalization of
transfection following 24h growth under lipoprotein depletion in the
presence and absence of sterols (mean ± SD, n=3 biologically
independent samples). Statistical significance was determined by two-tailed
unpaired t-test.
n class="Gene">C12orf49 is ubiquitously expressed among different tissues
(Extended Data Fig. 6a) and contains an
uncharacterized conserved domain, DUF2054 (Extended Data
Fig. 6b-e). Upon sterol depletion,
SCAP, a chaperone protein, transports SREBP to the Golgi complex where it is
subsequently cleaved by membrane bound transcription factor peptidase, Site 1 (MBTPS1,
site-1-protease). The evidence that a primary role of C12orf49 may be in SREBP
processing raised the question of where within this pathway C12orf49 functions. To
address this, we treated cells with brefeldin A, which disassembles the Golgi
compartments and redistributes them to the ER, eliminating the need for SREBP transport
to the Golgi and allowing the cleavage of SREBP1 regardless of the presence of
sterols[33,34]. Interestingly, brefeldin A treatment failed to
induce SREBP cleavage in C12orf49-null cells, strongly suggesting that
C12orf49 functions downstream of SCAP localization (Fig.
3a). Notably, overexpression of the mature SREBP isoforms completely
eliminated the sensitivity of C12orf49-null cells, indicating that
C12orf49 does not impact nuclear function of mature SREBP (Fig. 3b). Consistent with its role downstream of SCAP, C12orf49 mainly
localizes to cis- and trans- Golgi (GM130 and p230, respectively) (Fig. 3c). While N-terminal region of C12orf49 provides the
Golgi localization signal of the protein, this region is dispensable for SREBP
activation (Fig. 3d). Instead, localizing the
conserved DUF2054 domain to Golgi, but not to other organelles (ER and mitochondria), is
sufficient to activate SREBP cleavage and signaling, as well as proliferation under
lipoprotein depletion (Fig. 3e,f; Extended Data Fig.
6f).
Extended Data Fig. 6
C12orf49 gene expression in various tissues
A. Gene expression analysis across different tissues
for C12orf49. Box plots are shown as median and 25th and 75th percentiles;
points are displayed as outliers if they are above or below 1.5 times the
interquartile range (Source: GTEx Portal).
B. DUF2054 profile hidden Markov Model (HMM) logo from
Pfam shows 14 conserved cysteines, 3 of which are CC-dimers.
C. Different architectures of DUF2054 in different
species. (Source: Pfam)
D. Occurrence of DUF2054 domain across different
species.
E. Predicted N-glycosylation site (UniProtKB) and
transmembrane domains (predicted with TMHMM v.2.0) for C12orf49.
F. Scheme for different functional domains of
C12orf49.
Figure 3,
C12orf49 is a Golgi localized protein and binds S1P to regulate cholesterol
metabolism
A. Scheme depicting the action of Brefeldin A which
disassembles the Golgi compartments and redistributes them to the ER (left).
Immunoblots of mature SREBP1 and SREBP2 in indicated Jurkat cells in the
presence and absence of sterols or Brefeldin A (1 ug/ml) for 6 hours in the
lipoprotein depleted serum (right). Lamin B1 was used as the loading
control.
B. Fold change in cell number (log2) of Jurkat
wild type and C12orf49_KO cells overexpressing a control or mature SREBP cDNA
following 7-day growth under lipoprotein depleted serum in the absence or
presence of sterols (mean ± SD, n=3 biologically independent
samples).
C. Localization of C12orf49 to the Golgi. Wild type
HEK293T cells expressing C12orf49 cDNA were processed for
immunofluorescence analysis using antibodies against c12orf49, calreticulin
(ER), p230 (trans-Golgi) and GM130 (cis-Golgi). White color indicates overlap.
(Scale bar, 8 μm).
D. N-terminal region of C12orf49 is sufficient for Golgi
localization. Wild type HEK293T cells expressing C12orf49(1–70)-
HA-mNeonGreen cDNA were processed for immunofluorescence analysis using
antibodies against HA and GM130 (Golgi). White color indicates overlap. (Scale
bar, 8 μm)
E. Fold change in cell number (log2) of Jurkat
C12orf49_KO cells overexpressing indicated cDNAs following 6-day growth under
lipoprotein depletion serum with indicated sterol concentrations (mean ±
SD, n=3 biologically independent samples) (left).
Immunofluorescence analysis of overexpressed DUF2054 domain alone or tagged with
the Golgi targeting sequence of B3GALT1 (amino acids 1–61) in HEK293T
cells (right). White indicates overlap (Scale bar, 8
μm).
F. Immunoblots of SREBP1 and several SREBP target proteins
of Jurkat C12orf49_KO cell lines expressing the indicated cDNAs following 24h
growth under lipoprotein depletion in the presence and absence of sterols. Actin
and Lamin B1 were used as the loading controls for whole cell and nuclear
extracts, respectively.
G. iBAQ based mass spectrometric analysis identified
proteins immunoprecipitated from HEK293T cells expressing FLAG-C12orf49 (n=6
biologically independent samples) or GalT-FLAG cDNA (n=2
biologically independent samples). Log2 transformed fold differences
are indicated on x-axis. Selected proteins are marked to show proteins of
particular interest. Filled circles indicates that a protein was not detectable
in the control samples. For visualization, an unpaired two-tailed t-test was
performed.
H. Immunoblot analysis of C12orf49 interaction partners.
Glycosylated MBTPS1 co-immunoprecipitated with c12orf49. GalT- FLAG was used as
a near-neighbor control immunoprecipitation.
I. Immunoblot analysis of c12orf49 immunoprecipitates in
the HEK293T C12orf49_KO cells expressing the indicated cDNAs. DUF2054 was
localized to mitochondria, ER or Golgi using specified targeted sequences.
The experiments were repeated independently at least twice with similar
results. Statistical significance was determined by two-tailed unpaired
t-test.
To begin to undn class="Gene">erstand the precise mechanism by which C12orf49 regulates SREBP
processing and cholesterol metabolism, we sought to identify candidate regulators of
SREBP processing that interact with C12orf49. Mass spectrometric analyses of
immunoprecipitates of C12orf49, as compared to a Golgi-localized control, revealed the
presence of several proteins including OS9 and MBTPS1 (Fig. 3g, Extended Data Fig. 7a).
MBTPS1 is a member of the subtilisin-like proprotein convertase family and originally
made as an inactive precursor in the ER[35]. This inactive precursor undergoes a series of autocatalytic
cleavage at 2 sites, creating active forms, which can be glycosylated[33,36]. In turn, active forms of site-1-protease catalyze the proteolytic
cleavage of its substrates including SREBPs. In individual immunoprecipitation
experiments, C12orf49 specifically immunoprecipitates with an N-glycosylated form of
S1P, as shown by its sensitivity to PNGase F, a glycosidase that cleaves the asparagine
linked glycosylation residues (Fig. 3h). This
interaction requires the correct localization of the protein to the Golgi and the
presence of DUF2054 domain, as forced localization of the protein to other organelles
prevents the interaction (Fig. 3i). Notably, loss
of C12orf49 impacts cleavage of S1P targets including GNPTAB[37], CREB3L2 and CREB4[38], though at different levels (Extended Data Fig. 7b). Consistent with the dysfunction of the
Golgi-ER recycling of SCAP in the absence of S1P activity [39], SCAP localizes to the Golgi even in the
presence of sterols in the C12of49 knockouts. These experiments suggest
that the Golgi-localized C12orf49 binds and regulates S1P function (Extended Data Fig. 7c).
Extended Data Fig. 7
The impact of C12orf49 loss on the cleavage of MBTPS1 targets
A. Immunoblot analysis of OS9 in the C12orf49
immunoprecipitates of the HEK293T cell line expressing the indicated cDNAs.
The experiment was repeated independently twice with similar results.
B. Immunoblot analysis of cleavage of other site-1
protease targets, GNPTAB, CREB3L2 and CREB4 at 24-hours following
transfection in the C12orf49-knockout and addback HEK293T cells. Actin was
used as loading control. The experiment was repeated independently twice
with similar results.
C. Localization of SCAP-GFP in c12orf49 null HEK293T
cells expressing control or C12orf49 cDNA under lipoprotein depletion in the
presence or absence of sterols (Scale bar, 8 μm). The experiment was
repeated independently twice with similar results.
Because n class="Gene">C12orf49 is conserved in the metazoa and in some
plants, we next asked whether these homologs could replace C12orf49 in
human cells, when expressed (Fig. 4a; Extended Data Fig. 8a). With the exception of the
A.thaliana homolog, overexpression of any of the
C12orf49 homologs rescued the sensitivity of JurkatC12orf49-knockout cells to cholesterol depletion and restored SREBP
activation (Fig. 4b,c). Notably, A. thalianaC12orf49 possesses a long
C-terminus glycosyltransferase domain, raising the possibility that this protein may
have evolved an additional role in plants (Extended Data
Fig. 6c). Collectively, these results suggest that the functional
relationship between C12orf49 and S1P is evolutionarily conserved.
Figure 4,
C12orf49 function is conserved and essential for organismal lipid
homeostasis
A. Phylogenetic tree of C12orf49 in
organisms.
B. Fold change in cell number (log2) of Jurkat
C12orf49_KO cells overexpressing indicated C12orf49 cDNAs of different organisms
following a 6-day growth under lipoprotein depletion in the presence or absence
of sterols (mean ± SD, n=3 biologically independent samples). Statistical
significance was determined by two-tailed unpaired t-test.
C. Immunoblots of SREBP1 (nuclear) and SREBP target
proteins of Jurkat c12orf49_KO cell lines expressing the indicated cDNAs
following 24h growth under lipoprotein depletion in the presence and absence of
sterols. Actin and Lamin B1 were used as the loading controls for whole cell and
nuclear extracts, respectively. The experiment was repeated independently twice
with similar results.
D. Schematic showing genomic locus of zebrafish
c12orf49, g1 and g2 guide RNA target sites are marked by
arrows.
E. Experimental strategy for feeding and dietary clearance
assay.
F. Lipid absorption defects are marked by Oil Red O
staining (full gut) in mutant larvae. Quantification shows similar defects in
c12orf49
(trans-heterozygous germline mutant) and
mbtps1 germline
mutants, as well as c12orf49-gRNA injected larvae
(c12orf49
and c12orf49
). Number of larvae with represented phenotype is
indicated on corresponding images. Gut is demarcated by dashed lines.
G. CRISPR-Cas9 generated mutations detected in
c12orf49
and c12orf49
injected larvae. del: deletion, ins: insertion,
sub: substitution. Number of base pair changes are indicated. Dashes indicate
deletions, insertions are shown in green, substitutions in small-case
letters.
H. Flow chart describing disease association study using
PrediXcan method in BioVU biobank.
Significance is tested by logistic regression analysis (two-sided), n =
25,000. Multiple testing adjustment is done using Bonferroni correction. GTEx:
Genotype-Tissue Expression, EHR: electronic health record.
Extended Data Fig. 8
Conservation of C12orf49 function in metazoa and zebrafish
A. Phylogenetic tree of the C12orf49 genes across
species (Source: TreeFam).
B. DNA gel showing the cutting efficiencies of c12orf49
sgRNAs used in the zebrafish experiments. Upper bands (smears) represent DNA
heteroduplexes caused by CRISPR-Cas9 mutations; lower band is unedited DNA.
This assay was repeated twice with similar results.
C. Strategy to evaluate the effect of
CRISPR-Cas9-generated c12orf49 mutations at transcript level. c12orf49-g2
founder F0 fish were crossed and F1 progeny was individually analyzed.
Briefly, RNA was isolated from individual larvae, then cDNA was synthesized.
Using exon-specific primers g2 target site was PCR amplified and sequenced.
Various mutations detected from transcripts are shown.
Building upon the consn class="Gene">erved function and to further study
C12orf49 in a more physiologically relevant context, we used
zebrafish as a model organism. Since our biochemical data show that S1P is unable to
cleave and activate SREBP in the absence of C12orf49, we postulated that zebrafish
s1p-mutant
(mbtps1 allele shown to block
SREBP activation[40]) and
c12orf49-mutant models would demonstrate comparable phenotypes in
their lipid metabolism. Indeed, a dietary lipid clearance assay on a high-cholesterol
diet revealed similar intestinal lipid absorption blockade in both
s1p and
c12orf49 mutants generated by CRISPR/Cas9 system (Fig. 4d-g; Extended Data Fig. 8b,c). While previous studies showed cranioskeletal malformations associated
with mbtps1 mutations, c12orf49 mutants do not display
these phenotypes, suggesting that mbtps1 targets may be affected to a
different extent upon c12orf49 loss (Extended Data Fig. 7b) or alternative pathways exist to compensate for the
loss in different tissues. Collectively, these results suggest that C12orf49, like S1P,
may regulate lipid metabolism in vivo. To gain insight into
C12orf49 function in human physiology, we also examined disease
associations to reduced genetically regulated expression (GReX) of
C12orf49 in the genotype-linked Electronic Health Records (EHR) of
BioVU biobank[41,42] using PrediXcan[43] method. This analysis performed in
~25,000 BioVU subjects revealed a significant association of reduced
C12orf49 GReX to mixed hyperlipidemia (p=0.0326) and other
secondary intestinal phenotypes (Fig. 4h; Extended Data Fig. 9). These results collectively
suggest that C12orf49 functions in organismal lipid homeostasis and may
be associated with dysregulated lipid metabolism in humans.
Extended Data Fig. 9
GReX analysis identifies C12orf49 association with mixed
hyperlipidemia
Disease traits associated with reduced c12orf49 GReX in BioVU
biobank. Phecodes are indicated in parentheses. Traits are categorized into
systems (y-axis), and significance is displayed on x-axis. Significance is
tested by logistic regression analysis (two-sided), n = 25,000. Multiple
testing adjustment is done using Bonferroni correction.
Metabolic coessentiality network offn class="Gene">ers an alternative method to discover
unknown components of cellular metabolism and functionally assign them to existing
pathways. Using this method, here, we identify C12orf49 as an essential
component of SREBP processing and cholesterol-sensing in mammalian cells. Precisely how
C12orf49 contributes to the proteolysis of SREBPs is not known but our findings suggest
that its interaction with S1P is likely involved in the regulation of cholesterol
metabolism. Remarkably, C12orf49 is highly conserved, even in lower organisms. As a
subset of these organisms does not have an SREBP ortholog yet harbor orthologs of
C12orf49 and MBTPS1, the association between C12orf49 and S1P is likely relevant to
cellular processes other than SREBP in these organisms. Interestingly,
C12orf49 is associated with hyperlipidemia, so future line of work
is needed to understand whether this protein may be implicated in human disease or have
any clinical value. In conclusion, our work adds a new component to cellular cholesterol
regulation and provides a platform to determine the function of other unknown metabolic
components.
MATERIALS AND METHODS
Metabolic Coessentiality analysis
We adopted a three-step method to build putative intn class="Gene">eraction network
among genes based on their co-essentiality scores. In step I, we removed genes
which were strongly correlated with a large number of genes because pathway
analysis literature suggest that few proteins have many interaction partners. To
do this, we calculated a Pearson correlation network among all 17,638 genes with
a threshold of |r|=0.25. Then we ranked the genes based on their degrees in this
network and removed the top 10% from downstream analysis.
In steps II and III, we built partial correlation networks following the
Correlation Analysis workflow proposed in Section 3.1 of previous work[13]. Since calculating partial
correlation among essentiality scores of many genes using fewpan class="Gene">er cell lines is
computationally intensive, this workflow builds on a useful property of Gaussian
graphical models that was previously established[44]. This property ensures that genes in
different connected components of the partial correlation network are marginally
uncorrelated. Therefore, we can first construct a network by applying a
threshold on Pearson correlation, and then estimate partial correlation networks
separately for each of its connected components.
In step II of our analysis, we built such a Pearson correlation network
with a threshold |r|=0.35. Since we are only intn class="Gene">erested in finding novel genes
that interact with metabolic genes, we removed all the non-metabolic genes that
are not connected to any metabolic genes in this network, using a curated
metabolic gene set[45-47]. Of note, we curated this
metabolic gene set by exhaustive analysis of every known human gene combined
with searches of KEGG database and literature verifying the known or proposed
metabolic function of each gene[45]. Focusing on positive Pearson correlations, this led to a
network with 515 genes (275 metabolic genes, 240 non-metabolic genes) consisting
of 55 components (component size varied between 3 and 20).
In step III, we calculated separate partial correlation matrices for
each of these connected components and used statistically significant partial
correlations (FDR < 0.05) to construct the putative intpan class="Gene">eraction network.
We used R function ‘pcor’ from library ‘ppcor’, and
debiased graphical lasso[48]
implemented in the DSPC software[13], as two different ways to calculate partial correlation
networks. The debiased graphical lasso has an in-built regularization step and
is particularly suitable when the number of genes in the network is high
compared to the number of cell lines. Since the Pearson network components were
reasonably small, the results of the two methods were qualitatively similar and
we reported the output from ‘pcor’ in this paper. Finally, we
removed interactions of genes in −/+1 cytogenic bands of each other in
order to reduce false interactions as CRISPR-Cas9 genome editing was reported to
induce large truncations[49,50].
Cell lines
Cell lines n class="CellLine">HEK293T, Jurkat, MDA-MB-435, U-87 and BHK-21 were purchased
from the ATCC. Cell lines were verified to be free of mycoplasma contamination
and the identities of all were authenticated by STR profiling.
Antibodies, compounds and constructs
Custom antibody for n class="Gene">c12orf49 and TMEM41A were designed and generated at
YenZym Antibodies, using synthetic peptides with QEERAVRDRNLLQVHDHNQP (amino
acids 37–56 of c12orf49) and ETSTANHIHSRKDT (amino acids 251–264
of TMEM41A). Other antibodies, compounds, supplies, equipment, software,
experimental models and constructs are provided in the supplementary files.
Cell Culture Conditions
n class="CellLine">Jurkat were maintained in RPMI media (GIBCO) containing 2 mM glutamine,
10% fetal bovine serum, penicillin and streptomycin. HEK293T, U87M and
MDA-MB-435 cells were maintained in DMEM media (GIBCO) containing 4.5g/L
glucose, 4mM glutamine, 10% fetal bovine serum, penicillin and streptomycin. All
cells were maintained in monolayer culture at 37ºC and 5% CO2.
Focused CRISPR-based genetic screen
The highly focused sgRNA library was desin class="Gene">gned by including
representation of each gene within the SREBP module. For some of the genes, our
sgRNAs have previously been published and validated[15], we therefore used smaller number of
sgRNAs for particular genes. Oligonucleotides for sgRNAs were synthesized by
Integrated DNA Technologies and annealed before they were introduced in
lentiCRISPR-v2 vector using a T4 DNA ligase kit (NEB), following
manufacturer’s instructions. Ligation products were then transformed in
NEB stable competent E. coli (NEB) and the resulting colonies
were grown overnight at 32 °C and plasmids isolated by Miniprep (QIAGEN).
This plasmid pool was used to generate a lentiviral library containing five
sgRNAs per gene target. This viral supernatant was titred in each cell line by
infecting target cells at increasing amounts of virus in the presence of
polybrene (8 μg ml−1) and by determination of cell
survival after 3 days of selection with puromycin. One million Jurkat cells were
infected at a MOI of 1 before selection with puromycin for 3 days. An initial
pool of one million cells was collected. Infected cells were then cultured for
14 population doublings in the lipoprotein depleted serum containing media in
the presence or absence of cholesterol, after which one million cells were
collected and their genomic DNA was extracted by a DNeasy Blood & Tissue kit
(QIAGEN). For amplification of sgRNA inserts, we performed PCR using specific
primers for each condition. PCR amplicons were then purified and sequenced on a
MiSeq (Illumina). Sequencing reads were mapped and the abundance of each sgRNA
was measured. sgRNA score is defined as the log2 fold change in the
abundance between the initial and final population the sgRNA targeting a
particular gene. Report of the guide scores and sequences of the guides are
available in Supplementary
Table 1.
Generation of knockout and cDNA overexpression cell lines
For knockout expn class="Gene">eriments of C12orf49, sgRNA
(5′-TTTCAGGCTACGTTTGCGAG-3′) was cloned into lentiCRISPR-v1-GFP
vector by T4 DNA ligase (NEB) after linearization with BsmBI. Vector was
transfected into HEK293T cells with lentiviral packaging vectors VSV-G and
Delta-VPR using XtremeGene transfection reagent (Roche). Media was changed 24 hr
after transfection. The virus containing supernatant was collected at 48h and
filtered through 0.45 uM filter before use. Jurkat cells were spin-infected at a
MOI of 1 in 6-well tissue culture plates using 8 μg
ml−1 of polybrene at 1,200g for 1.5 h.
Virus was removed 24 hours afterinfection and single cell sorting was performed
into 96 well plates using GFP. Separately, HEK293T cells were transfected with
the same vector and single cell sorted similarly following selection by
puromycin for 3 days. For overexpressions, gBlocks(IDT) containing the
guide-resistant version of c12orf49 and other indicated cDNAs were cloned into
the pMXs retroviral vector by linearizing with BamHI and NotI, followed by
Gibson assembly. Epitope tags were added to the cDNAs when indicated.
Overexpression plasmids were transfected with retroviral packaging plasmids
Gag-pol and VSV-G into HEK293T cells. After transduction, cells were selected
with blasticidin.
Immunoblotting
Cell pellets wn class="Gene">ere washed twice with ice-cold PBS before lysis in SDS
lysis buffer (10 mM Tris-HCl pH 6.8, 100mM NaCl, 1 mM EDTA, 1mM EGTA, 1% SDS)
supplemented with protease inhibitors. Each cell lysate was sonicated thrice for
15s on ice with a 2 min interval between each sonication. Proteins from
membranes and nuclei are isolated using the Cell Fractionation Kit (CST #9038).
Protein concentrations of the samples were determined by a Pierce BCA Protein
Assay Kit (Thermo Scientific) with bovine serum albumin as a protein standard.
Samples were mixed with 5x SDS loading buffer and boiled for 5 min. Finally,
samples were resolved on 8%, 12% or 16% SDS–PAGE gels and analyzed by
immunoblotting. Immunoblot analysis of c12orf49 knockouts were performed
following deglycosylation with PNGase F (New England Biolabs) under denaturing
conditions, according to the manufacturer’s instructions.
For n class="Gene">SREBP targets, 24 hours before extraction, Jurkat cells were washed
three times with PBS and plated as triplicates (1 × 106 cells
per replicate) in 6-well plates using RPMI medium supplemented with 10% LPDS
supplemented with 50uM compactin and 50uM sodium mevalonate in the presence or
absence of sterols (10 μg ml−1 cholesterol, 1 μg
ml−1 25-hydroxycholesterol). For nuclear extracts, cells
were also provided 25 μg ml−1
N-acetly-leucinal-leucinal-norleucinal for the last 3 hours. Rest of the
immunoblotting was performed as described above. Immunoprecipitated proteins
were equally split into different tubes and reactions were performed under
denaturing conditions with the indicated deglycosylation enzyme according to the
manufacturer’s manual.
Proliferation assays
Cell lines wn class="Gene">ere cultured as triplicates in 96-well plates at 500 cells
(suspension) or 200 cells (adherent) per well in a final volume of 0.2 ml
RPMI-1640 medium (suspension) or DMEM media (adherent) supplemented with 10%
lipoprotein depleted serum (Kalen) with indicated treatments. A duplicate plate
was setup to determine initial luminescence on the day plates were set up,
without any treatment. To measure luminescence, 40 μl of Cell Titer Glo
reagent (Promega) was added in each well according to the manufacturer’s
instructions and data was obtained using a SpectraMax M3 plate reader (Molecular
Devices). Data are presented as relative fold change in luminescence of the
final measurement to the initials. For proliferation assays under lipoprotein
depletion luminescence was measured after 6 days of growth. In cholesterol
rescue experiments, 100 μg ml−1 LDL (corresponding to
total 50 μg ml−1 of cholesterol) or 10 μg
ml−1 free cholesterol were used as indicated. Cell culture
images were taken using a Primovert microscope (Zeiss).
Isotope tracing experiments and lipid metabolite profiling
n class="CellLine">Jurkat cells were washed three times with PBS and plated as triplicates
(1 × 106 cells per replicate) in 6-well plates using RPMI
medium supplemented with 10% LPDS in the presence or absence of sterols (10
μg ml−1 cholesterol, 1 μg ml−1
25-hydroxycholesterol). After 24 h, media was replaced with fresh medium
containing sodium acetate (10mM) or 13C1 sodium acetate
(10 mM). Following an incubation of 48 hours, cell pellets were washed twice
with 1 ml of 0.9% NaCl (800g for 2 minutes) and resuspended in 600 μl of
cold LC-MS grade methanol. Non-polar metabolites were extracted by consecutive
addition of 300 μl of LC-MS grade water followed by 400 μl of
LC-MS grade chloroform. The samples were vortexed (10 min) and centrifuged for
10 min at 20,000g and 4°C. The lipid-containing chloroform layer was
carefully removed and dried under liquid nitrogen. Dry lipid extracts were
stored at −80°C till further analysis.
The n class="Chemical">lipid extracts were saponified in 200 ul of 2M methanolic KOH (95%
methanol) for 2 hours at 60°C in a thermoblock (Eppendorf ThermoMixer).
Upon cooling to room temperature, 200ul of 5% NaCl was added to the saponified
extracts and the mixture was vortexed and acidified with 6N HCl (pH <2).
HPLC grade hexanes was added and the mixture was vortexed vigorously for 10
seconds (3X). After a centrifugation for 10 min at 20,000g and 4°C, the
hexane layer was transferred to a glass vial. The lipids were extracted with
hexanes twice more, adding 300ul hexanes each time. The combined hexane layers
were dried under liquid nitrogen and stored at −80 °C until LC-MS
analysis.
n class="Chemical">Lipids were separated on an Ascentis Express C18 2.1 mm × 150 mm
× 2.7 μm particle size column (Supelco) connected
to a Vanquish UPLC system and a Q Exactive benchtop orbitrap mass spectrometer
(Thermo Fisher Scientific), equipped with a heated electrospray ionization
(HESI) probe. Dried lipid extracts were reconstituted in 50 μl of 65:30:5
acetonitrile: isopropanol: water (v/v/v), vortexed for 10 sec, centrifuged for
10 min (20,000 g, 4°C) and 5 μl of the supernatant was injected
into the LC-MS in a randomized order, with separate injections for positive and
negative ionization modes. Mobile phase A consisted of 10mM ammonium formate in
60:40 water: acetonitrile (v/v) with 0.1% formic acid, and mobile phase B
consisted of 10mM ammonium formate in 90:10 isopropanol:acetonitrile (v/v) with
0.1% formic acid. Chromatographic separation was achieved using the previously
described gradient[51]. The
column oven and autosampler were held at 55 °C and 4 °C,
respectively.
The mass spectrometn class="Gene">er was operated with the following parameters;
positive or negative ion polarity; spray voltage, 3500 V; heated capillary
temperature, 285 °C; source temperature, 250 °C; sheath gas, 60
(arbitrary units); auxiliary gas, 20 (arbitrary units). External mass
calibration was performed every five days using the standard calibration
mixture.
Mass spectra wn class="Gene">ere acquired in positive ionization mode, using a Top3
data-dependent MS/MS method. The full MS scan was acquired as such; 70,000
resolution, 1 × 106 AGC target, 250 ms max injection time,
scan range 350 – 450 m/z. The data-dependent MS/MS scans
were acquired at a resolution of 17,500, AGC target of 1 ×
105, 75 ms max injection time, 1.0 Da isolation width, stepwise
normalized collision energy (NCE) of 20, 30, 40 units and 8 sec dynamic
exclusion.
Relative quantification of unlabeled and labeled n class="Chemical">cholesterol was
performed using Skyline Daily (MacCoss Lab)[52] with the maximum mass and retention time tolerance set
to 2 ppm and 20 sec, respectively. The measured isotopologues of cholesterol in
the unlabeled acetate experiments were used to correct for natural isotope
abundance in the [13C1] acetate-treated samples. Data are
presented as percentage of the labeled cholesterol in the total pool.
Real-time PCR assays
n class="CellLine">Jurkat cells were washed three times with PBS and plated as triplicates
(1 × 106 cells per replicate) in 6-well plates using RPMI
medium supplemented with 10% LPDS supplemented with 50uM compactin and 50uM
sodium mevalonate in the presence or absence of sterols (10 μg
ml−1 cholesterol, 1 μg ml−1
25-hydroxycholesterol). After an 8-hour incubation, RNA was isolated from cell
pellets by a RNeasy Kit (Qiagen) according to the manufacturer’s
protocol. RNA was spectrophotometrically quantified and equal amounts were used
for cDNA synthesis with the Superscript II RT Kit (Invitrogen). qPCR analysis
was performed on an ABI Real Time PCR System (Applied Biosystems) with the SYBR
green Mastermix (Applied Biosystems). Primers for each target are provided in
the supplementary
files. Results were normalized to β-actin.
Immunofluorescence
For lipoprotein depletion expn class="Gene">eriments, HEK293T cells were washed three
times with PBS, resuspended in DMEM supplemented with 10% LPDS and seeded
(2× 105) on coverslips in 6-well plates previously coated with
poly-D-lysine (Sigma). 12h later, cells were transfected with 100ug of
pMXS-mCherry-SREBP1 with the XtremeGENE 9 DNA transfection reagent, according to
the manufacturer’s manual. After 12 hours, cells were switched to fresh
media with 10% LPDS supplemented with 50uM compactin and 50uM sodium mevalonate
in the presence or absence of sterols (10 μg ml−1
cholesterol, 1 μg ml−1 25-hydroxycholesterol).
Following 16-hour incubation, cells were fixed for 15 min with 4%
paraformaldehyde diluted in PBS at room temperature. After three washes with
PBS, cells on the coverslips were permeabilized by incubation with 0.05% Triton
X-100 in PBS for 10 min at room temperature prior to another three PBS washes.
Coverslips were blocked with normal donkey serum (20X diluted in PBS) at room
temperature for 20 min and washed thrice with PBS. Coverslips were then blocked
with 5% normal donkey serum (NDS) for 1 hour at room temperature, before an
overnight incubation with the indicated primary antibodies diluted in 5% NDS at
4C. On the next day, following three washes with PBS, coverslips were then
incubated with secondary antibodies (Alexa Fluor 488 and Alexa Fluor 568) in the
dark for 1 hour at room temperature. Three washes with PBS were followed by an
incubation with a 300 nM solution of DAPI in PBS for 5 min in dark. Coverslips
were washed three times with PBS and finally mounted onto slides with Prolong
Gold antifade mounting media (Invitrogen). Images were taken on a confocal
microscope. For other localization experiments, HEK293T cells were cultured and
transfected in DMEM with 10% FBS.
Brefeldin A treatment
n class="CellLine">Jurkat cells were in grown in RPMI supplemented with 10% serum. One day
before stimulation, 1×106 cells were plated in 6-well plates.
On the day of the experiment, cells were washed three times with PBS and
resuspended in fresh media with 10% LPDS supplemented with 50uM compactin and
50uM sodium mevalonate in the presence or absence of sterols (10 μg
ml−1 cholesterol, 1 μg ml−1
25-hydroxycholesterol) and Brefeldin 1ug/ml was added to the indicated cells. 6
hours post-induction, cell pellets were subjected to nuclear extraction as
described above.
Immunoprecipitation
Before the day of immunoprecipitation, n class="CellLine">HEK293T cells overexpressing the
indicated plasmids were plated (1× 107) in a 15-cm culture
dish. After 15 hours, cells were washed with ice cold PBS twice and lysed in
immunoprecipitation lysis buffer (50 mM Tris⋅HCl, pH 7.4, 150 mM NaCl, 1
mM EDTA, 1% Triton X-100 and cOmplete EDTA-free protease inhibitor). The mixture
was placed on an end-over-end rotator for 10 minutes at 4C and spun down at
1000g for 4 minutes to separate the supernatant. For anti-FLAG
immunoprecipitations, the FLAG-M2 affinity gel was washed with 1 mL TBS (150 mM
NaCl) twice and 40 uL of the affinity gel was then added to the lysate
supernatant and incubated rotating at 4C for 3 hours. Affinity gel was placed on
spin columns (Chromotek) and washed thrice with TBS. Proteins were eluted by
incubating with 100 ng/uL of 3X FLAG peptide in lysis buffer for 15 min at room
temperature. For the proteomics experiment, proteins were chemically crosslinked
in live cells prior to lysis by adding dithiobis(succinimidyl propionate) to a
working concentration of 2.5 mM and incubating for 7 min at room temperature.
Crosslinking reaction was quenched by adding 1/10 volume of 1M Tris pH 8.5 to
the media and incubating for 2 min at room temperature.
Proteomics
Competitively eluted (3X FLAG peptide) samples, in 1% Triton, wn class="Gene">ere
diluted 2-fold followed by precipitation overnight in 6 volumes ice cold
acetone. Precipitates were dissolved and chemically reduced in 35uL 8M Urea/70mM
ammonium bicarbionate/20mM Dithiothreitol followed by alkylation (50mM
iodoacetamide). Samples were diluted and digested using Endopeptidase LysC (Wako
Chemicals) followed by additional dilution and trypsinization (Promega).
Acidified tryptic peptides were desalted[53] and analyzed using nano-LC-MS/MS (EasyLC1200 and Fusion
Lumos operated in High-High mode, ThermoFisher). Data were queried against
UniProt human database (March 2016) concatenated with common contaminants and
quantitated using MaxQuant v. 1.6.0.13 [54]. False discovery rates of 2% and 1% was applied to
peptide and protein identification. The iBAQ[55] values obtained from MaxQuant, were filtered, using
Perseus software[56], and the
following filters; 80% of replicates must contain a valid value in either the
‘experiment’ (n=6) and/or ‘control’ (n=2) groups,
protein must be matched to a minimum of 3 razor/unique peptides. Missing values
in the ‘control’ samples were imputed (Perseus) from a normal
distribution. For visualization only, a t-test was performed (Fig. 3g).
Phylogenetic analysis
Protein sequences of n class="Gene">C12orf49 in different species (UniProtKB) were
aligned using the Clustal W and MegAlign Software (DNASTAR). Phylogenetic tree
was constructed automatically by applying BioNJ algorithm with uncorrected
pairwise distance metrics and global gap removal.
CRISPR/Cas9 genome editing in zebrafish
CRISPR/Cas9 target sites within n class="Species">zebrafish c12orf49 gene
(GRCz11 assembly, gene name: zgc:110063) were identified using
CHOPCHOP[57] web tool.
Two independent genomic sites within c12orf49 locus were
targeted by alternative guide RNAs (gRNAs), namely g1 and g2 with the following
sequences; g1: 5’-GGTCTGAGTCCCTCGCCTCCAGG-3’ and g2:
5’-GGATGAACTTAACCTTCCACTGG-3’. Genomic locations targeted by gRNA
g1 and g2 are as follows: chr5:11947798 and chr5:11947828, respectively. A
cloning-free method to generate gRNA template was performed as previously
described [58]. Guide RNAs were
synthesized with MEGAshortscript T7 transcription kit (ThermoFisher Scientific).
To generate mutations with CRISPR/Cas9 system, a mixture of 500 pg purified Cas9
protein (PNA Bio Inc, # CP01) and 300 pg of either gRNA was injected into
one-cell stage embryos of wild-type (AB) crosses. Efficient generation of
mutations was confirmed by DNA heteroduplex formation assay[59] using following primers: forward
5’-ATGTACAGGAGGAGCGAACG-3’ and reverse
5’-TGAGAAGGCTCTTTCCCTGA-3’. RNA was isolated from zebrafish
embryos using TRIzol method following manufacturer’s intructions; cDNA
was synthesized using oligo dT primers. Following exonic primer (reverse) was
used in combination with the forward primer listed above to amplify c12orf49-g2
targeted site: Exonic Reverse:5’-CTCGAGCTGGGAGCATTAAC-3’
Sequence-confirmed mutant embryos wn class="Gene">ere grown to adulthood to generate
two independent germline mutant lines, c12orf49
and c12org49
, thus establishing F0 founders. These allelic F0
lines were then crossed to each other to produce trans-heterozygous mutant F1
embryos that carry a c12orf49
mutation in their maternal copy and a
c12org49
mutation in their paternal copy. The advantage of
this cross is the ability to eliminate off-target effects that potentially might
have been induced in either animal, and drive to homozygosity only the targeted
site.
Dietary Lipid Clearance Assay
Injected embryos wn class="Gene">ere grown to 5 dpf stage and fed with 10% organic
chicken egg yolk for 4 hours, followed by 16 hours of fasting. Next, zebrafish
larvae were fixed in 4% paraformaldehyde and processed for oil red O staining to
assay dietary lipid clearance in the digestive system, as described
previously[60]. Stained
larvae were imaged with Zeiss Axioimager Z1 scope equipped with Axiocam HRc
camera.
PrediXcan Discovery Analyses
We investigated to the association of n class="Gene">c12orf49 with
hyperlipidemia. We performed PrediXcan[43] analysis, leveraging a SNP-based prediction model in
colon (transverse). We estimated the genetically regulated gene expression
(GReX) in the approximately twenty five thousand BioVU subjects [41,61,62] using the
GTEx resource (v6p)[63,64] as a reference transcriptome
panel, and tested for association with hyperlipidemia[41]. From the weights
derived from the gene expression imputation
model for c12orf49 (driven by the single-nucleotide
polymorphism rs10507274 with effect allele “C” with false
discovery rate[65] (q-value) of
0.03) and the number of effect alleles for individual i at the
variant j, we estimated GReX as follows: in the BioVU subjects. We performed logistic regression to
determine the association between GReX and the disease trait. To maximize the
quality of the phenome information, we required at least two ICD9 or ICD10 codes
on different clinical visits to instantiate a phecode for diagnosis of the
phenotype.
Analytical Validation of Method and Comparison with Alternatives
Pearson correlation is the most commonly used method for building
co-essentiality networks among genes. Pan et al. (2019) has used genome-scale
Pearson correlation networks to identify functional modules and protein
complexes[2]. Howevpan class="Gene">er,
gene networks based on statistically significant Pearson correlation tend to
have many edges, including many false positives, which makes it difficult to
identify suitable targets for novel gene interaction discovery and wet-lab
validation. Thus there is a need for computational methods with higher
specificity (lower false positives) that identifies fewer but high-confidence
putative genetic interactions from data. In a recent work, Wainberg et al.
(2019) proposed an alternative co-essentiality network method based on
generalized least squares (GLS), which explicitly accounts for non-independence
of cell lines and reduces the number of false positives and has identified
93,575 significant co-essential gene pairs[3]. Although these comprehensive methods undoubtedly
identified many novel gene functions, we wanted to create a conservative method
that more easily allowed us to manually curate each individual network. As
result, we looked towards alternative methods and filters that allowed us to
short list putatively novel gene interactions.
In essence, both methods described above measure pairwise
association between two genes, without accounting for indirect or
spurious effects due to their intpan class="Gene">eractions with a third gene. Partial
correlation, a canonical method in classical statistics, allows explicitly
accounting for such indirect associations and produces a smaller but
high-confidence set of putative interactions for follow-up wet-lab validation.
While clustering based on pairwise correlation allows us to zoom in on a
specific module of genes, calculating partial correlation among genes within the
module help us focus on gene pairs which are more likely to interact
directly. As a result, we were better equipped with a
manageable list of gene interactions that can be studied at an experimental
scale. This is in sharp contrast with Pearson correlation based methods
described above, which only analyses association between two genes at a
time.
The principle of filtn class="Gene">ering out effects of other nodes in a network is at
the core of graphical modeling literature in statistics and machine learning.
Prior works that successfully employed this idea to build metabolic
networks[12,13]. Here we illustrate the benefit of such
a strategy using a simulation experiment based on biologically inspired network
structure.
We select a subnetwork of 30 nodes from an n class="Species">E.Coli
network using the GeneNetWeaver software[66], a popular tool for benchmarking network inference
methods. This network has a few hubs, with a main hub node at gene
fis. We then simulated (log) co-essentiality score of every
gene g (denoted by X) based on the
following rule:
Hn class="Gene">ere, pa(g) denotes the set of genes in the network
which have an outgoing edge to gene g. In other words,
essentiality score of gene g is influenced by the essentiality
score of its parent genes pa(g), although the main hub gene
fis exerts a stronger effect than other parent genes. The
term e in the above equation denotes standard Gaussian noise in the structural
equation system.
We simulated essentiality scores according to the above model for
n=500 independent samples (cell lines), and used Pearson
and partial correlation (using both ‘pcor’ and dn class="Gene">ebiased graphical
lasso) to reconstruct the gene networks from data (statistically significant
partial correlations (FDR < 0.05) were used to construct edges in
networks). Results of this experiment are displayed in Extended Data Fig. 1a. As expected, we see that gene
pairs which are connected only through fis (e.g. xylR,
xylH, pdxA, lysV) have high Pearson correlation, leading to false
positive edges. However, such edges are rarely picked up in both partial
correlation networks.
We note that building a Pearson correlation network with high cutoff
(vn class="Gene">ery small p-value) is not an alternative to partial correlation. In the
example above, even genes having only an indirect association through
fis may have higher Pearson correlation than two genes that
interact directly (e.g. marA and putA) due to
the strong effect of fis. So a network of large absolute
correlation is likely to keep more indirect associations and miss some of the
directly interacting gene pairs. This can be seen in the ROC curve of Extended Data Fig. 1b, where we calculate
false positive and negatives based on a range of cut-offs on Pearson and partial
correlation.
We conducted a more systematic simulation study by repeating the above
expn class="Gene">eriments on N=20 replicates, varying the number of genes (p = 30, 40, 50) and
number of cell lines (n = 100, 200, 300, 400, 500). Number of false positives
and true positives for Pearson correlation and the two types of partial
correlation methods (pcor and DGLASSO) are reported in Supplementary Table 2. Standard
errors calculated over the N=20 replicates are shown in parenthesis. These
results show that partial correlation networks substantially reduce the number
of false positives (hence increases specificity) over Pearson correlation, while
reducing the true positives to some extent. Our simulation results also show
that partial correlation tends to have lower power (sensitivity) as the network
size (p) increases. This is expected since calculation of partial correlation
matrix requires estimation of O(p2) parameters. Therefore, we do not
advocate using partial correlation at genome-scale, and only use it to filter
the set of interactions in small components (modules) obtained by Pearson
correlation or other pairwise association methods. Developing a one-step method
that combines the strengths of both Pearson and partial correlation to make it
applicable at genome-scale and possibly accounts for dependence among cell lines
as in Wainberg et al (2019)[3] is
an interesting research question, but beyond the scope of this paper and is left
for future work.
For mix population knockout expn class="Gene">eriments in U-87 MG and MDA-MB-435, sgRNA
of C12orf49 (5′-TTTCAGGCTACGTTTGCGAG-3′) was
cloned into lentiCRISPR-V2-puro vector. Vector was transfected into HEK293T
cells with lentiviral packaging vectors VSV-G and Delta-VPR using XtremeGene
transfection reagent (Roche). Indicated cells were spin-infected in 6-well
tissue culture plates using 8 μg ml−1 of polybrene at
1,200g for 1.5 h and selected by puromycin with
corresponding minimum lethal dose for 3 days. For knockout experiments of
TMEM41A, sgRNAs (5′-CATGCTGCTACCTGCTCTCC-3′,
5′-TCGCCTTGTACTTGCTGTCG-3′) were cloned into lentiCRISPR-v1-GFP
vector. Following transduction, cells were single cell sorted using GFP.
Overexpression of guide-resistant version TMEM41A and other plasmids used were
cloned into pMXs retroviral expression vector and was carried on by viral
transduction and selection as described.
Viral infectivity assays
The green fluorescent protein (GFP)-tagged n class="Species">bunyamwera virus (BUNV-GFP)
[67] (generously
provided by Richard M Elliott) was amplified in BHK-21 cells and titrated by
median tissue culture infectious dose (TCID50). For virus replication assays,
HEK293T cells (WT and C12orf49 KO) were seeded into poly-L-lysine coated 24-well
plates at 2.5×104 cells/well using lipid-depleted DMEM
supplemented with 10% fetal bovine serum (FBS). The following day, cells were
washed with Opti-MEM (Gibco) and infected with BUNV-GFP diluted in 200 μL
Opti-MEM at a multiplicity of infection (MOI) of 0.1 infectious units (IU)/mL.
Cells were inoculated for 2 h at 37°C before virus inoculum was removed
and washed off using Opti-MEM. For the remainder of the virus infection assay,
cells were cultured in lipid-depleted DMEM. Supernatants with progeny BUNV-GFP
were harvested at various timepoints (0, 24, 48, 72 hpi) and the infectious
titers were determined by TCID50 assays on BHK-21 cells. At the final timepoint
(72 hpi), cells were harvested into 250 μl Accumax cell dissociation
medium (eBioscience) and transferred to a 96-well block containing 250 μl
4% paraformaldehyde (PFA) fixation solution. Cells were pelleted at a relative
centrifugal force (RCF) of 930 for 5 min at 4°C, resuspended in cold
phosphate-buffered saline (PBS) containing 3% FBS and stored at 4°C until
flow cytometry analysis. Samples were analyzed using the LSRII flow cytometer
(BD Biosciences) equipped with a 488 nm laser for detection of GFP, and
resulting data using FlowJo software (Treestar).
Lipid metabolite profiling for TMEM41A null cells
The procedure for n class="Chemical">lipid extraction and analysis of the cellular
lipidomes were adopted from previously described protocols[68]. Briefly, Jurkat cells were washed three
times with PBS and plated as triplicates (1 × 106 cells per
replicate) in 6-well plates using RPMI medium supplemented with10% FBS. After 24
hours, cell pellets were resuspended in 1 mL cold PBS. A 30 μL aliquot of
the cell suspension was taken for determining protein concertation. The
remaining 970 μL of cell suspension was then transferred to a homogenizer
to which 2 mL of chloroform and 1 mL of methanol was added. The solution was
kept on ice and homogenized 30 times. The homogenized solution was centrifuged
(500 rcf, 4 °C, 10 minutes) to separate aqueous and organic layers. The
organic layer was carefully transferred into a 1-dram glass vial, of which 1.5
mL was transferred into a new vial to ensure equal volume was removed from each
extract. The chloroform extract was dried under vacuum. Samples were then
resuspended in a calculated amount of chloroform based on total protein
concentration.
n class="Chemical">Lipidomics data was acquired using an Agilent 1260 HPLC paired with an
Agilent 6530 Accurate-Mass Quadrupole Time-of-Flight mass spectrometer. A Gemini
C18 reversed-phase column (5 μm, 4.6×50mm, Phenomenex) with a C18
reversed-phase guard cartridge was used in negative mode. Mobile phase A was
95:5 water:methanol (v/v) and mobile phase B was 60:35:5
isopropanol:methanol:water (v/v). Mobile phases were supplemented with 0.1%
(w/v) ammonium hydroxide for negative mode. The gradient used for separation
began after 5 minutes, increasing from 0% B to 100% B over 60 minutes. At 65
minutes an isocratic gradient at 100% B was applied for 7 minutes, followed by
equilibration of the column with 0% B for 8 minutes. The flow rate for the
initial 5 minutes was 0.1 mL/min and was increased to 0.5 mL/min for the
remaining gradient. A DualJSI fitted electrospray ionization source was used.
Capillary voltage was set to 3500 V and fragmentor voltage set to 175 V. The
drying gas temperature was set to 350 °C with a flow rate of 12 L/min.
Targeted data analysis was performed using MassHunter Qualitative Analysis
software (version B.06.00, Agilent). The corresponding m/z for
each lipid was extracted and the peak area was manually integrated.
Lipotoxicity assays
n class="Chemical">Palmitic acid was conjugated to BSA. A 12 mM solution of the fatty acid
was dissolved in 20 mL of 0.01M NaOH and stirred for 30 min at 70ºC,
followed by addition into a stirring 60 mL 10% BSA solution in PBS to make a
final concentration of 3 mM. Solution was stirred for 1hr at 37C to allow fatty
acids to conjugate with BSA. Finally, the fatty acid-BSA solution was filtered
through 0.22Um filter and stored in a glass container at 4ºC. Indicated
Jurkat cells were cultured as triplicates in 96-well plates at 400 cells per
well in a final volume of 0.2 ml RPMI-1640 with increasing concentrations of
palmitate. A duplicate plate was setup to determine initial luminescence on the
day plates were set up, without any treatment. To measure luminescence, 40
μl of Cell Titer Glo reagent (Promega) was added in each well according
to the manufacturer’s instructions and data was obtained using a
SpectraMax M3 plate reader (Molecular Devices). Data are presented as relative
fold change in luminescence of the final measurement to the initials.
Luciferase Reporter assays
Three tandem repeats of the n class="Chemical">Sterol Regulated Element (SRE-1) in the
promoter of LDRL were cloned into pGL4.20 luciferase vector. Parental, knockout
and addback HEK293T cells were washed three times with PBS, resuspended in DMEM
supplemented with 10% LPDS and seeded (2.5× 104) in 96-well
plates previously coated with poly-D-lysine (Sigma). 12h later, cells were
transfected with increasing amounts of pGL-3xSRE and pRL-SV40 (1:20 ratio of
renilla: total plasmid) with the XtremeGENE 9 DNA transfection reagent,
according to the manufacturer’s manual. After 12 hours, cells were
switched to fresh media with 10% LPDS supplemented with 50uM compactin and 50uM
sodium mevalonate in the presence or absence of sterols (10 μg
ml−1 cholesterol, 1 μg ml−1
25-hydroxycholesterol). At 24h, cells were lysed and luminescence was read by
using the Dual-Glo Luciferase Assay System (Promega) and SpectraMax M3 plate
reader (Molecular Devices). Data is presented as Firefly/Renilla
luminescence.
Cleavage assays of other site-1 protease targets
Knockout and addback n class="CellLine">HEK293T cells were plated in DMEM supplemented with
10% FBS (2× 105) in 6-well plates. 12h later, cells were
transfected with 100ng of plasmids of triple tandem HA tagged GNPTAB, CREB3L2 or
CREB4 with the XtremeGENE 9 DNA transfection reagent, according to the
manufacturer’s manual. 24 hours post transfection, total proteins were
extracted and immunoblotted as described above.
SCAP localization
n class="CellLine">HEK293T cells were plated in DMEM supplemented with 10% FBS (2×
105) on coverslips in 6-well plates previously coated with
poly-D-lysine (Sigma). 12h later, cells were transfected with 100ug of GFP-SCAP
with the XtremeGENE 9 DNA transfection reagent, according to the
manufacturer’s manual. After 12 hours, cells were switched to fresh media
with 10% LPDS supplemented with 50uM compactin and 50uM sodium mevalonate in the
presence or absence of sterols (10 μg ml−1 cholesterol,
1 μg ml−1 25-hydroxycholesterol). Following 16-hour
incubation, cells were fixed and processed for imaging as described above.
Anti-GFP antibody (ProteinTech) was used for detection of SCAP.
Gene expression, conservation and architecture analysis
Gene expression across diffn class="Gene">erent tissues was obtained from GTEx. For the
uncharacterized domain of unknown function (DUF2054), Hidden Markov Model (HMM)
logo, different domain architectures and occurrence across different species
were obtained from Pfam (EMBL-EBI).
For n class="Gene">c12orf49, predicted motifs and post-translational modifications were
obtained from UniProtKB. Prediction of transmembrane helices of humanC12orf49
was performed by using the TMHMM Server v.2.0. Phylogenetic tree of C12orf49
across species is described at TreeFam (EMBL-EBI).
Statistical analysis
Sample size, mean, and significance (p-values) are indicated in the text
and figure legends. n class="Gene">Error bars in the experiments represent standard deviation
(SD) from either independent experiments or independent samples. Statistical
analyses were performed using GraphPad Prism 7 or reported by the relevant
computational tools.
Data availability
The data supporting the findings of this study are available from the
corresponding author upon reasonable request. Source data for all figures are
included with the online vpan class="Gene">ersion of the pappan class="Gene">er.
Code availability
The code for the computational analysis that is used in this study are
available from the corresponding author upon reasonable request.
Comparative Simulation between partial and Pearson correlation
A. Simulation expn class="Gene">eriment of a subnetwork from an E. coli network
demonstrating the advantage of using partial correlation over Pearson
correlation.
B. Receivn class="Gene">er operating characteristic (ROC) curve based on the
simulated data. (n= 500 independent samples)
Metabolic coessentiality modules
35 Metabolic coessentiality modules. Blue line indicates a
previously known intpan class="Gene">eraction between the genes. Poorly charactpan class="Gene">erized genes
are highlighted as orange.
C12orf49 is necessary for cell growth under sterol depletion
A. Pearson correlation values of the essentiality
scores of the indicated genes across diffpan class="Gene">erent pan class="Disease">cancer cell lines
(n=558).
B. Diffn class="Gene">erential sgRNA score for C12orf49 gene of Jurkat
cell line in the presence or absence of sterols.
C. Fold change in cell numbn class="Gene">er (log2) of U-87 MG or
MDA-MB-435c12orf49_KO cell line following a 6-day growth under lipoprotein
depletion in the absence or presence of sterols. (mean ± SD, n=3
biologically independent samples). Statistical significance was determined
by two-tailed unpaired t-test.
D. Immunoblots of n class="Gene">c12orf49 in the indicated knockout
cells of HEK293T. Actin was used as the loading control. The experiment was
repeated independently twice with similar results.
E. (left) Immunoblots of n class="Gene">c12orf49 knockout and addback
cells in Jurkat cells. Actin was used as the loading control. The experiment
was repeated independently twice with similar results. (right) Fold change
in cell number (log2) of indicated knockout and rescued addback Jurkat cells
following a 6-day growth under lipoprotein depletion in the absence or
presence of sterols. (mean ± SD, n=3 biologically independent
samples). Statistical significance was determined by two-tailed unpaired
t-test.
F. Fold change in cell numbn class="Gene">er (log2) of indicated
knockout and rescued addback HEK293T cells following a 6-day growth under
lipoprotein depletion in the absence or presence of sterols. (mean ±
SD, n=3 biologically independent samples). Statistical significance was
determined by two-tailed unpaired t-test.
TMEM41A is involved in lipid metabolism
A. Pearson correlation values of the essentiality
scores of the indicated genes across diffpan class="Gene">erent pan class="Disease">cancer cell lines
(n=558).
B. Localization of n class="Gene">TMEM41A to ER. Wild type HEK293T
cells expressing FLAG-TMEM41A cDNA were processed for immunofluorescence
analysis using antibodies against FLAG and PDI (ER). White color indicates
overlap (Scale bar, 8 μm). The experiment was repeated independently
twice with similar results.
C. Heatmap showing the relative abundance of indicated
n class="Chemical">lipid species in TMEM41-null Jurkat cells and those expressing sgRNA
resistant TMEM41A cDNA.
D. Immunoblot of n class="Gene">TMEM41A in Jurkat wild type cell line,
TMEM41A nulls and those expressing TMEM41A cDNA. Actin was used as the
loading control. The experiment was repeated independently twice with
similar results.
E. Fold change in cell numbn class="Gene">er (log2) of Jurkat wild
type cell line, TMEM41A-null cells and those expressing TMEM41A cDNA after a
7-day growth upon treatment of indicated palmitate concentrations
(0–80 uM). (mean ± SD, n=3 biologically independent samples).
Statistical significance was determined by two-tailed unpaired t-test.
Role of C12orf49 in sterol synthesis and SREBP-mediated
transcription
A. (top left) Pn class="Gene">ercentage of Bunyamwera virus-positive
cells at 72-hours post-infection (MOI=0.1IU/Ml) in indicated knockout and
addback HEK293T cells (mean ± SD, n=3 biologically independent
samples). Statistical significance was determined by two-tailed unpaired
t-test. (top right) Viral titer measured by TCID50 assays on BHK-21 cells
with the harvested supernatant from the Bunyamwera virusinfectedHEK293T
cells of C12orf49 knockouts and addbacks. (mean ± SD, n=3
biologically independent samples) Statistical significance was determined by
two-tailed unpaired t-test. (bottom) Growth of the viral titers at different
time points in the knockout and addback cells.
B. Fold change in mRNA levels (log2) of n class="Gene">SREBP target
genes in indicated Jurkat cell lines following 8h growth under lipoprotein
depletion in the presence and absence of sterols (mean ± SD,
n=3).
C. Relative luminescence activity (Lucifn class="Gene">erase/Renilla)
in the indicated HEK293 cell lines following transfection with firefly
luciferase under SRE promoter and Renilla luciferase for normalization of
transfection following 24h growth under lipoprotein depletion in the
presence and absence of sterols (mean ± SD, n=3 biologically
independent samples). Statistical significance was determined by two-tailed
unpaired t-test.
C12orf49 gene expression in various tissues
A. Gene expression analysis across diffn class="Gene">erent tissues
for C12orf49. Box plots are shown as median and 25th and 75th percentiles;
points are displayed as outliers if they are above or below 1.5 times the
interquartile range (Source: GTEx Portal).
B. DUF2054 profile hidden Markov Model (HMM) logo from
n class="Chemical">Pfam shows 14 conserved cysteines, 3 of which are CC-dimers.
C. Diffn class="Gene">erent architectures of DUF2054 in different
species. (Source: Pfam)
D. Occurrence of DUF2054 domain across diffpan class="Gene">erent
species.
E. Predicted N-glycosylation site (UniProtKB) and
transmembrane domains (predicted with TMHMM v.2.0) for pan class="Gene">C12orf49.
F. Scheme for diffpan class="Gene">erent functional domains of
pan class="Gene">C12orf49.
The impact of C12orf49 loss on the cleavage of MBTPS1 targets
A. Immunoblot analysis of n class="Gene">OS9 in the C12orf49
immunoprecipitates of the HEK293T cell line expressing the indicated cDNAs.
The experiment was repeated independently twice with similar results.
B. Immunoblot analysis of cleavage of othn class="Gene">er site-1
protease targets, GNPTAB, CREB3L2 and CREB4 at 24-hours following
transfection in the C12orf49-knockout and addback HEK293T cells. Actin was
used as loading control. The experiment was repeated independently twice
with similar results.
C. Localization of n class="Gene">SCAP-GFP in c12orf49 null HEK293T
cells expressing control or C12orf49 cDNA under lipoprotein depletion in the
presence or absence of sterols (Scale bar, 8 μm). The experiment was
repeated independently twice with similar results.
Conservation of C12orf49 function in metazoa and zebrafish
A. Phylogenetic tree of the pan class="Gene">C12orf49 genes across
species (Source: TreeFam).
B. DNA gel showing the cutting efficiencies of n class="Gene">c12orf49
sgRNAs used in the zebrafish experiments. Upper bands (smears) represent DNA
heteroduplexes caused by CRISPR-Cas9 mutations; lower band is unedited DNA.
This assay was repeated twice with similar results.
C. Strategy to evaluate the effect of
CRISPR-Cas9-genn class="Gene">erated c12orf49 mutations at transcript level. c12orf49-g2
founder F0 fish were crossed and F1 progeny was individually analyzed.
Briefly, RNA was isolated from individual larvae, then cDNA was synthesized.
Using exon-specific primers g2 target site was PCR amplified and sequenced.
Various mutations detected from transcripts are shown.
GReX analysis identifies C12orf49 association with mixed
hyperlipidemia
Disease traits associated with reduced pan class="Gene">c12orf49 GReX in BioVU
biobank. Phecodes are indicated in parentheses. Traits are categorized into
systems (y-axis), and significance is displayed on x-axis. Significance is
tested by logistic regression analysis (two-sided), n = 25,000. Multiple
testing adjustment is done using Bonfpan class="Gene">erroni correction.
Authors: Aviad Tsherniak; Francisca Vazquez; Phil G Montgomery; Barbara A Weir; Gregory Kryukov; Glenn S Cowley; Stanley Gill; William F Harrington; Sasha Pantel; John M Krill-Burger; Robin M Meyers; Levi Ali; Amy Goodale; Yenarae Lee; Guozhi Jiang; Jessica Hsiao; William F J Gerath; Sara Howell; Erin Merkel; Mahmoud Ghandi; Levi A Garraway; David E Root; Todd R Golub; Jesse S Boehm; William C Hahn Journal: Cell Date: 2017-07-27 Impact factor: 41.582
Authors: Tim Wang; Haiyan Yu; Nicholas W Hughes; Bingxu Liu; Arek Kendirli; Klara Klein; Walter W Chen; Eric S Lander; David M Sabatini Journal: Cell Date: 2017-02-02 Impact factor: 41.582
Authors: Robin M Meyers; Jordan G Bryan; James M McFarland; Barbara A Weir; Ann E Sizemore; Han Xu; Neekesh V Dharia; Phillip G Montgomery; Glenn S Cowley; Sasha Pantel; Amy Goodale; Yenarae Lee; Levi D Ali; Guozhi Jiang; Rakela Lubonja; William F Harrington; Matthew Strickland; Ting Wu; Derek C Hawes; Victor A Zhivich; Meghan R Wyatt; Zohra Kalani; Jaime J Chang; Michael Okamoto; Kimberly Stegmaier; Todd R Golub; Jesse S Boehm; Francisca Vazquez; David E Root; William C Hahn; Aviad Tsherniak Journal: Nat Genet Date: 2017-10-30 Impact factor: 38.330
Authors: Jan Rozman; Birgit Rathkolb; Manuela A Oestereicher; Christine Schütt; Aakash Chavan Ravindranath; Stefanie Leuchtenberger; Sapna Sharma; Martin Kistler; Monja Willershäuser; Robert Brommage; Terrence F Meehan; Jeremy Mason; Hamed Haselimashhadi; Tertius Hough; Ann-Marie Mallon; Sara Wells; Luis Santos; Christopher J Lelliott; Jacqueline K White; Tania Sorg; Marie-France Champy; Lynette R Bower; Corey L Reynolds; Ann M Flenniken; Stephen A Murray; Lauryl M J Nutter; Karen L Svenson; David West; Glauco P Tocchini-Valentini; Arthur L Beaudet; Fatima Bosch; Robert B Braun; Michael S Dobbie; Xiang Gao; Yann Herault; Ala Moshiri; Bret A Moore; K C Kent Lloyd; Colin McKerlie; Hiroshi Masuya; Nobuhiko Tanaka; Paul Flicek; Helen E Parkinson; Radislav Sedlacek; Je Kyung Seong; Chi-Kuang Leo Wang; Mark Moore; Steve D Brown; Matthias H Tschöp; Wolfgang Wurst; Martin Klingenspor; Eckhard Wolf; Johannes Beckers; Fausto Machicao; Andreas Peter; Harald Staiger; Hans-Ulrich Häring; Harald Grallert; Monica Campillos; Holger Maier; Helmut Fuchs; Valerie Gailus-Durner; Thomas Werner; Martin Hrabe de Angelis Journal: Nat Commun Date: 2018-01-18 Impact factor: 17.694
Authors: Joshua Pan; Robin M Meyers; Brittany C Michel; Nazar Mashtalir; Ann E Sizemore; Jonathan N Wells; Seth H Cassel; Francisca Vazquez; Barbara A Weir; William C Hahn; Joseph A Marsh; Aviad Tsherniak; Cigall Kadoch Journal: Cell Syst Date: 2018-05-16 Impact factor: 10.304
Authors: Shahram Bahrami; Kaja Nordengen; Alexey A Shadrin; Oleksandr Frei; Dennis van der Meer; Anders M Dale; Lars T Westlye; Ole A Andreassen; Tobias Kaufmann Journal: Nat Commun Date: 2022-06-15 Impact factor: 17.694
Authors: David R Amici; Jasen M Jackson; Mihai I Truica; Roger S Smith; Sarki A Abdulkadir; Marc L Mendillo Journal: Life Sci Alliance Date: 2020-12-16
Authors: Bikal R Sharma; Sydney R Vaughn; Albert Lu; Frank Hsieh; Carlos Enrich; Suzanne R Pfeffer Journal: J Cell Biol Date: 2021-12-22 Impact factor: 8.077
Authors: W Frank Lenoir; Micaela Morgado; Peter C DeWeirdt; Megan McLaughlin; Audrey L Griffith; Annabel K Sangree; Marissa N Feeley; Nazanin Esmaeili Anvar; Eiru Kim; Lori L Bertolet; Medina Colic; Merve Dede; John G Doench; Traver Hart Journal: Nat Commun Date: 2021-11-11 Impact factor: 14.919
Authors: Joshua Pan; Jason J Kwon; Jessica A Talamas; Ashir A Borah; Francisca Vazquez; Jesse S Boehm; Aviad Tsherniak; Marinka Zitnik; James M McFarland; William C Hahn Journal: Cell Syst Date: 2022-01-31 Impact factor: 11.091