Mckenzie Tu1, Sarah Saputo1. 1. Department of Chemistry and Biochemistry, SUNY Brockport, Brockport, NY, USA.
Abstract
The serine incorporator (SERINC) family of proteins are a family of multipass transmembrane proteins associated with biosynthesis of serine-containing phospholipids and sphingolipids. Humans have 5 paralogs, SERINC1-5, which have been linked to disease including variable expression in tumor lines and possessing activity as restriction factors against HIV-1. Despite recent studies, the cellular function of SERINC proteins have yet to be fully elucidated. The goal of this study as to investigate the role of SERINC3 by expanding upon its interactome. We used a variety of bioinformatic tools to identify cellular factors that interact with SERINC3 and assessed how sequence variation might alter these interactions. Analysis of the promoter region indicates that SERINC3 is putatively regulated by transcription factors involved in tissue-specific development. Analysis of the unique 3'-untranslated region of one variant of HsSERINC3 revealed that this region serves as a conserved site of regulation by both RNA binding proteins and miRNA. In addition, SERINC3 is putatively regulated at the protein level by several posttranslational modifications. Our results show that extra-membrane portions of SERINC3 are subject to variation in the coding sequence as well as areas of relatively low conservation. Overall, our data suggest that regions of low homology as well as presence of variations in the nucleotide and protein sequences of HsSERINC3 suggest that these variations may lead to aberrant function and alternative regulatory mechanisms in homologs. The functional consequences of these sequence and structural variations need to be explored systematically to fully appreciate the role of SERINC3 in both health and disease.
The serine incorporator (SERINC) family of proteins are a family of multipass transmembrane proteins associated with biosynthesis of serine-containing phospholipids and sphingolipids. Humans have 5 paralogs, SERINC1-5, which have been linked to disease including variable expression in tumor lines and possessing activity as restriction factors against HIV-1. Despite recent studies, the cellular function of SERINC proteins have yet to be fully elucidated. The goal of this study as to investigate the role of SERINC3 by expanding upon its interactome. We used a variety of bioinformatic tools to identify cellular factors that interact with SERINC3 and assessed how sequence variation might alter these interactions. Analysis of the promoter region indicates that SERINC3 is putatively regulated by transcription factors involved in tissue-specific development. Analysis of the unique 3'-untranslated region of one variant of HsSERINC3 revealed that this region serves as a conserved site of regulation by both RNA binding proteins and miRNA. In addition, SERINC3 is putatively regulated at the protein level by several posttranslational modifications. Our results show that extra-membrane portions of SERINC3 are subject to variation in the coding sequence as well as areas of relatively low conservation. Overall, our data suggest that regions of low homology as well as presence of variations in the nucleotide and protein sequences of HsSERINC3 suggest that these variations may lead to aberrant function and alternative regulatory mechanisms in homologs. The functional consequences of these sequence and structural variations need to be explored systematically to fully appreciate the role of SERINC3 in both health and disease.
Serine incorporator (SERINC) proteins constitute a unique protein family that show
minimal amino acid homology to other proteins but are highly conserved among
eukaryotes.[1,2]
Yeast possess a similar membrane protein, dubbed TMS1, which localizes to the
vacuolar membrane and exhibits modest homology to mammalian homologs.[3,4] Humans encode five paralogs
that contain between 8 and 11 transmembrane domains that are characteristic to
SERINC-family proteins. These SERINCs were originally named for their proposed
ability to incorporate serine into membranes as phosphatidylserine or sphingolipids.
Localization studies have revealed that SERINC3 and SERINC5 are present in
the perinuclear region, Golgi apparatus (SERINC3) as well as the plasma membrane.In multiple model systems, SERINC-family proteins have been linked to membrane
trafficking. Both SERINC1 and SERINC3 in Homo sapiens were found to
be cargo proteins that act in trafficking to exchange intermediates between cellular compartments.
Cells deficient in the adaptor complex 4 (AP-4) possessed aberrant
localization of SERINC1 and SERINC3. Both SERINC proteins colocalize with the
autophagy-related protein 9A (ATG9A) and interact with AP-4 complex factors. The
five adaptor protein complexes observed in humans act in distinct pathways to
regulate transport of vesicles to distinct cellular localizations. The
clathrin-independent complexes have specifically been associated with the
trans-golgi network, facilitating transport from the Golgi apparatus to the early
endosome and plasma membrane.
Disruption of AP-4-associated transport has been linked to several forms of
spastic paraplegia, a disease associated with weakness and abnormal gait.The functions of SERINC-family proteins have been expanded to include functioning as
restriction factors against gamma-retroviruses in Mus musculus and
lentiviruses in Homo sapiens.[5,9] Members of this protein family
have been demonstrated to function by impairing the penetration of the viral
particle into the cytoplasm through a mechanism dependent on Nef, a HIV1 accessory
protein. Viral accessory proteins, like Nef, play a significant part in viral
replication and infection.
Protein systems responsible for host cell trafficking are hijacked and can
function in immune cell circumnavigation.
Of the SERINC proteins encoded in the human genome, SERINC3 and SERINC5
possess the greatest activity to inhibit Nef-defective virus infectivity upon
ectopic expression in “low Nef-responsive” cells.
In the absence of Nef, SERINC3 is successfully incorporated into viral
particles preventing delivery of the viral core by inhibiting the expansion of the
fusion pore.The structures of HsSERINC5 and the ortholog from Drosophila
melanogaster were elucidated confirming the presence of a multipass
helical structure as well as a well-defined lipid binding groove.
Our analysis of the SERINC3 structure confirms the presence of the 11
transmembrane helices that are conserved in most SERINC family proteins. Other
studies have taken a step further to assign cellular roles to specific structures
and posttranslational modifications. For example, mutational analysis revealed the
key amino acids that are associated with the ability of SERINC5 to localize to the
plasma membrane and possess the ability to restrict HIV-1 infection, an activity
consistent with its association with AP-4.
Another study linked the cellular functions of SERINC5 in HIV-1 restriction
have been linked to posttranslational modification and proteasomal degradtion.
Based on our findings, it is likely that SERINC3 undergoes a similar
mechanism of regulation as SERINC5. However, these findings need to be validated
using in vitro and in vivo model systems.Although the SERINC-family proteins were first described in relation to their
differential expression in tumor cell lines,[14,15] disruption of homologs have
observed in other diseases. For example, variants of other serine-family proteins
have been identified with links to alcohol dependence.
Allelic variations may also be associated with differential ability to
interact with or restrict HIV-1 infection.
Despite these recent studies, the cellular roles and functional network
associated with SERINC proteins have yet to be fully characterized. Therefore, we
chose to further investigate the function, structure, and regulation of the
HsSERINC3. The goal of this study was to use an in
silico approach to conduct an in-depth analysis of the SERINC3 genomic
loci, protein structure, and functional networks.
Materials and Methods
SERINC3 sequences
The data on the human SERINC3 gene, including sequences and single nucleotide
polymorphisms, were collected from Entrez Gene on the National Center for
Biological Information (NCBI) website.
Promoter analysis
Pairwise alignment of promoter sequences was conducted using the EMBOSS Sequence
Alignment tool (https://www.ebi.ac.uk/Tools/psa/emboss_needle/) with the
BLOSUM62 matrix and a default gap open penalty of 14.
Prediction and conservation of transcription factor binding sites within
the SERINC3 promoter was completed with Ciiider (http://ciiider.com/)
using a matrix of transcription factor binding profiles
and a deficit of 0.1. The promoter was defined as the 1000 nucleotides
preceding the SERINC3 start codon as done previously.
Conservation of the promoter region (chr20: 43150589-43151592) was
analyzed using the Dcode.org tool developed by the Ovcharenko lab.
To do this, the Evolutionary Conserved Regions (ECR) browser was used
with the following parameters: graph (smooth), ECR length (100), ECR similarity
(70), layer height (55), and coordinate system (relative).
Gene ontology analysis
The proteins that interact with SERINC3 were analyzed with the GeneOntology tool
(http://geneontology.org/),[22,23] which allows for the
categorization of proteins based on annotated biological function. The analysis
type was PANTHER Overrepresentation Test (Released 20210224) using GO Ontology
database DOI: 10.5281/zenodo.5228828 Released 2021-08-18. The embedded Fisher
exact test was used to calculate the P value and a false
discovery rate cut-off of .01 was used.
3′UTR characterization
Analysis of the 3′ untranslated region was done using RBPSuite (http://www.csbio.sjtu.edu.cn/bioinf/RBPsuite/)
and miRDB (http://mirdb.org/).
The RBPSuite and miRDB provides the probability that the analyzed
sequence has a binding site for a RNA binding protein or microRNA, respectively.
The GeneOntology tool allows for the categorization of proteins based on the
annotated biological function.
Posttranslational modification prediction
Posttranslational modification prediction was performed using the following tools
and cut-offs: NetPhos (https://services.healthtech.dtu.dk/service.php?NetPhos-3.1)
(threshold: 0.90), NetGlyc (https://services.healthtech.dtu.dk/service.php?NetNGlyc-1.0)
(threshold: 0.50), ASEB (http://cmbi.bjmu.edu.cn/huac)
(P value cut-off .05), and Biocuckoo (http://pail.biocuckoo.org/)
(balanced cut-off option). These tools predict the presence of sites of
posttranslational modification in the by comparing with a database of previously
characterized sequences. When possible, redundant analyses were performed on
these tools to confirm results.
Single nucleotide polymorphisms
Single nucleotide polymorphisms (SNPs) were downloaded from the NCBI Variation
Viewer (https://www.ncbi.nlm.nih.gov/variation/view). The effect of
select SNPs on protein stability was analyzed with I-Mutant2.0 (https://folding.biofold.org/i-mutant/i-mutant2.0.html), an
online support vector machine.
The ∆∆G value was calculated at 25°C from the unfolding Gibbs free energy
value of the mutated protein minus the unfolding Gibbs free energy value of the
wild type.
Structure analysis
Two-dimensional (2D) projections of SERINC3 were prepared using the
web-accessible Protter software (http://wlab.ethz.ch/protter/start/).
Colors and red lines were added using Adobe Illustrator (Adobe Systems,
San Jose, CA, USA). Analysis of the relative conservation of the SERINC3
compared with homolog sequences was performed using the ConSurf Server
(https://consurf.tau.ac.il/).
ConSurf uses a multiple protein alignment data to predict the relative
conservation of amino acids. The search was done using the HMMER homolog search
algorithm, a E-value cut-off of 0.0001, and the UNIREF-90 protein database.
Results
Study design
The activity of proteins is controlled at different levels in a hierarchy that is
suited to its cellular role. One way to characterize the function of a protein
is to identify the set of factors that interact with a protein of interest. As
the cellular role of SERINC3 is not completely understood, we reasoned that
identification of cellular factors that interact with SERINC3 would further
elucidate its function. Starting with analysis of the SERINC3 promoter, we used
an in silico approach to expand on the interactome of SERINC3
at each step of the central dogma. Regulation at the RNA level has the potential
to occur through several different mechanisms;
however, as the 3′-untranslated region (UTR) of SERINC3 is unique among
SERINC paralogs, we analyzed this region for sequences that matched binding
sites for characterized micro-RNAs (miRNAs) or RNA binding proteins. Finally, we
analyzed the protein structure of SERINC3 to search for amino acids that be
modified by posttranslational modifications (PTMs). At each point, DNA, RNA, and
protein, conservation of sequences was also assessed.
Regulation of SERINC3 expression
The expression of SERINC3 has been reported to be dysregulated in tumor cell lines.
Although the presence of a 5' enhancer has been documented,[15,35] little is
known about the proteins that control the expression of SERINC3. To investigate
the regulation of the SERINC3 gene, we used an in silico
prediction tool to locate putative sites for the binding of transcription
factors. Use of the Ciiider toolkit allowed for the analysis and visualization
of putative transcription factor binding sites based on data contained in
position frequency matrices.
Using the matrices developed by Khan et al,
we analyzed the promoter of SERINC3, as defined as the 1000 base pairs
prior to the ATG-site. Our results indicated the presence of 579 putative
binding sites that matched the consensus sequences of 238 regulatory proteins
with a deficit of 0.1 set by the Ciiider software (Supplemental Table S1). Gene ontology (GO) analysis of the
transcription factors with putative binding sites revealed a majority involved
in differentiation (110), tissue-specific morphogenesis (104), or development
(203) (Supplemental Table S2). For example, 15 transcription factors
that have been previously documented for their roles in kidney development have
a total of 41 sites in the HsSERINC3 that match the respective
consensus sequences (Figure
1). There are also 16 putative binding sites for the Lhx1 regulator
in the HsSERINC3 promoter, both on the coding and noncoding
strands, which have been omitted from Figure 1 for clarity. Of the 15
transcription factors involved in kidney development that bind the
HsSERINC3 promoter, only the consensus sites of GATA3,
Lhx1, Pax2, OSR2, and Smad4 are found in the coding strand of
HsSERINC3.
Figure 1.
SERINC3 promoter analysis. Putative transcription factor binding sites
were mapped to promoters using the Ciiider Software and the JASPAR 2018
core vertebrae set of matrices. Promoters were defined as the 1000
nucleotides preceding the SERINC3 start codon. SERINC3 promoters from
model organisms were analyzed for percent identity compared to the
HsSERINC3 promoter using EMBOSS matcher (noted in
italics). Conservation of select putative transcription factor binding
sites associated with transcriptional regulation of human kidney
development are shown, with the exception of Lhx1 which was omitted for
clarity.
SERINC3 promoter analysis. Putative transcription factor binding sites
were mapped to promoters using the Ciiider Software and the JASPAR 2018
core vertebrae set of matrices. Promoters were defined as the 1000
nucleotides preceding the SERINC3 start codon. SERINC3 promoters from
model organisms were analyzed for percent identity compared to the
HsSERINC3 promoter using EMBOSS matcher (noted in
italics). Conservation of select putative transcription factor binding
sites associated with transcriptional regulation of human kidney
development are shown, with the exception of Lhx1 which was omitted for
clarity.Next, we considered the conservation of the SERINC3 promoter and asked if
transcriptional regulation would be altered in other model organisms. Pairwise
analyses of the SERINC3 promoter sequences from human and select model organisms
revealed conservation of the promoter region in primates. Compared with the
human SERINC3 promoter, the percent identity was the highest with Pan
troglodytes (98.8%), Gorilla gorilla (98.8%), and
Maacaca mulatta (87.7%) (Figure 1). The SERINC3 promoter sequence
was somewhat less conserved in other mammals, including Sus
scrofa (pig, 64.4%), Cricetulus griseus (Chinese
hamster, 47.7%), and Mus musculus (mouse, 44.5%), for
example.The presence of putative transcription factor binding sites in the SERINC3
promoter region led us to next ask if the respective binding regions within the
promoter were conserved. As before, the SERINC3 promoter sequences were
retrieved from NCBI and the Ciiider software tool
was used to both align the sequences and predict putative sites of
transcription factor binding. The conservation of the putative sites
transcriptional regulator binding associated with the GO term “kidney
development” was mapped on the SERINC3 of Homo sapiens and 13
model organisms (Figure
1). Of the 15 transcription factors involved in kidney development
with conserved consensus sites in the HsSERINC3 promoter, many
of the consensus sites were conserved. Among primates SERINC3 promoters,
regulation appears to be highly conserved among the set of transcription factors
that include FOXC1, SOX8, Pax2, and others. In other mammals, a reduced level of
sequence conservation meant that the putative binding sites for regulators were
less conserved. For example, the SERINC3 promoter of Felis
catus had 58.7% identity to the human promoter. Our results
demonstrated that only 4 (SOX4, Nhx3-1, Pax2, and Smad4) regulators are
predicted to bind the FcSERINC3 promoter of the 15 regulatory
proteins with the HsSERINC3 promoter.Taken together, these results suggest that the transcription of SERINC3 is
controlled by a variety of transcriptional regulators and may contribute to
tissue-specific development. In addition, our analysis suggests that SERINC3
expression may be regulated in a similar manner in primates to that of humans,
whereas it may be subject to different transcriptional regulation in other
mammals.
The 3′UTR controls expression of SERINC3 at the RNA level
The role of both coding and noncoding RNA has expanded in recent decades to
include an additional layer of control over eukaryotic gene expression. The
presence of a ~2.8 kb untranslated region in the 3′ region variant 1 of SERINC3
suggested a method of alternative control that is unique among human SERINC
paralogs. To gain insight on the function of the SERINC3 3′UTR, we examined the
3′UTR sequence for putative binding sites for regulatory proteins and microRNA
(miRNA). The sequence corresponding to the SERINC3 3′UTR, according to NCBI
(ch20: 44500295-44497441), was used as input for analysis with the RBPSuite
and miRDB
to identify putative sites for regulation by proteins and miRNA,
respectively.First, putative sites for protein binding were detected in segments of 101
nonoverlapping segments of the SERINC3 3′UTR using a score threshold of 0.90. A
total of 144 proteins binding 807 sites were identified (Figure 2, top and Supplemental Table S3). This analysis revealed the presence of
potential hotspots of regulation by RNA binding proteins, most notably in
segments 3, 13, 21 with 103, 108, 83 RBPs binding each respective region. The
poly-A region, corresponding to segment 29, was also identified as a potential
hotspot of protein binding with a predicted 98 binding sites. Using the gene
ontology tool, we were able to further classify the proteins with predicted
sites in the SERINC3 3′UTR. GO analysis
indicated that the proteins predicted to bind the SERINC3 3′UTR have
roles in mRNA stability, transport, processing, and others (Table 1).
Figure 2.
Analysis of SERINC3 3′UTR. Top: Sites of predicted
binding by miRNA and RNA binding proteins as predicted by miRDB and
RBPSuite, respectively. The 3′UTR of SERINC3 was retrieved from NCBI and
was the 2854 nucleotides after the stop codon. Plot shows the number of
predicting binding sites (left vertical axis) for RNA binding proteins
(blue) and miRNA (orange) and the frequency of SNPs (gray line, right
axis) in nonoverlapping segments of 101 nucleotides.
Bottom: Axis for each organism corresponds to the
percent homology relative to HsSERINC3 3′UTR. Colored regions are
indicated as yellow for untranslated regions, green for simple
repeats.
Table 1.
Gene ontology terms associated with the proteins predicted to bind the
3′UTR.
GO biological process
GO Accession
P value
establishment of RNA localization
51 236
2.58E–09
gene expression
10 467
1.01E–33
gene silencing by RNA
31 047
9.12E–09
IRES-dependent viral translational initiation
75 522
2.89E–05
mRNA 3′-end processing
31 124
5.88E–13
mRNA cleavage involved in mRNA processing
98 787
5.36E–05
ncRNA metabolic process
34 660
1.66E–11
negative regulation of RNA metabolic process
51 253
1.25E–04
negative regulation of translation
17 148
9.48E–10
protein export from nucleus
6611
5.30E–07
regulation of gene expression
10 468
3.49E–15
regulation of mRNA processing
50 684
9.56E–21
regulation of mRNA stability
43 488
2.81E–05
regulation of translation
6417
6.92E–14
RNA export from nucleus
6405
2.24E–07
RNA transport
50 658
2.33E–09
viral process
16 032
9.89E–05
Analysis of SERINC3 3′UTR. Top: Sites of predicted
binding by miRNA and RNA binding proteins as predicted by miRDB and
RBPSuite, respectively. The 3′UTR of SERINC3 was retrieved from NCBI and
was the 2854 nucleotides after the stop codon. Plot shows the number of
predicting binding sites (left vertical axis) for RNA binding proteins
(blue) and miRNA (orange) and the frequency of SNPs (gray line, right
axis) in nonoverlapping segments of 101 nucleotides.
Bottom: Axis for each organism corresponds to the
percent homology relative to HsSERINC3 3′UTR. Colored regions are
indicated as yellow for untranslated regions, green for simple
repeats.Gene ontology terms associated with the proteins predicted to bind the
3′UTR.Next, the SERINC3 3′UTR was analyzed with the miRDB tool to detect putative sites
of miRNA binding. A total of 25 putative regulatory sites was identified and 61
miRNAs with a target prediction score greater than 80 (Supplemental Table S4). We used a lower threshold for the miRDB
tool based on the likelihood of true-positive hits as determined by the
creators. Similar to the RBPs, the miRNAs appeared to bind in hotspots in the
SERINC3 3′UTR (Figure
2, top). These hotspots correspond to segments 6, 7, 8, and 29, each
possessing 31, 6, 11, and 7 sites, respectively. With the exception of the
poly-A tail, located in segment 29, these miRNA binding hotspots do not
correspond to segments with a high frequency of putative RBP sites.The pattern of putative regulatory sites in the 3′UTR led us to next ask if these
sites were conserved. An alignment of the region following the stop codon was
performed the web-based Dcode.org tool
(Figure 2,
bottom). The species that exhibited the highest homology to the 3′UTR of
HsSERINC3 were P. troglodytes, M. mulatta, and C.
familiaris supporting the previous finding that the SERINC3 is
highly conserved in mammals.
The conserved regions roughly lined up to the hotspot regions of
predicted sites for RNA binding proteins and miRNA binding (Figure 2). For example, our analysis
suggested that the miRNA binding hotspots seen in segments 6 to 8 of
HsSERINC3 appear to be conserved in Monodelphis
domestica, Rattus norvegicus, and M. musculus. In
addition, the sequences corresponding to the poly-adenine tail, which also were
predicted hotspot regions for RNA binding regions for RBPs and miRNAs, were also
conserved. In contrast, segments 21 to 24 that contained 203 predicted binding
sites for 103 RBPs, there was a noticeable lack of homology among the analyzed
sequences. Interestingly, although the 3′UTR of the M. musculus
displayed relatively low homology relative to the primate sequences, there was a
select region between segments 19 and 20 that possessed higher similarity
relative to the surrounding sequences. Although no binding sites were identified
in this region, other structural features may have a role in SERINC3
regulation.Variation in the sequence of SERINC3 3′UTR can also be observed through SNPs.
Regions with an elevated frequency of SNPs in this region may have altered
mechanisms of regulation of homologs. SNP data were retrieved from the NCBI
Variation Viewer and we found a total of 580 documented nucleotide variations
was observed in this region (Supplemental Table S5). The frequency of documented sequence
variation was plotted against the segments of the 3′UTR of SERINC3 (Figure 2). We observed
that a range of 0 to 34 SNPs per 101 nucleotide segments throughout the 3′UTR of
SERINC3.There was no obvious correlation between SNP frequency and putative
binding sites of RBPs or miRNAs. The regions of the lowest SNP frequency were in
segments 1 and 29, the beginning and end of the 3′ region. The central region
possessed the highest frequency of SNPs with the highest being 34 SNPs
throughout the span of the 101 nucleotides of segment 12 (Figure 2). It is possible that the
higher frequency of SNPs in this region may result in altered regulatory
sequences and differential control. Overall, our analysis of the 3′UTR of
SERINC3 reveals that this region serves as a conserved site of regulation for
mRNA that may be altered by relative conservation and variation in sequence. In
addition, the presence of numerous putative sites of regulation suggests that
the 3′UTR of SERINC3 is highly regulated to alter properties such as RNA
half-life and localization.
Posttranslational modification of the SERINC3 protein
The two variants of SERINC3 encoded by Homo sapiens are
predicted to yield the same protein product. However, due to the difficulty
associated with purification of membrane proteins, the structure of
HsSERINC3 has not been elucidated. To gain insight on the
cellular roles of SERINC3, the protein sequence was examined for putative sites
of posttranslational modification.The web-based tool Protter was used to determine the amino acids that are
amendable to modification by predicting the membrane topology of SERINC3.
Consistent with other SERINC family proteins, the topology of SERINC3 protein
contained 11 transmembrane domains[1,2] (Figure 3). This tool also verified the
presence of 3 glycosylation sites, N33 and N187 on the Golgi side and N314 that
is exposed to the cytoplasm.
Figure 3.
Prediction of the structure, modification, and relative sequence
conservation of Homo sapiens SERINC3. Amino acid
residues were colored according to the relative conservation when
compared with >50 homologous sequences. Predicted PTM including sites
of ubiquitination (+), acetylation (triangle), glycosylation (diamond),
and phosphorylation (*) are labeled with putative enzyme, if known.
Prediction of the structure, modification, and relative sequence
conservation of Homo sapiens SERINC3. Amino acid
residues were colored according to the relative conservation when
compared with >50 homologous sequences. Predicted PTM including sites
of ubiquitination (+), acetylation (triangle), glycosylation (diamond),
and phosphorylation (*) are labeled with putative enzyme, if known.Next, we used a series of web-based tools to predict the presence of
posttranslational modifications. We limited our search for residues that were
predicted to be exposed to the cytoplasm or Golgi side of the membrane (Figures 3). According to
our findings, SERINC3 has several putative modifications on both the Golgi- and
cytoplasmic-facing regions. The modification included ubiquitination sites (8),
phosphorylation (18), acetylation (1), and N-glycosylation (3).
The sequence of SERINC3-contained sites that matched the consensus
phosphorylation sites of PKC, PKA, CKI, DNAPK, and cdk5.
One predicted acetylation site on the cytoplasmic side of the membrane
was detected at K266 (Figure
3). Based on prediction of consensus sites, the SIRT1 enzyme was
predicted to modify K266 with a P value of .0298.A similar analysis revealed putative ubiquitination sites using the web-based
tool, Biocuckoo
(Figure 3). A
total of eight sites were predicted, 3 on the cytoplasmic side of the membrane
and 5 on the Golgi side. Interestingly, the lysine residues at positions 33,
118, 120, 123, and 328 were located proximal to other sites of predicted PTM.
This site was near a site of predicted phosphorylation suggesting the potential
for competing modifications.
Variation in sequence may alter SERINC3 regulation and structure
The level of variation in protein regions is strongly dependent on its structural
and functional importance within a protein. Therefore, we next asked if these
sites of predicted PTM were conserved or susceptible to variability through
SNPs. A previous study revealed that the presence of point mutations in SERINC5
glycosylation sites resulted in mislocalization as well as failure to
successfully incorporate into the HIV-1 viron.
Although SERINC family proteins are well-conserved in mammals, we asked
if the sites of putative regulation were susceptible to variation. To answer
this question, we evaluated the sequence for conservation among homologs and for
the presence of single nucleotide polymorphisms.Using the ConSurf tool
to align select homologous sequences, we analyzed the relative
conservation of each amino acid in the SERINC3 sequence (Figure 3 and Supplemental Table S5). Based on a multiple sequence alignment
of more than 50 homologous sequences, each amino acid in SERINC3 was scored for
the relative conservation on a scale of 1 (indicating a variable residue) to 9
(indicating a conserved residue) (Figure 3). Our analysis showed that the
exposed loops corresponding to the regions between helices 2 and 3, 8 and 9, and
9 and 10 showed the greatest variability in sequences relative to homologs. A
majority of the putative ubiquitination sites (7/8) were ranked with a score of
5 or less according to the analysis performed by ConSurf indicating that these
sites are not highly conserved. Of the predicted phosphorylation sites outside
of the membrane, 8 were conserved, having ConSurf-associated homology scores of
6 or above. Overall, the considerable variation in amino acid sequence in
homologs suggests that regulatory mechanisms may not be conserved in other model
organisms.Next, we considered variation in SERINC3 that occurs through documented SNPs.
Missense point mutations in the coding sequence of SERINC3 were retrieved using
the NCBI Variation Viewer. In total, SERINC3 protein exhibited 321 documented
SNPs resulting in a missense mutation, where 173 of those were located outside
of the membrane regions (Supplemental Table S4). Our analysis of SNPs revealed that the
loop regions were susceptible to increased variability relative to the
transmembrane regions. Specifically, we observed the presence of a ratio of 0.82
SNPs: residues in the exposed regions, compared with the transmembrane regions
which only had a ratio of 0.64. These data suggest that the regions exposed to
the cytoplasm or Golgi apparatus are susceptible to more variation than the
transmembrane regions. Further analysis revealed that the 5-6 loop region
exhibited the lowest number of SNPs as expected with its high number of
conserved residues (Table
2). In contrast, loops 1-2, 6-7, as well as the C-terminal tail,
exhibited the highest ratio of SNPs: amino acids.
Table 2.
Frequency of single nucleotide polymorphisms (SNPs) in cytoplasmic or
Golgi-exposed regions of SERINC3.
# SNPs
# residues
# SNP/region
N-term
4
5
0.80
loop 1-2
12
11
1.09
loop 2-3
22
39
0.56
loop 3-4
8
12
0.67
loop 4-5
4
5
0.80
loop 5-6
6
21
0.29
loop 6-7
11
11
1.00
loop 7-8
9
11
0.82
loop 8-9
38
49
0.78
loop 9-10
34
53
0.64
loop 10-11
16
19
0.84
C-term
9
6
1.50
Frequency of single nucleotide polymorphisms (SNPs) in cytoplasmic or
Golgi-exposed regions of SERINC3.The presence of SNPs can alter both the identity of the residue susceptible to
PTM and the stability of the structure. The change in free energy of protein
folding associated with the new amino acid was determined using the iMutant2.0
online tool
(Table 3 and
S6). Of the SNPs we analyzed, 19 point mutations resulted in a ΔΔG that was
greater than zero, indicating that the new amino acid would increase the
stability of the SERINC3 tertiary structure. Our analysis revealed that the
majority of SNPs resulted in a structure that is less stable (168 with
ΔΔG > 0, Supplemental Table S6). Several missense mutations were also
found to occur at a putative site of posttranslational modification. These sites
included the predicted site of acetylation (K66), phosphorylation (S327, S331,
S359, T468, S473), and ubiquitination (K33, K328, K432). In contrast, the SNPs
S122T and S122N, with ΔΔG of 0.31 and 0.04, respectively, would result in an
increase in stability of the folded SERINC3 protein. Overall, these results
suggest that sequence variation, as a result of relative conservation of the
amino acids as well as individual nucleotide variation, have the potential to
alter the structure and regulatory mechanisms of the SERINC3 protein.
Table 3.
SERINC3 structural stability based on free energy change.
# Variant ID
Residue change
∆∆G (Kcal/mol)
rs768124514
Lys33Asn
−1.02
rs1555830242
Ser122Thr
0.31
rs1555830242
Ser122Asn
0.04
rs748350977
Ser122Arg
−0.28
rs185462055
Ser328Asn
−1.88
rs762874477
Ser331Thr
−1.51
rs762106624
Ser359Arg
−1.42
rs1254122914
Lys432Glu
−0.45
rs936670439
Thr468Ile
−0.01
rs201460770
Thr468Ala
−0.93
rs760368420
Ser473Cys
−1.91
SERINC3 structural stability based on free energy change.
Discussion
The SERINC proteins have emerged as proteins of interest because their cellular
functions are not well characterized. Since the initial observation of variable
expression in tumor lines,
the function of SERINC family of proteins has been expand to include
serine-containing phospholipid biosynthesis, cellular trafficking, as well as
ability to restrict lentiviruses, such as HIV-1.[2,6] Herein, we used several
bioinformatics tools to gain insight on the cellular factors that interact with
SERINC3 at the level of DNA, RNA, and protein (Figure 4).
Figure 4.
Model of the predicted SERINC3 interactome.
Model of the predicted SERINC3 interactome.Our analysis of the SERINC3 promoter indicated the presence of putative transcription
factor binding sites with roles in development of organs, including the generation
of neurons (Figure 1 and
Supplemental Table S2). Gene expression of SERINC3 in humans is also
regulated by the enhancer present 16 nucleotides upstream of the start site.
Conservation of regions with predicted transcription factor binding sites
suggests that regulation of SERINC3 at the transcriptional level might be conserved
in mammals.More than half of human genes use alternative cleavage and polyadenylation to
generate alternative 3′UTR isoforms. Untranslated regions contain sequence-specific
binding sites for proteins and regulatory RNA that can alter splicing, cellular
location, and mRNA stability. The effects of variants on protein structure can vary
dramatically depending on the type of protein and the extent of variation.
Examination of the SERINC3 3′UTR revealed that the transcripts are differentially
regulated. Based on the GO terms associated with the putative RBPs, the variant 1
transcript of SERINC3 may have an alternate location or relative stability in the
cell. In support of this finding, roles for SERINC3 have been indicated in the
plasma membrane as well as the Golgi apparatus.
In support of this, the SERINC proteins appear to be associated with
different adaptor proteins associated that start in the Golgi and are destined for
various organelles.
The roles of SERINC3 and paralogs in membrane trafficking directly links to
the initial studies of variable expression in tumors, as vesicular trafficking can
be highly up regulated in tumors.The cellular activities of SERINC paralogs appear to be regulated by
posttranslational modification. For example, SERINC4 protein is subject to
degradation by the proteasome, which contributes to its activity in restricting
HIV-1 replication.
In addition, N-glycosylation of SERINC5 by has been observed
to be preferentially incorporated into HIV-1 virions.
Our search for putative sites of posttranslational modifications in SERINC3
extra-membrane regions yielded the prediction of sites of putative phosphorylation,
glycosylation, acetylation, and ubiquitination. Addition of these functional groups
to the primary structure of SERINC3 may be a mechanism to modulate protein functions
and dynamically coordinate a signaling network.Other studies that used high throughput techniques have detected the presence of
sites of ubiquitination[37
-39] and phosphorylation in
SERINC3. Proximity of the PTMs sites to suggest that SERINC3 might be coordinately
regulated. For example, sites of predicted phosphorylation by PKC or PKA are in
close proximity on the 3-4 loop and C-terminal of SERINC3. Many of the residues with
predicted PTM are also susceptible to missense mutations that could result in
changes to the tertiary structure as well as regulation of SERINC3.Prediction of an acetylation site at K266 in the 2-3 loop of SERINC3 by SIRT1 might
add another layer of regulation. The enzyme sirtuin 1 (SIRT1) is a conserved enzyme
that has been demonstrated to have roles in oxidative stress and other metabolic activities.
Although most commonly associated with modulation of histone activity,
acetylases and deacetylases can target other proteins.The role of SERINC3 in the cell as well as association to disease is unclear.
According to the Cancer Genome Atlas, somatic mutations in SERINC3 have been
observed in bladder, endometrial, and other cancers. Another large-scale study found
SNPs in the sequence of SERINC3 associated with breast cancer
and progressive supranuclear palsy.
Similarly, our data of SNPs in HsSERINC3 coding sequence
have the potential alter its regulation as well as the stability of its folded
structure that may contribute to abnormal cellular phenotypes. These findings are in
agreement with a recent study of SERINC5 that found that variability within a
cytoplasmic exposed region alters the ability to restrict HIV.
The exposed regions of SERINC3 be susceptible to SNPS in humans and have a
considerable range of conservation at the level of amino acids. The variability in
the SERINC3 protein sequence that we observed in our studies have the potential to
alter the structure and function of SERINC3 at the cellular level. It is possible
that these variations in SERINC3 sequence can generate cellular phenotypes that can
contribute to disease, which is a future area of investigation.In this study, we used an in silico approach to predict and
characterize the functional network of SERINC3; as such, additional studies are
required to validate these findings. Throughout the study, statistical significance
was kept stringent to limit the number of predicted false-positive results (see
“Methods” section). When possible, redundant analyses were performed using separate
tools to confirm hits and reduce false-positives. In vivo analysis
might reveal tissue-specific interactions or regulation. To our knowledge, our study
is the first to investigate the SERINC3 interactome as well as a predictive analysis
of variation and regulators at the level of DNA, RNA, and protein. Our data suggest
that SERINC3 is regulated at the transcriptional level by several transcription
factors and at the RNA level by RBPs and miRNAs. Several sites of predicted protein
modification were also identified. The functional and structural impact of SNPs were
also investigated using computational prediction tools. The results found here
suggest that SERINC3 is coordinately regulated, and sequence variation have the
potential to alter both protein structure and cellular function.Click here for additional data file.Supplemental material, sj-xlsx-1-bbi-10.1177_11779322221092944 for From Beginning
to End: Expanding the SERINC3 Interactome Through an in silico Analysis by
Mckenzie Tu and Sarah Saputo in Bioinformatics and Biology Insights
Authors: Linden J Gearing; Helen E Cumming; Ross Chapman; Alexander M Finkel; Isaac B Woodhouse; Kevin Luu; Jodee A Gould; Samuel C Forster; Paul J Hertzog Journal: PLoS One Date: 2019-09-04 Impact factor: 3.240
Authors: Philipp Mertins; D R Mani; Kelly V Ruggles; Michael A Gillette; Karl R Clauser; Pei Wang; Xianlong Wang; Jana W Qiao; Song Cao; Francesca Petralia; Emily Kawaler; Filip Mundt; Karsten Krug; Zhidong Tu; Jonathan T Lei; Michael L Gatza; Matthew Wilkerson; Charles M Perou; Venkata Yellapantula; Kuan-lin Huang; Chenwei Lin; Michael D McLellan; Ping Yan; Sherri R Davies; R Reid Townsend; Steven J Skates; Jing Wang; Bing Zhang; Christopher R Kinsinger; Mehdi Mesri; Henry Rodriguez; Li Ding; Amanda G Paulovich; David Fenyö; Matthew J Ellis; Steven A Carr Journal: Nature Date: 2016-05-25 Impact factor: 49.962