Didac Santesmasses1, Vadim N Gladyshev1. 1. Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.
Abstract
Selenium is incorporated into selenoproteins as the 21st amino acid selenocysteine (Sec). There are 25 selenoproteins encoded in the human genome, and their synthesis requires a dedicated machinery. Most selenoproteins are oxidoreductases with important functions in human health. A number of disorders have been associated with deficiency of selenoproteins, caused by mutations in selenoprotein genes or Sec machinery genes. We discuss mutations that are known to cause disease in humans and report their allele frequencies in the general population. The occurrence of protein-truncating variants in the same genes is also presented. We provide an overview of pathogenic variants in selenoproteins genes from a population genomics perspective.
Selenium is incorporated into selenoproteins as the 21st amino acid selenocysteine (Sec). There are 25 selenoproteins encoded in the human genome, and their synthesis requires a dedicated machinery. Most selenoproteins are oxidoreductases with important functions in human health. A number of disorders have been associated with deficiency of selenoproteins, caused by mutations in selenoprotein genes or Sec machinery genes. We discuss mutations that are known to cause disease in humans and report their allele frequencies in the general population. The occurrence of protein-truncating variants in the same genes is also presented. We provide an overview of pathogenic variants in selenoproteins genes from a population genomics perspective.
Entities:
Keywords:
genetic variance; human disease; selenium; selenocysteine; selenoprotein
Selenium is an essential trace element in mammals. Its main biological functions are mediated by selenoproteins, which contain selenium in the form of the 21st amino acid selenocysteine (Sec). Selenoproteins are important oxidoreductase enzymes widely conserved across mammals [1]. They are involved in diverse molecular pathways, with Sec typically found at the catalytic site [2]. Sec is incorporated into selenoproteins in response to a UGA codon, normally a stop codon, through a recoding mechanism that is selenium-dependent and involves a dedicated machinery [3]. There are 25 known selenoprotein genes in humans, and 24 in mice [4]. Mouse models have been particularly instrumental in interrogating the functions of selenoproteins and assessing their gene essentiality [5]. Five selenoproteins have been reported to be essential in mice: Gpx4 [6,7], Txnrd1 [8], Txnrd2 [9], Selenot [10], and Selenoi [11]. In humans, the significance of selenium and selenoproteins for health is manifested by inherited congenital disorders caused by mutations that disrupt selenoprotein synthesis or affect individual selenoproteins.Here we provide an overview of clinically relevant genetic variants based on the analysis of the literature and specialized databases. We used ClinVar, a public archive of reports of the relationships of genetic variation and phenotypes with assessment of clinical relevance of variants submitted by researchers and genetic testing labs [12], and gnomAD, an aggregate of exome and genome sequencing data of unrelated individuals sequenced as part of various disease-specific and population genetic studies [13].
2. Pathogenic Variants in Selenoproteins
Selenoproteins associated with human syndromes thus far include 5 proteins: SELENON, GPX4, TXNRD1, TXNRD2, and SELENOI. Disruption of the selenoprotein synthesis pathway, which causes generalized selenoprotein deficiency, has been associated with mutations in SEPSECS, SECISBP2, and Sec tRNA (TRU-TCA1-1). The consequences of selenoprotein deficiency and causal mutations have been reviewed recently [14,15,16,17]. ClinVar currently lists a total of 72 pathogenic or likely pathogenic variants in selenoprotein genes, and 35 in the Sec synthesis machinery factors. 44 of those variants are observed in at least one individual in gnomAD.
2.1. SELENON
Selenoprotein N (SELENON, also known as SEPN1) is a transmembrane protein located in the endoplasmic reticulum (ER) [18]. Initially described by the in silico identification of its SECIS element [19], SELENON was shortly after linked to a congenital rigid spine muscular dystrophy [20], becoming the first selenoprotein associated with human disease. Its protein sequence contains an EF-hand domain and a Sec residue as part of a SCUG motif, where U corresponds to Sec, reminiscent of the Sec-containing motif in thioredoxin reductases. Since its association with disease, there has been a lot of interest in the characterization of its functions in the muscle [21,22]. A recent study showed how SELENON functions as a calcium sensor through its calcium-binding EF-hand domain and activates sarco/endoplasmic reticulum calcium ATPase (SERCA2)-mediated calcium uptake into the ER in a redox-dependent manner [23]. Knockout mice were also developed [24,25].SELENON-related myopathy (SELENON-RM, formerly SEPN1-RM) is a congenital disorder caused by loss-of-function variants in the SELENON gene. SELENON-RM comprises four neuromuscular disorders initially described separately as rigid spine muscular dystrophy [20,26], multi-minicore disease [27], congenital fiber type disproportion [28], and desmin-related myopathy with Mallory body-like inclusions [29]. The clinical phenotype is characterized by early-onset axial muscle weakness, spinal rigidity, and scoliosis, with respiratory failure. Transmission is autosomal recessive, and patients are homozygous or compound heterozygous. To date, many genetic variants that disrupt SELENON, or prevent Sec insertion, have been identified in SELENO-RM patients [30]. ClinVar currently lists 65 variants in SELENON as pathogenic/likely pathogenic. Notably, those variants include a change in the Sec codon TGA to TAA [20], which produces a premature stop codon (currently erroneously classified as synonymous in ClinVar); a variant that affects the conserved quartet core of the SECIS [31]; and several variants in the Selenocysteine Redefinition Element (SRE), a small RNA hairpin loop adjacent to the UGA codon [20,32]. In those cases where Sec synthesis efficiency is reduced, transcript levels have been shown to be decreased, suggesting that the mRNA is targeted by nonsense-mediated decay [31,32]. The remaining pathogenic variants are missense changes in conserved residues, small insertions and deletions that lead to frameshift, and splice donor/acceptor variants.The gnomAD database contains 30 of the pathogenic or likely pathogenic variants, which include both missense and protein-truncating variants (PTV: frameshift, stop gain, and splice variants) (Figure 1). Though they can be considered rare, those included in gnomAD are, presumably, the most common SELENON-RM associated variants in the general population. Their allele frequencies range from 0.000004 to 0.0002, and carriers are all heterozygotes. Some of the variants are more frequent in certain populations. For example, the frameshift variant p.Asn204LysfsTer63 (g. 26135244C>CA) had an allele frequency of 0.001353 in the Ashkenazi Jewish population. In addition to those listed in ClinVar, other PTVs are present in gnomAD (Figure 1), which have not been assessed for clinical significance. Given that all but one of the PTVs that have been submitted to ClinVar are pathogenic (19 in total), it raises the question whether the additional 23 truncating variants in gnomAD could potentially cause SELENON-RM.
Figure 1
Pathogenic variants and protein-truncating variants in SELENON. (a) Genomic organization of SELENON exons, and location of ClinVar pathogenic variants (above) and gnomAD protein-truncating variants (PTV) (below). The shape and color of each variant corresponds to its predicted consequence (see legend). For gnomAD variants only, the size of the symbol is proportional to the allele frequency. The genomic notation is used to describe each variant. (b) Location of variants along the SELENON protein sequence. The length of the protein is indicated on the right. Vertical black lines correspond to exon boundaries, and the vertical orange line correspond to the Sec residue. The same aesthetics for variants as (a) are used. The genomic coordinates correspond to genome build GRCh37/hg19; the SELENON gene structure and protein sequence correspond to transcript ENST00000374315.
2.2. GPX4
GPX4 is unique among the five glutathione peroxidases that depend on selenium in humans due to its ability to reduce lipid hydroperoxides and to use protein thiols as donors of electrons in addition to glutathione [33,34]. GPX4 also functions as a major regulator of ferroptosis, a form of regulated cell death characterized by the accumulation of lipid hydroperoxides [35,36]. Gpx4 is essential for embryonic mouse development [6,7], and its deficiency in mice leads to neuronal degeneration, ataxia, and seizures [37].GPX4 is associated with Sedaghatian-type spondylometaphyseal dysplasia (SSDM). The syndrome was described in 1980 as a congenital autosomal recessive disorder [38], and more recently, inactivating variants in GPX4 were identified in SSDM patients [39]. Patients show skeletal disorder and brain atrophy and die shortly after birth due to respiratory failure [40].There are five pathogenic or likely pathogenic variants in GPX4 associated with SSDM in ClinVar (Figure 2). Three of them were reported in [39], while for the other two (p.Gly51_His52insTer and p.Ile170fs) there is no study citation in ClinVar. Only one of the pathogenic variants is observed in gnomAD, c.476 + 5G>A (g. 1105813G>A), with allele frequency 0.00003586. Additional protein-truncating variants in GPX4 are observed in gnomAD, which are not present in ClinVar. Their allele frequencies range from 0.000004 to 0.000065.
Figure 2
Pathogenic variants and protein-truncating variants in GPX4. Location of ClinVar pathogenic variants (above) and gnomAD protein-truncating variants (below), along the GPX4 protein sequence. The same aesthetics as in Figure 1b are used. GPX4 transcript: ENST00000354171.
2.3. Thioredoxin Reductases
Thioredoxin reductases (TXNRD) are oxidoreductases that play a major role in the disulfide reduction system of the cell by converting thioredoxins to their reduced state. There are three TXNRDs in mammals, with different cellular or tissue localization. TXNRD1 is localized mainly in the cytosol and nucleus throughout different tissues, TXNRD2 is localized in the mitochondria also throughout tissues, and TXNRD3 is highly expressed in the testis [2]. All three proteins have the carboxy-terminal motif GCUG that contains the Sec residue [1]. Both Txnrd1 and Txnrd2 are essential for embryonic development in mice [8,9]. TXNRD2 has been associated with two different, unrelated diseases: dilated cardiomyopathy [41] and familial glucocorticoid deficiency [42]. TXNRD1 has been associated with generalized epilepsy [43].Two variants in TXNRD2, p.Ala59Thr and p.Gly375Arg, were identified in patients that suffered from dilated cardiomyopathy [41]. The clinical significance of p.Gly375Arg is currently uncertain in ClinVar, although it is predicted to be deleterious by the algorithms Polyphen and SIFT, used in gnomAD. The missense variant p.Gly375Arg (c.1123G>A, allele frequency (AF) = 0.000016; and c.1123G>C, AF = 0.00003) is observed in five heterozygous subjects in gnomAD. Only one of them is part of the control set, therefore it is not possible to assume that they are all healthy. A change to glutamic acid in this position is also present in gnomAD, p.Gly375Glu (AF = 0.000004). The other variant associated with dilated cardiomyopathy, p.Ala59Thr, is not present in ClinVar, but is also observed in gnomAD (AF = 0.000004). In this position, a change to proline is also observed, p.Ala59Pro, with an allele frequency of 0.00001. All missense variants in position 59 are heterozygous.The variant p.Y447Ter (g.19865895A>C) in TXNRD2 introduces a premature UAG stop codon leading to a truncated protein that lacks the Sec residue. The homozygous form of this variant was identified in several members of a consanguineous Kashmiri family affected with familial glucocorticoid deficiency [42]. The absence of cardiomyopathy in the family was surprising because it implied clinical heterogeneity associated with TXNRD2. The variant was also observed in a heterozygous patient with dilated cardiomyopathy [44]. The global allele frequency in gnomAD is 0.0005 (Figure 3), observed in 141 heterozygotes, but the variant appears to be much more common in the South Asian population, with an allele frequency of 0.0035.
Figure 3
Pathogenic variants and protein-truncating variants in TXNRD1 and TXNRD2. Location of pathogenic variants (above) and gnomAD protein-truncating variants (below) along the TXNRD1 and TXNRD2 protein sequence. The same aesthetics as Figure 1b are used. TXNRD1 transcript: ENST00000526390; TXNRD2 transcript: ENST00000400521.
TXNRD1 has been associated with genetic generalized epilepsy. The homozygous variant p.Pro190Leu (g.104714898C>T) was shown to cause decreased TXNRD1 protein levels and turnover rate, and segregated with the disease in a family [43]. The variant is found in one heterozygous subject in gnomAD (AF = 0.000004).
2.4. SELENOI
Selenoprotein I (SELENOI; also known as EPT1, Ethanolamine phosphotransferase 1) has been recently added to the list of essential selenoproteins in mice [11]. In humans, two pathogenic variants are currently known. They cause spastic paraplegia, described in two different families [45,46]. The missense variant p.Arg112Pro (g.26596259G>C) [45] hits a highly conserved arginine residue within the CDP-alcohol phosphatidyltransferase (Pfam PF01066). The splice variant g.26607825A>G leads to aberrantly spliced transcripts with exons 6 and 8 affected [46]. The mode of inheritance is autosomal recessive, all patients were homozygous, and unaffected direct relatives were heterozygous. Those two variants are not present in gnomAD (Figure 4), and presumably, are very rare in the population. In concordance with the importance of SELENOI, a strong selection against protein-truncating variants was observed in humans [47].
Figure 4
Pathogenic variants and protein-truncating variants in SELENOI. Location of ClinVar pathogenic variants (above) and gnomAD protein-truncating variants (below), along the SELENOI protein sequence. The same aesthetics as Figure 1b are used. SELENOI transcript: ENST00000260585.
The Sec-specific tRNA (tRNA[Ser]Sec), encoded by TRU-TCA1-1, plays a central role in the synthesis of selenoproteins. The Sec tRNA provides the backbone for the biosynthesis of Sec [48], and its anticodon recognizes context-dependent UGA codons to specify Sec insertion [49]. Its deletion in mice (encoded by Trsp) is embryonic lethal [50]. Several mouse models have been developed to study its role in health [51]. The Sec tRNA has unique features that distinguish it from other tRNAs: it is the longest tRNA with 90 nucleotides compared to ~75 in other tRNAs; it has a unique structure, with a long acceptor 9-base pairs (bp) stem, a long 6-bp D stem, and an unusually long variable arm [52]. It is transcribed by RNA Pol III, like other tRNAs, but has unique promoter elements [53]. The mature tRNA contains a few modified bases [49], and the tRNA pool is composed of two major isoforms, containing either 5-methoxycarbonylmethyluridine (mcm5U) or 5-methoxycarbonylmethyl-2′-O-methyluridine (mcm5Um) at position 34 [54]. The presence of mcm5Um governs the expression of stress-related selenoproteins [55].The single nucleotide change C65G was identified in patients [56,57]. The first patient described [56] exhibited a similar clinical phenotype to that observed in SECISBP2 mutation patients. Primary cells from the proband showed low Sec tRNASer[Sec] expression with a reduction in mcm5Um levels and decreased i6A modification in position 37. This suggests that the tRNA post-transcriptional maturation was impaired. The selenoprotein expression profile showed a deficiency of stress-related selenoprotein levels, but preserved the levels of housekeeping selenoproteins. The position C65 is located in the acceptor arm, adjacent to C64 in the TΨC arm, which interacts with SEPSECS [58]. The precise mechanism leading to the imbalance of Sec tRNASer[Sec] isoforms is unclear, with impaired post-transcriptional maturation or unstable interaction with SEPSECS being possibilities [16]. The variant C65G was not observed in gnomAD, but in that same position, the variant C65T was observed in two subjects, both heterozygotes. In total, 83 variants in 55 sites of TRU-TCA1-1 are currently present in gnomAD, with allele frequencies ranging from 0.000007 to 0.0008. All variants are heterozygous, except for one: the change C28T, in the anticodon arm is observed in two homozygous subjects. Given that many heterozygous variants exist in the general population, it would be reasonable to assume that a single wild type TRU-TCA1-1 allele is enough to maintain adequate expression of selenoproteins, as observed in heterozygotes for the variant C65G [56]. Remarkably, two subjects have a variant in position 35 of the anticodon triplet. One variant is C35T, which produces a TTA anticodon that is complementary to the stop codon UAA. The other one is C35G, producing a TGA anticodon, which is a Ser anticodon. Both subjects are heterozygous, but the change in the anticodon could have consequences not only on the synthesis of selenoproteins, but potentially on the entire proteome.
3.2. SECISBP2
The SECIS binding protein 2 (SECISBP2) binds the SECIS element in the 3′UTR of selenoprotein mRNAs, and interacts with EEFSEC, tRNA[Ser]Sec, and the ribosome, to incorporate Sec into the growing peptide. It is an obligate limiting factor for selenoprotein synthesis [59], and it is an essential gene in mice [60]. Its deficiency disrupts the synthesis of selenoproteins, which, in humans, is manifested by a multisystem disorder characterized by low circulating selenium and abnormal thyroid hormone levels [16]. A total of 18 pathogenic variants have been identified in 13 individuals from 11 families [16]. Three of the subjects are homozygous, and the rest are compound heterozygous. Most variants produce a truncated protein, either by stop gain or frameshift, and three are missense variants.The first 400 N-terminal amino acids of the human SECISBP2 protein are dispensable for Sec incorporation [59,61]. The C-terminal region (positions 399 to 784) comprises two domains, the Sec incorporation domain (SID), and the RNA binding domain (RBD) [62]. The RBD domain contains an L7Ae RNA-binding domain that interacts with the SECIS and the 28S ribosomal RNA [63]. Mouse models carrying human mutations have been developed to study the effect of specific pathogenic variants that affect either the RBD or SID regions [64]. The results showed that the variant in the RBD domain abrogates Sec insertion, while in the SID domain, the particular variant tested results in residual SECISBP2 activity in the brain, but it rendered the protein unstable in a tissue-specific manner, being completely degraded in mouse liver.ClinVar lists seven variants classified as pathogenic/likely pathogenic (Figure 5). Four of them are observed in gnomAD, all with an allele frequency below 0.00003. In addition, 93 protein-truncating variants are observed in gnomAD, which are not listed in ClinVar. They are all heterozygous and their allele frequencies are below 0.00009.
Figure 5
Pathogenic variants and protein-truncating variants in SECISBP2. Location of ClinVar pathogenic variants (above) and gnomAD protein-truncating variants (below), along the SECISBP2 protein sequence. The same aesthetics as Figure 1b are used. SECISBP2 transcript: ENST00000375807.
A paralog of SECISBP2, named SECISBP2L, is found in vertebrates [65]. Its function has not been elucidated, but it was shown to lack Sec incorporation activity [66]. Based on gnomAD, SECISBP2L has strong selective constraints against truncating variants [47]. Mice lacking Secisbp2l have been phenotyped and deposited on the International Mouse Phenotyping Consortium (IMPC). Deletion of Secisbp2l has, apparently, no effect on viability, but it shows significant phenotypes in body size, metabolism/adipose tissue, cardiovascular system, skeleton, hearing, and vision (https://www.mousephenotype.org/data/genes/MGI:1917604, accessed on 17 October 2021). However, deletion of Secisbp2 is also reported to have no effect on viability (https://www.mousephenotype.org/data/genes/MGI:1922670, accessed on 17 October 2021), which is in contradiction with the previous body of work that reported Secisbp2 as essential [60].
3.3. SEPSECS
SEPSECS catalyzed the conversion of Ser-tRNA[Ser]Sec to Sec-tRNA[Ser]Sec, the final step in the biosynthesis of selenocysteine [67]. The crystal structure of human SEPSECS in complex with tRNA[Ser]Sec has been solved [58], providing substantial information on its function. The interaction of SEPSECS and tRNA[Ser]Sec occurs through the tRNA long acceptor-TΨC arms. There is no published study on mice Sepsecs knockout, but its deletion is reported as embryonic lethal, with significant phenotypes in heterozygotes including cardiovascular, pigmentation, and vision (https://www.mousephenotype.org/data/genes/MGI:1098791, accessed on 17 October 2021).Deficiency of SEPSECS in humans causes pontocerebellar hypoplasia 2D (PCH2D), a neurological condition characterized by neurodegeneration and epilepsy. Multiple families from diverse ethnic backgrounds carry homozygous or compound heterozygous mutations that show similar clinical characteristics [16,68].ClinVar currently lists 27 pathogenic or likely pathogenic variants in SEPSECS (Figure 6). Most are protein-truncating variants, and three of them are missense variants. Eleven of those are present in gnomAD and their global allele frequencies are below 0.00004. There are many observed protein-truncating variants observed in gnomAD, which are not present in ClinVar, and have no clinical significance category. The most common is p.Tyr429Ter (g.25125772G>T), which has been observed in three unrelated patients in Finland [69]. All three were compound heterozygous for the same two variants, p.Tyr429Ter and p.Thr325Ser. p.Tyr429Ter is reportedly enriched in the population in Finland [69]. In gnomAD, both variants are observed exclusively within the Finnish population, where the allele frequencies are 0.002791 for p.Tyr429Ter, and 0.0002792 for p.Thr325Ser. The second most common truncating SEPSECS variant in gnomAD is g.25155152C>T, which disrupts the canonical splice acceptor site in intron 4. Interestingly, this variant appears to be also enriched in the Finnish population, with allele frequency 0.002192. Among the 58 subjects in gnomAD, only three are non-Finnish. ClinVar lists its clinical significance as uncertain as it has not been reported as pathogenic or benign. These observations raise the question whether the Finnish population is enriched with pathogenic variants in SEPSECS.
Figure 6
Pathogenic variants and protein-truncating variants in SEPSECS. Location of ClinVar pathogenic variants (above) and gnomAD protein-truncating variants (below) along the SEPSECS protein sequence. The same aesthetics as Figure 1b are used. SEPSECS transcript: ENST00000382103.
4. Common Genetic Variance
Genetic variation is often shared among many individuals in a population. This common variation reflects coinheritance of haplotypes. Functional annotation of haplotypes (groups of single nucleotide variants) is needed to understand hereditary factors linked to complex disease. Genome-wide association studies (GWAS) are designed to find associations between common genetic variants and particular traits.The NHGRI-EBI GWAS catalog (https://www.ebi.ac.uk/gwas, accessed on 17 October 2021) is a repository of genetic variant-trait associations from published studies [70]. The catalog currently lists 24 associations involving variants in selenoprotein genes. Most of those variants fall in introns and it is not clear what impact in protein function they might produce, or even whether they are the causal variants. Three of them are missense and produce an amino acid change in three selenoproteins. The variant rs225014 (p.Thr92Ala) in DIO2 has been shown to be involved in insulin resistance [71], thyroid functionality [72,73] and the onset and progression of osteoarthritis [74] in humans. DIO2 activates the prohormone thyroxine (T4) into the active thyroid hormone 3,3′,5-triiodothyronine (T3) [2]. The variant p.Thr92Ala is particularly common, with an allele frequency of 0.41. The variant rs5771225 (p.Val3Ala) in SELENOO is associated with late onset Alzheimer’s disease [75] and has an allele frequency of 0.22. SELENOO is a pseudokinase that transfers AMP from ATP to Ser, Thr, and Tyr residues (AMPylation) [76]. Lastly, the variant rs1050450 (p.Pro200Leu) in GPX1, with an allele frequency of 0.28, is associated with decreased hemoglobin levels. Its clinical significance is currently classified as benign according to ClinVar.Four GWAS on circulating selenium concentrations have been published in recent years [77,78,79,80]. Perhaps not surprisingly, individual variants in selenoproteins did not reach genome-wide significance levels. The authors discuss that though the selenium measurements used (circulating and toenail selenium) are an accepted selenium level biomarker, the selenium concentration might not reflect its functional significance in selenium-sufficient populations [79]. Nonetheless, a locus overlapping genes involved in metabolism of sulfur-containing amino acids reached significance level in two independent cohorts [77,79]. The region overlaps with genes dimethylglycine dehydrogenase (DMGDH) and betaine-homocysteine S-methyltransferase (BHMT), both involved in conversion of homocysteine to methionine. A prospective GWAS also found strong association with greater increase of circulating selenium after supplementation on the same locus in chromosome 5 [80]. These studies revealed a link between selenium exposure and homocysteine metabolism.The main source of selenium and other essential trace elements in humans is through the diet. Studies on bioavailability in plants and animals suggest that selenium levels vary widely across world regions, with several countries containing relatively low selenium, notably in some parts of China [81,82]. To explore how human populations have adapted to varying levels of dietary selenium levels during history, the signatures of positive selection were assessed by Castellano and collaborators through a survey using genetic polymorphisms in selenoproteins, Cys-containing homologs, and Sec synthesis machinery factors, in 50 human populations [83]. The strongest signals were observed in populations from China. The genes with largest contributions included selenoproteins DIO2, SELENOS, GPX1, SELENOM and SELENOF, and the Sec machinery factors SEPHS2 and SEPSECS. Several single nucleotide polymorphism with known functional consequences showed high levels of population differentiation, including the missense substitution p.Thr92Ala in DIO2 [84], and in GPX1, the missense variant p.Pro200Leu, and a noncoding change A to G (rs3811699) in its promoter region. Positive selection in GPX1 in human populations has been reported also in other studies [85,86].
5. Conclusions
Selenocysteine is a genetic trait shared by bacteria, archaea, and eukaryotes. Its origin maps at the root of the tree of life. Throughout evolution, some organisms lost the ability to synthesize selenoproteins and replaced selenocysteine in essential proteins with cysteine, which contains sulfur instead of selenium. Non-selenoprotein cysteine-containing homologs exist for almost all known selenoproteins; they are less costly for the cell to synthesize, but are preferred when it comes to catalysis. It is not fully understood what the advantage of Sec over Cys is. Nevertheless, the presence of 25 selenoproteins in the human genome should convey the requirement for the unique properties of Sec that cannot be compensated by the use of Cys.Elucidation of the molecular biology behind the insertion of Sec into proteins in response to UGA, paved the way for the identification of a genetic association between selenoproteins and human disease twenty years ago. Since then, several disorders caused by selenoprotein deficiency have been described. Their clinical phenotypes are diverse and different systems are affected, which is not surprising, given that selenoproteins carry out diverse functions and are expressed in different tissues. Moreover, global deficiency of selenoproteins caused by defects in the synthesis machinery result in more complex phenotypes. More surprising, however, is the fact that there are the differences observed between patients with defects in different components of the Sec machinery, SECISBP2 and TRU-TCA1-1 affect mainly thyroid function, muscle, and growth, while SEPSECS affects the brain and causes a more severe phenotype.A subset of the observed stop gain variants introduce a UGA codon. This is a special type of genetic variation when it occurs in selenoprotein genes because selenoprotein mRNAs carrying additional UGA codons may produce proteins with extra Sec residues. Sec insertion can occur in multiple sites [87,88], which may be partially functional or have a negative gain of function. mRNAs with early termination codons can be targeted by nonsense-mediated decay (NMD) to prevent its translation. But we previously showed that, in case of selenoproteins, NMD is less efficient if the stop gained is UGA, compared to the other two stop codons [47], which may increase insertion of extra Sec residues. TGA gain is observed in multiple sites in all selenoproteins included in this review and are indicated in the corresponding figures.The advent of genome sequencing, particularly exome sequencing for clinical diagnosis, has led to the identification of dozens of variants that cause disease through deficiency of selenoproteins, either by inactivating individual selenoprotein genes or by disrupting the selenoprotein synthesis pathway. Initiatives like ClinVar and gnomAD have become instrumental for researchers and clinicians by giving access to a vast amount of genomic data, and will help accelerate our understanding of the associations between genetic variation and health. Undoubtedly, more pathogenic variants in selenoproteins will be discovered, which will provide insights into the function of selenoproteins and opportunities to better understand the role of selenium in human health and disease.
Authors: Irina Ingold; Carsten Berndt; Sabine Schmitt; Sebastian Doll; Gereon Poschmann; Katalin Buday; Antonella Roveri; Xiaoxiao Peng; Florencio Porto Freitas; Tobias Seibt; Lisa Mehr; Michaela Aichler; Axel Walch; Daniel Lamp; Martin Jastroch; Sayuri Miyamoto; Wolfgang Wurst; Fulvio Ursini; Elias S J Arnér; Noelia Fradejas-Villar; Ulrich Schweizer; Hans Zischka; José Pedro Friedmann Angeli; Marcus Conrad Journal: Cell Date: 2017-12-28 Impact factor: 41.582
Authors: Michael J Jurynec; Ruohong Xia; John J Mackrill; Derrick Gunther; Thomas Crawford; Kevin M Flanigan; Jonathan J Abramson; Michael T Howard; David Jonah Grunwald Journal: Proc Natl Acad Sci U S A Date: 2008-08-19 Impact factor: 11.205
Authors: Konrad J Karczewski; Laurent C Francioli; Grace Tiao; Beryl B Cummings; Jessica Alföldi; Qingbo Wang; Ryan L Collins; Kristen M Laricchia; Andrea Ganna; Daniel P Birnbaum; Laura D Gauthier; Harrison Brand; Matthew Solomonson; Nicholas A Watts; Daniel Rhodes; Moriel Singer-Berk; Eleina M England; Eleanor G Seaby; Jack A Kosmicki; Raymond K Walters; Katherine Tashman; Yossi Farjoun; Eric Banks; Timothy Poterba; Arcturus Wang; Cotton Seed; Nicola Whiffin; Jessica X Chong; Kaitlin E Samocha; Emma Pierce-Hoffman; Zachary Zappala; Anne H O'Donnell-Luria; Eric Vallabh Minikel; Ben Weisburd; Monkol Lek; James S Ware; Christopher Vittal; Irina M Armean; Louis Bergelson; Kristian Cibulskis; Kristen M Connolly; Miguel Covarrubias; Stacey Donnelly; Steven Ferriera; Stacey Gabriel; Jeff Gentry; Namrata Gupta; Thibault Jeandet; Diane Kaplan; Christopher Llanwarne; Ruchi Munshi; Sam Novod; Nikelle Petrillo; David Roazen; Valentin Ruano-Rubio; Andrea Saltzman; Molly Schleicher; Jose Soto; Kathleen Tibbetts; Charlotte Tolonen; Gordon Wade; Michael E Talkowski; Benjamin M Neale; Mark J Daly; Daniel G MacArthur Journal: Nature Date: 2020-05-27 Impact factor: 69.504