| Literature DB >> 32665690 |
Monica Lopes-Marques1,2,3, Catarina Serrano4,5,6, Ana R Cardoso4,5,6, Renato Salazar4,5, Susana Seixas4,5, António Amorim4,5,6, Luisa Azevedo4,5,6, Maria J Prata4,5,6.
Abstract
The gene encoding the cytosolic β-glucosidase GBA3 shows pseudogenization due to a truncated allele (rs358231) that is polymorphic in humans. Since this enzyme is involved in the transformation of many plant β-glycosides, this particular case of gene loss may have been influenced by dietary adaptations during evolution. In humans, apart from the inactivating allele, we found that GBA3 accumulated additional damaging mutations, implying an extensive GBA3 loss. The allelic distribution of loss-of-function alleles revealed significant differences between human populations which can be partially related with their staple diet. The analysis of mammalian orthologs disclosed that GBA3 underwent at least nine pseudogenization events. Most events of pseudogenization occurred in carnivorous lineages, suggesting a possible link to a β-glycoside poor diet. However, GBA3 was also lost in omnivorous and herbivorous species, hinting that the physiological role of GBA3 is not fully understood and other unknown causes may underlie GBA3 pseudogenization. Such possibility relies upon a putative role in sialic acid biology, where GBA3 participates in a cellular network involving NEU2 and CMAH. Overall, our data shows that the recurrent loss of GBA3 in mammals is likely to represent an evolutionary endpoint of the relaxation of selective constraints triggered by diet-related factors.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32665690 PMCID: PMC7360587 DOI: 10.1038/s41598-020-68106-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Frequency distribution of the 1,000 genome project (Phase 3) variants with negative effects in GBA3 and GBA.
| Variant ID | Mutation | African | American | East Asian | European | South Asian | HMA |
|---|---|---|---|---|---|---|---|
| p.R213P | 0.001 | 0.007 | 0.029 | 0.011 | 0.001 | ||
| rs182102815 | p.G182S | 0.004 | 0.003 | ||||
| p.D106N | 0.001 | 0.006 | 0.039 | 0.001 | |||
| rs533876334 | p.A15P | 0.001 | |||||
| rs544339352 | p.C53S | 0.001 | |||||
| rs187359066 | p.R82C | 0.001 | |||||
| rs529839966 | p.T88R | 0.001 | |||||
| rs571805473 | p.P265S | 0.001 | |||||
| rs538886341 | p.Y281C | 0.001 | |||||
| rs200660617 | p.V306A | 0.001 | |||||
| rs371662599 | p.Y347X | 0.001 | |||||
| rs200623163 | p.R389C | 0.001 | |||||
| rs560225618 | p.K402E | 0.001 | |||||
| rs186578587 | p.L419V | 0.001 | |||||
| rs371075149 | p.N422K | 0.001 | |||||
| rs191769903 | p.F433L | 0.001 | |||||
| rs370728701 | p.V438A | 0.001 | |||||
| rs421016 | p.L483Pb | 0.002 | 0.001 | 0.012 | 0.002 | ||
| rs76763715 | p.N409Sb | 0.001 | 0.002 | ||||
| rs149171124 | p.E427Xb | 0.001 | |||||
| rs146519305 | p.R534C | 0.01 | |||||
| rs369068553 | p.V499Mb | 0.001 | |||||
HMA-Frequency of homozygous individuals for the minor allele (all 1KGP populations combined) (a)-Protein variant mislabeled in databases as “loss of stop codon”. (b)-Variants associated to Gaucher disease in ClinVar.
Figure 1Gene annotation of GBA3. Schematic representation of GBA3 identified mutations in different taxonomic groups, each cluster of grey squares represents one exon (total 5 exons). Square color code: red-stop codons; yellow-loss of splice site AG-GT or AG-GC (note: exon 2 presents conserved donor GC splice site in all species except in Trichechus manatus latirostris); blue—deletion and green—insertion. Number within each square indicates the number of nucleotides inserted or deleted and dark grey squares represent regions with missing data. Cross-species conserved mutations are highlighted by black arrow heads and below these 3 adjacent amino acids before the observed stop codon X are shown.
Figure 2Selection and Phylogenetic analysis. Maximum likelihood phylogenetic analysis of GBA3 nucleotide sequences, node values correspond to posterior probabilities (aBayes). Species contained in the collapsed clades are available in Supplementary Table 5. Clades analysed in CodeML are indicated by grey boxes containing corresponding clade letter A1, A2, A3, B1, B2, C1 and D1 and the omega values. In the case of Cetacea Relax analysis the K value is also indicated with the corresponding p-value; * indicates sequences predicted manually in unannotated genomes or poorly annotated genomes and Ѱ indicates pseudogenes.
Likelihood ratio test (LRT) and p-values.
| Null hypothesis tested (H0) | Alternative hypothesis (HA) | df | LRT | |
|---|---|---|---|---|
ω0 = ωA2 = ωA3 = ωB1 = ωBranchB1 = ωB2 = ωC1 = ωD1;
| 1 | 30.4 | < 0.05 | |
ω0 = ωA1 = ωBranchA1 = ωA3 = ωB1 = ωBranchB1 = ωB2 = ωC1 = ωD1;
| 1 | 0.68 | 0.41 | |
ω0 = ωA1 = ωBranchA1 = ωA2 = ωB1 = ωBranchB1 = ωB2 = ωC1 = ωD1,
| 1 | 0.36 | 0.55 | |
ω0 = ωA1 = ωBranchA1 = ωA2 = ωA3 = ωB2 = ωC1 = ωD1,
| 1 | 35 | < 0.05 | |
ω0 = ωA1 = ωBranchA1 = ωA2 = ωA3 = ωB1 = ωBranchB1 = ωC1 = ωD1,
| 1 | 0.98 | 0.32 | |
ω0 = ωA1 = ωA2 = ωA3 = ωB1 = ωBranchB1 = ωB2 = ωC1 = ωD1,
| 1 | 3.56 | 0.06 | |
ω0 = ωA1 = ωBranchA1 = ωA2 = ωA3 = ωB1 = ωB2 = ωC1 = ωD1,
| 1 | 0.06 | 0.81 | |
ω0 = ωA1 = ωBranchA1 = ωA2 = ωA3 = ωB1 = ωB2 = ωBranchB1 = ωD1,
| 1 | 0.47 | 0.49 | |
ω0 = ωA1 = ωBranchA1 = ωA2 = ωA3 = ωB1 = ωB2 = ωBranchB1 = ωC1,
| 1 | 1.83 | 0.18 | |
ω0 = ωA2 = ωA3 = ωB2 = ωC1 = ωD1,
| 1 | 67.07 | < 0.05 | |
ω0 = ωA2 = ωA3 = ωB2 = ωC1 = ωD1,
| 2 | 67.76 | < 0.05 | |
ω0 = ωA2 = ωA3 = ωB2 = ωC1 = ωD1,
| ||||
ω0 = ωA2 = ωA3 = ωB2 = ωC1 = ωD1,
| 4 | 74.62 | < 0.05 | |
| 1 | 0.70 | 0.40 | ||
| 3 | 7.56 | 0.06 | ||
| 1 | 0.00 | 1.00 |
P-values < 0.05 were considered significant (Clades A1-Pinnipedia, A2-Canidae, A3 Feliformia, B1 Cetacea, B2 Cetacea, C1 Chiroptera, D1 Rodentia and ancestral branches Branch A1- Pinnipedia and Branch B1-Cetacea).
Figure 3Gene annotation of NEU2. Schematic representation of the NEU2 gene and identified mutations, each group of grey squares represents one exon (total 2 exons). Square color code: red—stop codons; yellow—loss of canonical AG-GT splice site; blue—deletion and green—insertion. Number within each square indicates the number of nucleotides inserted or deleted and dark grey squares represent regions with missing data. Cross-species conserved mutations are highlighted by black arrow heads and below these 3 three adjacent amino acids before the observed stop codon X are shown.
Figure 4(A) Schematic illustration of the roles proposed for GBA3 and NEU2 in sialic metabolism. (B) Coding status of GBA3, NEU2 and CMAH in mammals, P indicates polymorphic pseudogene, X specifies gene loss, * coding ORF previously reported as lost.