| Literature DB >> 36104480 |
Safoura Khamse1, Samira Alizadeh1, Stephan H Bernhart2, Hossein Afshar1, Ahmad Delbari3, Mina Ohadi4.
Abstract
The human SBF1 (SET binding factor 1) gene, alternatively known as MTMR5, is predominantly expressed in the brain, and its epigenetic dysregulation is linked to late-onset neurocognitive disorders (NCDs), such as Alzheimer's disease. This gene contains a (GCC)-repeat at the interval between + 1 and + 60 of the transcription start site (SBF1-202 ENST00000380817.8). We sequenced the SBF1 (GCC)-repeat in a sample of 542 Iranian individuals, consisting of late-onset NCDs (N = 260) and controls (N = 282). While multiple alleles were detected at this locus, the 8 and 9 repeats were predominantly abundant, forming > 95% of the allele pool across the two groups. Among a number of anomalies, the allele distribution was significantly different in the NCD group versus controls (Fisher's exact p = 0.006), primarily as a result of enrichment of the 8-repeat in the former. The genotype distribution departed from the Hardy-Weinberg principle in both groups (p < 0.001), and was significantly different between the two groups (Fisher's exact p = 0.001). We detected significantly low frequency of the 8/9 genotype in both groups, higher frequency of this genotype in the NCD group, and reverse order of 8/8 versus 9/9 genotypes in the NCD group versus controls. Biased heterozygous/heterozygous ratios were also detected for the 6/8 versus 6/9 genotypes (in favor of 6/8) across the human samples studied (Fisher's exact p = 0.0001). Bioinformatics studies revealed that the number of (GCC)-repeats may change the RNA secondary structure and interaction sites at least across human exon 1. This STR was specifically expanded beyond 2-repeats in primates. In conclusion, we report indication of a novel biological phenomenon, in which there is selection against certain heterozygous genotypes at a STR locus in human. We also report different allele and genotype distribution at this STR locus in late-onset NCD versus controls. In view of the location of this STR in the 5' untranslated region, RNA/RNA or RNA/DNA heterodimer formation of the involved genotypes and alternative RNA processing and/or translation should be considered.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36104480 PMCID: PMC9474449 DOI: 10.1038/s41598-022-19878-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Allele distribution of the human SBF1 (GCC) repeat in the NCD and control groups.
| Alleles * Group Crosstabulation | |||
|---|---|---|---|
| Groupsa | Total | ||
| Controls | NCDs | ||
| 5-repeat | |||
| Count | 0 | 1 | 1 |
| % | 0.0% | 0.2% | 0.1% |
| 6-repeat | |||
| Count | 16 | 12 | 28 |
| % | 2.8% | 2.3% | 2.6% |
| 7-repeat | |||
| Count | 1 | 0 | 1 |
| % | 0.2% | 0.0% | 0.1% |
| 8-repeat | |||
| Count | 224 | 256 | 480 |
| % | 39.7% | 49.2% | 44.3% |
| 9-repeat | |||
| Count | 313 | 248 | 561 |
| % | 55.5% | 47.7% | 51.8% |
| 10-repeat | |||
| Count | 10 | 3 | 13 |
| % | 1.8% | 0.6% | 1.2% |
| Count | 564 | 520 | 1084 |
| % | 100.0% | 100.0% | 100.0% |
aFisher’s exact p = 0.006. Counts and % represent within each group.
Figure 1Allele frequency of the SBF1 (GCC)-repeat in the human samples studied. While multiple alleles were detected, the 8 and 9-repeat alleles were predominantly abundant. Significant excess of the 8-repeat was detected in the NCD group versus controls.
Genotype distribution of the human SBF1 (GCC) repeat in the NCD and control groups.
| Genotypes * Group Crosstabulation | |||
|---|---|---|---|
| Groupsa | Total | ||
| Controls | NCDs | ||
| 5/6 | |||
| Count | 0 | 1 | 1 |
| % | 0.0% | 0.4% | 0.2% |
| 6/8 | |||
| Count | 12 | 11 | 23 |
| % | 4.3% | 4.2% | 4.2% |
| 6/9 | |||
| Count | 4 | 0 | 4 |
| % | 1.4% | 0.0% | 0.7% |
| 7/8 | |||
| Count | 1 | 0 | 1 |
| % | 0.4% | 0.0% | 0.2% |
| 8/8 | |||
| Count | 93 | 100 | 193 |
| % | 33.0% | 38.5% | 35.6% |
| 8/9 | |||
| Count | 23 | 45 | 68 |
| % | 8.2% | 17.3% | 12.5% |
| 8/10 | |||
| Count | 2 | 0 | 2 |
| % | 0.7% | 0.0% | 0.4% |
| 9/9 | |||
| Count | 141 | 101 | 242 |
| % | 50.0% | 38.8% | 44.6% |
| 9/10 | |||
| Count | 4 | 1 | 5 |
| % | 1.4% | 0.4% | 0.9% |
| 10/10 | |||
| Count | 2 | 1 | 3 |
| % | 0.7% | 0.4% | 0.6% |
| Count | 282 | 260 | 542 |
| % | 100.0% | 100.0% | 100.0% |
aFisher’s exact p = 0.001. Counts and % represent within each group.
Figure 2Genotype frequency of the SBF1 (GCC)-repeat in the human samples studied. The genotype distribution departed from HWP in both groups and was different between the two groups.
Figure 3Identification of a genotype at the short extreme of the allele range in one instance of late-onset NCD.
Figure 4Accessibility (probability of being unpaired) of all regions of 10 nt length, ending at base x for the first exon of human SBF1 with 5 to 10-repeats. Differences in 3 regions were detected, at about nt 50, about nt 200, and about nt 220.
Interaction groups across various human SBF1 (GCC)-repeatsa.
| Group 1 | ((((((.(((((&))))).)))))) CGUGCUGGUGGC&GCCAUGAGCGCG 5 vs 5, 5 vs 6, 5 vs 7, 5 vs 8, 5 vs 9, 5 vs 10, 6 vs 6, 6 vs 7, 6 vs 8, 8 vs 8 |
| Group 2 | (((((((((.((..(((((&)))))..)).))))))))) GCCAUGGCGCGGCUCGCGG&CCGCGUCCCUCGCCAUGGC 6 vs 9, 6 vs 10, 7 vs 7, 7 vs 8, 7 vs 9, 7 vs 10, 8 vs 9, 8 vs 10, 9 vs 9, 9 vs 10, 10 vs 10 |
aLengths with interaction structure and sequences in bracket-dot notation (matching parenthesis are opening and closing bases of a base pair, dots are unpaired bases, and separate the two interacting sequences).
Figure 5Sequence alignment of the SBF1 (GCC)-repeat across selected vertebrate species. The (GCC)-repeat expanded beyond 2-repeats in primates.