| Literature DB >> 34291080 |
Gabriela Canalli Kretzschmar1, Nina Moura Alencar1, Saritha Suellen Lopes da Silva2, Carla Daniela Sulzbach2, Caroline Grisbach Meissner1, Maria Luiza Petzl-Erler1, Ricardo Lehtonen R Souza2, Angelica Beate Winter Boldt1.
Abstract
Several genome-wide association studies (GWAS) have been carried out with late-onset Alzheimer's disease (LOAD), mainly in European and Asian populations. Different polymorphisms were associated, but several of them without a functional explanation. GWAS are fundamental for identifying loci associated with diseases, although they often do not point to causal polymorphisms. In this sense, functional investigations are a fundamental tool for discovering causality, although the failure of this validation does not necessarily indicate a non-causality. Furthermore, the allele frequency of associated genetic variants may vary widely between populations, requiring replication of these associations in other ethnicities. In this sense, our study sought to replicate in 150 AD patients and 114 elderly controls from the South Brazilian population 18 single-nucleotide polymorphisms (SNPs) associated with AD in European GWAS, with further functional investigation using bioinformatic tools for the associated SNPs. Of the 18 SNPs investigated, only four were associated in our population: rs769449 (APOE), rs10838725 (CELF1), rs6733839, and rs744373 (BIN1-CYP27C1). We identified 54 variants in linkage disequilibrium (LD) with the associated SNPs, most of which act as expression or splicing quantitative trait loci (eQTLs/sQTLs) in genes previously associated with AD or with a possible functional role in the disease, such as CELF1, MADD, MYBPC3, NR1H3, NUP160, SPI1, and TOMM40. Interestingly, eight of these variants are located within long non-coding RNA (lncRNA) genes that have not been previously investigated regarding AD. Some of these polymorphisms can result in changes in these lncRNAs' secondary structures, leading to either loss or gain of microRNA (miRNA)-binding sites, deregulating downstream pathways. Our pioneering work not only replicated LOAD association with polymorphisms not yet associated in the Brazilian population but also identified six possible lncRNAs that may interfere in LOAD development. The results lead us to emphasize the importance of functional exploration of associations found in large-scale association studies in different populations to base personalized and inclusive medicine in the future.Entities:
Keywords: APOE; Alzheimer’s disease; BIN1; CELF1; GWAS; lncRNA; miRNA
Year: 2021 PMID: 34291080 PMCID: PMC8287568 DOI: 10.3389/fmolb.2021.632314
Source DB: PubMed Journal: Front Mol Biosci ISSN: 2296-889X
FIGURE 1Workflow. This study is divided into two stages: 1, association study; 2, in silico analysis. Only the associated variants in the studied population (South Brazilian) followed in silico analysis. SNP: single-nucleotide polymorphism; LOAD: late-onset Alzheimer’s disease; GWAS: genome-wide association studies; eQTL: expression quantitative trait locus; sQTL: splicing quantitative trait locus; LD: linkage disequilibrium; lncRNAs: long non-coding RNAs; miRNAs: microRNAs.
Demographic and clinical characteristics of research participants.
| Variable | Controls, | Patients, |
|---|---|---|
| Male (%) | 28 (24.8) | 52 (34.7) |
| Average age (min–max) | 70.8 (60–99) | 75.6 (60–90) |
|
| 20 (17.7) | 70 (47.9) |
|
| ||
| Euro-Brazilian (%) | 92 (80.7) | 120 (80) |
| Admixed (%) | 20 (17.5) | 28 (18.7) |
| Indeterminate (%) | 2 (1.8) | 2 (1.3) |
Ancestry was self-reported. The proportions agree with the South Brazilian population’s actual genomic composition (Lima-Costa et al., 2015). We emphasize that Euro-Brazilian participants are descendants of Europeans but are admixed. APOE: apolipoprotein E.
Single-nucleotide polymorphisms (SNPs) selected for this study and their minor allele frequencies.
| Region/closest gene | SNP | Alleles (maj./min.) | CHR | Position GRCh38.p12 | Region | Control | Patients | GWAS population* (%) | AD | References | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| % | N& | % | N& | CEU | TSI | IBS | YRI | ||||||||
|
| rs4147929 | G/A | 19 | 1063444 | Intron | 17.7 | 226 | 17.0 | 300 | 18 | 22 | 23 | 0 | A |
|
|
| rs704454 | T/C | 3 | 64941350 | Intron | 24.7 | 218 | 25.3 | 288 | 26 | 30 | 29 | 19 | ? |
|
|
| rs769449 | G/A | 19 | 44906745 | Intron | 7.14 | 224 | 20.6 | 296 | 15 | 9 | 8 | 0 | A |
|
|
| rs6733839 | C/T | 2 | 127135234 | Intergenic | 33.04 | 224 | 40.0 | 300 | 40 | 42 | 31 | 39 | T |
|
|
| rs744373 | A/G | 2 | 127137039 | Intergenic | 28.3 | 222 | 34.8 | 290 | 30 | 29 | 25 | 56 | ? |
|
|
| rs10948363 | A/G | 6 | 47520026 | Intron | 25.6 | 226 | 26.3 | 300 | 28 | 24 | 24 | 10 | G |
|
|
| rs3865444 | C/A | 19 | 51224706 | Promoter | 32.6 | 224 | 33.4 | 296 | 32 | 27 | 29 | 1 | ? |
|
|
| rs10838725 | T/C | 11 | 47536319 | Intron | 30.7 | 228 | 28.7 | 300 | 34 | 34 | 29 | 0 | C |
|
|
| rs11136000 | C/T | 8 | 27607002 | Intron | 41.6 | 226 | 40.7 | 300 | 37 | 40 | 36 | 60 | C |
|
|
| rs2974151 | C/G | 2 | 79926171 | Intron | 14.6 | 226 | 12.2 | 296 | 12 | 15 | 11 | 15 | ? |
|
|
| rs11771145 | G/A | 7 | 143413669 | Intron | 36.7 | 226 | 32.7 | 300 | 35 | 36 | 35 | 59 | ? |
|
|
| rs35349669 | C/T | 2 | 233159830 | Intron | 38.5 | 226 | 36.3 | 300 | 42 | 43 | 40 | 5 | T | |
|
| rs610932 | G/T | 11 | 60171834 | Intergenic | 41.5 | 224 | 41.2 | 296 | 52 | 53 | 43 | 42 | ? |
|
|
| rs3851179 | C/T | 11 | 86157598 | Intergenic | 33.6 | 226 | 33.4 | 296 | 43 | 36 | 35 | 5 | C |
|
|
| rs1476679 | T/C | 7 | 100406823 | Intron | 20.8 | 226 | 23.0 | 300 | 32 | 31 | 21 | 0 | ? |
|
|
| rs28834970 | T/C | 8 | 27337604 | Intron | 35.0 | 226 | 35.3 | 300 | 34 | 33 | 35 | 18 | C |
|
|
| rs10498633 | G/T | 14 | 92460608 | Intron | 17.3 | 226 | 18.0 | 300 | 18 | 16 | 22 | 11 | ? |
|
|
| rs3857059 | A/G | 4 | 89754087 | Intron | 19.9 | 226 | 17.1 | 292 | 8 | 7 | 7 | 65 | ? |
|
A: although this SNP does not appear associated with AD GWAS, it is commonly associated with Parkinson’s disease (PD) GWAS. We selected this SNP to see if we would find any association with LOAD in our population. Allele frequencies of the investigated SNPs: three European populations and one African population are presented for comparative purposes. SNPs: single-nucleotide polymorphism; ?: no information or both alleles were AD-associated in independent studies; maj.: major allele; min.: minor allele; CEU: population of Utah with northern and western European ancestry; TSI: population of Toscana, Italy; IBS: Iberian population, Spain; YRI: African population of Yoruba. *Allele frequencies according to data from the 1000 Genomes Project (Consortium, 2010); ABCA7: ATP-binding cassette subfamily A member 7; ADAMTS9-AS2: ADAMTS9 antisense RNA 2; APOE: apolipoprotein E; BIN1: bridging integrator 1; CYP27C1: cytochrome P450 family 27 subfamily C member 1; CD2AP: CD2-associated protein; CD33: CD33 molecule; CELF1: CUGBP Elav-like family member 1; CLU: clusterin; CTNNA2: catenin alpha 2; EPHA1: EPH receptor A1; INPP5D: inositol polyphosphate-5-phosphatase D; MS4A6A: membrane-spanning 4-domains A6A; PICALM: phosphatidylinositol-binding clathrin assembly protein; ZCWPW1: zinc finger CW-type and PWWP domain containing 1; PTK2B: protein tyrosine kinase 2 beta; SLC24A4: solute carrier family 24 member 4; SNCA: synuclein alpha; & = number of chromosomes. Some samples were excluded due to low genotyping quality. The maximum sample number was 150 patients (300 chromosomes) and 114 elderly controls (228 chromosomes).
Results of univariate analysis for all available variables.
| Independent variable | OR | 95% CI |
|
|---|---|---|---|
| Ethnicity | 0.93 | 0.49–1.76 | 0.827 |
| Sex | 0.62 | 0.36–1.07 |
|
| Schooling | 0.88 | 0.73–1.07 |
|
| Smoking habit | 2.28 | 1.27–4.09 |
|
| Alcoholism | 3.24 | 1.45–7.22 |
|
| Diabetes | 0.81 | 0.45–1.45 | 0.476 |
| Cholesterol | 0.75 | 0.44–1.29 | 0.299 |
| Hypertension | 0.71 | 0.40–1.25 | 0.232 |
| BMI | 0.65 | 0.45–0.96 |
|
| AD in family | 4.87 | 2.25–10.53 |
|
|
| 1.85 | 0.73–4.66 |
|
|
| 0.13 | 0.03–0.57 |
|
|
| 4.28 | 2.39–7.66 |
|
Variables with p-values lower than 0.220 were considered for multivariate regression analysis (in bold); BMI: body mass index.
Significant results of the polymorphisms investigated with LOAD.
| Region | SNP | OR | 95% CI | P | Pc# | IV | HWE | ||
|---|---|---|---|---|---|---|---|---|---|
| CON | PAT | ||||||||
|
| rs769449 | A/A* | – | – | – | 1 | 1 | ||
| A/G | 0.84 | 0.32–2.25 | 0.736 | ||||||
| G/G |
|
|
|
|
| ||||
| A+ |
|
|
|
|
| ||||
| G+* | – | – | – | ||||||
| Additive model |
|
|
|
|
| ||||
|
| rs6733839 | C/C | 0.55 | 0.29–1.05 |
| BMI, e4 | 0.281 | 0.865 | |
| C/T | 1.30 | 0.70–2.41 | 0.406 | ||||||
| T/T | 1.96 | 0.76–5.05 | 0.165 | ||||||
| C+ | 0.51 | 0.20–1.32 | 0.165 | ||||||
| T+ | 1.81 | 0.95–3.43 |
| BMI, e4 | |||||
| Additive model |
|
|
|
|
| ||||
|
| rs744373 | A/A | 0.69 | 0.36–1.29 | 0.246 | 0.242 | 0.856 | ||
| A/G | 0.98 | 0.51–1.86 | 0.948 | ||||||
| G/G |
|
|
|
|
| ||||
| A+ | 0.32 | 0.10–1.0 |
| BMI, e4 | |||||
| G+ | 1.45 | 0.77–2.74 | 0.246 | ||||||
| Additive model | 1.55 | 0.96–2.51 | 0.076 | BMI, e4 | |||||
|
| rs10838725 | C/C |
|
|
|
|
| 0.660 | 0.110 |
| C/T | 1.48 | 0.80–2.75 | 0.212 | ||||||
| T/T | 0.95 | 0.52–1.75 | 0.875 | ||||||
| C+ | 1.05 | 0.57–1.93 | 0.875 | ||||||
| T+ |
|
|
|
|
| ||||
| Additive model | 0.83 | 0.50–1.36 | 0.453 | ||||||
The values are the result of logistic regression performed by STATA. Bold: significant p-value; underline: trend; OR: odds ratio; CI: confidence interval; P= p-value; Pc#: p-value corrected for false discovery rate; IV: independent variable; HWE: Hardy–Weinberg equilibrium; PAT: patients; CON: controls; BMI: body mass index; +: allele carrier; *: it is not possible to calculate since all the controls have the rs769449*G allele. APOE: apolipoprotein E; BIN1: bridging integrator 1; CYP27C1: cytochrome P450 family 27 subfamily C member 1; CELF1: CUGBP Elav-like family member 1. All results are in Supplementary Table S2.
Characterization of lncRNAs potentially involved in AD.
| lncRNA ID | Other IDs | Position (GRCh38) | Genes within 2 Kb | Class. | BP | Variant | miRNA target |
| |
|---|---|---|---|---|---|---|---|---|---|
| Gain | Loss | ||||||||
|
|
| chr11:47602805-47611134 | - | lincRNA | 1,280 | rs71457224 | - | - | - |
| rs10769282 | - | hsa-miR-373-5p | 0.6488 | ||||||
|
|
| chr19:44909374-44914968 |
| ? | 5,594 | rs10414043 | hsa-miR-5089-3p | hsa-miR-1273g-3p | 0.3916 |
| hsa-miR-4252 | |||||||||
|
| hsa-miR-1227-3p | ||||||||
| rs7256200 | - | hsa-miR-4284 | 0.2019 | ||||||
|
|
| chr19:44907906-44909013 |
| Antisense | 526 | rs429358 | hsa-miR-4479 | hsa-miR-147b | 0.0666 |
|
| - | chr19:44907758-44909389 |
| ? | 1,051 | rs429358 | hsa-miR-6869-3p | - | 0.0666 |
|
| HSALNT0039381 | chr2:127133598-127135107 | - | lincRNA | 1,509 | rs4663105 | hsa-miR-6776-5p | hsa-miR-6839-3p | 0.9863 |
| hsa-miR-4455 | |||||||||
|
| - | chr2:127116083-127139365 |
| lincRNA | 4,459 | rs744373 | hsa-miR-5008-5p | hsa-miR-2467-5p | 0.9571 |
| hsa-miR-657 | |||||||||
| hsa-miR-6822-3p | |||||||||
| rs730482 | - | hsa-miR-192-5p | 0.4906 | ||||||
| hsa-miR-215-5p | |||||||||
| hsa-miR-4766-3p | |||||||||
| hsa-miR-1224-3p | |||||||||
Class.: lncRNA classification; bp: base pairs; * p-value of the possibility of SNP impacting the lncRNA structure [this p-value is empirical, being generated in silico, through the position of the SNP, the GC content of the molecule, and the size of the sequence (p < 0.2 = possibly harmful)].
FIGURE 2Prediction of secondary structures of lncRNAs possibly involved in AD. The secondary structures of the lncRNAs possibly involved in AD were predicted by the RNAfold web server based on the Vienna RNA package, considering the calculation of minimum free energy (MFE) and positional entropy. Structures were generated for both alleles (alleles in bold are deemed to have a possible harmful role). The impact of these mutations on the structures was established by visually changing the molecule and the p-value provided by lncRNASNP2. This p-value is empirical, being generated in silico, through the SNP’s position, the GC content of the molecule, and the size of the sequence (p < 0.2 = possibly harmful). Within the positional entropy scale, low entropy (red) leads to little structural flexibility, making the prediction more reliable, while regions of high entropy (violet) may have several alternative structures.