| Literature DB >> 36114174 |
Hyojung Paik1,2,3,4, Junehawk Lee1, Chan-Seok Jeong1, Jun Sung Park5, Jeong Ho Lee5,6, Nadav Rappoport2,7, Younghoon Kim1, Hee-Young Sohn8, Chulman Jo8, Jimin Kim1, Seong Beom Cho9.
Abstract
Observations of comorbidity in heart diseases, including cardiac dysfunction (CD) are increasing, including and cognitive impairment, such as Alzheimer's disease and dementia (AD/D). This comorbidity might be due to a pleiotropic effect of genetic variants shared between CD and AD/D. Here, we validated comorbidity of CD and AD/D based on diagnostic records from millions of patients in Korea and the University of California, San Francisco Medical Center (odds ratio 11.5 [8.5-15.5, 95% Confidence Interval (CI)]). By integrating a comprehensive human disease-SNP association database (VARIMED, VARiants Informing MEDicine) and whole-exome sequencing of 50 brains from individuals with and without Alzheimer's disease (AD), we identified missense variants in coding regions including APOB, a known risk factor for CD and AD/D, which potentially have a pleiotropic role in both diseases. Of the identified variants, site-directed mutation of ADIPOQ (268 G > A; Gly90Ser) in neurons produced abnormal aggregation of tau proteins (p = 0.02), suggesting a functional impact for AD/D. The association of CD and ADIPOQ variants was confirmed based on domain deletion in cardiac cells. Using the UK Biobank including data from over 500000 individuals, we examined a pleiotropic effect of the ADIPOQ variant by comparing CD- and AD/D-associated phenotypic evidence, including cardiac hypertrophy and cognitive degeneration. These results indicate that convergence of health care records and genetic evidences may help to dissect the molecular underpinnings of heart disease and associated cognitive impairment, and could potentially serve a prognostic function. Validation of disease-disease associations through health care records and genomic evidence can determine whether health conditions share risk factors based on pleiotropy.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36114174 PMCID: PMC9481623 DOI: 10.1038/s41398-022-02144-0
Source DB: PubMed Journal: Transl Psychiatry ISSN: 2158-3188 Impact factor: 7.989
Oligos used to synthesize 3×FLAG DNA fragments.
| sense 5′–3′ | anti-sense 5′–3′ | |
|---|---|---|
| (i) N-3×FLAG | ATGGACTACAAAGACCA | CTTGTCATCGTCATCCTTGTA |
| TGACGGTGATTATAAAGAT | GTCGATGTCATGATCTTTATAA | |
| CATGACATCGACTACAAG | TCACCGTCATGGTCTT | |
| GATGACGATGACAAG | TGTAGTCCAT | |
| (ii) 3×FLAG-C | GACTACAAAGACCATGACGG | CTACTTGTCATCGTCATC |
| TGATTATAAAGATCATGACAT | CTTGTAGTCGATGTCATG | |
| CGACTACAAGGATGACGATG | ATCTTTATAATCACCGTC | |
| ACAAGTAG | ATGGTCTTTGTAGTC |
Fig. 1Identifying shared genetic architecture by repurposing scaled digital health data.
Schematic workflow of a developing hypothesis in the observation phase based on digital health data (gray boxes) to study, using genomic data (white boxes). In the gray boxes, the database of HIRA and the electronic health records (EHR) of UCSF Medical Center were used to suggest the association of cardiac dysfunction (CD), such as hypertensive cardiac disease, with dementia and Alzheimer’s disease (AD/D); the public database of HIRA (Health Insurance Review and Assessment service) was used as a nationwide inpatient/outpatient diagnoses observation database in South Korea; the EHR of UCSF was used as the validation database. After presenting the propensity of CD amongst AD/D diagnoses, we examined the genetic associations between CD and AD/D in a discovery phase (i.e., white boxes).
Summary of the datasets used.
| Datasets and features | Frequency | |
|---|---|---|
| Observation phase | National inpatient/outpatient set of HIRAa | |
| Total patients | ||
| Selected patientsb | ||
| No. of men in the selected set | 341,788 (44.7%) | |
| No. of women in the selected set | 420,104 (55.2%) | |
| No. of diagnoses | ||
| No. of unique diagnosis codes (ICD-10)c | 5859 (5–3-letter level) | |
| Unique diagnosis high-level (3 letter)d | ||
| Mean of diagnoses age | 54.19 (±23.26) | |
| Outcome of diagnoses | ||
| Deceasede | 43,545 (5.7%) | |
| Alivef | 1,252,114 | |
| UCSF Medical Centerg (2012.01 to 2017.01) | ||
| Total patients | 6,852,000 | |
| Total cases of diagnoses | 44,545,038 | |
| No. of unique diagnoses codes (ICD-10-CM) | 29,893 | |
| Unique diagnosis codes with 3 letters | 1831 | |
| Mean of diagnoses age | 48.67 (±23.41), | |
| Outcome of diagnoses | ||
| Deceased | 2961 | |
| Not deceased | 143,996 | |
| Pending | 21,248,546 | |
| Discovery phase | VARIMED (VARiant Informing MEDicine) | |
| No. of reviewed publication | ||
| No. of SNPs (dbSNP IDs) | 130,426 (129,890) | |
| No. of traits (disease/non-disease traits)h | 4 223 (1 489/2374) | |
| No. of associations between SNPs and traits | ||
| NBB (Netherlands Brain Bank) | ||
| Total patients (AD/non-AD)i | ||
| Hippocampal formation (HF) samples | 50 | |
| Blood (BL) samples | 50 | |
| Mean deceased age (AD/non-AD deceased) | 83.5±8/71.4±12.6 | |
| Sex (AD deceased) | Male = 14; Female = 29 | |
| Sex (non-AD deceased) | Male = 3; Female = 4 | |
| Braak staging (AD/non-AD deceased) | 5.02±1.1/0.71 ± 0.48 | |
| UK Biobank | ||
| No. of total participants | 502,543 | |
| Participants with whole-exome seq (WES)j | 49,960 |
aThe Health Insurance Review and Assessment Service of Korea (HIRA). We utilized non-longitudinal sets consisting of randomly sampled in/outpatient sets built annually from 2009 to 2011 (www.hira.or.kr).
bTo minimize the re-enrollment of patients into the 2011 set from the 2009 and 2010 sets, we selected only deceased patients from the 2009 and 2010 sets. In addition, we excluded records of non-disease-related diagnoses, including injuries, poisoning, and childbirth, using diagnosis codes.
cInternational Statistical Classification of Disease and Related Health Problems 10th Revision (ICD-10).
dBased on the hierarchical structure of ICD-10 codes, which consists of a 5-letter level for a disease with familial history and a 3-letter level for general disease classification, we used transformed diagnosis codes at the 3-digit level in this study.
eDetected outcomes in health insurance reviews (HIRA).
fOther non-deceased outcomes included ongoing patients, transferred, sent back, others, and discharged while alive.
gDeidentified electronic medical records (EMRs) from the University of California, San Francisco (UCSF) Medical Center (a tertiary-care university hospital).
hCounted based on MeSH terms (Medical Subject Headings, the National Library of Medicine’s controlled vocabulary) for traits including eye color and diseases such as asthma.
iAll collected samples were of Western European ancestry.
jWe analyzed participants with WES data to validate the phenotypic effect of the germline variant of interest.
Bold characters emphasize the numbers of the table.
Statistics of diagnosis trajectory analysis using the set of HIRA.
| Features | Frequency | |
|---|---|---|
Trajectorya = the first and followed nodes linked via edges | No. of trajectories | |
Node = Diagnosed patients as a | Total type of diagnosis | 604 |
| The 1st diagnosis | ||
| Patients in the 1st diagnosis | ||
| Fatal outcomes of diagnosis | 2 134 | |
Edge (Directed acyclic edge) = Sequence between (FDR < 0.1)b | Subsequent diagnoses after the 1st diagnosis | |
| Mean of steps diagnoses in a trajectory |
aThe trajectory consists of the first node, representing a set of patients diagnosed with disease i, succeeding nodes are connected via a directed acyclic edge, representing subsequent diagnoses that occur more frequently than randomly.
bAll the presented edges were statistically significant (Relative association (RA) for the co-occurrence of diagnosis i and j > 1 and FDR of the binomial test for the co-diagnosis of diagnosis i and j < 0.1; FDR of the binomial test for the sequential directionality of diagnosis dates <0.1). The details of the diagnosis trajectory model are presented on the following website (https://www.youtube.com/watch?v=jJMds31-e2g).
Bold characters to emphasize the numbers of the table.
Fig. 2The preceding diagnosis of heart disease before dementia.
A, B Traced models of disease diagnosis patterns using the directed acyclic graph (DAG) model. Our DAG model consisted of a node for the disease diagnosis and an edge for subsequent diagnoses presented that were not random (FDR < 0.1). A From the HIRA dataset, we found that 6.1% of the patients suffering from hypertensive heart and renal disease (n = 425) were diagnosed with unspecified dementia and Alzheimer’s disease (a major subtype of dementia). B We validated whether the identical pattern was repeated in the independent dataset obtained from the electronic health records (EHR) of the UCSF Medical Center. Of the 672 patients with hypertensive heart and kidney disease, 6.8% (n = 46) were diagnosed with dementia. C, D Due to the vague definition of the “hypertensive heart and kidney disease” diagnosis, we looked into a more detailed diagnosis code at the 5-digit level of ICD-10. C In the HIRA dataset, approximately 46% of dementia diagnoses made in patients who already had hypertensive heart and renal diseases were identified after heart failure. Therefore, the diagnosis of dementia after the diagnosis of hypertensive heart and renal disease was significantly enriched by cardiac disease (CD) (p = 1.08E–04, hypergeometry test). D Likewise, the EHR from the UCSF Medical Center also showed an identical pattern. Of the 46 dementia diagnoses made in patients with pre-existing hypertensive heart and renal disease, 33 were identified after the diagnosis of heart failure. The diagnosis of dementia after the diagnosis of hypertensive heart and kidney disease was enriched by cardiac diseases, such as heart failure (p = 4.66E–02, hypergeometric test).
Candidates with pleiotropic features (i.e., shared genetic risks) between CD and AD/D.
| Genomic locationa | Gene symbol | Ref/Alt2b | Average of pathogenic scorec | No. of ADd | Allele frequency | Clinical significance (ClinVar) | |
|---|---|---|---|---|---|---|---|
| ExAC (all) | ExAC (Finnish cohort) | ||||||
| Chr6, 150949095 | G/A | 0.815 | 3 | 0.01264 | 0.0065 | NS | |
| Chr2, 115840821 | C/A | 0.792 | 2 | 0.00308 | 0.00076 | NS | |
| Chr3, 186854237 | G/A | 0.861 | 1 | 0.00302 | 0.00045 | NS | |
None of the selected variants were detected in the non-AD samples.
NS not significant.
aGenomic assembly version of GRCh 38.
bReference nucleotides (Ref) and alternative nucleotides (Alt).
cAverage pathogenic scores from 16 algorithms (SIFT, Polyphen2_HDIV, Polyphen2_HVAR, MutationTester, PROVEAN, REVEL, CADD, fathmm-MKL, Eigen-PC-raw, GERP++, PolyP100way_vertebrate, PolyP20way_Mammalian, phastCons100way_vertebrate, phastCons20way_mammalian, VEST3, and Siphy_29way_logOdds).
dNumber of patientswith Alzheimer’s disease (AD) in our NBB WES dataset (43 AD and seven non-ADsamples).
Fig. 3Identification of the candidates of the pleiotropic variants for CD and AD/D.
A Integration of the variant–disease associations metaset from our VARIMED and the whole-exome sequencing (WES) dataset. We selected 3 variants in MTHFD1L, DPP10, and ADIPOQ for further validation. B Reconfirmation of selected germline variants using Sanger sequencing.
Fig. 4Validation of pleiotropic effects for CD and AD/D.
A–D Validation of the functional impact of the 3 identified AD/D variants. We constructed site-directed mutant Neuro-2a cell lines (MTHFD1L-M, DPP10-M, and ADIPOQ-M). We assessed the abundance of each cell line of CP12, PHF1, and tau proteins via three biological replicates. A Loading abundance of CP13, PHF1, and tau proteins, as well as β-actin in neuronal cell lines by transfected genes. ‘M’ = mutated. ‘W’ = wild type. B–D Levels of aggregation of CP13, PHF1, and tau normalized by β-actin. ADIPOQ-M displayed abnormal aggregation of tau and CP13 (D) (p < 0.05, t-test). E Analysis of an enriched pathway of 473 selected differentially expressed genes (DEGs) (FDR p < 0.05, log2 fold change >5) in ADIPOQ knockout cells, H9c2 (rat cardiac cell). Cardiac dysfunction and cognition impairment pathways were enriched in 473 DEGs (FDR p < 0.05, hypergeometric test). F–H Functional impact of the ADIPOQ variant on a population scale using the UK Biobank. We selected 69 individuals from the minor allele group of ADIPOQ (c.268G>A) and 276 from the major allele group based on the propensity score matching analysis. F The average heart wall thickness across 16 sites was significantly increased in the ADIPOQ-M group (ADIPOQ c.268A) (p = 0.0023, Wilcoxon test). G Differences in mean reaction time (RT), calculated over 12 rounds, to press a “snap” button when both cards presented matched correctly. The X-axis represents the difference in the measured RTs between the initial assessment (0 years) and the third assessment (5–10 years later). Each plot shows the RTs by allele group (ADIPOQ-W and ADIPOQ-M). H Difference in the mean RTs between the baseline and the third assessment. The minor allele group (ADIPOQ-M) showed a longer mean RT than the major allele group in the third assessment compared to the first assessment (p < 0.05, t-test).