Literature DB >> 30936769

Peripheral transcriptomic biomarkers for early detection of sporadic Alzheimer disease?

Abstract

Alzheimer disease (AD) is the major epidemic of the 21st century, its prevalence rising along with improved human longevity. Early AD diagnosis is key to successful treatment, as currently available therapeutics only allow small benefits for diagnosed AD patients. By contrast, future therapeutics, including those already in preclinical or clinical trials, are expected to afford neuroprotection prior to widespread brain damage and dementia. Brain imaging technologies are developing as promising tools for early AD diagnostics, yet their high cost limits their utility for screening at-risk populations. Blood or plasma transcriptomics, proteomics, and/or metabolomics may pave the way for cost-effective AD risk screening in middle-aged individuals years ahead of cognitive decline. This notion is exemplified by data mining of blood transcriptomics from a published dataset. Consortia blood sample collection and analysis from large cohorts with mild cognitive impairment followed longitudinally for their cognitive state would allow the development of a reliable and inexpensive early AD screening tool.

Entities: Chemical Disease Gene Mutation Species

Keywords: Alzheimer disease; bioinformatics; mild cognitive impairment; proteomics; transcriptomics

Mesh：

Substances：
Biomarkers

Year: 2018 PMID： 30936769 PMCID： PMC6436957

Source DB: PubMed Journal: Dialogues Clin Neurosci ISSN： 1294-8322 Impact factor: 5.986

Early detection of sporadic Alzheimer disease is crucial for successful therapy

Cardiovascular diseases and cancer are currently the major cause of death among the elderly in developed countries. However, the World Health Organization (WHO) projects that by 2050, old-age dementia, most notably as result of sporadic (late-onset) Alzheimer disease (AD), will triple compared with current rates and become the major cause of death in developed as well as in low and middle-income countries.[1] According to the WHO report (updated on December 2017),[1] there are presently 50 million patients with dementia worldwide, with 10 million new cases diagnosed each year, and up to 70% of these diagnosed with AD. Indeed, while there are many different forms of dementia, AD is the most common one and contributes to 60% to 70% of all dementia cases. These numbers are projected to increase to 152 million by 2050, due mostly to increased life expectancies in middle- and low-income countries. Accordingly, the global societal cost is estimated to nearly triple from US$818 billion in 2015 to over US$2 trillion already by 2030.[1] The current societal cost already represents 1.1% of global gross domestic product (GDP); this monetary burden is higher, on average 1.4% of the local GDP, in developed countries. For example, a recent study from France calculated annual costs of nearly €30,000 for the care of patients with advanced AD.[2] Extrapolating this cost for the 152 million AD patients projected by 2050, the global cost could become as high as US$5 trillion, which represents 7% of current global GDP (though costs in low-income countries may be lower). In any scenario, future societal costs for treating AD patients represent a worrying burden for the global economy. The only sensible solution to this worrying trend would be the development of therapeutics for very early disease stages, along with affordable diagnostics for AD risk detection among middle-aged individuals and population-wide screening programs, so that the progress of dementia may be addressed (and ideally halted) prior to dementia onset. Current first-line AD therapeutics include acetylcholinesterase inhibitors and glutamate signaling ligands, and are at best capable of slowing the progress of dementia. Their low therapeutic capacity reflects the fact that once an individual is diagnosed with AD the brain damage is already too extensive for such drugs to repair the widespread impairment.[3,4] Many tentative AD therapeutics are still in various stages of preclinical or clinical development.[5,6] It is hoped that such future “disease-modifying therapeutics” would exhibit improved efficacy in slowing brain neurodegeneration, provided that they are prescribed early on, that is, well before severe brain damage has taken place. Prospective disease-modifying therapeutics include neurotrophic factors or their synthetic mimetics and agents capable of reducing neuroinflammation or allowing clearance of soluble amyloid-β so that its aggregation and subsequent brain accumulation are drastically reduced. At this time most such tentative AD therapeutics are in early stages of development; their approval and marketing may take many years.[7,8] Moreover, it has been suggested that AD may represent a disorder rather than a single distinct disease, reflecting metabolic, developmental, or cardiovascular deficits.[9-11] Hence, once a larger selection of disease-modifying therapeutics become available, hopefully before 2030, successful treatment may require patient stratification according to AD subtypes for prescribing precision medicine therapeutics according to each patient's symptoms, demographics, identified insufficiencies, and comorbidities. In this manner, precision medicine for AD patients would be prescribed similarly to the already practiced successful application of biomarker-dependent precision medicine in oncology.[12,13] In this commentary we suggest that identifying diagnostic biomarkers based on blood or plasma transcriptomes, proteomes, and/or metabolomes may provide the best approach for both enabling cost-effective population-wide screening for individuals at high risk to develop late-onset dementia, as well as enhancing precision medicine for such individuals prior to onset of dementia.[14-16]

Early Alzheimer disease diagnosis by imaging technologies

Precision medicine in oncology is typically based on DNA, RNA, and protein biomarkers detected in biopsied tumor tissues; brain biopsies are obviously not an option for CNS disorders, including neurodegenerative disorders. Thus, research on early detection of AD has focused over the past two decades on brain-imaging technologies. This topic has been extensively reviewed; see for example recent reviews on functional magnetic resonance (fMRI) imaging by Topiwala et al[17] and on positron emission tomography (PET) imaging by Fantoni et al[18] and by Shea et al.[19] In addition to brain imaging, studies on early detection biomarkers for sporadic AD are ongoing with retinal[20] and vascular imaging.[21] Cerebrospinal fluid (CSF) samples are also extensively studied as suitable biofluid for early detection of sporadic AD, in particular with protein studies on CSF amyloid-β and tau.[22-25] Yet, while brain imaging tools and CSF protein analysis offer hopes for improved early AD detection, their high costs (brain imaging) and health risks (CSF sampling) prohibit their use as a population-wide screening tool. By comparison, we estimate that testing an individual for her AD risk with genomic biomarkers would cost around 500 euros. Fees for such tests will likely keep decreasing in 5 to 10 years due to reduced costs of DNA and RNA sequencing technologies.

DNA vs RNA biomarkers for early Alzheimer disease detection

Genome-wide association studies (GWAS) examining the presence of common single-nucleotide polymorphisms (SNPs) in DNA samples from unrelated individuals by using microarray technologies have dominated the search for disease risk and phenotype-associated alleles for two decades.[26-28] Over the last decade, along with decreasing costs of next-generation whole genome DNA-sequencing technologies, human DNA studies have been moving from microarray-based to DNA-sequencing technologies, that yield far higher coverage of an individual's genome sequence. DNA-sequencing informs not only on the presence of common SNPs but also on rare SNPs, as well as on the presence of short insertions, deletions, duplications, and rearrangements in a donor's DNA sequence.[29] Early SNP-based microarrays for human DNA studies typically examined over half-million SNPs. Next-generation whole genome DNA-sequencing (WGS) may examine nearly the entire 3.2 billion nucleotides of the human DNA sequence. Given the large DNA sequence heterogeneity among humans, such huge amounts of data mean that very large cohorts are required for deriving statistically meaningful findings about associations between disease and DNA alleles. Thus, cohorts of tens of thousands of patients along with matched controls have become the norm rather than the exception for identifying disease risk genotypes or phenome-genome associations. Indeed, recently published GWAS have featured 46 939 patients and 27 910 controls for identifying prostate cancer risk alleles[30]; 449 484 individuals for identifying genetic loci associated with neuroticism[31]; 269 867 individuals for identifying loci linked to intelligence[32]; 60 720 patients along with 618 527 controls for identifying allergic rhinitis risk lock[33]; and 498 134 individuals for a UK Biobank study on the effects of coffee drinking on mortality.[34] Studying such extremely large cohorts demonstrates the utility of the GWAS approach, but at the same time, also the need for collecting samples from very large cohorts, a scheme requiring large research consortia. Such consortia illustrate the way forward for advancing genomic medicine and future precision medicine research for complex disorders. However, one must bear in mind that the power afforded by such large GWAS consortia highly depends on accurate patient phenotyping, which becomes more problematic when research teams include a large number of participating medical centers, that, when located in countries where divergent diagnostic criteria are applied, may add noise to the project's data. Such concerns have been raised for example in the context of establishing pharmacogenomic biomarkers for precision medicine[35,36] and seem to be highly relevant for sporadic (late-onset) AD risk detection. The research of sporadic AD has also benefited from GWAS studies. In 2007, a GWAS study of 1086 sporadic AD patients and controls[37] verified the already established status of APOE as the major risk gene for sporadic AD, with the presence of the SNP rs4420638 near APOE yielding odds-risk ratio of 4.01 with a Bonferroni corrected P-value=5.3x10-.[34] To our knowledge, this P-value remains the smallest reported to date with similarly sized patient cohorts of multigenic disorders. Further sporadic AD GWAS have so far identified major risk alleles in about 30 additional genes, some still requiring validation by larger cohort studies.[16] The largest sporadic AD study to date was published by the Alzheimer's Disease Sequencing Project in May 2018[38] and featured exome sequencing of 6965 sporadic AD patients and 13 252 controls. The only gene attaining genome-wide level statistical significance (besides APOE) was SORL1 (sortilin related receptor 1), encoding a preproprotein that, following proteolytic processing, generates a receptor that likely plays a role in protein endocytosis and sorting, possibly affecting the clearance of extracellular amyloid-β.[39] This finding awaits an independent validation. However, these extensive GWAS and exome-sequencing research projects have not yet been (and in our opinion are not likely to be) translated into population-wide programs for sporadic AD risk screening. The reasons are diverse and mirror the insufficient sensitivity and specificity of data derived from DNA genotyping or sequencing alone for AD risk determination. This in turn reflects that besides DNA sequence, an individual's non-genetic factors, including lifestyle, level of education, diet, gut microbiome, comorbidities, and other external exposures contribute to the risk of acquiring sporadic AD.[40,41] Such non-genetic factors may or may not be reflected by an individual's epigenome, the collective term for non-inherited DNA modifications, which have been extensively studied also in the context of AD risk prediction.[42,43] In contrast with DNA genotyping or sequencing, transcriptomics, the study of gene expression profiles at the level of RNA (including also the transcribed RNA of noncoding genes, microRNAs and long-noncoding RNAs), informs not only about inherited but also, albeit indirectly, about noninherited genomic information.[44,45] As such, transcriptomic studies also capture genomic information affected by individuals' environmental exposures and comorbidities; these may in part be relevant for risks of acquiring complex diseases, including sporadic AD. Thus transcriptomics appear to potentially be more informative compared with DNA genotyping or DNA sequencing for developing diagnostic tools for complex disorders, including sporadic AD.[46,47] An additional advantage is derived from improved statistical power: transcriptomic technologies examine the expression levels of up to 20 000 human genes in given biological samples, that is, at least 1000-fold fewer variables are being measured compared with GWAS or DNA sequencing studies; this in turn allows more power and thus smaller cohorts.[48,49] Deep sequencing may in addition detect up to twice as many long-noncoding RNAs (1ncRNAs), though the functions of most of them remain enigmatic. It is important to keep in mind that applying RNA as early AD risk biomarkers, as well as in the context of other diseases, has its limitations: the transcriptome is highly dynamic[50] and highly cell-specific[51]; hence RNA findings in peripheral tissues should be taken with caution, and require support by additional diagnostic tools. Moreover, mRNA levels may not faithfully reflect the corresponding protein level.[52] Hence, studies are required for validation of tentative RNA biomarkers reported by transcriptomic studies with postmortem brain tissues or blood samples. In addition to blood, saliva samples are also a likely source for early diagnosis of AD. In recent years, several notable findings in salivary samples have been reported that await confirmation in larger AD and mild cognitive impairment (MCI) cohorts. These studies include measurements of salivary lactoferrin,[53] amyloid-β,[54,55] and tau.[56] However, salivary tau was found unsuitable as an AD or MCI biomarker in a recent larger cohort.[57] To our knowledge, no studies (as of July 2018) have reported on tentative salivary-based RNA biomarkers for AD or MCI. As noted above, a key problem with early AD diagnostics is that the brain tissue is inaccessible for biopsy sampling, while CSF collection is not a valid option for population-wide screening; which leaves blood or saliva as the most likely optional biosamples. Blood samples have extensively been employed for searching early AD biomarkers, in particular proteomic and metabolomic biomarkers, topics beyond the scope of this article. Readers are referred to fine recent reviews on protein[58-61] and metabolome[62] early AD biomarkers. A roadmap for developing a blood test for early AD detection was recently proposed by Kiddle et al;[61] these authors outline a plan focused on changing research approach concepts from small studies to large multicenter consortia efforts and improved data sharing between research teams, a notion with which we strongly concur as the best way forward. Below, for demonstrating the untapped potential for discovering early AD biomarkers by studying blood transcriptomic signatures from large cohorts of MCI and AD patients, we provide a demonstrative example based on data mining of a single blood gene expression dataset.

MCI peripheral blood data mining example as proof of concept

Data mining of existing blood transcriptomics datasets was performed on the Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/gds/) on dataset GSE63060 (contributed by Sood et al 2015)[63] using the GE02R software tool (https://www.ncbi.nlm.nih.gov/geo/geo2r/). This tool allows comparison of groups of samples whose transcriptomics datasets have been deposited on the GEO server in order to identify genes that are differentially expressed between groups of samples. We applied this tool on dataset GSE63060[63] for comparing sets of candidate genes; we chose four genes reported as having differential expression in blood derived cell lines based on amyloid-β sensitivity or when comparing cell lines from AD patients and controls: RGS2, SIRT1, INPP4B, and FAH. [64,65] To this list we added a fifth gene, SERPINA1, based on the suggestion of higher serum alpha-1-antitrypsin protein as a tentative AD biomarker by Wetterling and Tegtmeyer, 1994.[66] Next, the receiver operating characteristics (ROC) curve[67] was applied for analyzing the blood expression levels of the above genes as a tentative MCI prediction tool, based on the candidate genes. ROC curves demonstrate the diagnostic capacity of classifiers as their discrimination thresholds are varied. In the context of diagnostic biomarkers, they allow the estimation of the classifier accuracy, by showing the relationships between clinical sensitivity and specificity for group differences. In a given ROC curve, the x-axis shows 1 minus specificity (false positive) fraction and the y-axis shows sensitivity (true positive) fraction. Additionally, the area under the curve (AUC) of a ROC curve of a given test can be used as a measure for its discriminative capacity. As shown in , while each of the five genes showed a fold-difference (FD) of between 1.1 and 1.3 between MCI and nondemented controls (with significance values P=3.47E-03 and P=2.7E-07), our tentative 5-gene tool showed a superior separation between MCI and controls: FD=1. 8 and P=4.60E-12. As shown in Table I, ROC curve analysis applied on the transcriptomic dataset GSE63060, which shows gene expression levels in blood samples from AD patients (n=145), MCI patients (n=80), and nondemented age-matched controls (n=104) indicated that our tentative 5-gene tool identified MCI patients vs controls with an AUC=0.777 and P=1.20E-10. Applying the same tool on the AD patients vs controls in the same GSE63060 dataset identified AD patients with AUC=0.683 and P=8.8E-7. Less robust AUC and P-values were found for applying the same tool on GSE63061 by the same authors,[63] which shows gene expression levels in blood samples from AD patients (n=139), MCI patients (n=109) and non-demented controls (n=134). Of note, in both the latter GEO datasets, ROC curve analysis of our proposed 5-gene tool indicated more robust identification of MCI patients compared with AD patients (each compared with nondemented age-matched controls). This observation suggests that our proposed tool is likely more adequate for correctly classifying MCI than AD patients. We identified two further GSE files containing transcriptomic data from at least 40 individuals, GSE85426 and GSE4229.[68] However, these two files, unlike GSE63060 and GSE63061, include data derived from RNA of peripheral blood mononuclear cells (PBMC) rather than whole blood, do not include MCI patients, and have considerably fewer samples (these datasets are from 90 and 18 AD patients and 90 and 22 nondemented controls, respectively). Indeed, ROC curves for the latter two datasets indicated that our example 5-gene prediction tool could not distinguish between AD patients and controls. Therefore, we cannot conclude about the plausible MCI or AD biomarker utility of transcriptomics of whole blood biosamples compared with PBMC, nor about the value of transcriptomics for identifying MCI compared with AD patients. The reasons are, unfortunately, the miniscule number of such GEO datasets currently available for bioinformatics analysis. This clearly demonstrates the need for further blood transcriptomic studies with larger MCI and AD cohorts, as well as the unmet need for open data sharing from such studies.

Looking ahead

The peripheral blood GSE63060 data mining example based on the expression of five candidate genes presented in Figure 1 is definitely inadequate for serving as early diagnostics for sporadic AD detection. We present it merely to illuminate the potential of transcriptomic data mining in discovering early sporadic AD biomarkers that may, in future, be utilized for population-wide screening. Table I demonstrates that at this time, our illustrative example lacks validation owing to the lack of sufficiently large open GEO datasets from MCI patients. GSE63060 includes transcriptomic data from merely 80 MCI and 104 matched nondemented individuals; as discussed above, much larger cohorts (at least several hundreds of individuals) would be needed for developing prediction tools for sporadic AD with robust specificity and sensitivity. For example, a recent transcriptomewide association study included breast tissue RNA from 229 000 women (122 977 cases and 105 974 controls) for identifying new candidate susceptibility genes for breast cancer.[69] Having large cohort datasets openly shared on the GEO website would allow the application of machine learning tools[70,71] for devising a reliable and affordable diagnostics tools for AD as well as other disorders of old age. Ideally, such datasets should incorporate blood transcriptomics (or microRNA profiling of plasma or serum) along with proteomic and metabolomic data.[72,73] In addition, efforts must include the collection of longitudinal blood samples as well as of longitudinal cognitive score data, along the successful UK Biobank approach. With increased projected human longevity, and the expected approval and marketing of disease modifying therapeutics for neurodegenerative disorders, investing in such studies and making their data openly available seems to be the best way forward.

AUC and P-values derived from GEO datasets following ROC curve analysis by the 5-gene example tool presented in Figure 1.

GSE#	Reference	RNA source	MCI	AD	Control	AUC (MCI vs control)	AUC (AD vs control)	Comments
GSE63060	63	whole blood	80	145	104	AUC=0.777	AUC=0.683
						P=1.20E-10	P=8.8E-7
GSE63061	63	whole blood	109	139	134	AUC=0.638	AUC=0.627
						P=2.2E-4	P=2.7E-4
GSE85426	NA	PBMC	0	90	90	NA	AUC=0.472	No INPP4B probe
							P=0.524
GSE4229	68	PBMC	0	18	22	NA	AUC=0.500	No SIRT1 probe
							P=1.000

71 in total

1. Deciphering mRNA Sequence Determinants of Protein Production Rate.

Authors: Juraj Szavits-Nossan; Luca Ciandrini; M Carmen Romano
Journal: Phys Rev Lett Date: 2018-03-23 Impact factor: 9.161

Review 2. The vascular facet of late-onset Alzheimer's disease: an essential factor in a complex multifactorial disorder.

Authors: Yasser Iturria-Medina; Vladimir Hachinski; Alan C Evans
Journal: Curr Opin Neurol Date: 2017-12 Impact factor: 5.710

Review 3. Lost in Translation? Finding Our Way To Effective Alzheimer's Disease Therapies.

Authors: Joseph F Quinn
Journal: J Alzheimers Dis Date: 2018 Impact factor: 4.472

Review 4. A Review of Fluid Biomarkers for Alzheimer's Disease: Moving from CSF to Blood.

Authors: Kaj Blennow
Journal: Neurol Ther Date: 2017-07-21

5. Early diagnosis of mild cognitive impairment and Alzheimer's disease based on salivary lactoferrin.

Authors: Eva Carro; Fernando Bartolomé; Félix Bermejo-Pareja; Alberto Villarejo-Galende; José Antonio Molina; Pablo Ortiz; Miguel Calero; Alberto Rabano; José Luis Cantero; Gorka Orive
Journal: Alzheimers Dement (Amst) Date: 2017-05-26

6. SIRT1, miR-132 and miR-212 link human longevity to Alzheimer's Disease.

Authors: A Hadar; E Milanesi; M Walczak; M Puzianowska-Kuźnicka; J Kuźnicki; A Squassina; P Niola; C Chillotti; J Attems; I Gozes; D Gurwitz
Journal: Sci Rep Date: 2018-05-31 Impact factor: 4.379

7. CSF biomarkers of Alzheimer's disease concord with amyloid-β PET and predict clinical progression: A study of fully automated immunoassays in BioFINDER and ADNI cohorts.

Authors: Oskar Hansson; John Seibyl; Erik Stomrud; Henrik Zetterberg; John Q Trojanowski; Tobias Bittner; Valeria Lifke; Veronika Corradini; Udo Eichenlaub; Richard Batrla; Katharina Buck; Katharina Zink; Christina Rabe; Kaj Blennow; Leslie M Shaw
Journal: Alzheimers Dement Date: 2018-03-01 Impact factor: 21.566

8. Expression profiling: a cost-effective biomarker discovery tool for the personal genome era.

Authors: David Gurwitz
Journal: Genome Med Date: 2013-05-14 Impact factor: 11.117

9. Whole-exome sequencing in 20,197 persons for rare variants in Alzheimer's disease.

Authors: Neha S Raghavan; Adam M Brickman; Howard Andrews; Jennifer J Manly; Nicole Schupf; Rafael Lantigua; Charles J Wolock; Sitharthan Kamalakaran; Slave Petrovski; Giuseppe Tosto; Badri N Vardarajan; David B Goldstein; Richard Mayeux
Journal: Ann Clin Transl Neurol Date: 2018-05-24 Impact factor: 4.511

10. Synaptic proteins in CSF as potential novel biomarkers for prognosis in prodromal Alzheimer's disease.

Authors: Flora H Duits; Gunnar Brinkmalm; Charlotte E Teunissen; Ann Brinkmalm; Philip Scheltens; Wiesje M Van der Flier; Henrik Zetterberg; Kaj Blennow
Journal: Alzheimers Res Ther Date: 2018-01-15 Impact factor: 6.982

4 in total

Review 1. Transcriptomics in Alzheimer's Disease: Aspects and Challenges.

Authors: Eva Bagyinszky; Vo Van Giau; SeongSoo A An
Journal: Int J Mol Sci Date: 2020-05-15 Impact factor: 5.923

2. Mapping the gene network landscape of Alzheimer's disease through integrating genomics and transcriptomics.

Authors: Sara Brin Rosenthal; Hao Wang; Da Shi; Cin Liu; Ruben Abagyan; Linda K McEvoy; Chi-Hua Chen
Journal: PLoS Comput Biol Date: 2022-02-25 Impact factor: 4.475

3. Screening and Identification of Potential Peripheral Blood Biomarkers for Alzheimer's Disease Based on Bioinformatics Analysis.

Authors: Xin Wang; Lantao Wang
Journal: Med Sci Monit Date: 2020-08-19

4. Brain-Derived Neurotrophic Factor (BDNF) Preserves the Functional Integrity of Neural Networks in the β-Amyloidopathy Model in vitro.

Authors: Elena V Mitroshina; Roman S Yarkov; Tatiana A Mishchenko; Victoria G Krut'; Maria S Gavrish; Ekaterina A Epifanova; Alexey A Babaev; Maria V Vedunova
Journal: Front Cell Dev Biol Date: 2020-07-08

4 in total