| Literature DB >> 27727239 |
Mariet Allen1, Minerva M Carrasquillo1, Cory Funk2, Benjamin D Heavner2, Fanggeng Zou1, Curtis S Younkin3, Jeremy D Burgess1, High-Seng Chai4, Julia Crook2, James A Eddy2, Hongdong Li2, Ben Logsdon5, Mette A Peters5, Kristen K Dang5, Xue Wang3, Daniel Serie3, Chen Wang4, Thuy Nguyen1, Sarah Lincoln1, Kimberly Malphrus1, Gina Bisceglio1, Ma Li1, Todd E Golde6, Lara M Mangravite5, Yan Asmann2, Nathan D Price2, Ronald C Petersen7, Neill R Graff-Radford8, Dennis W Dickson1, Steven G Younkin1, Nilüfer Ertekin-Taner1,8.
Abstract
Previous genome-wide association studies (GWAS), conducted by our group and others, have identified loci that harbor risk variants for neurodegenerative diseases, including Alzheimer's disease (AD). Human disease variants are enriched for polymorphisms that affect gene expression, including some that are known to associate with expression changes in the brain. Postulating that many variants confer risk to neurodegenerative disease via transcriptional regulatory mechanisms, we have analyzed gene expression levels in the brain tissue of subjects with AD and related diseases. Herein, we describe our collective datasets comprised of GWAS data from 2,099 subjects; microarray gene expression data from 773 brain samples, 186 of which also have RNAseq; and an independent cohort of 556 brain samples with RNAseq. We expect that these datasets, which are available to all qualified researchers, will enable investigators to explore and identify transcriptional mechanisms contributing to neurodegenerative diseases.Entities:
Mesh:
Year: 2016 PMID: 27727239 PMCID: PMC5058336 DOI: 10.1038/sdata.2016.89
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Meta-data for each of the four studies.
| A. Mayo LOAD GWAS (Data Citation 2) | LOAD Case control GWAS. Uses samples from 3 cohorts: Total 2,099 subjects (Post-QC). This data is used to identify loci associated with LOAD risk. | Mayo Clinic Jacksonville (JS)/Antemortem | Clinical: AD Cases and Controls, collected at Mayo Clinic Jacksonville. Age at first diagnosis of AD or age at study entry: 60–80. | LOAD GWAS Genotypes, demographics | Illumina Hap 300 | Carrasquillo | |
| Mayo Clinic Rochester (RS)/Antemortem | Clinical: AD Cases and Controls, collected at Mayo Clinic Rochester. Age at first diagnosis of AD or age at study entry: 60–80. | ||||||
| Mayo Clinic Brain Bank (AUT)/ Postmortem | Post-mortem: AD Cases (Braak ≥4.0) and Other Pathologies (Braak ≤2.5). Age at death: 60–80. | ||||||
| B. Mayo eGWAS (Data Citations 3,4) | WG-DASL gene expression measures for a subset of Mayo Brain Bank subjects that were included in the Mayo LOAD GWAS: RNA was isolated from two brain regions: TCX and CER. This data is utilized to identify loci associated with brain gene expression in subjects with AD, subjects with Other brain pathologies that do not meet criteria for AD (Non-AD), and the combined cohort. | Mayo Brain Bank/Temporal Cortex | Post-mortem: AD Cases (Braak ≥4.0) and Other Pathologies (Braak ≤2.5). Age at death: 60–80. | Gene expression phenotypes, eGWAS results, covariates | Illumina WG-DASL | Zou | |
| Mayo Brain Bank/Cerebellum | |||||||
| C. Mayo Pilot RNAseq (Data Citation 5) | RNAseq gene expression measures for a subset of Mayo Brain Bank subjects that were included in the Mayo LOAD GWAS: RNA was isolated from TCX. This data is utilized to identify loci associated with brain gene expression in subjects with AD and subjects with PSP. | Mayo Brain Bank/Temporal Cortex | Post-mortem: AD Cases (Braak ≥4.0) and pathologic diagnosis of PSP (Braak≤2.5). Age at death: 60–80. | Gene expression phenotypes, covariates | IlluminaHiSeq2000, 50 bp, paired end RNAseq | Allen | |
| D. Mayo RNAseq (Data Citations 6,7) | RNAseq gene expression measures for subjects from the Mayo Brain Bank non-overlapping with the Mayo LOAD GWAS, and also from Banner Sun Health Institute. RNA was isolated from two brain regions: TCX and CER. This data is utilized to compare brain gene expression between different pairwise diagnostic groups. | Mayo Brain Bank and Banner Sun Health/Temporal Cortex | Post-mortem: AD Cases (Braak ≥4.0), pathologic diagnoses of PSP (Braak≤3), pathologic aging (Braak≤3) and elderly control brains (Braak≤3) without neurodegenerative diagnoses. Age at death≤60. | Gene expression phenotypes, covariates | IlluminaHiSeq2000, 101 bp, paired end RNAseq | NA | |
| Mayo Brain Bank and Banner Sun Health/Cerebellum |
Figure 1Overview of the relationship of the four genomic datasets herein described.
Demographics for the cohorts included in each of the four studies.
| 74.0±4.8 (60–80) | 73.2±4.4 (60–80) | 73.6±5.5 (60–80) | 71.6±5.6 (60–80) | 73.6±5.6 (60–80) | 71.7±5.5 (60–80) | 74.1±5.7 (60–80) | 71.9±5.4 (60–80) | |
| 549/277/18 (65%) | 344/889/22 (27%) | 123/79/0 (61%) | 49/146/2 (25%) | 126/71/0 (64%) | 45/130/2 (25%) | 58/36/0 (62%) | 20/72/0 (22%) | |
| 482 (57%) | 641 (51%) | 108 (53%) | 78 (40%) | 101 (51%) | 63 (36%) | 41 (44%) | 37 (40%) | |
| NA | NA | 6.3±0.9 (5–9) | 6.9±1.0 (5–9.3) | 7.2±1.0 (5–9.4) | 7.2±1.0 (5–9) | 7.0±0.7 (6.2–9) | 7.0±0.9 (5.7–9.3) | |
Description of data files deposited in the AMP-AD Knowledge Portal.
| The Genotype and Covariate data from the Mayo LOAD GWAS study (A) and the eSNP results under the Mayo eGWAS study (B) were previously made available through the NIAGADS respository: | |||||||
|---|---|---|---|---|---|---|---|
| A. Mayo LOAD GWAS (Data Citation 2) | 10.7303/syn2910256 | Genotype and Covariate | MayoLOADGWAS_SNPGenotypes_covariates.csv | 10.7303/syn3205821.6 | Covariate information for subjects included in Mayo LOAD GWAS, comma delimited text file | FID=Family ID; IID=Individual ID; RSGWAS=‘1’ indicates if sample is part of Mayo Clinic Rochester Antemortem Cohort; AUTGWAS=‘1’ indicates if sample is part of Mayo Clinic Brain Bank Postmortem Cohort; AgeOver60=number of years beyond the age of 60 for age at death (Mayo Clinic Brain Bank Postmortem Cohort) or age at diagnosis (Mayo Clinic Rochester and Jacksonville Antemortem Cohorts); Sex=1 if Male; APOE4_Dose (+/−)=‘1’ indicates carriers of at least one copy of the APOEε4 allele; APOE4_dosage (0,1,2)= number of APOEε4 alleles ; APOE_Genotype=APOE genotype calls. For all columns ‘−9’ indicates missing data. | Carrasquillo |
| MayoLOADGWAS_SNPGenotypes.bed | 10.7303/syn3205812.4 | LOAD GWAS binary ped file (PLINK format), genotype information | Binary File, No Column Headers. | ||||
| MayoLOADGWAS_SNPGenotypes.bim | 10.7303/syn3205814.4 | LOAD GWAS binary map file (PLINK format), variant information | Binary File, No Column Headers. | ||||
| MayoLOADGWAS_SNPGenotypes.fam | 10.7303/syn3205816.4 | LOAD GWAS fam file (PLINK format), pedigree and phenotype information | Six columns: Column 1=Family ID; Column 2= Individual ID; Column 3=Paternal ID; Column 4=Maternal ID; Column 5=Sex (1=male; 2=female); Column 6=Phenotype (1=control; 2= AD case) | ||||
| B. Mayo eGWAS (Data Citation 3,Data Citation 4) | 10.7303/syn3157225 | Array Expression and Covariate (Data Citation 3) | MayoEGWAS_arrayExpression_CBE_covariates.csv | 10.7303/syn3256502.1 | Covariate information for subjects with cerebellum expression measures, comma delimited text file. | Dxn=Diagnosis (0=Non-AD, 1=AD); Sex=1 if female and 0 if male; Age=Age at Death; E4dose=Number of APOEε4 alleles; plate1-plate4 (cerebellum) or plate 2-plate 5 (temporal cortex)=PCR plate - technical covariate; RIN=RNA intergrity number - a numerical assessment of the integrity of RNA; RINsqAdj=(RIN-RINmean)^2 - a statistical adjustment of the RIN. | Zou |
| MayoEGWAS_arrayExpression_TCX_covariates.csv | 10.7303/syn3617056.1 | Covariate information for subjects with temporal cortex expression measures, comma delimited text file. | |||||
| MayoEGWAS_arrayExpression_CBE.csv | 10.7303/syn3256501.1 | Gene expression phenotypes from cerebellum tissue samples, comma delimited text file. | FID=Family ID; IID=Individual ID; ILMN_1762337 to ILMN_2137536=WG-DASL Illumina Probe ID. | ||||
| MayoEGWAS_arrayExpression_TCX.csv | 10.7303/syn3617054.1 | Gene expression phenotypes from temporal cortex tissue samples, comma delimited text file. | |||||
| 10.7303/syn3157249 | eSNP Results (Data Citation 4) | MayoEGWAS_analysis_eQTL_CBE_results_AD.gz | 10.7303/syn3207163.3 | Brain expression GWAS (eGWAS) results obtained using Hap300 genotypes in the cerebellar samples of the AD subjects, text file with 444,372 rows. | CHR=Chromosome; SNP=SNP rs number; BP=Physical position (base-pair) according to NCBI Ref Seq, Build 36.2; A1=Tested Allele; TEST=The ‘additive’ SNP genotype test; NMISS=Number of non-missing individuals included in analysis; BETA=Regression coefficient. Based on the SNP minor allele using an additive model; STAT=Coefficient t-statistic; P=Asymptotic p-value for t-statistic (uncorrected); PROBE=WG-DASL Illumina Probe ID; txStart=Starting base pair for the RefSeq gene the brain level of which is tested for associations. (Based on Build 36.2); txEnd=Ending base pair for the RefSeq gene the brain level of which is tested for associations. (Based on Build 36.2); SYMBOL=Gene symbol for the RefSeq gene tested in the eGWAS; hasSNP=If the probe sequence harbors ≥1 SNPs, then this is shown as ‘TRUE’ under the ‘SNP-In-Probe’ column, and ‘FALSE’, otherwise. | ||
| MayoEGWAS_analysis_eQTL_CBE_results_nonAD.gz | 10.7303/syn3207167.3 | Brain expression GWAS (eGWAS) results obtained using Hap300 genotypes in the cerebellar samples of the non-AD subjects, text file with 443,171 rows | |||||
| MayoEGWAS_analysis_eQTL_CBE_results.gz | 10.7303/syn3207165.3 | Brain expression GWAS (eGWAS) results obtained using Hap300 genotypes in the cerebellar samples of the combined AD and non-AD subjects, text file with 443,784 rows | |||||
| MayoEGWAS_analysis_eQTL_TCX_results_AD.gz | 10.7303/syn3207169.3 | Brain expression GWAS (eGWAS) results obtained using Hap300 genotypes in the temporal cortex samples of the AD subjects, text file with 450,813 rows. | |||||
| MayoEGWAS_analysis_eQTL_TCX_results_nonAD.gz | 10.7303/syn3207175.3 | Brain expression GWAS (eGWAS) results obtained using Hap300 genotypes in the temporal cortex samples of the non-AD subjects, text file with 440,065 rows. | |||||
| MayoEGWAS_analysis_eQTL_TCX_results.gz | 10.7303/syn3207173.3 | Brain expression GWAS (eGWAS) results obtained using Hap300 genotypes in the temporal cortex samples of the combined AD and non-AD subjects, text file with 445,356 rows. | |||||
| MayoEGWAS_analysis_eQTL_CBE_imputedResults_AD.gz | 10.7303/syn3207177.3 | Brain expression GWAS (eGWAS) results obtained using HapMap2 imputed genotypes in the cerebellar samples of the AD subjects, text file with 4,427,924 rows. | CHR=Chromosome; SNP=SNP rs number; BP=Physical position (base-pair) according to NCBI Ref Seq, Build 36.2; A1=Tested Allele; TEST=The ‘additive’ SNP genotype test; NMISS=Number of non-missing individuals included in analysis; BETA=Regression coefficient. Based on the SNP minor allele using an additive model; STAT=Coefficient t-statistic; P=Asymptotic p-value for t-statistic (uncorrected); PROBE=WG-DASL Illumina Probe ID; txStart=Starting base pair for the RefSeq gene the brain level of which is tested for associations. (Based on Build 36.2); txEnd=Ending base pair for the RefSeq gene the brain level of which is tested for associations. (Based on Build 36.2); SYMBOL=Gene symbol for the RefSeq gene tested in the eGWAS; hasSNP=If the probe sequence harbors ≥1 SNPs, then this is shown as ‘TRUE’ under the ‘SNP-In-Probe’ column, and ‘FALSE’, otherwise. We also calculated for each probe within each analytic group, percent detection rate above background. Probes that are detected in >12.5%, >25%, >50% and >75% of the subjects in each analytic group are annotated by four separate columns labeled respectively and indicated with ‘TRUE’ and ‘FALSE’ statetments. | ||||
| MayoEGWAS_analysis_eQTL_CBE_imputedResults_nonAD.gz | 10.7303/syn3207181.3 | Brain expression GWAS (eGWAS) results obtained using HapMap2 imputed genotypes in the cerebellar samples of the non-AD subjects, text file with 4,419,055 rows | |||||
| MayoEGWAS_analysis_eQTL_CBE_imputedResults.gz | 10.7303/syn3207179.3 | Brain expression GWAS (eGWAS) results obtained using HapMap2 imputed genotypes in the cerebellar samples of the combined AD and non-AD subjects, text file with 4,483,512 rows | |||||
| MayoEGWAS_analysis_eQTL_TCX_imputedResults_AD.gz | 10.7303/syn3207183.3 | Brain expression GWAS (eGWAS) results obtained using HapMap2 imputed genotypes in the temporal cortex samples of the AD subjects, text file with 4,426,363 rows. | |||||
| MayoEGWAS_analysis_eQTL_TCX_imputedResults_nonAD.gz | 10.7303/syn3207187.3 | Brain expression GWAS (eGWAS) results obtained using HapMap2 imputed genotypes in the temporal cortex samples of the non-AD subjects, text file with 4,425,955 rows. | |||||
| MayoEGWAS_analysis_eQTL_TCX_imputedResults.gz | 10.7303/syn3207185.3 | Brain expression GWAS (eGWAS) results obtained using HapMap2 imputed genotypes in the temporal cortex samples of the combined AD and non-AD subjects, text file with 4,484,344 rows. | |||||
| C. Mayo Pilot RNAseq (Data Citation 5) | 10.7303/syn3157268 | RNA-seq Alignment, Expression and Covariate | MayoPilotRNAseq Alzheimers Disease RNAseq BAMs | 10.7303/syn5580964 | Folder of BAM alignment files, 1 file per sample. Sample ID (IID) for each subject is provided in the first 12 digits of each file name. | NA | Allen |
| MayoPilotRNAseq Progressive Supranuclear Palsy RNAseq BAMs | 10.7303/syn5584594 | Folder of BAM alignment files, 1 file per sample. Sample ID (IID) for each subject is provided in the first 12 digits of each file name. | |||||
| MayoPilotRNAseq_RNAseq_TCX_AD_covariates.csv | 10.7303/syn3607480.2 | Covariate information for AD subjects with temporal cortex expression measures, comma delimited text file. | IlluminaSampleID=Individual ID (matches corresponding IID in Mayo LOAD GWAS files); Sex=M if Male and F if Female; Age at death; RIN=RNA intergrity number - a numerical assessment of the integrity of RNA; RINsqAdj=(RIN-RINmean)^2 - a statistical adjustment of the RIN; Library Batch=RNAseq library preparation batch; FCC1MR9ACXX - FCD1LUUACXX (AD) or FCD1GH3ACXX -FCC1CDJACXX (PSP)=‘1’ indicates sample was on sequencing flowcell - technical covariate; Flowcell=sequencing flowcell name. | ||||
| MayoPilotRNAseq_RNAseq_TCX_PSP_covariates.csv | 10.7303/syn3607506.1 | Covariate information for PSP subjects with temporal cortex expression measures, comma delimited text file. | |||||
| MayoPilotRNAseq_RNAseq_TCX_AD_geneCounts.tsv | 10.7303/syn3607482.2 | Gene expression phenotypes from AD temporal cortex tissue samples, space delimited text file with 64,254 rows. | Column 1=Ensemble Gene IDs; 1823480105_B - 1796012340_B=Illumina Sample ID's (matches corresponding IID in Mayo LOAD GWAS files). | ||||
| MayoPilotRNAseq_RNAseq_TCX_AD_geneCounts_normalized.tsv | 10.7303/syn3607497.1 | Normalized gene expression phenotypes from AD temporal cortex tissue samples, tab delimited text file with 64,254 rows | |||||
| MayoPilotRNAseq_RNAseq_TCX_AD_transcriptCounts.tsv | 10.7303/syn3607485.2 | Transcript expression phenotypes from AD temporal cortex tissue samples, space delimited text file with 208,245 rows. | Column 1=Ensemble Transcript IDs; 1823480105_B - 1796012340_B=Illumina Sample ID's (matches corresponding IID in Mayo LOAD GWAS files). | ||||
| MayoPilotRNAseq_RNAseq_TCX_AD_transcriptCounts_normalized.tsv | 10.7303/syn3607502.1 | Normalized transcript expression phenotypes from AD temporal cortex tissue samples, tab delimited text file with 208,245 rows. | |||||
| MayoPilotRNAseq_RNAseq_TCX_PSP_geneCounts.tsv | 10.7303/syn3607487.2 | Gene expression phenotypes from PSP temporal cortex tissue samples, space delimited text file with 64,254 rows | Column 1=Ensemble Gene IDs; 1811024586_A - 1811024534_A=Illumina Sample ID's (matches corresponding IID in Mayo LOAD GWAS files). | ||||
| MayoPilotRNAseq_RNAseq_TCX_PSP_geneCounts_normalized.tsv | 10.7303/syn3607513.1 | Normalized gene expression phenotypes from PSP temporal cortex tissue samples, tab delimited text file with 64,254 rows | |||||
| MayoPilotRNAseq_RNAseq_TCX_PSP_transcriptCounts.tsv | 10.7303/syn3607489.2 | Transcript expression phenotypes from PSP temporal cortex tissue samples, space delimited text file with 208,245 rows. | Column 1=Ensemble Transcript IDs; 1811024586_A - 1811024534_A=Illumina Sample ID's (matches corresponding IID in Mayo LOAD GWAS files). | ||||
| MayoPilotRNAseq_RNAseq_TCX_PSP_transcriptCounts_normalized.tsv | 10.7303/syn3607519.1 | Normalized transcript expression phenotypes from PSP temporal cortex tissue samples, tab delimited text file with 208,245 rows. | |||||
| D. Mayo RNAseq (Data Citation 6,Data Citation 7) | 10.7303/syn3163039 | TCX RNA-seq Alignment, Expression and Covariate (Data Citation 6) | MayoRNAseq Temporal Cortex BAMs | 10.7303/syn4894912 | Folder of BAM alignment files, 1 file per sample. Sample ID (IID) for each subject is provided in the first 6 to 8 digits of each file name. | NA | NA |
| MayoRNAseq_RNAseq_TCX_covariates.csv | 10.7303/syn3817650.5 | Covariate information for subjects with temporal cortex expression measures, comma delimited text file. | Sample ID=3 to 5 digit unique sample identifier followed by _TCX; Source=Pathology lab that provided the tissue; Tissue=Brain region sampled; RIN=RNA intergrity number - a numerical assessment of the integrity of RNA; Diagnosis=Pathological diagnosis at time of death (AD=Alzheimer's Disease; PSP=Progressive Supranuclear Palsy; Pathologic Aging; Control); Sex=M if Male, F if Female; AgeAtDeath; Flowcell=sequencing flowcell. | ||||
| MayoRNAseq_RNAseq_TCX_geneCounts.tsv | 10.7303/syn4650257.8 | Gene expression phenotypes, temporal cortex tissue samples, space delimited text file with 64,254 rows | Column 1=Ensemble Gene IDs; 11344_TCX to 1047_TCX etc.=Sample ID's (matches corresponding SampleID in Covariate file). | ||||
| MayoRNAseq_RNAseq_TCX_geneCounts_normalized.tsv | 10.7303/syn4650265.4 | Normalized gene expression phenotypes, temporal cortex tissue samples, tab delimited text file with 64,254 rows | |||||
| MayoRNAseq_RNAseq_TCX_transcriptCounts.tsv | 10.7303/syn5600752.2 | Transcript expression phenotypes, temporal cortex tissue samples, space delimited text file with 208,245 rows | Column 1=Ensemble Transcript IDs; 11344_TCX to 1047_TCX etc.=Sample ID's (matches corresponding SampleID in Covariate file). | ||||
| MayoRNAseq_RNAseq_TCX_transcriptCounts_normalized.tsv | 10.7303/syn5600755.2 | Normalized Transcript expression phenotypes, temporal cortex tissue samples, tab delimited text file with 208,245 rows | |||||
| 10.7303/syn5049298 | CER RNA-seq Alignment, Expression and Covariate (Data Citation 7) | MayoRNAseq Cerebellum BAMs | 10.7303/syn5049322 | Folder of BAM alignment files, 1 file per sample. Sample ID for each subject is provided in the first 6 to 8 digits of each file name. | NA | ||
| MayoRNAseq_RNAseq_CBE_covariates.csv | 10.7303/syn5223705.2 | Covariate information for subjects with cerebellum expression measures, comma delimited text file. | Sample ID=3 to 5 digit unique sample identifier followed by _CER; Source=Pathology lab that provided the tissue; Tissue=Brain region sampled; RIN=RNA intergrity number - a numerical assessment of the integrity of RNA; Diagnosis=Pathological diagnosis at time of death (AD=Alzheimer's Disease; PSP=Progressive Supranuclear Palsy; Pathologic Aging; Control); Sex=M if Male, F if Female; AgeAtDeath; ApoE=APOE genotype; Flowcell=sequencing flowcell. | ||||
| MayoRNAseq_RNAseq_CBE_geneCounts.tsv | 10.7303/syn5201012.4 | Gene expression phenotypes, cerebellum tissue samples, tab delimited text file with 64,254 rows | Column 1=Ensemble Gene IDs; 1000_CER to 991_CER= Sample ID's (matches corresponding SampleID in Covariate file). | ||||
| MayoRNAseq_RNAseq_CBE_geneCounts_normalized.tsv | 10.7303/syn5201007.4 | Normalized gene expression phenotypes, cerebellum tissue samples, tab delimited text file with 64,254 rows | |||||
| MayoRNAseq_RNAseq_CBE_transcriptCounts.tsv | 10.7303/syn5600773.2 | Transcript expression phenotypes, cerebellum tissue samples, space delimited text file with 208,245 rows | Column 1=Ensemble Transcript IDs; 1000_CER to 991_CER=Sample ID's (matches corresponding SampleID in Covariate file). | ||||
| MayoRNAseq_RNAseq_CBE_transcriptCounts_normalized.tsv | 10.7303/syn5600772.2 | Normalized Transcript expression phenotypes, cerebellum tissue samples, tab delimited text file with 208,245 rows |