Literature DB >> 31504207

Allele-Specific Transcript Abundance: A Pilot Study in Healthy Centenarians.

Lauren C Tindale1,2, Nina Thiessen1, Stephen Leach1, Angela R Brooks-Wilson1,2.   

Abstract

The genetic basis of healthy aging and longevity remains largely unexplained. One hypothesis as to why long-lived individuals do not appear to have a lower number of common-complex disease variants, is that despite carrying risk variants, they express disease-linked alleles at a lower level than the wild-type alleles. Allele-specific abundance (ASA) is the different transcript abundance of the two haplotypes of a diploid individual. We sequenced the transcriptomes of four healthy centenarians and four mid-life controls. CIBERSORT was used to estimate blood cell fractions: neutrophils were the most abundant source of RNA, followed by CD8+ T cells, resting NK cells, and monocytes. ASA variants were more common in noncoding than coding regions. Centenarians and controls had a comparable distribution of ASA variants by predicted effect, and we did not observe an overall bias in expression toward major or minor alleles. Immune pathways were most highly represented among the gene set that showed ASA. Although we found evidence of ASA in disease-associated genes and transcription factors, we did not observe any differences in the pattern of expression between centenarians and controls in this small pilot study.
© The Author(s) 2019. Published by Oxford University Press on behalf of The Gerontological Society of America.

Entities:  

Keywords:  Genetics; Human aging; Longevity; Transcriptomics

Year:  2020        PMID: 31504207      PMCID: PMC7243586          DOI: 10.1093/gerona/glz188

Source DB:  PubMed          Journal:  J Gerontol A Biol Sci Med Sci        ISSN: 1079-5006            Impact factor:   6.053


Several studies have shown that long-lived individuals do not appear to carry a substantially lower number of common-complex disease risk alleles compared with typical individuals (1–3). It may be that despite carrying risk alleles, they express the disease-related variants at a lower level. Allele-specific abundance (ASA) is the different transcript abundance of the two haplotypes of a diploid individual. ASA occurs across the genome in different biological processes and across multiple tissues (4). ASA can be caused by allele-specific alternative splicing, variation in transcriptional start or stop sites, cis-acting regulatory variants, epigenetic differences, such as imprinting, DNA methylation, chromatin state, and differences in mRNA stability (4). ASA has been shown to be under genetic control using lymphoblastoid cell lines where monozygotic twins were found to have a more similar degree of ASA than unrelated controls (5). Although different tissues have been shown to have important differences in expression, whole blood (WB) analysis has advantages. Peripheral blood is a cost-effective and clinically relevant tissue, it provides an overall view of the body since it comes into contact with almost all organs and tissues (6), and it is useful when looking for biomarkers of biological age (7) and diseases where the tissue of interest is not readily available. WB collected in PAXgene tubes have identified and validated a gene expression signature that distinguished between Alzheimer’s disease (AD) patients and cognitively healthy controls, showing that AD could be detected away from the primary site of the disease (8). Mutations causing expression differences in cardiac-restricted genes for Long QT syndrome, Marfan syndrome, and hypertrophic cardiomyopathy were also identified from WB from PAXgene tubes (9). In a pilot study of four healthy centenarians and four mid-life controls, we hypothesized that healthy centenarians may exhibit an overall pattern of lower expression of disease-associated alleles. This would effectively allow them to favor expression of nonrisk alleles, despite carrying disease-causing variants. Heterozygous SNPs in the transcribed region of a gene may be used as indicators to determine if one copy of the gene is more highly expressed than the other. We tested whether preferential transcript abundance was observed, and whether it was skewed in centenarians versus controls by examining ASA in disease-associated genes and variants.

Method

The Super-Seniors Study

Super-Seniors are aged 85 years and older with no reported history of cancer, cardiovascular disease (CVD), diabetes, major pulmonary disease, or dementia (10). Four Super-Senior centenarians aged 100–104 years (two males, two females), and four controls aged 50–56 years (two males, two females) who were not selected for health, participated in a transcriptome pilot study. Research ethics board (REB) approval was received from the joint Clinical REB of BC Cancer and the University of British Columbia, and the REB of Simon Fraser University. Participants gave written informed consent.

Sample Preparation

WB was collected in PAXgene tubes, which preserve in vivo transcription levels. RNA was extracted using the PAXgene blood miRNA kit (QIAGEN) and globin depleted using the GLOBINclear kit (Thermo Fisher Scientific, MA). Transcriptome and exome libraries were constructed using the Illumina TruSeq stranded mRNA kit and Agilent Technologies SureSelect v4+UTR kit, respectively.

Analysis

Transcriptomes were aligned in a two-step process using bwa-mem 0.7.7a (11) and JAGuaR (12) 2.2: (a) reads were aligned to a genomic plus exon–exon junction reference created using hg38_no_alt reference with annotations from ensembl79 and refseq, and (b) reads that aligned to the exon–exon junction sequence were repositioned into standard genomic coordinates to become large-gapped genomic alignments. Exomes were aligned using bwa-mem 0.7.7a. Variants were called using bcftools mpileup. CIBERSORT (13), an in silico flow cytometry tool, was used to estimate immune cell type abundances from gene expression data. The signature gene file of 22 distinct immune cell types provided by CIBERSORT was used as a reference for gene expression signatures. Exome sequence calling was used to determine heterozygous SNPs, which were then matched to the transcriptome sequences. Autosomal variants were filtered for a read depth (RD) ≥30 and an alternate allele frequency >0.7 or <0.3. Variants with ASA were compared by SNPEff predicted effect (14) and by major or minor allele bias as determined from minor allele frequencies reported in dbSNP build 150 (15). Gene lists were entered into QIAGEN Ingenuity® Pathway Analysis (IPA) to determine enriched pathways and physiological functions. To look for disease associations we compared genes showing ASA to lists of publically available disease associations. We looked for evidence of ASA in the 60 genes recommended by ACMG for reporting of incidental findings (16), as well as variants with SNP-trait associations that reached genome-wide significance listed in the NHGRI-EBI GWAS catalog v1.0.1 (17). Variants with ASA were input into SNP2TFBS to map SNPs to transcription factors (TF) (18).

Results

CIBERSORT estimated the mean major cell types to be neutrophils, followed by CD8+ T cells, monocytes, and resting natural killer (NK) cells (Supplementary Figure S1; Supplementary Table S1). Cell type distribution varied between participants. However, a larger sample size would be required to determine if there is a pattern of differences between centenarians and controls. After transcriptome alignment, there were ~240–320 million mapped reads per sample. Centenarians had a mean of 344,167 (SD = 14,369) transcriptome variants and 130,068 (SD = 10,188) heterozygous exome variants; controls had a mean of 355,220 (SD = 136,358) transcriptome variants and 128,775 (SD = 3,029) heterozygous exome variants. There was a mean of 30,335 (SD = 1,480) matches between the transcriptome and heterozygous exome variants in centenarians and 28,670 (SD = 2,822) matches in controls. After filtering for RD ≥30, there were 16,679 (SD = 1,108) matches in centenarians and 15,000 (SD = 1,967) matches in controls. After filtering for alternate allele frequency >0.7 or <0.3, the final list contained 1,145 (SD = 92) ASA variants in centenarians and 1,048 (SD = 136) ASA variants in controls.

Distribution and Function of Variants With ASA

Variants with ASA were most frequently noncoding: 3′UTR variants were most frequent, followed by intronic variants, downstream gene variants, and then coding missense and synonymous variants (Supplementary Figure S2). There was no distinguishable difference in the proportion of variants with major allele bias versus minor allele bias when comparing centenarians and controls (Figure 1). Among ASA variants with an allelic ratio ≥0.7:0.3, centenarians had 49.1% of variants skewed to the minor allele, compared with 59.9% in controls. At the more extreme ASA variants with an allelic ratio ≥0.9:0.1, centenarians had 53.3% of variants skewed to the minor allele, compared with 51.2% among controls. There was no difference in the proportion of variants skewed to the minor allele between centenarians and controls for allelic ratio ≥0.7:0.3 (χ 2 = 3.4; p = .065) or for allelic ratio ≥0.9:0.1 (χ 2 = 3.33; p = .068). There was no difference in the proportion in % skewed to the minor allele between the allelic ratio ≥0.7:0.3 and the allelic ratio ≥0.9:0.1 sets (χ 2 = 0.06; p = .806).
Figure 1.

(A) Proportion of major and minor allele bias by group. (B) Coding region variants by major and minor allele bias (n = 3,700 coding variants with ASA ≥ 0.7:0.3). (C) Coding region variants by major and minor allele bias with alternate allele frequency >0.9 or <0.1 (n = 360 coding variants with ASA ≥ 0.9:0.1).

(A) Proportion of major and minor allele bias by group. (B) Coding region variants by major and minor allele bias (n = 3,700 coding variants with ASA ≥ 0.7:0.3). (C) Coding region variants by major and minor allele bias with alternate allele frequency >0.9 or <0.1 (n = 360 coding variants with ASA ≥ 0.9:0.1).

Disease Association of Genes and Variants With ASA

Eighty-two SNPs observed to have ASA ≥0.7:0.3 in either Super-Seniors or controls overlapped with SNPs in the NHGRI-EBI GWAS catalog. Trait associations from the catalog were filtered to keep only associations with exclusion criteria diseases for Super-Seniors. Twenty-one disease-associated SNPs remained. Of these, 9 were skewed to the minor allele and 12 were skewed to the major allele. Seven were skewed toward the disease-associated allele, 9 were skewed away from the disease-associated allele, and 5 did not have a clear disease-associated allele (Table 1). Seven SNPs had ASA in multiple subjects, all of which were skewed in the same direction. Among the set of 21 disease-associated SNPs, there were 20 incidences of ASA in centenarians and 13 in controls.
Table 1.

NHGRI-EBI GWAS Catalog SNPs With Evidence of Allele-Specific Expression ≥0.7:0.3 in Centenarians or Controls

Geners IDAssociated Disease/TraitSkewed to Minor Allele?Skewed to GWAS Disease- Associated Allele?CentsControls
MTX1 rs1057941Multiple cancersNoNo10
TRIM66 rs11042023ObesityNoNo01
GSDMB rs11078927AsthmaNoYes21
HIP1 rs1167827Body mass indexNon/a10
SREBF1 rs11868035Parkinson’s diseaseYesNo01
LYSMD4 rs12185079Post bronchodilator FEV1/FVC ratio in COPDYesYes10
SLC25A44 rs2072499Testicular germ cell tumorNoNo01
OR52K2-OR52K1 rs2278170Amyotrophic lateral sclerosisNon/a20
PRC1 rs2290203Breast cancerNoYes10
GSDMB rs2290400Type 1 diabetes, bronchial hyper-responsiveness in asthmaNoNo31
GSDMB rs2305480Asthma, ulcerative colitisNon/a21
TNS1 rs2571445Pulmonary functionYesYes11
CASP8 rs3769823Basal cell carcinomaNoYes10
TRIM8 rs3850699Prostate cancerYesNo01
SFTPD rs721917Chronic obstructive pulmonary diseaseNoNo10
ASAH1 rs7508Atrial fibrillationYesYes11
CD226 rs763361Type 1 diabetesYesn/a01
C9orf72 rs774359Amyotrophic lateral sclerosisYesn/a01
RCCD1 rs79548680Type 2 diabetesYesYes21
RNF207 rs846111QT intervalNoNo10
SDCCAG8 rs953492Diastolic blood pressureYesNo01
Total2013
NHGRI-EBI GWAS Catalog SNPs With Evidence of Allele-Specific Expression ≥0.7:0.3 in Centenarians or Controls There were 2,182 and 2,052 genes where ASA was observed in centenarians or controls, respectively. When the gene list was entered into QIAGEN IPA to determine enriched pathways and physiological functions, the top canonical pathways in centenarians were: antigen presentation; autophagy; crosstalk between dendritic cells and NK cells; CD28 signaling in T helper cells. The top canonical pathways in controls were: antigen presentation; NK cell signaling; Th1 pathway; Th1 and Th2 activation. Top centenarian genes with ASA were defined as genes for which there was evidence of ASA in four centenarians and only one or no controls, or in three centenarians and no controls; and vice versa for top genes with ASA in controls. There were 35 top genes with ASA in centenarians and 23 top genes with ASA in controls (Supplementary Table S2). In QIAGEN IPA, the top canonical pathways for the top genes with ASA in centenarians were, in order: citrulline-nitric oxide cycle; LXR/RXR activation; IL-12 signaling and production in macrophages; superpathway of citrulline metabolism; production of nitric oxide. The top canonical pathways for the top genes with ASA in controls were: LXR/RXR activation; role of osteoblast, osteoclasts, and chondrocytes in rheumatoid arthritis; IL-10 signaling; CCR5 signaling in macrophages. Among the 60 “ACMG” genes, we found that centenarians and controls showed ASA ≥0.7:0.3 in 8 and 11 genes, respectively (Table 2). Of note, in LDLR ASA was observed in one centenarian and all four controls; and in TMEM43 ASA was observed in three centenarians and no controls. All other ASA variants in ACMG genes were unique with the exception of two SNPs in PRKAG2, which were each present in two subjects. One variant had a high predicted functional impact: a TTCT>TTCTCT indel at chr18:31068033 in DSC2.
Table 2.

ACMG Genes With Allele-Specific Abundance ≥0.7:0.3 in Centenarians or Controls

GeneDiseaseCentenariansControls
APC Adenomatous polyposis coli1
BRCA1 Breast-ovarian cancer, familial 11
DSC2 Arrhythmogenic right ventricular cardiomyopathy, type 111
KCNH2 Long QT syndrome 21
LDLR Familial hypercholesterolemia14
MYH11 Aortic aneurysm, familial thoracic 41
NF2 Neurofibromatosis, type 21
PRKAG2 Familial hypertrophic cardiomyopathy 621
SDHAF2 Paragangliomas 21
SMAD3 Loeys–Dietz syndrome type 31
SMAD4 Juvenile polyposis syndrome,1
TGFBR1 Loeys–Dietz syndrome type 1A1
Loeys–Dietz syndrome type 2A
Marfan’s syndrome
TMEM43 Arrhythmogenic right ventricular cardiomyopathy, type 53
TNNI3 Familial hypertrophic cardiomyopathy 71
TPM1 Familial hypertrophic cardiomyopathy 31
TSC1 Tuberous sclerosis 11
VHL Von Hippel–Lindau syndrome1
Total1114
ACMG Genes With Allele-Specific Abundance ≥0.7:0.3 in Centenarians or Controls Six thousand thirty-nine unique variants with rsIDs were entered into SNP2TFBS (15). TF enrichment was determined by the ratio (total TF sites identified by SNP2TFBS): (TFs identified in the input list) (Supplementary Table S3). ZBTB33, Tcfcp2l1, NHLH1, TFAP2C, and Pax2 were the top five most enriched TFs.

Discussion

Overall we found that centenarians and controls had comparable distributions of ASA variants with no difference in proportion by predicted effect. ASA variants were more common in noncoding than coding regions. We did not observe an overall bias in expression toward the major or minor alleles in centenarians or controls. Since this is an exploratory study, to detect allelic ratios with a difference ≥0.7:0.3, samples were sequenced to 330–400 million raw reads per sample. Li and colleagues (19) calculated that at RD 30, there is ~60% statistical power to detect an allelic ratio of 0.7:0.3, ~90% power to detect an allelic ratio of 0.8:0.2, and nearly 100% power to detect an allelic ratio of 0.9:0.1. They further calculated that to achieve RD 30 in ~90% and ~95% of exonic variants would require ~190 million and ~240 million reads, respectively. Immune pathways were most highly represented among the gene set that showed ASA in both centenarians and controls. We cannot determine whether this observation is related to aging; however, the presence of ASA in genes in these pathways is nonetheless of interest. The occurrence of immune dysfunction with aging is well established, including increased susceptibility to infection, frequency of neoplasia, inflammation, and autoimmune response (20). A common factor underlying the pathogenesis of many age-related chronic diseases, including CVD, cancer, neurodegenerative diseases, and diabetes is inflammation (21). It has been previously suggested that centenarians may have genetic factors that allow them to better maintain immune function as they age (22,23). When looking at ACMG genes with potential clinical utility (16), we found evidence of ASA in some genes related to disease etiology. Although it may be an artifact of small sample size, LDLR, for which ASA was observed in one centenarian and all four controls, and TMEM43, for which ASA was observed in three centenarians and no controls, are potentially interesting to examine in a larger study. LDLR is associated with familial hypercholesterolemia and is a major apoE receptor; variants of the APOE gene are the most replicated association with longevity and healthy aging phenotypes (24) including the Super-Senior phenotype (25). TMEM43 is associated with arrhythmogenic right ventricular cardiomyopathy, type 5. ASA variants also overlapped with disease-associated NHGRI-EBI GWAS catalogue variants and were slightly more prevalent in centenarians; this difference should be explored further in a larger study. It is possible that among patients with disorders associated with these genes, even when a pathogenic variant is not identified, ASA may play a role in disease pathogenicity if a cis-acting factor is affecting gene expression. There was evidence that ASA variants overlapped with numerous TFs, which are possible mechanisms for cis-acting affects. ASA can be caused by numerous mechanisms that cannot be distinguished in the present study. It is possible that some of our observed cases of ASA could be the result of alternative transcripts that have been collapsed into single loci. ASA differences between blood cell types could confound our analyses. By using a RD 30 cutoff we are unable to examine ASA in low-abundance genes. A caveat to the ASA observed is that it may not reflect true in vivo RNA expression levels. Limitations of this study include that we cannot determine if ASA differences are associated with healthy aging, or are a consequence of getting old. As a pilot study, the sample size was very small, and may lead to false negative findings. We did not find evidence for a pattern of lower expression of disease-associated variants in centenarians compared to mid-life controls. The presence of ASA in some ACMG genes suggests that in patients with disorders that are highly associated with a certain gene, but for which no causal variant is apparent, looking at ASA may be of interest. As well, the representation of immune pathways among the ASA gene set may be related to decreased immune function with aging.

Funding

This work was supported by the Lotte and John Hecht Memorial Foundation. Click here for additional data file.
  25 in total

1.  dbSNP: the NCBI database of genetic variation.

Authors:  S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

Review 2.  Genome-wide allele-specific analysis: insights into regulatory variation.

Authors:  Tomi Pastinen
Journal:  Nat Rev Genet       Date:  2010-06-22       Impact factor: 53.242

3.  Whole blood RNA offers a rapid, comprehensive approach to genetic diagnosis of cardiovascular diseases.

Authors:  Todd E Miller; Lijing You; Robert J Myerburg; Paul J Benke; Nanette H Bishopric
Journal:  Genet Med       Date:  2007-01       Impact factor: 8.822

4.  Burden of Common Complex Disease Variants in the Exomes of Two Healthy Centenarian Brothers.

Authors:  Lauren C Tindale; Andy Zeng; Karla L Bretherick; Stephen Leach; Nina Thiessen; Angela R Brooks-Wilson
Journal:  Gerontology       Date:  2015-06-10       Impact factor: 5.140

5.  A gene expression pattern in blood for the early detection of Alzheimer's disease.

Authors:  Birgitte Boonstra Booij; Torbjørn Lindahl; Peter Wetterberg; Nina Voss Skaane; Solve Sæbø; Guri Feten; Phil D Rye; Lena Iren Kristiansen; Nina Hagen; Marianne Jensen; Ken Bårdsen; Bengt Winblad; Praveen Sharma; Anders Lönneborg
Journal:  J Alzheimers Dis       Date:  2011       Impact factor: 4.472

Review 6.  The search for longevity and healthy aging genes: insights from epidemiological studies and samples of long-lived individuals.

Authors:  Joanne M Murabito; Rong Yuan; Kathryn L Lunetta
Journal:  J Gerontol A Biol Sci Med Sci       Date:  2012-04-12       Impact factor: 6.053

7.  Identification of allele-specific alternative mRNA processing via transcriptome sequencing.

Authors:  Gang Li; Jae Hoon Bahn; Jae-Hyung Lee; Guangdun Peng; Zugen Chen; Stanley F Nelson; Xinshu Xiao
Journal:  Nucleic Acids Res       Date:  2012-03-29       Impact factor: 16.971

8.  Burden of disease variants in participants of the Long Life Family Study.

Authors:  Meredith Stevenson; Harold Bae; Nicole Schupf; Stacy Andersen; Qunyuan Zhang; Thomas Perls; Paola Sebastiani
Journal:  Aging (Albany NY)       Date:  2015-02       Impact factor: 5.682

9.  The Super-Seniors Study: Phenotypic characterization of a healthy 85+ population.

Authors:  Julius Halaschek-Wiener; Lauren C Tindale; Jennifer A Collins; Stephen Leach; Bruce McManus; Kenneth Madden; Graydon Meneilly; Nhu D Le; Joseph M Connors; Angela R Brooks-Wilson
Journal:  PLoS One       Date:  2018-05-24       Impact factor: 3.240

10.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.