| Literature DB >> 36064847 |
Sergey V Lobanov1, Branduff McAllister1, Mia McDade-Kumar1, G Bernhard Landwehrmeyer2, Michael Orth3,4, Anne E Rosser1,5, Jane S Paulsen6, Jong-Min Lee7,8,9, Marcy E MacDonald7,8,9, James F Gusella7,9,10, Jeffrey D Long11, Mina Ryten12,13, Nigel M Williams1, Peter Holmans1, Thomas H Massey14, Lesley Jones1,15.
Abstract
Huntington's disease is caused by an expanded CAG tract in HTT. The length of the CAG tract accounts for over half the variance in age at onset of disease, and is influenced by other genetic factors, mostly implicating the DNA maintenance machinery. We examined a single nucleotide variant, rs79727797, on chromosome 5 in the TCERG1 gene, previously reported to be associated with Huntington's disease and a quasi-tandem repeat (QTR) hexamer in exon 4 of TCERG1 with a central pure repeat. We developed a method for calling perfect and imperfect repeats from exome-sequencing data, and tested association between the QTR in TCERG1 and residual age at motor onset (after correcting for the effects of CAG length in the HTT gene) in 610 individuals with Huntington's disease via regression analysis. We found a significant association between age at onset and the sum of the repeat lengths from both alleles of the QTR (p = 2.1 × 10-9), with each added repeat hexamer reducing age at onset by one year (95% confidence interval [0.7, 1.4]). This association explained that previously observed with rs79727797. The association with age at onset in the genome-wide association study is due to a QTR hexamer in TCERG1, translated to a glutamine/alanine tract in the protein. We could not distinguish whether this was due to cis-effects of the hexamer repeat on gene expression or of the encoded glutamine/alanine tract in the protein. These results motivate further study of the mechanisms by which TCERG1 modifies onset of HD.Entities:
Year: 2022 PMID: 36064847 PMCID: PMC9445028 DOI: 10.1038/s41525-022-00317-w
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 6.083
Fig. 1The relationship of rs79727797 to the CAGGCC hexanucleotide short tandem repeat in TCERG1.
a The sequence of the tandem repeat region in exon 4 of TCERG1 (orange). The blue polygon bounds quasi-tandem repeat (QTR) the central part of which contains pure repeat, CAGGCC hexanucleotide short tandem repeat (STR). b The TCERG1 protein domains and location of the repeat tract. c The variant alleles seen at the tandem repeat locus arranged in descending order of prevalence.
Hexanucleotide repeat allele frequencies in TCERG1.
| Allele | QTR length | STR length | Number of alleles | Allele frequency (%) | ||
|---|---|---|---|---|---|---|
| N | ΔN | N | ΔN | |||
| A1 | 38 | 0 | 6 | 0 | 1114 | 91.31 |
| A2 | 35 | −3 | 3 | −3 | 50 | 4.10 |
| A3 | 36 | −2 | 4 | −2 | 28 | 2.30 |
| A4 | 40 | +2 | 8 | +2 | 24 | 1.97 |
| A5 | 34 | −4 | 4 | −2 | 1 | 0.08 |
| A6a | 38 | 0 | 6 | 0 | 1 | 0.08 |
| A7 | 39 | +1 | 7 | +1 | 1 | 0.08 |
| A8 | 39 | +1 | 6 | 0 | 1 | 0.08 |
QTR quasi-tandem repeat, STR short tandem repeat.
aAllele A6 differs from the reference allele, A1, by a synonymous SNV (see Fig. 1c).
Fig. 2TCERG1 tandem repeat genotype counts and associated mean residual ages at onset.
a Quasi-tandem repeat (QTR) genotypes. b Short tandem repeat (STR) genotypes. Black numbers mark genotype counts. Red and blue numbers indicate mean residual ages at onset for individual genotypes, early onset in red, late onset in blue.
Fig. 3The relationship between hexanucleotide quasi-tandem repeat (QTR) length and residual age at onset of HD.
a–c Histograms showing distribution of the sum of two QTR repeat lengths Nsum = Nmin + Nmax for the groups with early (red, R < −Rthr) and late (blue, R > Rthr) onsets. Panels (a, b, and c) correspond to the residual age at onset threshold Rthr of 0, 13, and 20 years, respectively. d Association of the sum of two QTR repeat lengths Nsum with the residual age at onset for the entire HD cohort. Red pluses indicate mean residual age at onset for every sum of QTR repeat lengths. Grey and black dashed lines are plotted using coefficients of the linear regression analysis and regression with selection.
Fig. 4Locus zoom plot showing the relationship of rs79727797 association with residual age at onset to that of the sum of two quasi-tandem repeat (QTR) lengths (black cross) in 468 subjects with both single nucleotide variant (SNV) and sequencing data.
The associations of age at onset with the sum of STR (red cross) and QTR (blue cross) repeat lengths in all 610 subjects are also shown. The bar on the right of the plot indicates the strength of linkage disequilibrium (r2) between each SNV and the tandem repeat. The p-value threshold for genome-wide significance (5 × 10−8) is shown with a black dashed line.