| Literature DB >> 35935919 |
Jessica Dawson1, Fiona K Baine-Savanhu1, Marc Ciosi2, Alastair Maxwell2, Darren G Monckton2, Amanda Krause1.
Abstract
Huntington disease (HD)is a dominantly inherited neurodegenerative disorder caused by the expansion of a polyglutamine encoding CAG repeat in the huntingtin gene. Recently, it has been established that disease severity in HD is best predicted by the number of pure CAG repeats rather than total glutamines encoded. Along with uncovering DNA repair gene variants as trans-acting modifiers of HD severity, these data reveal somatic expansion of the CAG repeat as a key driver of HD onset. Using high-throughput DNA sequencing, we have determined the precise sequence and somatic expansion profiles of the HTT repeat tract of 68 HD-affected and 158 HD-unaffected African ancestry individuals. A high level of HTT repeat sequence diversity was observed, with three likely African-specific alleles identified. In the most common disease allele (30 out of 68), the typical proline-encoding CCGCCA sequence was absent. This CCGCCA-loss disease allele was associated with an earlier age of diagnosis of approximately 7.1 years and occurred exclusively on haplotype B2. Although somatic expansion was associated with an earlier age of diagnosis in the study overall, the CCGCCA-loss disease allele displayed reduced somatic expansion relative to the typical HTT expansions in blood DNA. We propose that the CCGCCA loss occurring on haplotype B2 is an African cis-acting modifier that appears to alter disease diagnosis of HD through a mechanism that is not driven by somatic expansion. The assessment of a group of individuals from an understudied population has highlighted population-specific differences that emphasize the importance of studying genetically diverse populations in the context of disease.Entities:
Keywords: African ancestry; CAG repeat; CCGCCA loss; Huntington disease; cis-acting modifier; genetically diverse
Year: 2022 PMID: 35935919 PMCID: PMC9352962 DOI: 10.1016/j.xhgg.2022.100130
Source DB: PubMed Journal: HGG Adv ISSN: 2666-2477
Figure 1The HTT disease and non-disease allele structures in African ancestry individuals
Schematic representation of the HTT disease and non-disease allele structures defined for this study. The typical allele structures were grouped together as Q1-2-2-P2-2, while the atypical allele structures are shown individually for deviations from the reference allele structure to be clearly demonstrated.
Summary of African ancestry HTT disease and non-disease alleles
| Typical alleles | Q1-2-2-6-2 | 14–17 | – | 2 | 2 | 6 | 2 | 11 | 2.9 | – | – | 0.384 |
| 15–28 | 41–55 | 2 | 2 | 7 | 2 | 10 | 14.7 | 0.064 | ||||
| Q1-2-2-8-2 | 17 | – | 2 | 2 | 8 | 2 | 5 | 1.3 | – | – | 1 | |
| Q1-2-2-9-2 | 15–28 | 40 | 2 | 2 | 9 | 2 | 29 | 7.6 | 1 | 1.5 | 0.066 | |
| Q1-2-2-10-2 | 11–20 | 40–54 | 2 | 2 | 10 | 2 | 20 | 29.4 | ||||
| Q1-2-2-11-2 | 12–21 | – | 2 | 2 | 11 | 2 | 18 | 4.7 | – | – | 0.089 | |
| Q1-2-2-12-2 | 17 | – | 2 | 2 | 12 | 2 | 1 | 0.3 | – | – | 1 | |
| ∗Q1-2-2-13-2 | 17 | – | 2 | 2 | 13 | 2 | 1 | 0.3 | – | – | 1 | |
| Typical alleles subtotal | 235 | 61.2 | 31 | 45.6 | ||||||||
| Atypical alleles | Q1-2-2-4-3 | 23 | – | 2 | 2 | 4 | 3 | 1 | 0.3 | – | – | 1 |
| Q1-2-2-6-3 | 15–23 | 42–44 | 2 | 2 | 6 | 3 | 2 | 0.5 | ||||
| Q1-2-2-9-3 | 12–21 | – | 2 | 2 | 9 | 3 | – | – | ||||
| Q1-2-2-10-3 | 16 | – | 2 | 2 | 10 | 3 | 1 | 0.3 | – | – | 1 | |
| ∗Q1-4-2-4-3 | – | 42 | 4 | 2 | 4 | 3 | – | – | 1 | 1.5 | 0.154 | |
| Q1-4-2-7-3 | 14–19 | – | 4 | 2 | 7 | 3 | 22 | 5.7 | – | – | 0.059 | |
| ∗Q1-4-2-10-2 | 16–19 | – | 4 | 2 | 10 | 2 | 4 | 1.0 | – | – | 1 | |
| 16–32 | 40–58 | 2 | 0 | 9 | 2 | 27 | 7.0 | |||||
| Q1-0-0-9-2 | – | 39–46 | 0 | 0 | 9 | 2 | – | – | ||||
| Atypical alleles subtotal | 149 | 38.8 | 37 | 54.4 | ||||||||
The novel allele structures unique to this study are indicated by an asterisk (∗). The most common non-disease and disease allele structures are indicated in underlined italics. The statistically significant frequency differences between the non-disease and disease alleles are indicated in italics (non-disease alleles: Q1-2-2-10-2 p = 0.048 and disease alleles: Q1-2-2-6-3 p = 5.587 × 10−3, Q1-2-2-9-3 p = 9.142 × 10−8 and Q1-2-0-9-2 p = 3.119 × 10−13).
Summary of the HTT haplogroups/haplotypes and associated allele structures in disease and non-disease alleles
| A | ∗A2a | Q1-2-2-7-2 | – | – | 1 | 1.5 |
| ∗A2b | Q1-2-2-7-2 | 13 | 3.4 | 1 | 1.5 | |
| A4a | Q1-2-2-7-2 | 5 | 1.3 | 3 | 4.5 | |
| A4b | Q1-2-2-7-2 | 2 | 0.5 | 5 | 7.5 | |
| A6 | Q1-2-2-7-2 | 34 | 8.9 | – | – | |
| B1 | Q1-2-2-9-2 | 1 | 0.3 | 1 | 1.5 | |
| 25 | 6.6 | |||||
| C2 | Q1-4-2-7-3 | 21 | 5.5 | – | – | |
| C4 | Q1-2-2-9-2 | 21 | 5.5 | 1 | 1.5 | |
| C4c | Q1-2-2-6-2 | 11 | 2.9 | – | – | |
| Q1-2-2-10-2 | 69 | 18.1 | 19 | 28.4 | ||
| C8 | Q1-2-2-9-2 | 7 | 1.8 | – | – | |
| C-SA | C3 | Q1-2-2-10-2 | 1 | 0.3 | – | – |
| C9 | Q1-2-2-6-3 | 2 | 0.5 | 4 | 6.0 | |
| C10 | Q1-2-2-4-3 | 1 | 0.3 | – | – | |
| Other | O | Q1-2-2-7-2 | 36 | 9.4 | – | – |
| Total | #381 | 100.0 | 67 | 100.0 | ||
The two haplotypes that had not been previously identified in African ancestry individuals are indicated by an asterisk (∗). The most common non-disease and disease haplogroup/haplotype are indicated in underlined italics. The most common disease allele structure Q1-2-0-9-2 (29 out of 67 = 43.3%) is indicated in italics. Two samples (one disease allele and three non-disease alleles) from the sequence diversity analysis presented in Table 1 were excluded due to unsuccessful tag-SNP genotyping (#).
Figure 2Frequency of the HTT haplotype B2 in the populations of the 1000 Genomes Project
The African B2 haplotype was defined by SNPs rs2857936-rs762855-rs4690073 as described by Baine et al. The haplotype frequencies were obtained using the LDhap tool from the LDlink suite (ldlink.nci.nih.gov). Haplotype B2 was shown to have the highest frequencies among the African and African ancestry populations, ranging between 6.6% and 9.9%. Outside of the continental African populations, Puerto Rico (American) had the highest frequency of haplotype B2 (3.4%), followed by the five East Asian populations (range from 0.5% to 1.0%). The Columbian (American), Utah residents (European), and Sri Lankan (South Asian) populations had low frequencies (0.5%), and B2 was not detected in the rest of the populations analyzed. The results were comparable with the frequency of B2 in the African ancestry non-disease alleles included in this study. This indicates that, although this analysis was only conducted in non-disease alleles, haplotype B2 may be of an African origin and an African-specific haplotype.
Figure 3The HTT allele structure associated with age at HD diagnosis and somatic expansion of the HD allele in blood DNA in African ancestry individuals
(A) Linear regression analysis testing the association between the log transformed AoD and the inherited CAG repeat length for each disease allele structure revealed a significant association (r2 = 0.61, p = 1.36 × 10−10). The Q1-0-0-9-2 and Q1-2-0-9-2 disease allele structures characterized by the loss of one or more of the intervening sequences had the earliest AoD.
(B) The estimated marginal mean AoD for the disease allele structures, corrected for repeat size. The Q1-2-0-9-2 allele structure had the earliest mean AoD (n = 30, 45.5 years: 95% CI = 43.0–48.2), followed by Q1-0-0-9-2 (n = 2, 47.1 years: 95% CI = 38.1–58.1), Q1-4-2-4-3 (n = 1, 50.4 years: 95% CI = 37.4–67.9), Q1-2-2-P2-2 (n = 31, 53.0 years: 95% CI = 49.9–56.3), and Q1-2-2-6-3 (n = 4, 56.9 years: 95% CI = 48.9–66.0).
(C) Linear regression analysis testing the association between the log transformed AoD (corrected for CAG repeat length and allele structure) and expansion score. Overall, a significant association (p = 0.012) was identified.
(D) The estimated marginal mean expansion score for the allele structures, corrected for CAG repeat length and age at sampling. The Q1-0-0-9-2 (n = 2, 0.32: 95% CI = 0.227–0.458) and Q1-2-0-9-2 (n = 30, 0.42: 95% CI = 0.380–0.460) allele structures had the lowest mean expansion score followed by Q1-2-2-6-3 (n = 4, 0.44: 95% CI = 0.348–0.545), Q1-4-2-4-3 (n = 1, 0.44: 95% CI = 0.288–0.680), and Q1-2-2-P2-2 (n = 31, 0.60: 95% CI = 0.535–0.669).
Multiple linear models testing the association between the HD phenotype and various explanatory variables
| 1 | Ln (AoD)∼ CAG + allele structures + expansion score | 0.625 | 1.296 × 10−9 | 60 | |||
| 2 | |||||||
| 30 | |||||||
| 4 | Q1-2-2-6-3 | −0.840 | 0.846 | ||||
| 1 | Q1-4-2-4-3 | −5.903 | 0.411 | ||||
| E | |||||||
| 2 | Ln (AoD)∼ CAG + haplotypes + expansion score | 0.664 | 2.989 × 10−8 | 60 | |||
| 1 | A2a | 2.050 | 0.784 | ||||
| 1 | A2b | 11.664 | 0.163 | ||||
| 1 | A4a | 12.985 | 0.137 | ||||
| 4 | |||||||
| 1 | |||||||
| 1 | C4 | 2.773 | 0.719 | ||||
| 16 | |||||||
| 4 | |||||||
| E | |||||||
The statistically significant explanatory variables are indicated in italics. Model 1. Linear model testing the association of the CAG repeat length, allele structure and expansion score on the AoD, relative to the grouped typical allele structure Q1-2-2-P2-2. The R-square and p values of the overall model show a significant association (r2 = 0.63, p = 1 × 10−9), the CAG repeat length, allele structures Q1-0-0-9-2 and Q1-2-0-9-2, and expansion score had a significant association. Model 2. Linear model testing the association of the CAG repeat length, background haplotype, and expansion score on the AoD, relative to the most common haplotype B2. The R-square and p values of the overall model show a significant association (r2 = 0.66, p = 3 × 10−8), and the CAG repeat length; haplotypes A4b, B1, C5, and C9; and expansion score had a significant association.