| Literature DB >> 35794204 |
Marco Toffoli1, Xiao Chen2,3, Michael A Eberle4,5, Christos Proukakis6, Fritz J Sedlazeck7, Chiao-Yin Lee1, Stephen Mullin1,8, Abigail Higgins1, Sofia Koletsi1, Monica Emili Garcia-Segura1, Esther Sammler9,10, Sonja W Scholz11,12, Anthony H V Schapira1.
Abstract
GBA variants carriers are at increased risk of Parkinson's disease (PD) and Lewy body dementia (LBD). The presence of pseudogene GBAP1 predisposes to structural variants, complicating genetic analysis. We present two methods to resolve recombinant alleles and other variants in GBA: Gauchian, a tool for short-read, whole-genome sequencing data analysis, and Oxford Nanopore sequencing after PCR enrichment. Both methods were concordant for 42 samples carrying a range of recombinants and GBAP1-related mutations, and Gauchian outperformed the GATK Best Practices pipeline. Applying Gauchian to sequencing of over 10,000 individuals shows that copy number variants (CNVs) spanning GBAP1 are relatively common in Africans. CNV frequencies in PD and LBD are similar to controls. Gains may coexist with other mutations in patients, and a modifying effect cannot be excluded. Gauchian detects more GBA variants in LBD than PD, especially severe ones. These findings highlight the importance of accurate GBA analysis in these patients.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35794204 PMCID: PMC9259685 DOI: 10.1038/s42003-022-03610-7
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
Fig. 1Schematic illustration of the different types of GBA recombinant alleles and positions of PCR primers used to detect them with ONT.
Not to scale, corresponding roughly to g.chr1:155,210,000-155,245,000. a Wild-type allele. Only primer pair 1 will produce an amplicon. b Non-reciprocal recombination (gene conversion). Similar to non-recombinant alleles, only primer pair 1 will produce an amplicon. c Reciprocal crossover between gene and pseudogene resulting in a 20.6 kb deletion (CNL). Only primer pair 2 will produce an amplicon. d Reciprocal crossover between gene and pseudogene resulting in a 20.6 kb duplication (CNG). Both primer pair 1 and primer pair 3 will produce amplicons. Note that the normal allele is present and that amplification with primer pair 3 will produce an amplicon independently of the number of copy number gains. CNG copy number gain, CNL copy number loss.
Details of cross-validation between Gauchian and ONT.
| Samplea | CN change | Other variants | CNV | Variant type | Number of samples |
|---|---|---|---|---|---|
| NA20756 | 1 | Gain | CNG with no other variant | 11 | |
| HG01912 | 3 | ||||
| HG01889 | 5 | ||||
| HG02284 | 6 | ||||
| HG03547 | 3 | ||||
| NA19909 | 4 | ||||
| HG03895 | 1 | ||||
| NA18917 | 2 | ||||
| NA19711 | 2 | ||||
| HG03575 | 4 | ||||
| Brain-S1 | 1 | ||||
| PP-3307* | 1 | p.L483P | CNG + SNV | 1 | |
| Brain-S2 | 4 | c.1263del+RecTL | CNG + c.1263del+RecTL conversion | 2 | |
| Brain-S3 | 4 | c.1263del+RecTL | |||
| HG03428 | −1 | Loss | CNL, non-pathogenic | 3 | |
| NA19024 | −1 | ||||
| PP-12224 | −1 | ||||
| HG00422 | −1 | RecNciI | Pathogenic CNL (RecNciI CNL) | 1 | |
| Brain-S4 | −1 | p.L483P | Non-pathogenic CNL + p.L483P | 1 | |
| HG00119 | 0 | c.1263del+RecTL | No CN change | Gene conversion | 2 |
| HG00115 | 0 | c.1263del+RecTL | |||
| PP-3420 | 0 | p.L483P | SNV | 9 | |
| PP-3700 | 0 | p.L483P | |||
| PP-57787 | 0 | p.L483P | |||
| PP-59343 | 0 | p.L483P | |||
| PP-59926 | 0 | p.L483P | |||
| PP-60060 | 0 | p.L483P | |||
| Brain-S5 | 0 | p.L483P | |||
| PP-41342* | 0 | p.L483P/p.E365K | |||
| PP-3429 | 0 | p.A495P | |||
| PP-3762*,PP-42378*,PP-3476,PP-3179,PP-3001,PP-3173,PP-3023,PP-42444,PP-3406,PP-56534,PP-52772,PP-41705 | 0 | No | 12 | ||
Samples with * were discordant with BWA-GATK.
aSamples with IDs starting with NA- and HG- were obtained from NHGRI; samples with IDs starting with PP were obtained from PPMI; samples marked as brain were obtained from QSBB.
Non-pathogenic CNVs in 1kGP and AMP-PD cohorts.
| 1kGP | PD | LBD | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| European | African | Other | European | African | Other or unknown | European | |||||
| Control | Control | Control | Case | Control | Case | Control | Case | Control | Case | Control | |
| CNL | 2 | 3 | 9 | 8a | 7 | 0 | 0 | 0 | 0 | 13c | 13 |
| CNG | 1 | 74 | 18 | 11b | 6 | 1 | 2 | 0 | 1 | 21d | 11e |
| Total | 503 | 661 | 1340 | 2227 | 1213 | 22 | 27 | 76 | 15 | 2598 | 1941 |
aThree out of the eight PD cases with non-pathogenic CN losses also have a pathogenic GBA variant (two samples have p.L483P and one sample has p.N409S).
bFour out of the 11 PD cases with CN gains also have a pathogenic GBA variant (three samples have p.L483P and one sample has p.N409S).
cOne out of the 13 LBD cases with non-pathogenic CN losses also has a pathogenic GBA variant, p.L483P.
dFive out of the 21 LBD cases with CN gains also have a pathogenic or PD-related GBA variant (p.L483P, p.D448H, c.1263del+RecTL, p.T408M, and compound heterozygote p.L483P/p.D448H).
eOne out of 11 LBD controls with CN gains also has a PD-related GBA variant, p.T408M.
GBAP1-like variants in the exons 9–11 homology region in 1kGP, PD and LBD cohorts.
| p.A495P | p.L483P | p.D448H | c.1263del | RecNciI | c.1263del+RecTL | Total | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| CNL | Conversion | CNL | Conversion | |||||||
| 1kGP | 1 | 5 | 0 | 2 | 1 | 0 | 0 | 2 | 11 | |
| PD | Case ( | 3 | 14 | 1 | 0 | 1 | 2 | 1 | 0 | 22 |
| Control ( | 0 | 6 | 1 | 0 | 0 | 0 | 0 | 0 | 7 | |
| LBD | Case ( | 4 | 23 | 4 | 6 | 10 | 3 | 2 | 2 | 54* |
| Control ( | 2 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 3 | |
| PD + LBD called by Gauchian | 9 | 43 | 7 | 6 | 11 | 5 | 3 | 2 | 86 | |
| PD + LBD called by BWA-GATK | 9 (+11 FP) | 27 | 7 (+2 FP) | 0 | 0 | 1 | 0 | 0 | 44 | |
*One sample is compound heterozygous for p.L483P and p.D448H.
Fig. 2Gauchian detects challenging GBA variants through targeted copy number calling and haplotype phasing.
a Median mapping quality (red line) across 2504 1kGP samples plotted for each position in the GBA/GBAP1 region (hg38). A median filter is applied in a 50 bp window. The eleven exons of GBA are shown as orange boxes. GBAP1 and MTX1 exons are shown as green and purple boxes, respectively. The 4 kb major homology region (98.1% sequence similarity, exons 9–11) between GBA and GBAP1 is shaded in pink and highlights an area of low mapping accuracy. The light blue box shows the 10 kb unique region between the two genes in which copy number calling is performed in Gauchian. b Distribution of normalised depth in the 10 kb CN calling region in 2504 1kGP samples, showing peaks at CN1 (CNL), 2 (no CNV) and 3-8 (CNG). c Recombinant haplotypes in the exons 9–11 homology region, distinguished by GBA/GBAP1 differentiating bases (x-axis). Reference genome sequences are shaded in yellow. There is an error in hg38 where the first three sites of GBAP1 show GBA bases, which could lead to alignment errors. The GBA recombinant haplotypes are shown in the white background, including those where one or a few nearby sites are mutated to the corresponding GBAP1 base, resulting from either gene conversion or CNL. Grey bases indicate that the base can be either GBA or GBAP1 depending on the breakpoint position of the CNL/conversion. Shaded in purple are two example GBAP1 haplotypes, found by Gauchian, that have been partially converted to GBA and can cause false-positive GBA variant calls by standard secondary analysis pipelines. For the first example, the reverse-p.L483P variant on GBAP1 directs aligners to align GBAP1 reads to GBA, causing the nearby p.A495P false-positive call. For the second example, the reverse-c.1263del variant inserts 55 bp to GBAP1, driving GBAP1 reads to align to GBA, causing the nearby p.D448H false-positive call. CNG copy number gain, CNL copy number loss, CNV copy number variant.
Summary of samples carrying GBA coding variants detected in 1kGP, PD and LBD cohorts.
| p.N409S | Severe* variants | Total | Total excluding p.N409S | ||
|---|---|---|---|---|---|
| 1kGP | 3 | 14 | 53 | 50 | |
| PD | case | 128 | 38 | 296 | 171 |
| control | 158 | 9 | 200 | 43 | |
| OR (95% CI) | n/a | 2.12 (1.07–4.71) | n/a | 2.07 (1.48–2.95) | |
| LBD | case | 59 | 79 | 353 | 298 |
| control | 19 | 2 | 86 | 67 | |
| OR (95% CI) | n/a | 30.83 (9.71–187.55) | n/a | 3.68 (2.82–4.87) |
*Severe and mild variants are defined in Supplementary Table 5.