| Literature DB >> 35299952 |
Nikita Simone Pillay1, Owen A Ross2,3, Alan Christoffels1,4, Soraya Bardien5,6.
Abstract
Parkinson's disease is a neurodegenerative disorder with a heterogeneous genetic etiology. The advent of next-generation sequencing (NGS) technologies has aided novel gene discovery in several complex diseases, including PD. This Perspective article aimed to explore the use of NGS approaches to identify novel loci in familial PD, and to consider their current relevance. A total of 17 studies, spanning various populations (including Asian, Middle Eastern and European ancestry), were identified. All the studies used whole-exome sequencing (WES), with only one study incorporating both WES and whole-genome sequencing. It is worth noting how additional genetic analyses (including linkage analysis, haplotyping and homozygosity mapping) were incorporated to enhance the efficacy of some studies. Also, the use of consanguineous families and the specific search for de novo mutations appeared to facilitate the finding of causal mutations. Across the studies, similarities and differences in downstream analysis methods and the types of bioinformatic tools used, were observed. Although these studies serve as a practical guide for novel gene discovery in familial PD, these approaches have not significantly resolved the "missing heritability" of PD. We speculate that what is needed is the use of third-generation sequencing technologies to identify complex genomic rearrangements and new sequence variation, missed with existing methods. Additionally, the study of ancestrally diverse populations (in particular those of Black African ancestry), with the concomitant optimization and tailoring of sequencing and analytic workflows to these populations, are critical. Only then, will this pave the way for exciting new discoveries in the field.Entities:
Keywords: Parkinson’s disease; african ancestry; bioinformatic pipelines; diverse populations; familial PD; next-generation sequencing; third-generation sequencing; whole-exome sequencing
Year: 2022 PMID: 35299952 PMCID: PMC8921601 DOI: 10.3389/fgene.2022.781816
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
List of published studies that identified novel Parkinson’s disease loci using next-generation sequencing approaches.
| Reference | Gene | Population | Pre-NGS screening approach used | Study Participants Screened (Sequencing platform used) | QC and Read Alignment Tools | Variant Calling Tools | Variant Annotation and | Variant Inclusion/Exclusion Criteria | Mutations Identified/(Chromosome) |
|---|---|---|---|---|---|---|---|---|---|
|
|
| Swiss family (Family A) | None | WES on a PD-affected pair of 1st degree cousins | SOAPaligner (read alignment to the human References genome - Hg18, build 36.1) | SOAPsnp (SNP calling) | Database of Genomic Variants v6 (determination of structural variants against CNVs) | Variants were excluded if | Homozygous c.1858G > A |
| - on the X chromosome | - p.Asp620Asn (16q11.2) | ||||||||
| - homozygous (autosomal-dominant inheritance of disease was assumed) | |||||||||
| - non-coding | |||||||||
| - synonymous | |||||||||
| - variants present in dbSNP v.130 | |||||||||
| Variants were subsequently genotyped in a multi-ethnic case-control series (4,326 patients and 3,309 controls) | |||||||||
| Confirmation | |||||||||
|
|
| Austrian family | Haplotyping and linkage analysis (Merlin software) | WES on two PD-affected second cousins (Genome Analyzer IIx system (Illumina) | Burrows-Wheeler Aligner (BWA version 0.5.8) (read alignment to human References genome - Hg19) | SAMtools (v 0.1.7)—(SNVs and InDel calling) | PolyPhen2, SNAP and SIFT—(pathogenicity prediction) | Variants were excluded if | Heterozygous c.1858G > A |
| - present in the 72 control exomes of non-PD patients | - p.Asp620Asn (16q11.2) | ||||||||
| - present in dbSNP131 and 1000-Genomes Project | |||||||||
| - had an average heterozygosity of more than 0.02 | |||||||||
| Variants were included if | |||||||||
| - heterozygous | |||||||||
| - non-synonymous | |||||||||
|
|
| Palestinian family (two patients and their unaffected brother) | Homozygosity mapping and SNP genotyping in a consanguineous family (SNP genotyping using Affymetrix GeneChip Human Mapping 250 K Nsp Array | WES on a single index patient (GAIIx, Illumina) | Burrows-Wheeler Aligner (BWA) (sequence reads were aligned to human References genome - hg18 (GRCh36)) | Genome Analysis Toolkit (GATK) (variant calling) | ANNOVAR (variant annotation) | Variants were excluded if | Homozygous c.801–2A > G (1p31.3) |
| Picard (marking of PCR duplicates) | SeattleSeq Annotation (GERP score) | - present in dbSNP132, 1000-Genomes Project and in-house databases | |||||||
| Polyphen, SIFT and Mutation taster (pathogenicity prediction) | Variants were included if | ||||||||
| NHLBI Exome Sequencing Project website release Version: v.0.0.9 (mutation frequency in ethnically matched controls) | - non-synonymous | ||||||||
| - conservation score GERP >3 | |||||||||
| Confirmation | |||||||||
|
|
| Iranian family (healthy parents, who were first-degree relatives, as well as two affected, and three unaffected siblings) | Genome-wide SNP genotyping and homozygosity mapping was performed on a consanguineous PD family (HumanOmniExpress beadchips and HiScanSQ system, Illumina) | WES on two PD-affected siblings (HiSeq 2000, Illumina) | Burrows-Wheeler Aligner (BWA) tool (alignment of raw sequence reads to the human References genome - NCBI GRCh37) | GATK Unified Genotyper tool (SNP/SNV/InDel calling) | AnnTools (variant annotation) | Variants were excluded if | Homozygous c.773G > A |
| Genome Studio program (genotyping quality assessment) | Genome Analysis Toolkit (GATK v1.5–16-g58245bf) (base-quality re-calibration and local realignment) | MutPred, SNPs&GO, Mutalyzer, HomoloGene (NCBI) and Clustalw2) (pathogenicity prediction) | - present in dbSNP137, 1,000 Genomes Project and Exome Variant Server of the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project databases | - p. Arg258Gln (21q22.11) | |||||
| PLINK (Homozygous segment identification) | Variants were included if | ||||||||
| Illumina genome viewer (homozygous segment visualizer | - located in exons or splice sites | ||||||||
| Confirmation | |||||||||
|
|
| German Family | Genotyping of the top ten candidate variants (KORA-AGE cohort using | WES on 2 PD-affected second cousins. (Genome Analyzer IIx system (Illumina) | Burrows-Wheeler Aligner (BWA 0.5.8) (read alignment) | SAMtools (version 0.1.7) (SNV/InDel calling) | SIFT/PROVEAN, PolyPhen-2 and MutationTaster (pathogenicity prediction) | Variants were excluded if: observed in in-house exome database, dbSNP135, 1000-Genomes Project and NHLBI-ESP (EA only) databases with a minor allele frequency >1% | Heterozygous c.1970C > T |
| MALDI-TOF masspectrometry on the SequenomH platform | Variants were included if | - p.Ser657Asn (7q32.3) | |||||||
| Linkage analysis on 6 family members using oligonucleotide SNP arrays (500 K | - non-synonymous | ||||||||
| Illumina) | - exonic/coding | ||||||||
| MERLIN (Linkage analysis) | - missense, nonsense, stoploss, splice site or frameshift variants | ||||||||
| Confirmation | |||||||||
|
|
| Canadian (Dutch–German– Russian Mennonite) family | None | WES on three PD - affected members (Agilent SureSelect 38 Mb Human All Exon Kit, Illumina Genome Analyzer) | Bowtie 12.70 and Burrows-Wheeler Aligner (BWA 0.5.9) (read alignment to human References genome - NCBI Build 37.1) | SAMtools (variant calling) | SIFT (pathogenicity prediction) | Variants were excluded if | Homozygous c.2564A > G |
| Genome Analysis Toolkit (GATk) (local realignment around insertions and deletions) | - Phred quality score <20 | - p.Asn855Ser (3q22.1) | |||||||
| - frequently observed in population databases (minor allele frequency >1%) | |||||||||
| Confirmation | |||||||||
|
|
| Japanese family | Genome-wide linkage analysis on 8 affected and 5 unaffected individuals of the family (Genome-Wide Human SNP Array 6.0, Affymetrix) | WES on three patients & WGS on one patient (HiSeq 2000, Illumina) | Burrows-Wheeler Aligner (BWA-MEM version 0.5.9) (read alignment to References human genome - UCSC hg19) | SAMtools version 0.1.16 (SNV/InDel calling) | PolyPhen-2 & MutationTaster (pathogenicity prediction) | Variants were excluded if | Heterozygous 182C > T |
| SNPHitLink & MERLIN (linkage analysis) | - present in the 1,000 Genomes, dbSNP138, the Human Genetic Variation database, and the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP) database | - p.Thr61Ile (7p11.2) | |||||||
| Variants were included if | |||||||||
| - located in exons or splice sites | |||||||||
| - heterozygous state | |||||||||
| - non-synonymous or caused aberrant splicing | |||||||||
| - located in regions with positive log of odds greater than 1 | |||||||||
| - not noted in unaffected Japanese controls | |||||||||
| Confirmation by Sanger sequencing | |||||||||
|
|
| South Indian family | None | WES on a single index patient (HiSeq 2000, Illumina) | FastXToolkit (pre-alignment QC) | SAMTools and GATk (variant calling) | wANNOVAR (variant annotation) | Variants were excluded if | Homozygous c.169C > A |
| Burrows-Wheeler Aligner (BWA) (read alignment) | KGGSeq (variant filtering) | - present in databases (dbSNP 135, 137 and 138, 1,000 genomes and National Heart, Lung, and Blood Institute (NHLBI) 6500 exomes and ExAC) with a MAF >0.01 | - p.P57T (11p15.4) | ||||||
| SAMTools (Post-alignment QC) | Variants were included if | ||||||||
| BEDTools (assess target coverage and depth | - heterozygous | ||||||||
| Confirmation | |||||||||
|
|
| North Indian family | None | WES on two affected siblings (HiSeq 2000, Illumina) | FastXToolkit (pre-alignment QC) | SAMTools and GATk (variant calling) | wANNOVAR (variant annotation) | Variants were excluded if | Homozygous c.89_90 insGTCGCCCC |
| Burrows-Wheeler Aligner (BWA) (read alignment) | KGGSeq (variant filtering) | - present in databases (dbSNP 135, 137 and 138, 1,000 genomes and National Heart, Lung, and Blood Institute (NHLBI) 6500 exomes and ExAC) with a MAF >0.01 | - p.Gln32fs (7q32.3) | ||||||
| SAMTools (Post-alignment QC) | Variants were included if | ||||||||
| BEDTools (assess target coverage and depth | - homozygous (Autosomal recessive inheritance assumed) | ||||||||
| - exonic variants | |||||||||
| - shared between the two affected individuals | |||||||||
| Confirmation | |||||||||
|
|
| Canadian-Mennonite (same family as DNAJC13) | None | WES on one unaffected individual and 4 distantly related affected cousins) (HiSeq2500, Illumina) | Genome Analysis Tool Kit (GATk v1.1) (read alignment to human References genome - Hg19) | Unified Genotyper from the Genome Analysis Tool Kit (SNV/INDEL calling and performing variant quality score (VQS) and Phred-likelihood scores) | ANNOVAR (variant annotation) | Variants were excluded if | Heterozygous c.422G > T |
| PolyPhen2 (pathogenicity prediction) | - present in multiple databases including the dbSNP (v130), HapMap and 1,000 Genome databases with a MAF >0.01 | - p.Arg141Leu (20p13-p12.3) | |||||||
| SpliceView, NNsplice, and ESEfinder (splicing effect prediction) | - VQSLOD < −3 | ||||||||
| - alternate Phred-scaled likelihood scores <99 | |||||||||
| Variants were included if | |||||||||
| - the average read per targeted base was >65X with the Phred quality score of ≥30 | |||||||||
| Confirmation | |||||||||
|
|
| Spanish Basque family | None | WES on index patient (HiSeq 2000, Illumina) | Burrows-Wheeler | GATK Unified Genotyper tool (SNP INDEL calling) | AnnTools kit (variant annotation) | Variants were excluded if | Heterozygous c.5885G > A |
| Aligner Tool (BWA) (read alignment to the human References genome - NCBI | PICARD (Exome statistics) | - intragenic, intronic, and non-coding exonic | -p.Arg1962His | ||||||
| GRCh37.p13) | MutPred, SNPs&Go, MutationTaster, and CADD (pathogenicity prediction) | - present in the dbSNP149 build, 1,000 Genomes | and c.8959G.A- p.Gly2987Arg) | ||||||
| Genome Analysis | HomoloGene database (protein conservation across species) | Project phase 3, the Exome Variant Server of the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing and the Exome Aggregation Consortium databases with a MAF >0.05 | (8p23.2) | ||||||
| Toolkit (GATK v1.5-16-g58245bf) (base-quality re-calibration and local realignment) | Human Gene Mutation database (HGMD) & NCBI | Variants were included if | |||||||
| ClinVar database (genotype-phenotype correlation) | - mapping quality (q30 or higher) | ||||||||
| - depth of coverage (d10 or higher) | |||||||||
|
|
| Canadian and Italian family | Positional cloning (Ion AmpliSeq™ Exome Kit and the Ion Proton™ System, Thermo Fisher Scientific) | WES on index patient (HiSeq 2000, Illumina) | Torrent Suite Software | Torrent Variant Caller (tvc 4.2-18) (variant calling) | ANNOVAR (variant annotation) | Confirmation, segregation analysis and screening | Homozygous c.187A > T |
| - p.K63* (10q21.3) | |||||||||
| and c.79–2A > G - p.V27Wfs*14 (10q21.3) | |||||||||
|
|
| Italian family | Genome-wide SNP array genotyping and linkage analysis in ten affected | WES on index PD patient (HiSeq 2000, Illumina) | Burrows-Wheeler Aligner (BWA-MEM version 0.5.9 (read alignment to human References genome - UCSC hg19) | Genome-Analysis-Tool-Kit (GATk) v3 (variant calling) | Cartagenia Bench Lab NGS v·5·0·1 (variant filtering) | Variants were excluded if | Homozygous |
| Relatives (HumanCNV370 bead chip, Illumina) | SpliceSiteFinder-like, MaxEntScan, NNSPLICE, GeneSplicer, and Human Splicing Finder integrated in Alamut Visual version 4·2 (splicing effect prediction) | - present in dbSNP, Exome Variant Server NHLBI GO Exome Sequencing Project (ESP), 1000 Genomes, Genome of the Netherlands (GoNL), Exome Aggregation Consortium (ExAC) and the Genome aggregation database (GnomAD) databases with a MAF >0.01 | - p.Gly603Arg (14q11.2) | ||||||
| Copy number analysis (Nexus Copy Number, BioDiscovery) | Variants were included if | ||||||||
| MERLIN (linkage analysis) | - heterozygous | ||||||||
| - exonic | |||||||||
| - non-synonymous | |||||||||
| -within 5bp from a splice site | |||||||||
| - predicted to be pathogenic with ≥5 in silico tools | |||||||||
| Confirmation by Sanger sequencing | |||||||||
|
|
| Han Chinese family | None | WES on 39 EOPD patients (probands), their parents, and 20 unaffected siblings (HiSeq 2000, Illumina) | Burrows-Wheeler Aligner (BWA version 0.5.9-r16) (alignment to the human References genome - hg19) | HaplotypeCaller in GATk (SNV/InDel calling) | PolyPhen-2 (pathogenicity prediction) | Variants were excluded if | Heterozygous c.691+3dupA (6q22.1) |
| Picard (marking of PCR duplicates) | DAPPLE (disease Association Protein-Protein Link Evaluator) (construction of protein-protein interaction networks) | - present in dbSNP137, the Han Chinese of 1,000 Genomes Project, or both of the two offspring in quads | |||||||
| GATk (InDel realignment recalibration of the base quality scores) | GEO2R (determine differential gene expression in protein networks) | - indels were in known structure variation regions | |||||||
| Gene Ontology (GO) (gene annotation) | Variants were included if | ||||||||
| KEGG pathway database (functional enrichment) | - Phred quality scores >30 | ||||||||
| PLINK (single variant associations) | - there was only one type of alternative allele | ||||||||
| - the read coverage of alternative alleles in the offspring was > than 4 | |||||||||
| - more than 30% and less than 5% of the covered reads were the alternative allele for the offspring and parents | |||||||||
| - for the offspring: PL (0/0)≥30, PL (0/1) = 0, and PL (1/1)≥30 (PL: Phred-scaled likelihoods for a given genotype) | |||||||||
| - for both parents PL (0/0) = 0, PL (0/1)≥30, and PL (1/1)≥30 | |||||||||
| - two adjacent SNVs were located at least 10 bp away | |||||||||
| Confirmation of variants | |||||||||
| (Lin et al., 2019) |
| Taiwanese Family | Custom-designed NGS Gene Panel (including 40 genes associated with parkinsonism) screening | WES on three affected individuals (Ion Torrent TM Next-Generation | Burrows-Wheeler Aligner (BWA-MEM) (alignment to the human References genome - GRCh37/hg19) | GATk (variant calling) | ANNOVAR (variant annotation) | Variants were excluded if | Heterozygous c.941A > C |
| Sequencing Exon v2 kit and platform) | Picard (marking and removing duplicates) | CADD, PolyPhen-2 and SIFT (pathogenicity prediction) | - present dbSNP144, 1,000 Genomes Project, EXAC, gnomAD and the Taiwan Biobank with a MAF >0.01 | - p.Tyr314Ser (3p21.31) | |||||
| Human Splicing Finder (splicing effect prediction) | Variants were included if | ||||||||
| - exonic | |||||||||
| Confirmation of co-segregation | |||||||||
|
|
| Afrikaner family (South Africa) | None | WES on three affected individuals (HiSeq 2000, Illumina) | Burrows-Wheeler Aligner (BWA-MEM) (alignment to the human References genome -GRCh37/hg19) | GATk (variant calling) | Annovar (variant annotation) | Variants were excluded if | Heterozygous p.G849D (C > T) |
| SAMTools (mpileup) (read coverage statistics) | SIFT, PolyPhen-2, MutationTaster, CADD, GERP++ (pathogenicity prediction) | - present in the EXAC database, gnomAD, the 1,000 Genomes Project and dbSNP databases | (11q13.1) | ||||||
| Allen Brain Atlas, Human Protein Atlas, KEGG database, PANTHER (pathway and expression analysis) | Variants were included if | ||||||||
| - minimum Phred quality score >30 | |||||||||
| Confirmation | |||||||||
|
|
| Australian Families (family #002 and #433) | Probands were screened for known PD causes including SNVs and expansions of repetitive regions in ATXN2, ATXN3 and TBP, and copy number variations in SNCA and PARK2 | #433 ( | Torrent Suite (v4.0) was used for Ion Torrent data (alignment to the human References genome) | HaplotypeCaller from the GenomeAnalysis ToolKit (v3.5) for the MiSeq data (variant calling) | ANNOVAR (variant annotation) | Variants were excluded if | SIPA1L1-Heterozygous p.R236Q (14q24.2) |
|
| WES on three PD-affected siblings (Ion AmpliSeq capture kit and sequenced using the Ion Torrent (Thermo Fisher Scientific, Waltham, MA, USA) | SamTools and bedtools2 (alignment to the human References genome) | Torrent Suite (v4.0) was used for Ion Torrent data (variant calling) | -seen in >30% of the MiSeq in-house datasets (2n = 48) or >0.5% of the AnnEx Annotated Exomes browser (2n = 5,902, | KCNJ15 -Heterozygous p.R28C (21q22.13) | ||||
|
| #002 ( | SamTools and bedtools2 (variant calling) | Variants were included if | ||||||
| WES on 2 PD-affected siblings and 2 PD-affected cousins and an unaffected cousin | - present in affected members of the family while taking into consideration incomplete penetrance | ||||||||
| Illumina HiSeq, Illumina MiSeq and Ion Torrent) | - if were exonic or in a splicing region (RefSeq v61) | ||||||||
| - missense allele | |||||||||
| - minor allele frequency of <0.01 in the gnomAD database | |||||||||
| Confirmation | |||||||||
FIGURE 1Summary of tools used to analyze next-generation sequencing data in the 17 studies that identified novel Parkinson’s disease genes.