Literature DB >> 34526668

Elevated common variant genetic risk for tourette syndrome in a densely-affected pedigree.

Andrew McQuillin¹, James J Crowley^2,3,4, Matthew Halvorsen⁵, Jin Szatkiewicz⁵, Poorva Mudgal⁵, Dongmei Yu^6,7, Ashley E Nordsletten^8,9, David Mataix-Cols⁸, Carol A Mathews¹⁰, Jeremiah M Scharf^6,7,11, Manuel Mattheisen^8,12,13,14, Mary M Robertson¹⁵.

Abstract

Tourette syndrome (TS) is a highly heritable neuropsychiatric disorder with complex patterns of genetic inheritance. Recent genetic findings in TS have highlighted both numerous common variants with small effects and a few rare variants with moderate or large effects. Here we searched for genetic causes of TS in a large, densely-affected British pedigree using a systematic genomic approach. This pedigree spans six generations and includes 122 members, 85 of whom were individually interviewed, and 53 of whom were diagnosed as "cases" (consisting of 28 with definite or probable TS, 20 with chronic multiple tics [CMT], and five with obsessive-compulsive behaviors [OCB]). A total of 66 DNA samples were available (25 TS, 15 CMT, 4 OCB cases, and 22 unaffecteds) and all were genotyped using a dense single nucleotide polymorphism (SNP) array to identify shared segments, copy number variants (CNVs), and to calculate genetic risk scores. Eight cases were also whole genome sequenced to test whether any rare variants were shared identical by descent. While we did not identify any notable CNVs, single nucleotide variants, indels or repeat expansions of near-Mendelian effect, the most distinctive feature of this family proved to be an unusually high load of common risk alleles for TS. We found that cases within this family carried a higher load of TS common variant risk similar to that previously found in unrelated TS cases. Thus far, the strongest evidence from genetic data for contribution to TS risk in this family comes from multiple common risk variants rather than one or a few variants of strong effect.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34526668 PMCID： PMC8881309 DOI： 10.1038/s41380-021-01277-w

Source DB: PubMed Journal: Mol Psychiatry ISSN： 1359-4184 Impact factor: 13.437

Introduction

While the etiology of Tourette syndrome (TS)[1] remains unknown, in the 1885 description of the condition that now bears his name, Georges Gilles de la Tourette noted a role for heredity. Today, we recognize TS as a highly heritable (~60–80%)[2-4] neuropsychiatric disorder characterized by recurrent, nonrhythmic tics with a childhood onset and male bias.[1] First-degree relatives of TS cases have a 15-to-25-fold increased risk of TS[5, 6] compared with the general population, representing one of the highest recurrence risks for common neuropsychiatric disorders. Gene discovery for TS is in its infancy, yet as with other psychiatric disorders,[7] the available results suggest that much of overall genetic risk derives from a combination of many common loci of small effect and fewer rare variants of large effect. Regarding common variation, published TS genome-wide association studies (GWAS)[8, 9] to date reveal substantial SNP heritability (~30%) and genetic overlap with obsessive compulsive disorder (OCD),[10] though sample sizes to date are relatively small (4,819 cases and 9,488 controls; larger studies are underway). Regarding rare variation, there is now evidence for several genes with likely gene-disrupting mutations[11-16] or copy number variants (CNVs).[17-20] The relative importance of common versus rare variation in TS, particularly in individual families, has yet to be determined. To begin to address this question, we evaluated a notable TS pedigree using modern genomic methods. Large pedigrees with an unusually high rate of symptom presentation are appealing targets for psychiatric genetic study. There is already strong evidence that rare coding variants contribute to psychiatric illness risk across multiplex families, for example in autism.[21] More recently, there have been multiple studies focused on schizophrenia and bipolar disorder that have found that families with multiple affected individuals display higher common variant genetic risk.[22-24] The pedigree studied here was first described by Robertson and Gourdie in 1990.[25] They reported a multiply-affected six-generation kindred with one founding couple and affected family members in five generations. A total of 122 members were identified, 85 were individually interviewed, and 53 were diagnosed as ‘cases’. ‘Cases’ consisted of 28 with definite or probable TS, 20 with chronic multiple motor or vocal tics (CMT), and 5 with obsessive-compulsive behavior (OCB). Only eight of the 53 cases had previously consulted a doctor for their symptoms, with one receiving a diagnosis of TS. Many of those diagnosed later in adulthood presented with mild symptoms. For a full clinical and psychopathological description of the pedigree, see Robertson and Gourdie (1990).[25] This pedigree has previously been examined by linkage and candidate gene analysis. While an initial complex segregation analysis by Curtis et al. (1992)[26] suggested the possibility that GTS could have been inherited as a single major gene with autosomal dominant inheritance, subsequent work by Brett et al. (1994, 1995a, b, 1996, 1997)[27-31] was inconsistent with this hypothesis as areas of interest on chromosome 3, chromosome 8, and several candidate genes were ruled out. A genome-wide linkage scan was performed by Curtis et al (2004)[32], pointing to areas of modest interest (at least one result significant at 0.005 using two-point analysis) on chromosomes 5, 10 and 13. A subsequent factor analytic study of phenotypes from this pedigree by Robertson and Cavanna (2007)[33] suggested that the genetics of TS in this pedigree may be more complex than previously appreciated (i.e. more than a few segregating risk loci). We reanalyzed this TS pedigree using modern genomic methods. All 66 available DNA samples (25 TS, 15 CMT, 4 OCB cases and 22 unaffecteds) were genotyped using a dense SNP array to identify shared segments and CNVs, and to calculate genetic risk scores. Eight cases were also whole genome sequenced to test whether any rare variants were shared identical by descent. These eight samples were selected from the most recent generation with DNA available and were spread across the pedigree in order to maximize the genetic distance between samples.

Methods and Materials

Samples

All individuals provided written informed consent. The structure of the pedigree studied here is shown in Figure 1. We had access to a total of 66 DNA samples (25 TS, 15 CMT, 4 OCB cases, and 22 unaffecteds). For a full clinical and psychopathological description of the pedigree, see Robertson and Gourdie (1990).[25] DNA was extracted from peripheral blood using standard methods.

Figure 1.

The pedigree for the family being studied. The plus signs indicate individuals where we were able to obtain DNA and conduct array-based genotyping. Individuals being pointed to with an arrow were subjected to whole genome sequencing. We highlight inbreeding loops which represent potential means through which recessive genotypes or polygenic risk for Tourette syndrome could accumulate.

SNP array

All 66 samples were genotyped on the Illumina Multi-Ethnic Genotyping Array (MEGA) with ~1.7 million markers. Genotyping was performed at the SNP&SEQ Technology Platform at Uppsala University, Sweden and the results were analyzed using the software GenomeStudio 2011.1 from Illumina Inc. The raw dataset provided contains data from a total of 1,779,819 markers across a total of 67 samples (1 of which was a known duplicate). For downstream shared segment analysis, as quality control, we verified sex and relatedness and included only variants that were missing calls in <=2% of samples, carried a Mendelian error rate < 0.02 and had a minor allele frequency (MAF) within the cohort of at least 0.01 (N = 736,144 SNPs). All 66 distinct samples had a genotype missingness < 0.02 and qualified for inclusion within the analysis.

Shared segment analysis

We used the MEGA array genotype data to identify segments of the genome identical by descent (IBD). This was performed across all 66 samples included in the MEGA array data. Beagle (version r1399, settings: impute=false, ibd=true) was used to collect pairwise IBD segments and DASH (v1.1.0, settings: default) was used to cluster pairwise IBD segment calls. We converted these IBD segments into PLINK bed/bim/bam files, and then used GCTA v1.92.3 to transform these data into a genetic relationship matrix (GRM) for pedigree members using the subset of segments found at MAF >= 0.01. We used GCTA v1.92.3 again (--mlma –maf 0.01) to determine if any of these segments are associated with affected status within the pedigree to a multiple test corrected level of significance (ie. case/control with 44 cases and 22 controls), using the computed GRM to control for pairwise relatedness within the family. Cases are reported as having CMT, TS or OCB.

CNV analysis (arrays)

We used PennCNV v1.0.4 to call CNVs from genotype array data. The PennCNV script ‘detect_cnv.pl’ was used to call CNVs from sample-level B Allele Fequency (BAF) and Log R Ratio (LRR) data from all 1,779,819 markers that were on the array. We used ‘filter_cnv.pl’ to subset on CNVs supported by at least 10 SNPs and have a confidence score of at least 10. We then used ‘clean_cnv.pl’ (specifically, the function ‘combineseg’) to merge separate CNV calls on single samples that are likely part of one large single CNV, in order to form a final CNV callset. From this callset we identified a subset of 5 samples that had an LRR standard deviation or BAF drift > 95th percentile within the pedigree and excluded these from further analysis, leaving 61 samples (42 affecteds, 19 unaffecteds) for analysis. We note that of these 5 samples, 4 were >95th percentile for both LRR standard deviation and BAF drift, offering strong support for their exclusion from analysis. Using the CNV callset, we performed standard CNV association testing, and also looked for specific CNV calls associated with a psychiatric phenotype from prior studies. We performed CNV case/control association testing using PLINK v1.07, using 100,000 permutations and a seed value of 892523. We understood that this analysis was particularly underpowered, and thus took note of any CNVs with an association p-value < 0.05. We also examined in detail any CNV that carried a reciprocal overlap greater than 0.5 with an entry from a list of 54 previously identified neurodevelopmental CNVs.[34]

Whole genome sequencing (WGS)

A total of 8 cases were sequenced on the Illumina HiSeq X to a mean coverage of 36X with paired-end 150 bp reads. Sequencing was performed at the SNP&SEQ Technology Platform at Uppsala University, Sweden. We used the BAM and genomic VCF files produced by the Uppsala University bioinformatics core, aligned on GRCh37 and produced following GATK best practices, for downstream analyses. We performed joint calling using the genomic VCFs followed by variant quality score recalibration, once again according to Broad GATK best practices workflows. As quality control, we tested for concordance between array-based variant genotypes and sequence-based genotypes, which would be consistent with sample concordance. Disagreement between the genotypes would suggest that sample switching had occurred in WGS data.

CNV analysis (WGS)

To call structural variants (deletions, duplications, inversions and translocations) from WGS data, we used Delly2 v0.7.7. For the calling of each distinct SV type we followed the suggested workflow for germline SV calling by the developers: 1) single sample calling using developer-provided exclusion regions for hg19, and using the 1kg_human_ref.fa file; 2) merging of sample level genotype calls into a unified site list with constraints min_sv_size = 500, max_sv_size = 1000000, max_breakpoint_offset = 500, min_reciprocal_overlap = 0.5; 3) re-genotyping each single sample using the unified site list; 4) merging of re-genotyped single-sample call BCFs to get one merged call BCF; and 5) application of the germline SV filter to the merged call BCF. We screened for CNVs (deletions, duplications, translocations and inversions) that overlapped exons from coding genes in Ensembl 37.75, and for deletions and duplications, did not overlap by more than 50% with CNVs found in > 1% of the general population. We screened for CNVs that were absent from 9 internal non-pedigree samples and were found in at least 1 of the 8 index cases. We also screened for CNVs that had reciprocal overlap with the previously described 54 neurodevelopmental CNV regions[34] and were found in at least one sample.

Repeat expansion analysis (WGS)

We screened the 8 WGS samples for shared previously reported pathogenic repeat expansions using TREDPARSE v0.7.8.[35] A total of 30 tri-nucleotide repeat expansion loci were screened for the presence of an expansion size with prior evidence of pathogenicity. Any positive results were listed, in particular any that were shared between more than one sample. In addition, we used ExpansionHunter v2.5.3, and screened for instances of expansions at the sample level where the majority of the reads spanned the entirety of the repeat, as opposed to containing the repeat region as a subset of the read.

SNV/indel analysis (WGS)

For alignment and variant calling, GATK best practices were followed. Realignment, base quality score recalibration, and realignment at indel sites were performed at SciLifeLab. From genome-level VCF files we performed joint genotyping on 8 TS pedigree members alongside 9 additional samples in-batch, forming a single callset. We then performed variant quality score recalibration following GATK best practices, requiring that all variants fall within a VQSR tranche of >99.9% for SNVs and >95% for indels. After this we performed transcript-level variant annotation on the VCF using Variant Effect Predictor v90. Finally, additional annotation of the SNV/indel VCF was carried out using vcfanno v0.2.6, and included global and subpopulation allele frequency across exome and genome samples in gnomAD v2.0.2, as well as noncoding constraint percentiles derived from human genome data.[36] We prioritized loss of function coding variation based on the probability of loss-of-function intolerance (pLI) from the nonpsychiatric subset of ExAC v0.3.1,[37] and missense coding variation based on the MPC score (https://doi.org/10.1101/148353). We screened for coding and noncoding variants with potential for contribution of risk to TS based on sharing between sequenced index cases to a degree that defied expectation. Three analyses of rare variation were constructed: 1) a genome-level screen for variants with evidence of excess sharing among index cases at a level of significance that survives a multiple test threshold, and 2) tests for an excess of rare nonsynonymous variant sharing relative to synonymous variants of similar MAF, and 3) tests for an excess of rare constrained (context-dependent tolerance score or CDTS<1%) noncoding variant sharing relative to nonconstrained (CDTS>=1%) noncoding variant sharing. P-values from variant-level association testing were derived from the R package RVS.[38] We merged genomewide tests for oversharing with data from WGS deletions, duplications, translocations and inversions described previously. We also catalogued all coding SNVs and indels that were rare in all populations in gnomAD (gnomAD v2.0.2 popmax MAF < 0.0001), found in at least one sequenced case and carried strong evidence for being subject to purifying selection, either based on being loss of function within a gene that is highly loss of function intolerant (pLI > 0.995) or based on being a missense variant with an MPC score >= 2.

Genetic risk scores (GRS)

In order to generate discovery and target TS datasets for GRS analyses, we split the individual-level data from Yu et al [9] into two datasets. The first comprised 2108 cases and 5726 controls and was used for the discovery GWAS. The second comprised 2191 cases and 2848 controls and was used as the target sample. To process the pedigree data the same way as the unaffected and TS sample, we first redid variant and sample-level QC using the ricopili function ‘preimp_dir_12’ on default settings resulting in a dataset of 762,276 variants across all 66 pedigree samples. We then used the ricopili ‘impute_dirsub_57’ function on the dataset to impute common variation using 1000 genomes phase 3 as a reference panel, just as was done with external samples. Next, we subsetted on samples that could be classified as European-ancestry according to PEDDY v0.4.2, resulting in a target sample set of 2832 unaffected controls, 2171 TS-affected cases, 22 unaffected pedigree members and 43 affected pedigree members. We also utilized published GWAS summary statistics for OCD[39] and ADHD[40], consisting of case/control cohort sizes of 2688/7037 and 19,099/43,194 respectively. Genetic risk score calculation was done using PRScs (version update Apr 24, 2020 on https://github.com/getian107/PRScs) PRScs SNP-level weights were calculated using PRScs.py, using author-provided 1000 genomes EUR LD reference filesets and defining the size of each GWAS. Once all SNP-level weights were calculated using PRScs, the genetic risk score for a given sample was calculated as the linear combination of SNP weights multiplied by SNP dosages across autosomes, carried out in PLINK v1.90b3w using the –score and providing an autosomal SNP weight file as input. All normalization and analysis of computed GRS values were done using R v3.4.3. For a given set of GRS values relevant to a phenotype, we Z-transformed the values specifically for the subset of samples that were in the pedigree or in the set of unrelated TS cases or population controls. We removed any instance of sample GRS outside of 6 SD from the mean (only two noted, both of which were in the TS GRS test and were unrelated TS cases). We conducted multiple-group comparisons with integrated multiple correction using the Tukey honest significant difference test, implemented in R via the function ‘TukeyHSD’. Plotting was done using ggplot2 (Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978–3-319–24277-4, https://ggplot2.tidyverse.org.), with code for raincloud plots taken from https://micahallen.org/2018/03/15/introducing-raincloud-plots/ (https://wellcomeopenresearch.org/articles/4-63/v1). For comparisons of GRS that effectively controlled for pairwise relatedness, we used an approach described previously[22] that relies on the R package GMMAT.[41] We formed a kinship matrix across all pedigree samples and all external control samples based on autosomal SNPs from the previously described set of genotypes that have a MAF of at least 0.05 in the full cohort. In R, we then used the GMMAT function glmm.wald() to form a logistic mixed model where for a given binary comparison, membership of one group versus the other was the outcome, and the only listed fixed effect was gender. The kinship relationship matrix for included samples was provided as a random effect in the model. As described in[22] we fitted this model by maximum likelihood using Nelder–Mead optimization. To make the fitting tractable, we selected a set of 260 random samples from the control cohort set (based on retaining a 1:4 pedigree/control sample ratio) for use in our analysis versus the 65 assessable samples in the pedigree dataset. Finally, we performed a final set of tests using GMMAT that were specifically focused on within-pedigree samples, in order to test the following null hypotheses : 1) that within affected samples, there would be no difference in TS GRS between CMT cases and formal TS cases; 2) within unaffected samples there would be no difference in GRS between male and female individuals; 3) within affected samples there would be no difference in GRS between male and female individuals.

Results

Quality Control for array and WGS data

The genotype array data proved to be high quality. Genotype concordance between duplicates in the data was 99.99%. We used PLINK to assess sequence-derived sex and pairwise relatedness in array data. We found that all 66 genotyped family member sexes were concordant with those reported in the provided family pedigree. We next performed pairwise relatedness with the plink ‘--genome’ function on LD-pruned variants with MAF > 10% (see Methods and Materials), and verified that all relationships reported in the pedigree up to the second degree are supported by the array data. Like the array data, WGS data also were of high quality. On average, 98.9% of reads per sample were successfully aligned to the reference and mean coverage was 36.1X (range: 31.4X-39.0X). We used PLINK to perform relatedness checks between WGS data and reportedly matching array data, using genotypes from 50,108 easily-intersected variants between the data. All WGS/array sample pairs reported as coming from the same individual showed near 100% relatedness, consistent with the DNA from these pairs coming from a single individual.

Shared segments in array data

Using high density array data, we identified 57,533 total distinct shared segments (ie. haplotypes) within the pedigree. We tested for the presence of a shared segment found at a higher dosage in affected family members (CMT, OCB, TS) at a significance level that survives multiple test correction (0.05 / 57,533 = 8.7×10−7). To accomplish this, we utilized a mixed model approach that takes familial pairwise relationships as covariates in comparing haplotype dosage, as described elsewhere.[23] We were unable to identify any shared haplotype that passed the preset significance threshold (Figure 2A, Supplemental Table 1). The region that came closest was a segment which maps to chr7:34282415 (P = 1.7 × 10−4). The genes nearby (BMPER and NPSR1) have no reported relevance to neuropsychiatric phenotypes. These null results are consistent with the absence of a single fully penetrant variant that might explain the high rate of TS within the pedigree.

Figure 2.

Manhattan plots for single locus and variant tests in affected pedigree members versus unaffected controls. We were unable to identify a single array IBD segment (top), array CNV (middle) or WGS SNV/indel/CNV (bottom) which was found in cases relative to controls at a degree of significance that survives multiple testing. This is not to say that there is no contribution of risk from rare variants within this particular pedigree, but rather that we lack evidence for a particular single rare variant contributing to risk in the pedigree.

CNV screen in array data

We called CNVs from array intensity data using PennCNV v1.0.4, and tested for evidence of overrepresentation of single CNVs in affecteds, as well as the presence of CNVs with known neuropsychiatric consequence.[34] We focused the analysis on a subset of 6513 QC-passing calls found within 61 QC-passing samples (42 affecteds, 19 unaffecteds). We first used PLINK v1.07 to determine if there is a CNV where the degree of elevated frequency in affecteds versus unaffecteds passes the multiple test corrected threshold. Here, since we were able to detect a total of 12,037 distinct markers for CNVs found at least once across the pedigree, we set the threshold as 0.05 / 12,037 = 4.2 × 10-6. We were again unable to find a CNV that passed the preset threshold (Figure 2B, Supplemental Table 2). The most significant results were duplications overlapping the genes SLC2A3 and SLC2A14, neither of which carry any known relevance to neuropsychiatric phenotypes.We next screened the CNV callset for overlap with a set of 53 previously described neurodevelopemental CNVs. We detected 2 overlapping calls mapping to a single parent-child transmission of a 450kb 15q11.2 deletion. Given that this is the only observance of this CNV in the cohort, and that both parent and child are reported as unaffected, it seems unlikely that this CNV is relevant to the phenotype.

Oversharing of SNVs, indels and SVs in WGS data

For the 8 affected cases that we whole genome sequenced, we used the RVS package in R to compute, given the pedigree, the probability of the sharing pattern observed by chance. All CNVs that overlapped a coding base were included, while only SNVs and indels that were either nonsynonymous or in a region with CDTS < 5% were considered. We weighted this probability for SNVs and indels by their frequency in gnomAD v2.0.2, and corrected for the total number of variants assessed in this way (5349 total), searched for a variant where the degree of oversharing passed the multiple test corrected threshold (0.05 / 5349 = 9.3 × 10−6). We were unable to identify a single variant where the probability of the sharing occurred by chance passed the significance threshold. The most significant instance of oversharing was a noncoding SNV within an intronic region of the protein-coding gene WDR37 (4 of 8 pedigree samples, p=2.8×10−5). The most significant oversharing of a CNV is that of a duplication overlapping the gene CGNL1 (4 of 8 pedigree samples, p=1×10−4). We did identify 3 variants that were found in 5 of 8 samples. One was a noncoding SNV in the 5’ untranslated region of the gene BBS10, another was a noncoding SNV in the 3’ untranslated region of the gene SLC2A4RG, and the final variant was a missense SNV in RTEL1 gene. None of the overlapping protein-coding genes have known psychiatric relevance. Only RTEL1 showed particularly high gene expression in brain in GTEx.

Sharing of damaging SNVs and indels in WGS data

While there was no compelling evidence in the data for a single penetrant variant that was found among all 8 index cases that underwent WGS, we also tested the hypothesis that there may be damaging rare variants in the pedigree that altogether are overshared to a greater degree than expected by chance. For a given selection of genomic bases that fit a broad classification of annotation (coding, noncoding) we used a corresponding set of variant level annotations that could be considered neutral within these spaces (synonymous variation and variants in regions with CDTS > 1%, respectively). We compared the ratio of sharing/nonsharing of variants that are rare (gnomAD popmax MAF < 0.0001) with a deleterious classification to that of neutral variation, since these variants best represent patterns of sharing specific to this pedigree. We noted some oversharing of loss of function coding variation relative to presumably neutral synonymous variation (Odds Ratio=3.39, p=0.009, Supplemental Figure S1). There was no significant evidence for oversharing of noncoding variation in constrained versus nonconstrained regions, or for missense variation relative to synonymous variation. Oversharing of loss of function variants would be consistent with a non-zero amount of these variants contributing to total genetic risk for TS for at least a fraction of cases. Despite this, there was no single variant with deleterious annotation that was found in all 8 index cases that had potential to explain the heritability within this pedigree. Ultimately, these data do not offer compelling evidence that there are discernable shared rare variants that contribute to TS risk within this particular pedigree.

Damaging coding SNVs and indels present in the pedigree

We identified a total of 27 variants that could be classified as damaging coding variants (Table S3). Of these, 23 could be classified as damaging based on having missense annotation with an MPC score >= 2, while 4 could be classified as damaging based on having loss of function annotation and falling within a gene that has a pLI score in the nonpsychiatric subset of ExAC v0.3.1 > 0.995. Of these, only 2 were shared, a damaging missense variant in ALKBH6 (carried by 3 of 8 index cases) and a damaging missense variant in PDE4A (found in 2 of 8 index cases).

Genetic risk scores of unrelated TS cases and healthy controls

As a final analysis of the genetic variation within the pedigree, we tested the hypothesis that the family carries a burden of genetic risk on the more common end of the allele frequency spectrum, detectable through computation of common variant genetic risk scores (GRS). We calculated sample-level GRS values for TS, OCD and ADHD across a European ancestry cohort consisting of 65 genotyped pedigree members (43 affected, 22 unaffected), 2171 unrelated TS cases and 2832 healthy controls. See Methods and Materials for details. Before proceeding, we checked for evidence of any sort of systemic stratification of pedigree and unrelated case/control data, and were unable to find any. We selected a subset of 13 unrelated married-in members of the pedigree and constructed a principal components analysis (PCA) of genetic variation with MAF > 0.05 in these individuals alongside unrelated TS cases and unaffected controls. Since the 13 pedigree members are unrelated we do not expect that the PCA will be impacted by any patterns of relatedness, and should only be influenced by ethnicity. We found no evidence for stratification of these individuals and the rest of the unrelated case/control cohort across 20 prinicpal components (Supplemental Figure S2). We also conducted PCA on the unrelated case/control cohort with a representative in-family member (sample 14, close in relationship to pedigree founders) and do not see any separation of the in-family member from other unrelated samples across PCs (Supplemental Figure S3). We constructed a set of GRS comparisons meant to test the null hypothesis that there is an absence of group differences with regards to GRS calculated from meaningful summary statistics. We partition the samples into four different groups, 1) population controls, 2) unrelated TS cases, 3) unaffected pedigree samples and 4) affected pedigree samples. When using TS GRS values computed from PRScs, we found substantial differences between groups. We found that as expected, TS cases have a higher GRS than population controls (difference = 0.33, Tukey’s HSD adjusted P =1.28×10−12). We failed to find statistically significant evidence that unaffected family members have a higher load than controls (difference = 0.41, adjusted P = 0.22). We did, however, observe a higher level of GRS in affected family members than controls (difference = 0.67, adjusted P = 3.96×10−5). We failed to observe a statistically significant difference in GRS between affected and unaffected family members, potentially due to low sample size (difference = 0.27, adjusted P = 0.72). Similar analyses of the load of common risk variation for OCD and ADHD did not yield similar results. We did not find significant evidence of elevated OCD GRS in affected pedigree samples relative to controls (difference = −0.15, adjusted P= 0.75, Supplemental Figure S4). We found that pedigree affected samples had unusually low ADHD GRS values relative to controls (difference = −0.74, P = 6.02×10−6, Supplemental Figure S5). This is surprising given the genetic correlation between ADHD and TS, but could simply reflect the genetic profile of this family primarily being centered on TS risk. Interestingly, unaffected pedigree samples did not display a significant difference in ADHD GRS from controls (difference = 0.19, P = 0.80). We performed several follow-up analyses to determine if the elevation in GRS among pedigree affecteds could be explained via upstream batch effects or biasing that is technical in nature, and were unable to identify any. For example, we repeated the GRS procedure with variants that had MAF > 0.20 in unaffected samples from the TS GWAS sumstats to determine if the GRS elevation was limited to rare variation, and found that there was still a clear elevated GRS in affected pedigree members relative to controls (difference = 0.59, P = 4.73×10−4). It was important to compile a comparison of GRS values between pedigree samples and unaffected controls in a manner that controls for pairwise relatedness between all samples, since in this comparison, essentially all cases are from the same family. To do this we followed a protocol described in separate work[22] that relies on the R package GMMAT[41] for forming and analyzing a logistic mixed model. In the model we defined as the outcome membership within the group being tested, defined sex as a fixed effect, and defined the genetic relationship matrix between all included samples as a random effect. As with before we found that samples with higher GRS are more likely to be cases within the pedigree than they are to be controls outside of the pedigree, even taking pairwise relatedness into account (OR=1.75, 95% CI=1.01–3.04, P=0.047). This particular difference was not significant after adjusting for a false discovery rate of 0.1, but we did note that after adjustment, there was a significant difference in TS GRS in the whole pedigree versus external controls (OR=1.72, 95% CI=1.11–2.67, P=0.016). (Supplemental Figure 6). Finally, we did not find significant evidence for a difference in TS GRS between affected in-pedigree individuals with TS versus CMT (OR=0.59, 95% CI=0.26–1.37, P=0.22), nor did we find any evidence for a difference in TS GRS between in-pedigree unaffected female versus male individuals (OR=0.77, 95% CI=0.28–2.16, P=0.63). We did note a significant difference between in-pedigree affected female versus male individuals (OR=2.88, 95% CI=1.22–6.81, unadjusted P=0.02, FDR adjusted P=0.08), consistent with female cases in this pedigree having a higher load of autosomal TS polygenic risk than males.

Discussion

Our comprehensive analyses did not conclusively identify any rare variants of strong effect in this British pedigree. However, we cannot exclude the presence of risk variants with more complex inheritance patterns, variants with more cryptic functional effects, or variants missed due to coverage or individuals sequenced. Nonetheless, our results suggest that the presence of many common TS risk alleles may partly explain why there is a high incidence of tic disorders in this pedigree. These results may be somewhat surprising given the high incidence of tic disorders in this pedigree, but they are consistent with GWAS of unrelated tic disorder cases which indicate a high degree of polygenicity. Prior studies of large pedigrees with other psychiatric disorders have found similar results. Collins et al[42] analyzed a large Spanish pedigree densely affected by bipolar disorder and found no rare variants of strong effect, but an unusually high load of common risk alleles for bipolar disorder. Szatkiewicz et al[23] genomically characterized a large pedigree from Northern Sweden with a high density of psychotic disorders and likewise found no rare variants of strong effect, but rather an elevated common variant risk profile. Diniz et al[43] studied a large, multigenerational pedigree from a rural village in Brazil with a high prevalance of mood disorders and noted a substantial role for common genetic variation, given the strength of linkage signals observed. One possible explanation for the accumulation of common risk alleles in a family is non-random mating, where partners with similar phenotypes are more likely to mate. In support of this idea, Nordsletten et al[44] found pervasive patterns of nonrandom mating within and across 11 major psychiatric disorders, including OCD (tic disorders were not reported due to insufficient power for analysis). Furthermore, in the Brazilian pedigree mentioned above, de Jong et al[24] found that polygenic risk contributions from married-in individuals, caused by non-random mating on phenotype, appeared to increase risk for psychiatric disorders. Finally, Kurlan et al[45] observed bilineal (from maternal and paternal sides) transmission in TS families. Strengths of this study include the large pedigree size and depth of genomic characterization, which included multiple types of rare variation (SNVs, indels, CNVs, repeat expansions) and common variation for relevant traits. Limitations include the fact that only eight of the 53 cases had previously consulted a doctor for their symptoms and many of those diagnosed later in adulthood presented with mild symptoms. Furthermore, there was substantial heterogeneity of symptoms among the affecteds and this might explain the somewhat surprising lack of association with OCD GRS and the negative association with ADHD GRS. In conclusion, we systematically assessed this large pedigree dense with TS, CMT and OCB for genetic variants of strong effect. This comprehensive analysis did not conclusively identify any CNVs, SNVs, indels or repeat expansions of near-Mendelian effect. However, the common variant risk profiles of cases in this pedigree suggest that the etiology of TS, CMT and OCB in this pedigree is more related to the cumulative burden of TS genetic risk rather than one or a few variants of extremely strong effect.

42 in total

Review 1. The genetics of Tourette syndrome.

Authors: Hao Deng; Kai Gao; Joseph Jankovic
Journal: Nat Rev Neurol Date: 2012-03-13 Impact factor: 42.937

Review 2. Genetic architectures of psychiatric disorders: the emerging picture and its implications.

Authors: Patrick F Sullivan; Mark J Daly; Michael O'Donovan
Journal: Nat Rev Genet Date: 2012-07-10 Impact factor: 53.242

3. Familial clustering of tic disorders and obsessive-compulsive disorder.

Authors: Heidi A Browne; Stefan N Hansen; Joseph D Buxbaum; Shannon L Gair; Judith B Nissen; Kathrine H Nikolajsen; Diana E Schendel; Abraham Reichenberg; Erik T Parner; Dorothy E Grice
Journal: JAMA Psychiatry Date: 2015-04 Impact factor: 21.596

4. Familial Risks of Tourette Syndrome and Chronic Tic Disorders. A Population-Based Cohort Study.

Authors: David Mataix-Cols; Kayoko Isomura; Ana Pérez-Vigil; Zheng Chang; Christian Rück; K Johan Larsson; James F Leckman; Eva Serlachius; Henrik Larsson; Paul Lichtenstein
Journal: JAMA Psychiatry Date: 2015-08 Impact factor: 21.596

Review 5. Tourette's syndrome.

Authors: James F Leckman
Journal: Lancet Date: 2002-11-16 Impact factor: 79.321

6. A twin study of Tourette syndrome.

Authors: R A Price; K K Kidd; D J Cohen; D L Pauls; J F Leckman
Journal: Arch Gen Psychiatry Date: 1985-08

Review 7. The genetics of Tourette syndrome: a review.

Authors: Julia A O'Rourke; Jeremiah M Scharf; Dongmei Yu; David L Pauls
Journal: J Psychosom Res Date: 2009-09-30 Impact factor: 3.006

8. Interrogating the Genetic Determinants of Tourette's Syndrome and Other Tic Disorders Through Genome-Wide Association Studies.

Authors: Dongmei Yu; Jae Hoon Sul; Fotis Tsetsos; Muhammad S Nawaz; Alden Y Huang; Ivette Zelaya; Cornelia Illmann; Lisa Osiecki; Sabrina M Darrow; Matthew E Hirschtritt; Erica Greenberg; Kirsten R Muller-Vahl; Manfred Stuhrmann; Yves Dion; Guy Rouleau; Harald Aschauer; Mara Stamenkovic; Monika Schlögelhofer; Paul Sandor; Cathy L Barr; Marco Grados; Harvey S Singer; Markus M Nöthen; Johannes Hebebrand; Anke Hinney; Robert A King; Thomas V Fernandez; Csaba Barta; Zsanett Tarnok; Peter Nagy; Christel Depienne; Yulia Worbe; Andreas Hartmann; Cathy L Budman; Renata Rizzo; Gholson J Lyon; William M McMahon; James R Batterson; Danielle C Cath; Irene A Malaty; Michael S Okun; Cheston Berlin; Douglas W Woods; Paul C Lee; Joseph Jankovic; Mary M Robertson; Donald L Gilbert; Lawrence W Brown; Barbara J Coffey; Andrea Dietrich; Pieter J Hoekstra; Samuel Kuperman; Samuel H Zinner; Pétur Luðvigsson; Evald Sæmundsen; Ólafur Thorarensen; Gil Atzmon; Nir Barzilai; Michael Wagner; Rainald Moessner; Roel Ophoff; Carlos N Pato; Michele T Pato; James A Knowles; Joshua L Roffman; Jordan W Smoller; Randy L Buckner; A Jeremy Willsey; Jay A Tischfield; Gary A Heiman; Hreinn Stefansson; Kári Stefansson; Danielle Posthuma; Nancy J Cox; David L Pauls; Nelson B Freimer; Benjamin M Neale; Lea K Davis; Peristera Paschou; Giovanni Coppola; Carol A Mathews; Jeremiah M Scharf
Journal: Am J Psychiatry Date: 2019-03-01 Impact factor: 18.112

9. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture.

Authors: Lea K Davis; Dongmei Yu; Clare L Keenan; Eric R Gamazon; Anuar I Konkashbaev; Eske M Derks; Benjamin M Neale; Jian Yang; S Hong Lee; Patrick Evans; Cathy L Barr; Laura Bellodi; Fortu Benarroch; Gabriel Bedoya Berrio; Oscar J Bienvenu; Michael H Bloch; Rianne M Blom; Ruth D Bruun; Cathy L Budman; Beatriz Camarena; Desmond Campbell; Carolina Cappi; Julio C Cardona Silgado; Danielle C Cath; Maria C Cavallini; Denise A Chavira; Sylvain Chouinard; David V Conti; Edwin H Cook; Vladimir Coric; Bernadette A Cullen; Dieter Deforce; Richard Delorme; Yves Dion; Christopher K Edlund; Karin Egberts; Peter Falkai; Thomas V Fernandez; Patience J Gallagher; Helena Garrido; Daniel Geller; Simon L Girard; Hans J Grabe; Marco A Grados; Benjamin D Greenberg; Varda Gross-Tsur; Stephen Haddad; Gary A Heiman; Sian M J Hemmings; Ana G Hounie; Cornelia Illmann; Joseph Jankovic; Michael A Jenike; James L Kennedy; Robert A King; Barbara Kremeyer; Roger Kurlan; Nuria Lanzagorta; Marion Leboyer; James F Leckman; Leonhard Lennertz; Chunyu Liu; Christine Lochner; Thomas L Lowe; Fabio Macciardi; James T McCracken; Lauren M McGrath; Sandra C Mesa Restrepo; Rainald Moessner; Jubel Morgan; Heike Muller; Dennis L Murphy; Allan L Naarden; William Cornejo Ochoa; Roel A Ophoff; Lisa Osiecki; Andrew J Pakstis; Michele T Pato; Carlos N Pato; John Piacentini; Christopher Pittenger; Yehuda Pollak; Scott L Rauch; Tobias J Renner; Victor I Reus; Margaret A Richter; Mark A Riddle; Mary M Robertson; Roxana Romero; Maria C Rosàrio; David Rosenberg; Guy A Rouleau; Stephan Ruhrmann; Andres Ruiz-Linares; Aline S Sampaio; Jack Samuels; Paul Sandor; Brooke Sheppard; Harvey S Singer; Jan H Smit; Dan J Stein; E Strengman; Jay A Tischfield; Ana V Valencia Duarte; Homero Vallada; Filip Van Nieuwerburgh; Jeremy Veenstra-Vanderweele; Susanne Walitza; Ying Wang; Jens R Wendland; Herman G M Westenberg; Yin Yao Shugart; Euripedes C Miguel; William McMahon; Michael Wagner; Humberto Nicolini; Danielle Posthuma; Gregory L Hanna; Peter Heutink; Damiaan Denys; Paul D Arnold; Ben A Oostra; Gerald Nestadt; Nelson B Freimer; David L Pauls; Naomi R Wray; S Evelyn Stewart; Carol A Mathews; James A Knowles; Nancy J Cox; Jeremiah M Scharf
Journal: PLoS Genet Date: 2013-10-24 Impact factor: 5.917

10. Genome-wide association study of Tourette's syndrome.

Authors: J M Scharf; D Yu; C A Mathews; B M Neale; S E Stewart; J A Fagerness; P Evans; E Gamazon; C K Edlund; S K Service; A Tikhomirov; L Osiecki; C Illmann; A Pluzhnikov; A Konkashbaev; L K Davis; B Han; J Crane; P Moorjani; A T Crenshaw; M A Parkin; V I Reus; T L Lowe; M Rangel-Lugo; S Chouinard; Y Dion; S Girard; D C Cath; J H Smit; R A King; T V Fernandez; J F Leckman; K K Kidd; J R Kidd; A J Pakstis; M W State; L D Herrera; R Romero; E Fournier; P Sandor; C L Barr; N Phan; V Gross-Tsur; F Benarroch; Y Pollak; C L Budman; R D Bruun; G Erenberg; A L Naarden; P C Lee; N Weiss; B Kremeyer; G B Berrío; D D Campbell; J C Cardona Silgado; W C Ochoa; S C Mesa Restrepo; H Muller; A V Valencia Duarte; G J Lyon; M Leppert; J Morgan; R Weiss; M A Grados; K Anderson; S Davarya; H Singer; J Walkup; J Jankovic; J A Tischfield; G A Heiman; D L Gilbert; P J Hoekstra; M M Robertson; R Kurlan; C Liu; J R Gibbs; A Singleton; J Hardy; E Strengman; R A Ophoff; M Wagner; R Moessner; D B Mirel; D Posthuma; C Sabatti; E Eskin; D V Conti; J A Knowles; A Ruiz-Linares; G A Rouleau; S Purcell; P Heutink; B A Oostra; W M McMahon; N B Freimer; N J Cox; D L Pauls
Journal: Mol Psychiatry Date: 2012-08-14 Impact factor: 15.992

2 in total

Review 1. Tourette syndrome research highlights from 2021.

Authors: Andreas Hartmann; Per Andrén; Cyril Atkinson-Clement; Virginie Czernecki; Cécile Delorme; Nanette Marinette Debes; Natalia Szejko; Keisuke Ueda; Kevin Black
Journal: F1000Res Date: 2022-06-29

Review 2. Current understanding of the genetics of tourette syndrome.

Authors: Wei-De Lin; Fuu-Jen Tsai; I-Ching Chou
Journal: Biomed J Date: 2022-01-15 Impact factor: 7.892

2 in total