| Literature DB >> 21996756 |
Jillian P Casey1, Tiago Magalhaes, Judith M Conroy, Regina Regan, Naisha Shah, Richard Anney, Denis C Shields, Brett S Abrahams, Joana Almeida, Elena Bacchelli, Anthony J Bailey, Gillian Baird, Agatino Battaglia, Tom Berney, Nadia Bolshakova, Patrick F Bolton, Thomas Bourgeron, Sean Brennan, Phil Cali, Catarina Correia, Christina Corsello, Marc Coutanche, Geraldine Dawson, Maretha de Jonge, Richard Delorme, Eftichia Duketis, Frederico Duque, Annette Estes, Penny Farrar, Bridget A Fernandez, Susan E Folstein, Suzanne Foley, Eric Fombonne, Christine M Freitag, John Gilbert, Christopher Gillberg, Joseph T Glessner, Jonathan Green, Stephen J Guter, Hakon Hakonarson, Richard Holt, Gillian Hughes, Vanessa Hus, Roberta Igliozzi, Cecilia Kim, Sabine M Klauck, Alexander Kolevzon, Janine A Lamb, Marion Leboyer, Ann Le Couteur, Bennett L Leventhal, Catherine Lord, Sabata C Lund, Elena Maestrini, Carine Mantoulan, Christian R Marshall, Helen McConachie, Christopher J McDougle, Jane McGrath, William M McMahon, Alison Merikangas, Judith Miller, Fiorella Minopoli, Ghazala K Mirza, Jeff Munson, Stanley F Nelson, Gudrun Nygren, Guiomar Oliveira, Alistair T Pagnamenta, Katerina Papanikolaou, Jeremy R Parr, Barbara Parrini, Andrew Pickles, Dalila Pinto, Joseph Piven, David J Posey, Annemarie Poustka, Fritz Poustka, Jiannis Ragoussis, Bernadette Roge, Michael L Rutter, Ana F Sequeira, Latha Soorya, Inês Sousa, Nuala Sykes, Vera Stoppioni, Raffaella Tancredi, Maïté Tauber, Ann P Thompson, Susanne Thomson, John Tsiantis, Herman Van Engeland, John B Vincent, Fred Volkmar, Jacob A S Vorstman, Simon Wallace, Kai Wang, Thomas H Wassink, Kathy White, Kirsty Wing, Kerstin Wittemeyer, Brian L Yaspan, Lonnie Zwaigenbaum, Catalina Betancur, Joseph D Buxbaum, Rita M Cantor, Edwin H Cook, Hilary Coon, Michael L Cuccaro, Daniel H Geschwind, Jonathan L Haines, Joachim Hallmayer, Anthony P Monaco, John I Nurnberger, Margaret A Pericak-Vance, Gerard D Schellenberg, Stephen W Scherer, James S Sutcliffe, Peter Szatmari, Veronica J Vieland, Ellen M Wijsman, Andrew Green, Michael Gill, Louise Gallagher, Astrid Vicente, Sean Ennis.
Abstract
Autism spectrum disorder (ASD) is a highly heritable disorder of complex and heterogeneous aetiology. It is primarily characterized by altered cognitive ability including impaired language and communication skills and fundamental deficits in social reciprocity. Despite some notable successes in neuropsychiatric genetics, overall, the high heritability of ASD (~90%) remains poorly explained by common genetic risk variants. However, recent studies suggest that rare genomic variation, in particular copy number variation, may account for a significant proportion of the genetic basis of ASD. We present a large scale analysis to identify candidate genes which may contain low-frequency recessive variation contributing to ASD while taking into account the potential contribution of population differences to the genetic heterogeneity of ASD. Our strategy, homozygous haplotype (HH) mapping, aims to detect homozygous segments of identical haplotype structure that are shared at a higher frequency amongst ASD patients compared to parental controls. The analysis was performed on 1,402 Autism Genome Project trios genotyped for 1 million single nucleotide polymorphisms (SNPs). We identified 25 known and 1,218 novel ASD candidate genes in the discovery analysis including CADM2, ABHD14A, CHRFAM7A, GRIK2, GRM3, EPHA3, FGF10, KCND2, PDZK1, IMMP2L and FOXP2. Furthermore, 10 of the previously reported ASD genes and 300 of the novel candidates identified in the discovery analysis were replicated in an independent sample of 1,182 trios. Our results demonstrate that regions of HH are significantly enriched for previously reported ASD candidate genes and the observed association is independent of gene size (odds ratio 2.10). Our findings highlight the applicability of HH mapping in complex disorders such as ASD and offer an alternative approach to the analysis of genome-wide association data.Entities:
Mesh:
Year: 2011 PMID: 21996756 PMCID: PMC3303079 DOI: 10.1007/s00439-011-1094-6
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132
Fig. 1The principles and analytical approach of homozygous haplotype mapping. a The schematic outlines the principle of homozygous haplotype (HH) mapping. SNP genotype data is collected on each case and control. Homozygous and heterozygous SNPs are shown in black and grey respectively. Firstly, runs of homozygosity (ROH) are identified in the samples (outlined in purple boxes). The overlapping ROH region shared by a minimum of three individuals (shown between red dashed lines) is considered for the HH analysis. The haplotypes within the overlapping ROH region are identified and a Fisher’s exact test applied to determine if a particular HH is significantly more common in ASD cases compared to parental controls. Only the haplotypes of those individuals who have an ROH in the region in question are considered. In the above example all four individuals (3 ASD cases and 1 parental control) have an overlapping ROH. However, the haplotype in the overlapping ROH may differ. The 3 ASD cases have haplotype A (blue) while the parental control has haplotype B (red). Haplotype A is shared at a higher frequency in ASD cases compared to parental controls (apply Fisher’s test) and is termed a risk homozygous haplotype (rHH). This is an example of a rHH that is specific to ASD probands; b Flowchart of homozygous haplotype analysis of ASD cohort. The discovery analysis was performed on 1,402 AGP trios from the AGP stage 1 collection. The replication study involved an additional 1,182 AGP trios from the stage 2 collection. The stage 1 and 2 samples were clustered together to (1) separate stage 1 and 2 individuals into population clusters of similar ancestry and (2) classify stage 2 individuals into the joint ancestry-matched population clusters for the stage 2 replication study. The same rHH mapping strategy was applied to the discovery (stage 1) and replication (stage 2) data sets independently. The genes located in homozygous haplotypes significantly more common in ASD cases compared to parental controls were identified in each analysis. The rHH candidate genes were then compared for the ancestry-matched groups that had at least 50 probands in both the discovery and replication sample sets. To assess the contribution of genomic architecture to the rHH findings in ASD, the same strategy was applied to two additional disease data sets; bipolar disorder (BD) and coronary artery disease (CAD). The location of the rHH in ASD, BD and CAD were compared
Fig. 2Genetic ancestry of AGP sample set. Principal component analysis (PCA) of 2,584 ASD proband samples (discovery stage 1 = 1,402 samples, replication stage 2 = 1,182 samples) was performed in EIGENSOFT. Tracy–Widom statistics indicated that the first eight principal components (PCs) were significantly contributing to the genetic variation of the sample set (Supplementary Table 2). The Hopach hierarchical clustering algorithm was applied to eigenvalues (y-axis) from the first eight PCs (x-axis) (van der Laan and Pollard 2002). In the ‘Pop’ column each sample is coloured according to the AGP site at which it was collected (see legend). Hopach clustering, non-parametric bootstrapping and genetic distance calculations (Supplementary Table 3) identified ten ancestral population clusters labelled C1 to C10. The five population clusters with a minimum of 50 probands (C2–C6) were used in the discovery analysis (n = 1,019 trios)
Fig. 3Comparison of HH sharing in ASD cases and parental controls. The normalised number of rHH (HH that are more common in one group compared to the other) in five population clusters with a minimum of 50 probands. The dark grey bars represent the number of HH that are more common (Fisher’s exact test right p value <0.05) in ASD probands compared to parental controls. Such regions are referred to as rHH throughout the paper. The light grey bars denote the number of HH that are more common (Fisher’s exact test left p value <0.05) in parental controls compared to ASD probands. To account for differences in sample size, counts have been normalised to a group of 100 samples (Supplementary Material 1). The number of rHH identified in ASD probands is significantly greater than the number of rHH identified in parental controls across the five population clusters (paired t test p value = 0.008)
Summary of rHH results for each population cluster
| Cluster | No. of ASD probands | No. of parental controls | Significant rHH | Total no. of rHH genes | ||
|---|---|---|---|---|---|---|
| Total | ASD-specific | Enriched | ||||
| C2 | 148 | 294 | 40 | 23 | 17 | 243 |
| C3 | 289 | 584 | 99 | 44 | 55 | 341 |
| C4 | 85 | 170 | 11 | 5 | 6 | 79 |
| C5 | 280 | 560 | 100 | 48 | 52 | 417 |
| C6 | 217 | 434 | 57 | 18 | 39 | 372 |
| Total | 1,019 | 2,024 | 307 | 138 | 169 | 1,452a |
A summary of results for each of the ancestry-matched population clusters in the discovery analysis. The number of homozygous haplotypes over-represented [5% significance level, risk homozygous haplotypes (rHH)] in the ASD cohort is further subdivided into ASD-specific (only present in probands) and enriched (more common in probands than controls). The number of genes implicated by the rHH in each population cluster is shown in the final column. A total of 307 rHH were identified across the 5 population clusters. These regions contained or were adjacent to 1,452 genes
aWhen genes that are found in more than one population cluster are considered only once, the final number of genes is 1,243
Fig. 4rHH identified in four population clusters in the vicinity of CADM2. An rHH located in a non-coding evolutionary-conserved region on 3p12.1 was identified in four of the five population clusters. The coloured bars represent the run of homozygosity in each patient/parental control carrying the rHH. For each population cluster, the rHH is the shared ROH segment. The ROH profile is presented with a conservation plot (ECR browser; conservation throughout mouse, dog and rhesus monkey of fragments >350 bp at 75% identity indicated in red). rHH adjacent to CADM2 were identified in 23/1019 ASD cases and 11/2031 parental controls [Yates’ corrected χ2 p value = 1.9 × 10−5, OR = 4.26 (2.1, 8.6)]
Previously identified ASD candidate genes located in rHH
| Pop. | No. genes identified in rHH regions | No. rHH genes previously implicated in ASD | ASD candidate genes located in rHH regions |
|---|---|---|---|
| Discovery stage 1 | |||
| C2 | 243 | 7 |
|
| C3 | 341 | 12 | ALAS1, FBXO33, |
| C4 | 79 | – | – |
| C5 | 417 | 5 | ACO2, |
| C6 | 372 | 11 |
|
| Replication stage 2 | |||
| C3 | 529 | 8 | ALAS1, FOXP2, GBE1, GRIK2, |
| C4 | 70 | 1 |
|
| C5 | 731 | 9 | ACO2, |
| C6 | 46 | – | – |
A number of genes that have previously been implicated in ASD were found to be located in rHH regions in both the discovery and replication HH mapping studies. In the discovery analysis, 25 previously reported ASD candidate genes occurred in rHH regions. Nine ASD candidate genes were located in rHH in more than one population group and are shown in bold. Another 16 ASD candidates were located in rHH in a single population group and may represent population-specific susceptibility genes. We also found that 16 previously implicated ASD genes were located in rHH regions in the replication study. Two ASD candidate genes were located in rHH in more than one population group and 14 ASD genes were population-specific. Ten ASD candidate genes (ALAS1, CSMD3, FOXP2, GABRG1, GBE1, GRM8, FBXO33, IMMP2L, SLC4A10 and ACO2) occurred in rHH regions in both the discovery and replication HH mapping analyses