Literature DB >> 23042115

De novo gene mutations highlight patterns of genetic and neural complexity in schizophrenia.

Bin Xu1, Iuliana Ionita-Laza, J Louw Roos, Braden Boone, Scarlet Woodrick, Yan Sun, Shawn Levy, Joseph A Gogos, Maria Karayiorgou.   

Abstract

To evaluate evidence for de novo etiologies in schizophrenia, we sequenced at high coverage the exomes of families recruited from two populations with distinct demographic structures and history. We sequenced a total of 795 exomes from 231 parent-proband trios enriched for sporadic schizophrenia cases, as well as 34 unaffected trios. We observed in cases an excess of de novo nonsynonymous single-nucleotide variants as well as a higher prevalence of gene-disruptive de novo mutations relative to controls. We found four genes (LAMA2, DPYD, TRRAP and VPS39) affected by recurrent de novo events within or across the two populations, which is unlikely to have occurred by chance. We show that de novo mutations affect genes with diverse functions and developmental profiles, but we also find a substantial contribution of mutations in genes with higher expression in early fetal life. Our results help define the genomic and neural architecture of schizophrenia.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23042115      PMCID: PMC3556813          DOI: 10.1038/ng.2446

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


Schizophrenia is a severe psychiatric disorder with strong genetic component[1]. While the contribution of rare de novo copy number variants (CNVs) to schizophrenia risk is well established[2-5], the contribution of de novo nucleotide variants has not yet been probed extensively [6,7]. We completed exome sequencing of 146 Afrikaner family trios of subjects with a diagnosis of schizophrenia or schizoaffective disorder (SCZAFF)[3,8], as well as 34 unaffected control trios (53 and 22 trios, respectively, have been described previously[7]). We also sequenced the exomes of 85 US trios of subjects with schizophrenia or SCZAFF (see Methods and Supplementary Table 1). We excluded carriers of rare de novo CNVs (≥ 30kb), based on prior CNV scans of these cohorts[3,4]. We used an analytical pipeline previously described[7] and a series of filters, including final validation by Sanger sequencing of all family members (see Methods and Supplementary Fig. 1). More than 90% of single nucleotide variants (SNVs) and 20% of insertions/deletions (indels) were validated (Supplementary Table 2). In our control cohort, we identified 16 exonic de novo SNVs and 1 protein truncating indel in 34 subjects (0.50 event/sample) (Table 1). The point mutation rate in the captured coding sequence is 1.28 × 10-8 per base per generation. Among the 16 de novo SNVs, 11 are predicted to be non-synonymous (non-syn) missense and 5 synonymous (syn) changes. The non-syn/syn ratio of 2.20 is consistent with neutral expectation (2.23)[9,10] and with those reported for control (unaffected siblings) samples from the Simons Simplex Collection (SSC) (non-syn/syn = 2.23, n = 200, Ref.[11]; 2.11, n = 31, Ref.[12]; 2.99, n = 343, Ref.[13]; average 2.65, n = 574, combined SSC control group).
Table 1

Distribution of de novo events in family cohorts

Variant TypeTotal number of de novo events (number of events/subject)
Afrikaner casesn = 146US casesn = 85Total casesn = 231Afrikaner controlsn = 34
SNVs93 (0.64)53 (0.62)146 (0.63)16 (0.47)
-Non-syn80 (0.55)41 (0.48)121 (0.52)11 (0.32)
-Syn13 (0.09)12 (0.14)25 (0.11)5 (0.15)
Non-syn/syn ratio6.153.424.842.20
-nonsense2 (0.01)4 (0.05)6 (0.03)0 (0.00)
-canonical splice site3 (0.02)2 (0.02)5 (0.02)0 (0.00)
-splice consensus7 (0.05)4 (0.05)11 (0.05)1 (0.03)
All de novo indels9 (0.06)4 (0.05)13 (0.06)1 (0.03)
-no frameshift3 (0.02)1 (0.01)4 (0.02)0 (0.00)
-frameshift6 (0.04)3 (0.04)9 (0.04)1 (0.03)
All LOFs11 (0.08)9 (0.11)20 (0.09)1 (0.03)
All likely functional99 (0.68)51 (0.6)150 (0.65)13 (0.38)
Functional/syn ratio7.624.2562.6
In the 146 Afrikaner probands, we observed 93 confirmed de novo exonic point mutations (92 SNVs and 1 dinucleotide substitution) and 9 confirmed de novo indel events (Table 1). Six of the indels result in protein truncations and 3 in single aminoacid deletions. Additional query of de novo SNVs located within the flanking intronic regions identified 3 that altered a canonical splice site (Table 1) and 7 that altered the consensus sequence flanking canonical splice sites (Methods). Overall, 73 of 146 patients (50%) carry at least one likely functional de novo event (non-syn, indel, splice site mutations). The point mutation rate in the captured coding sequence is 1.73 × 10-8 per base per generation, not significantly different than the one observed in our control sample. Moreover, we found no differences in the distribution or frequency of multiple de novo point mutations within individuals in cases versus controls (Supplementary Fig. 2). The 93 identified de novo point mutations included 80 non-syn changes and 13 syn changes. The non-syn/syn ratio of 6.15 is higher than neutral expectation (2.23) [9,10] (P = 1.92 × 10-4, two-sided exact binomial test). To further assess the statistical significance of the observed enrichment of non-syn variants in cases, we performed permutation testing by randomly permuting the case/control labels of the trios in our dataset (Supplementary Note). Based on 100,000 such permuted datasets we obtain a permutation one-sided P = 0.033. By contrast, analysis of the enrichment of non-syn point mutations among private inherited variants (i.e. variants present only in one family and serving as proxy for evolutionarily young mutation events) did not reveal any significant differences in cases versus controls (Supplementary Table 3). Indeed, de novo variants are ~3.7 times more likely than rare inherited variants to harbor non-syn changes in cases (P < 0.0001, chi-square test) but not in controls (P < 0.77). In addition to non-syn events, we observed a 2.6-fold enrichment when comparing the rate of de novo events that almost certainly lead to loss-of-function (LOF; nonsense, frameshift indels and canonical splice site mutations) in cases (11/146; 7.5%) versus controls (1/34; 2.9%). Overall, using permutation testing we found a significant enrichment of likely functional de novo variants over syn ones in Afrikaner cases versus controls (permutation one-sided P = 0.017) (Table 1 and Supplementary Note). Analysis of the US cohort revealed a point mutation rate of 1.73 × 10-8 per base per generation, similar to the Afrikaner cases (Table 1). The non-syn/syn ratio of 3.42 (Table 1) is higher than neutral expectation (2.23) [9,10] but the difference is not statistically significant. In addition to the smaller size, this is likely due to higher uncertainty in family history status in the US cohort, which may not be entirely depleted of familial cases, consistent with a lower de novo CNV rate[4] as compared to the Afrikaner cohort[3]. By contrast, when comparing the rate of LOF de novo events per patient (8/85; 9.4%) we observed a 3.2-fold enrichment, similar to the one observed in the Afrikaner sample. In the combined sample of 231 affected families the non-syn/syn ratio (4.84, Table 1) remained higher compared to neutral expectation[9] (P = 2.1 × 10-4, two-sided exact binomial test). There was a differential enrichment of non-syn events between de novo and rare private inherited events (P < 0.0001). Notably, as in both individual cohorts, the LOF events per affected trio (20/231; 8.7%) was 2.8 times higher than in control trios (1/34, 2.9%) strongly supporting a role for gene-disrupting mutations (Tables 1, 2). Permutation testing revealed a significant enrichment of likely functional de novo variants over syn ones (permutation one-sided P = 0.026, Supplementary Note). We estimate that 46% of all likely functional de novo mutations identified represent genuine risk variants (Supplementary Note).
Table 2

LOF mutations in schizophrenia probands

IDSchizophrenia CohortMutation TypeGene SymbolLOF mutations in SA/SSC controls
trio_090USframeshiftXPR1no
trio_107USframeshiftCCDC39yes a
trio_121USframeshiftKDM5Cyes a
trio_005SAframeshiftKIAA0467no
trio_020SAframeshiftHIST1H1Eno
trio_026SAframeshiftRB1CC1no
trio_042SAframeshiftESAMno
trio_092SAframeshiftLAMA2no
trio_027SAframeshiftDDHD2no
trio_101USnonsenseSSBP3no
trio_118USnonsenseNUP54no
trio_124USnonsenseDPYDno
trio_128USnonsenseSTAP2no
trio_053SAnonsenseURB2no
trio_085SAnonsenseRARGno
trio_018SAsplice site bSYNGAP1no
trio_072SAsplice site bBRPF1no
trio_111USsplice site bPRDX6no
trio_103USsplice site bNLRC5no
trio_016SAsplice site bCUGBP2no

Ref [13]

canonical splice site

SA = Afrikaner

Considering phenotypic correlates, we observed a correlation between paternal age at proband’s birth and number of de novo events per offspring (Supplementary Fig. 3) but did not find any other significant differences (Supplementary Note and Supplementary Tables 4, 5). Genes affected by de novo variants in our study were not significantly over-represented in two previously established comprehensive lists of synaptic genes (Supplementary Note). In addition, pathway analyses using the DAVID Annotation Tool did not find any significantly enriched functional clusters. Evaluation of protein-protein interactions (PPI) revealed a significantly greater connectivity among mutational targets than would be expected by chance (P < 0.05). Multigene PPI clusters included one centered on MTOR and one centered on CANX that includes extracellular matrix and cell adhesion proteins (Supplementary Fig. 4) suggesting that diverse schizophrenia risk genes may converge on a shorter list of functional modules. We then examined to what extent the enrichment in functional de novo events is determined by the developmental pattern of brain expression of the mutated genes [14,15] (Supplementary Note and Supplementary Table 6). In both hippocampus (HPC) (Figure 1a and Supplementary Fig. 5) and dorsolateral prefrontal cortex (DLPFC) (Supplementary Fig. 6), two brain areas implicated in schizophrenia[16], the highest effect size was observed for genes showing highest expression during the prenatal period (Figure 1a and Supplementary Fig. 6). Importantly, there is a functional correlation between prenatal expression bias of target genes and neurodevelopmental impact of the corresponding mutations. Specifically, among patients carrying de novo mutations, those with mutations in prenatally-biased genes are more likely to have had multiple (≥ 3) behavioral abnormalities in childhood (before the age of 10)[17] as well as worse functional outcome following disease onset (Figure 1b and Supplementary Note). In addition, comparison of all genes with functional de novo events identified in our schizophrenia families (n = 145) with those identified in autism spectrum disorder (ASD) families (n = 675)[10-13] revealed 15 shared genes (Figure 1c), an overlap within random expectation (P = 0.29, Supplementary Note). However, 11 of the 15 shared genes (73%) are included in the list of our prenatally-biased targets (Supplementary Table 7). The probability that this overlap arises by chance is very low (P = 0.004).
Figure 1

Enrichment of non-syn or functional de novo variants according to temporal expression profiles of targets genes

a. We grouped our target genes from the combined schizophrenia cohorts into three classes [prenatal brain expression biased (“prenatally-biased”), postnatal brain expression biased (“postnatally-biased”) and “non-biased”] according to their temporal trajectory in reference to a previously described global expression switch occurring before birth and determined the relative enrichment of non-syn or, more generally, functional de novo variants over syn ones. The relative enrichment in differentially regulated target genes is shown for de novo mutations in the Afrikaner and US probands (top) and the combined SSC control group (middle), as well as for private transmitted variants in Afrikaner and US probands (bottom). The non-syn/syn ratio for prenatally-biased genes was significantly higher than neutral expectation (2.23) (7, P = 0.004). In comparison, the corresponding values for postnatally-biased genes were 4.4 (P = 0.06), whereas for non-biased genes were 4.13 (P = 0.12). Similar analysis of de novo variants in the SSC control group or of private transmitted variants in Afrikaner and US probands showed no differential in the distribution of effect sizes in any of the temporal trajectories tested. ‘Non-syn’ refers to non-synonymous mutations (missense and nonsense SNVs); “functional” refers to non-syn, indel and splice site mutations. Indel data was not available for transmitted variants.

b. Left: An elevated frequency of patients who had multiple (≥ 3) childhood behavioral abnormalities (Supplementary Note) was found among carriers of de novo functional mutations in prenatally-biased genes (10 out of 29 compared to 4 out of 51 patients carrying mutations in genes with no fetal brain bias, 4.4-fold enrichment, P = 0.0047, Fisher’s exact test). Right: An elevated frequency of patients with severe disease functional outcome (Supplementary Note) was found among carriers of de novo functional mutations in prenatally-biased genes (35 out of 38 compared to 45 out of 66 patients carrying mutations in genes with no fetal brain bias, 1.35-fold enrichment, P = 0.007, Fisher’s exact test).

c. Venn diagrams depicting the overlap between either all target genes or only prenatally-biased genes harboring functional de novo mutations identified in the Afrikaner and US probands (n = 145) and those identified in ASD exome scans[10-13] (n = 675).

The majority of prenatally-biased targets is highly expressed during the first and second trimester of pregnancy (Supplementary Fig. 7) and show an overrepresentation of nuclear genes involved in chromatin remodeling, nuclear transport and transcriptional control, as well as in protein translation and degradation (Supplementary Tables 6, 8). They also include genes involved in cell-cell and cell-matrix interactions, a subset of which interacts with two key adhesive proteins, THBS1[18] and ITGA6[19], involved in synaptogenesis, axonal growth and cortical layering (Supplementary Fig. 4). We confirmed that genes with prenatal expression bias are highly enriched in microRNA targets [15] (Supplementary Note) and also found a nominally significant enrichment of hsa-mir-367 and hsa-mir-1244 targets (Supplementary Tables 9, 10). By comparison, the risk conferred by postnatally-biased targets is related to their involvement in intracellular signaling processes (GTPase-, DAG-, or calcium-modulated signaling, Supplementary Table 8), which regulate diverse aspects of neuronal connectivity. We next set out to evaluate which of the individual genes are more likely to confer disease risk. We found 4 genes altered by two de novo events each in unrelated probands (3 of these genes were affected in patients across the two different populations tested) (Table 3 and Supplementary Note). These genes were struck twice by a combination of a nonsense and a missense de novo SNV (DPYD), or a combination of a splice site mutation (Supplementary Tables 11, 12) with either a de novo missense SNV (TRRAP and VPS39) or indel (LAMA2). None of these genes were affected in the Afrikaner or the SSC control group and no such mutational combinations were reported for any gene among 488 de novo events in the SSC control group. Given the number of de novo mutations in our dataset, observation of four such recurrent events has a P-value of 0.002 (Supplementary Note) with one of the 4 combinations (in LAMA2) being individually statistically significant (P = 0.017).
Table 3

Genes hit by recurrent de novo events

Sample IDGene SymbolChr LocusMutation TypeDNA changeRNA/Amino acid changeCohortGene Name
Genes hit by recurrent de novo SNVs/indels
trio_124DPYD1p21.3Nonsensec.1863G>Ap.Trp621*USdihydropyrimidine dehydrogenase
trio_016DPYDMissensec.1615G>Ap.Gly539ArgSA
trio_092LAMA26q22.13Frameshiftc.9139_9146del7p.Ser3050ThrfsX27SAlaminin, alpha 2
trio_049LAMA2splice siteac.4718-3r.(spl?)SA
trio_033TRRAP7q22.1Missensec.883A>Tp.Ile295PheSAtransformation/transcription domain-associated protein
trio_099TRRAPsplice siteac.7223.+6r.(spl?)US
trio_120VPS3915q15.1Missensec.2368C>Tp.Arg790CysSAvacuolar protein sorting 39 homolog (S. cerevisiae)
trio_125VPS39splice siteac.441+8r.(spl?)US
Genes hit by both de novo SNVs and de novo CNVs
trio_091DGCR222q11.2Missensec.1163C>Gp.Pro388ArgSADiGeorge syndrome critical region gene 2
DGCR2CNV(del)bSA
trio_064TOP3B22q11.2Missensec.1415G>Ap.Arg472GlnSAtopoisomerase (DNA) III beta
TOP3BCNV (del)SA
trio_121CIT12q24.23Missensec.238T>Cp.Tyr80HisUScitron (rho-interacting, serine/threonine kinase 21)
CITCNV (dupc)bSA
trio_111STAG13q22.3Missensec.667A>Tp.Thr223SerUSstromal antigen 1
STAG1CNV (del)bSA
trio_078SMAP21p34.2Missensec.896G>Ap.Ser299AsnSAsmall ArfGAP2
SMAP2CNV (dup)bSA

Consensus splice site mutation. Mutations in LAMA2 and TRRAP are predicted to be damaging (Supplementary Tables 11, 12).

Ref 3

Intragenic duplication

SA = Afrikaner

LAMA2 encodes the laminin alpha 2 chain, which constitutes one of the subunits of laminin 2 and 4 and binds to ITGA6, a prenatally-biased target (Supplementary Fig. 4). The indel in LAMA2 disrupts a critical C-terminal domain, while the splice mutation affects a highly conserved nucleotide at -1 position adjacent to the canonical splice acceptor AG motif and is expected to disrupt splicing (Supplementary Table 12). Homozygous mutations in LAMA2 lead with variable penetrance to congenital muscular dystrophy characterized by CNS involvement, including white matter abnormalities, cognitive impairment, seizures and neuronal migration defects[20]. In addition, a de novo mutation in the isoform LAMA1 was described in another schizophrenia cohort[6]. DPYD (Dihydropyrimidine dehydrogenase) is the initial and rate-limiting factor in the pathway of pyrimidine catabolism[21] and also modulates production of beta-alanine, a neuromodulator of inhibitory transmission in the brain[22]. We identified one missense and one nonsense SNV in the Afrikaner and US cohort, respectively. Abnormal urinary excretion of thymine and uracil confirmed DPYD deficiency in the missense mutation carrier (Supplementary Fig. 8). Heterozygous deletions either encompassing or within DPYD, as well as altered expression[23] have been described in ASD[24] and intellectual disability (ID)[25]. Neither autistic features nor ID were present in our DPYD mutation carriers suggesting variable expressivity. A GWAS mega-analysis in schizophrenia identified the strongest association (P = 1.6 × 10−11) at rs1625579, a variant located at 1p21.3, in the intron of MIR137[26] and within a haplotypic block (D’ > 0.9) that extends to the 5’ of DPYD (Supplementary Fig. 9). We did not find mutations in MIR137 and therefore association with rs1625579 may reflect contribution of DPYD variants. The splice mutation in TRAPP2 affects position +6 to the splice junction site and is predicted to disrupt splicing and binding of SRP55 splicing factor within a splicing enhancer. Notably, TRRAP and VPS39 are also mutated in ASD cases[12]. We also compared the identified functional de novo mutations to the de novo CNVs identified previously in our two cohorts (22 CNVs affecting 156 genes)[3,4]. Five genes (DGCR2, TOP3B, CIT, STAG1 and SMAP2) were altered by both de novo SNVs and CNVs (Table 3), two of them in patients across the two different populations tested. Two of these genes are within the 22q11.2 schizophrenia susceptibility locus. Our findings implicate a contribution from a diverse set of de novo mutations of relatively high but incomplete penetrance to the genomic architecture of schizophrenia in the context of a mutation-selection balance model and highlight the importance of using family samples where disease history has been thoroughly ascertained to illuminate their role. In that respect, focusing on our comprehensively ascertained Afrikaner cohort we estimate that at least 17.6% of sporadic cases carry a de novo pathogenic exonic mutation (Supplementary Note) and at least 9.9% carry a de novo CNV[3]. Thus, such mutations account for ~ 1/4 to 1/3 of all sporadic cases. Given that results from scans of non-exonic regions are still forthcoming, this is likely an underestimate. Equally important is the contribution of our findings toward understanding the neural architecture of schizophrenia risk. Given that we estimate the number of schizophrenia-risk loci to more than 850 (Supplementary Note), our findings unveil an exquisite sensitivity of the neural circuits underlying susceptibility to schizophrenia to precise levels or activity of many diverse proteins and signaling modules and suggest that focusing on circuits may be more commensurate with the heterogeneity of schizophrenia than other proposed mechanisms that concentrate on specific neurotransmitters or cell-types[27]. In addition, we show that in determining disease risk not only the function of the target gene but also the timing of the genetic insult is of critical importance. Specifically, although de novo mutations affect genes with diverse functions and developmental profiles, we describe a substantial contribution of mutations in developmentally regulated genes with higher expression during early- and mid-fetal life and show that such mutations are enriched among adult patients with prominent early, pre-psychotic, deviant behaviors. Our findings provide a mechanistic context to interpret epidemiological correlations among various prenatal environmental insults during the first and second trimester of pregnancy and risk for schizophrenia[28]. Moreover, the fact that expression of many prenatally-biased genes is under strict microRNA control may explain emerging links between microRNA dysregulation and psychiatric disorders[29]. The challenge remains to identify the affected biological processes and neural circuits and determine how they are affected. Unbiased network-based approaches as well as animal and cellular models of recurrent mutations will be invaluable toward this goal[30].

METHODS

Cohorts

The samples analyzed here comprise of trios collected from two distinct populations, the Afrikaner population from South Africa (European, mostly Dutch descent) (146 schizophrenia trios) and the U.S. population (Northern European descent) (85 schizophrenia trios). Of the 146 Afrikaner probands, 122 (83.6%) had a diagnosis of schizophrenia and 24 (16.4%) were diagnosed with SCZAFF disorder. Of the 85 U.S. probands, 46 (54.1%) had a diagnosis of schizophrenia, and 39 (45.9%) were diagnosed with SCZAFF disorder. The control cohort consisted of 34 trios with established Afrikaner heritage. Control families included unaffected subjects screened against presence and history of treatment for any psychiatric condition, as well as history of mental illness in 1st- or 2nd-degree relatives. Both affected and control trios were recruited and characterized in the context of our ongoing, large-scale genetic studies of schizophrenia and have been described previously[3,7,8]. Because de novo mutations are more likely to account for sporadic forms of the disease, we took great care to determine reliably and in-depth the family history status and generate cohorts enriched in sporadic cases (Supplementary Note). However, negative or positive family history was not a screening criterion. In the Afrikaner cohort it was possible to determine absence of disease in 1st- or 2nd-degree relatives due to the cohesive family structure, the large catchment area and long-term care provided by the local recruiting hospital that affords detailed hospital records over several generations[3,8]. In the geographically fragmented and ethnically diverse U.S. cohort we were able to determine absence of disease in 1st-degree relatives only (Supplementary Note). For additional cohort characteristics, see Supplementary Note. Informed consent was obtained from all participants and the Institutional Review Committees of Columbia University and University of Pretoria approved all procedures. Paternity and maternity were confirmed prior to sequencing via the Affymetrix Genome-Wide Human SNP Array 5.0 as well as via a panel of microsatellite markers. DNA for all study subjects was extracted from whole blood and analysis was performed blind to affected status while maintaining knowledge of the parent-child relations.

Exome library construction

Exome capture and sequencing was performed using the following methods: Genomic DNA (~3 μg) was sheared to 200–300 bp using a Covaris Acoustic Adaptor. Fragments were end-repaired, dA-tailed, and sequencing adaptor oligonucleotides ligated using reagents from New England BioLabs. Libraries were barcoded using the Illumina index read strategy, which uses six-base sequences within the adapter that are sequenced separately from the genomic DNA insert. Ligated products were size-selected during purification steps. The DNA library was subsequently enriched for sequences with 5’ and 3’ adapters by PCR amplification using primers complementary to the adapter sequences (ligation-mediated PCR, LM-PCR). Exonic DNA was captured using two hybridization systems: Aligent SureSelect v2 (n = 85 trios) and NimbleGen SeqCap EZ v2 (n = 180 trios). Following capture, another round of LM-PCR was performed to generate the final library. Each library was quantitated by fluorescent methods (PicoGreen) and fragment sizes measured with the Agilent Bioanalyzer. Finally, the molar concentration of each library was measured using the size information from the Agilent Bioanalyzer and DNA quantitation information from a real-time PCR assay (Kapa Biosystems per manufacturer’s protocol). Each library was normalized to 10 nM and sequenced using an IlluminaHiSeq2000.

Exome data analysis for de novo SNVs and indels

The exome data analysis pipeline has been described previously[7]. Briefly, raw sequencing data were mapped to the human reference genome (build hg19) using the Burrows-Wheeler Aligner (BWA v0.5.81536). The Genome Analysis Toolkit (GATK, version 5091) was used to remove duplicates, perform local realignment and map quality score recalibration to produce a “cleaned” BAM file and then make genotype calls for all trios jointly. The resulting Variant Call Format (VCF, version 4.0) files were annotated using the GenomicAnnotator module in GATK to identify and label the called variants that are within the targeted coding regions and overlap with known and likely benign SNPs reported in dbSNP v132 (see URLs). The filtered genotype calls were further validated using the mpileup module in the SAMtools (see URLs) as described previously[7]. Indel calls were made by the Dindel software using one “cleaned” BAM file per run. The resulting VCF files were further revalidated using the same SAMtools procedure described above for point mutations. To determine potential mutations at splice-donor or acceptor sites, GATK variant calls were made in a batch fashion (90 samples per batch) that covered each target coding region and 50 bp flanking segments in each direction. The variants in the resulting VCF files were annotated according to refGene-big-table-hg19.txt (see URLs). A variant was annotated as a “canonical splice site mutation” if it disrupted the largely invariable core canonical 2-base-pair acceptor (AG) or donor (GU) sites. De novo variants within 10 bp surrounding the exon-intron boundary, included in the consensus sequence flanking core canonical splice sites and therefore likely to modulate splicing efficiency, were annotated as “consensus splice site mutations”. Candidate de novo variants were tested using standard Sanger sequencing on an ABI 3730xl DNA Analyzer to validate presence of each mutation in the subject and absence in the parental genomes, by designing custom primers (Primer3) based on ~500 bp of sequence flanking each variant. The total number of de novo SNVs found and validated in a given cohort was divided by the total bases analyzed to calculate a per-base rate of point mutations in the captured coding sequence.

Variant detection pipeline and QC

Because the whole capture and sequencing procedure was conducted blindly to the affected status for all three cohorts, we expected no bias among cohorts. To further demonstrate that variant detection and QC are consistent across all samples and all experimental conditions, we compared percentage of average reads at 1X, 8X, 20X and 30X for these conditions. The comparison is shown in Supplementary Fig. 1. There were no differences for any of these parameters.

Statistics

The two-sided exact binomial test was conducted using R. Fisher’s exact test or chi-square test with Yates’ correction was used for the analysis of contingency tables, depending on the sample sizes, using R.

Annotation of the functional impact of the de novo mutations

The functional impact of the de novo mutations was annotated from several different resources. The PolyPhen-2[31] (see URLs) online batch query server was used with the full annotation settings to determine the non-syn or syn nature of the mutations and predict their functional impact by further classifying them as non-tolerated (damaging) or benign at a given site. The Grantham score for each coding variant was determined by the Grantham matrix table[32]. The phyloP score for each coding variant was extracted from the “phyloP46wayAll” table in the UCSC Table Browser (see URLs). Regarding splice site variants, we consider mutations directly disrupting canonical splice sites as severe disruptive events without further analysis. For mutations in consensus splice sites, we used a mutation analysis module in Human Splice Finder program (HSF, Version 2.4.1, see URLs)[33], to predict their functional impact. Briefly, 100 nucleotides sequence surrounding the exon-intron boundary was extracted from the UCSC browser and the wild-type and mutated sequences were imported into HSF mutation analysis module to detect potential disruption of splicing signals. Supplementary Tables 11, 12 show the HSF-derived results for the identified consensus splice site mutations.

Gene set enrichment analysis and protein-protein interaction network analyses

The DAVID Functional Annotation Chart[34] (see URLs) was used to assess whether a given gene set with de novo mutations was enriched in particular GO terms or functional keywords defined in Swiss-Prot (SP) and Protein Information Resource (PIR). Target genes were mapped in the database and functional annotation chart analysis was conducted with the default settings. We used the Disease Association Protein–protein Link Evaluator (DAPPLE)[35] to determine if there was excess protein–protein interaction among the genes hit by likely functional de novo variants. A list of all target genes with likely functional altered mutations was submitted to the DAPPLE server (see URLs) with default settings.

Temporal expression profile analysis of the genes carrying de novo mutations

To investigate developmental expression of target genes we took advantage of the Human Brain Transcriptome (HBT) database (see URLs), a compendium of exon-level expression profiles across developmental stages from embryonic to late adulthood[14]. Genes harboring de novo events were grouped into three classes (prenatal brain-biased, postnatal brain-biased and non-biased) according to their temporal trajectory in reference to a global expression turning point occurring between mid–late and late fetal stage[14]. For each class, the ratio of non-syn or likely functional variants to neutral ones was calculated.
  35 in total

1.  Altered brain microRNA biogenesis contributes to phenotypic deficits in a 22q11-deletion mouse model.

Authors:  Kimberly L Stark; Bin Xu; Anindya Bagchi; Wen-Sung Lai; Hui Liu; Ruby Hsu; Xiang Wan; Paul Pavlidis; Alea A Mills; Maria Karayiorgou; Joseph A Gogos
Journal:  Nat Genet       Date:  2008-05-11       Impact factor: 38.330

2.  The best of times, the worst of times for psychiatric disease.

Authors:  Maria Karayiorgou; Jonathan Flint; Joseph A Gogos; Robert C Malenka
Journal:  Nat Neurosci       Date:  2012-05-25       Impact factor: 24.884

3.  Amino acid difference formula to help explain protein evolution.

Authors:  R Grantham
Journal:  Science       Date:  1974-09-06       Impact factor: 47.728

4.  Most rare missense alleles are deleterious in humans: implications for complex disease and association studies.

Authors:  Gregory V Kryukov; Len A Pennacchio; Shamil R Sunyaev
Journal:  Am J Hum Genet       Date:  2007-03-08       Impact factor: 11.025

Review 5.  Genotype and phenotype in patients with dihydropyrimidine dehydrogenase deficiency.

Authors:  A B Van Kuilenburg; P Vreken; N G Abeling; H D Bakker; R Meinsma; H Van Lenthe; R A De Abreu; J A Smeitink; H Kayserili; M Y Apak; E Christensen; I Holopainen; K Pulkki; D Riva; G Botteon; E Holme; M Tulinius; W J Kleijer; F A Beemer; M Duran; K E Niezen-Koning; G P Smit; C Jakobs; L M Smit; A H Van Gennip
Journal:  Hum Genet       Date:  1999-01       Impact factor: 4.132

6.  Thrombospondins are astrocyte-secreted proteins that promote CNS synaptogenesis.

Authors:  Karen S Christopherson; Erik M Ullian; Caleb C A Stokes; Christine E Mullowney; Johannes W Hell; Azin Agah; Jack Lawler; Deane F Mosher; Paul Bornstein; Ben A Barres
Journal:  Cell       Date:  2005-02-11       Impact factor: 41.582

Review 7.  The expanding phenotype of laminin alpha2 chain (merosin) abnormalities: case series and review.

Authors:  K J Jones; G Morgan; H Johnston; V Tobias; R A Ouvrier; I Wilkinson; K N North
Journal:  J Med Genet       Date:  2001-10       Impact factor: 6.318

8.  Schizophrenia susceptibility associated with interstitial deletions of chromosome 22q11.

Authors:  M Karayiorgou; M A Morris; B Morrow; R J Shprintzen; R Goldberg; J Borrow; A Gos; G Nestadt; P S Wolyniec; V K Lasseter
Journal:  Proc Natl Acad Sci U S A       Date:  1995-08-15       Impact factor: 11.205

9.  Essential role of alpha 6 integrins in cortical and retinal lamination.

Authors:  E Georges-Labouesse; M Mark; N Messaddeq; A Gansmüller
Journal:  Curr Biol       Date:  1998-08-27       Impact factor: 10.834

10.  Strong association of de novo copy number mutations with sporadic schizophrenia.

Authors:  Bin Xu; J Louw Roos; Shawn Levy; E J van Rensburg; Joseph A Gogos; Maria Karayiorgou
Journal:  Nat Genet       Date:  2008-05-30       Impact factor: 38.330

View more
  221 in total

1.  Incorporating Functional Information in Tests of Excess De Novo Mutational Load.

Authors:  Yu Jiang; Yujun Han; Slavé Petrovski; Kouros Owzar; David B Goldstein; Andrew S Allen
Journal:  Am J Hum Genet       Date:  2015-07-30       Impact factor: 11.025

2.  A literature search tool for intelligent extraction of disease-associated genes.

Authors:  Jae-Yoon Jung; Todd F DeLuca; Tristan H Nelson; Dennis P Wall
Journal:  J Am Med Inform Assoc       Date:  2013-09-02       Impact factor: 4.497

3.  The pattern of cortical dysfunction in a mouse model of a schizophrenia-related microdeletion.

Authors:  Karine Fénelon; Bin Xu; Cora S Lai; Jun Mukai; Sander Markx; Kimberly L Stark; Pei-Ken Hsu; Wen-Biao Gan; Gerald D Fischbach; Amy B MacDermott; Maria Karayiorgou; Joseph A Gogos
Journal:  J Neurosci       Date:  2013-09-11       Impact factor: 6.167

Review 4.  Genetics and genomics of psychiatric disease.

Authors:  Daniel H Geschwind; Jonathan Flint
Journal:  Science       Date:  2015-09-24       Impact factor: 47.728

5.  Whole-exome sequencing points to considerable genetic heterogeneity of cerebral palsy.

Authors:  G McMichael; M N Bainbridge; E Haan; M Corbett; A Gardner; S Thompson; B W M van Bon; C L van Eyk; J Broadbent; C Reynolds; M E O'Callaghan; L S Nguyen; D L Adelson; R Russo; S Jhangiani; H Doddapaneni; D M Muzny; R A Gibbs; J Gecz; A H MacLennan
Journal:  Mol Psychiatry       Date:  2015-02-10       Impact factor: 15.992

6.  GWAS, cytomegalovirus infection, and schizophrenia.

Authors:  Jakob Grove; Anders D Børglum; Brad D Pearce
Journal:  Curr Behav Neurosci Rep       Date:  2014-12-01

7.  Pathogenic rare copy number variants in community-based schizophrenia suggest a potential role for clinical microarrays.

Authors:  Gregory Costain; Anath C Lionel; Daniele Merico; Pamela Forsythe; Kathryn Russell; Chelsea Lowther; Tracy Yuen; Janice Husted; Dimitri J Stavropoulos; Marsha Speevak; Eva W C Chow; Christian R Marshall; Stephen W Scherer; Anne S Bassett
Journal:  Hum Mol Genet       Date:  2013-06-27       Impact factor: 6.150

8.  Integrated Post-GWAS Analysis Sheds New Light on the Disease Mechanisms of Schizophrenia.

Authors:  Jhih-Rong Lin; Ying Cai; Quanwei Zhang; Wen Zhang; Rubén Nogales-Cadenas; Zhengdong D Zhang
Journal:  Genetics       Date:  2016-10-17       Impact factor: 4.562

9.  Gene expression elucidates functional impact of polygenic risk for schizophrenia.

Authors:  Menachem Fromer; Panos Roussos; Solveig K Sieberts; Jessica S Johnson; David H Kavanagh; Thanneer M Perumal; Douglas M Ruderfer; Edwin C Oh; Aaron Topol; Hardik R Shah; Lambertus L Klei; Robin Kramer; Dalila Pinto; Zeynep H Gümüş; A Ercument Cicek; Kristen K Dang; Andrew Browne; Cong Lu; Lu Xie; Ben Readhead; Eli A Stahl; Jianqiu Xiao; Mahsa Parvizi; Tymor Hamamsy; John F Fullard; Ying-Chih Wang; Milind C Mahajan; Jonathan M J Derry; Joel T Dudley; Scott E Hemby; Benjamin A Logsdon; Konrad Talbot; Towfique Raj; David A Bennett; Philip L De Jager; Jun Zhu; Bin Zhang; Patrick F Sullivan; Andrew Chess; Shaun M Purcell; Leslie A Shinobu; Lara M Mangravite; Hiroyoshi Toyoshiba; Raquel E Gur; Chang-Gyu Hahn; David A Lewis; Vahram Haroutunian; Mette A Peters; Barbara K Lipska; Joseph D Buxbaum; Eric E Schadt; Keisuke Hirai; Kathryn Roeder; Kristen J Brennand; Nicholas Katsanis; Enrico Domenici; Bernie Devlin; Pamela Sklar
Journal:  Nat Neurosci       Date:  2016-09-26       Impact factor: 24.884

10.  Effects of schizophrenia risk variation in the NRG1 gene on NRG1-IV splicing during fetal and early postnatal human neocortical development.

Authors:  Clare Paterson; Yanhong Wang; Joel E Kleinman; Amanda J Law
Journal:  Am J Psychiatry       Date:  2014-09       Impact factor: 18.112

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.