| Literature DB >> 34145229 |
Masaki Nishioka1, An-A Kazuno1, Takumi Nakamura1,2, Naomi Sakai1, Takashi Hayama3, Kumiko Fujii4, Koji Matsuo5, Atsuko Komori1, Mizuho Ishiwata1, Yoshinori Watanabe6, Takashi Oka7, Nana Matoba1,8, Muneko Kataoka1,9, Ahmed N Alkanaq10, Kohei Hamanaka10, Takashi Tsuboi11, Toru Sengoku12, Kazuhiro Ogata12, Nakao Iwata13, Masashi Ikeda13, Naomichi Matsumoto10, Tadafumi Kato14,15, Atsushi Takata16,17,18.
Abstract
Bipolar disorder is a severe mental illness characterized by recurrent manic and depressive episodes. To better understand its genetic architecture, we analyze ultra-rare de novo mutations in 354 trios with bipolar disorder. For germline de novo mutations, we find significant enrichment of loss-of-function mutations in constrained genes (corrected-P = 0.0410) and deleterious mutations in presynaptic active zone genes (FDR = 0.0415). An analysis integrating single-cell RNA-sequencing data identifies a subset of excitatory neurons preferentially expressing the genes hit by deleterious mutations, which are also characterized by high expression of developmental disorder genes. In the analysis of postzygotic mutations, we observe significant enrichment of deleterious ones in developmental disorder genes (P = 0.00135), including the SRCAP gene mutated in two unrelated probands. These data collectively indicate the contributions of both germline and postzygotic mutations to the risk of bipolar disorder, supporting the hypothesis that postzygotic mutations of developmental disorder genes may contribute to bipolar disorder.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34145229 PMCID: PMC8213845 DOI: 10.1038/s41467-021-23453-w
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Patterns of gDNM enrichment in BD.
a–d Plots of per-individual rates of the following four types of gDNMs in the affected and unaffected groups: loss of function (LoF), damaging missense/inframe indel, non-damaging missense, and synonymous. Damaging missense gDNMs are defined as those with a Combined Annotation Dependent Depletion (CADD) score >15. a Rates of all gDNMs in BD (orange, N = 257) and controls (green, N = 1640). b Rates of gDNMs not observed in the general population (gnomAD and ToMMo) in BD and controls. c Rates of gDNMs not observed in the general population and hitting a constrained gene (pLI > 0.9) in BD and controls. d Rates of gDNMs not observed in the general population and hitting a constrained gene (pLI > 0.9) in bipolar I or schizoaffective disorder (BDI + SCZAD, magenta, N = 203) and controls. The mean of gDNM counts in the affected and unaffected groups for each mutational type is indicated as the colored points accompanied by the error bars (95% confidence intervals). Uncorrected P values calculated by one-tailed permutation tests are shown on the right of the plots.
Fig. 2Gene set enrichment analysis of deleterious gDNMs in BD.
a Gene ontology (GO) terms enriched among the genes hit by deleterious (LoF or damaging missense/inframe indel) gDNMs in BD at a false discovery rate (FDR) <0.1. FDR adjustment was performed based on the number of terms in the corresponding GO category (biological process, cellular component, or molecular function). Uncorrected P values for our observations (the red dotted lines) were calculated by DNENRCIH that considers confounding factors, such as gene sizes and local sequence contexts (“Methods”). The histograms indicate the distributions of the expected number of gDNMs hitting the corresponding GO term (the x axis), which was generated by one million random permutations by DNENRCIH. Top, BD; bottom, controls. b Network visualization of the GO terms enriched among the genes hit by deleterious gDNMs in BD at uncorrected P < 0.05. The node colors and label sizes indicate the statistical significance of enrichment (deep red indicates the most significant terms). The node sizes indicate the number of genes included in a term. The edge width is proportional to the overlap coefficient. The full list of GO terms with uncorrected P < 0.05 is shown in Supplementary Data 2. c Enrichment analysis of the genes hit by deleterious gDNMs in BD for the top 2% of the genes with the highest expression in each of the 54 human tissues in the GTEx dataset. Uncorrected P values calculated by DNENRICH are shown as bar plots. The dotted and solid lines indicate the nominal (P = 0.05) and the Bonferroni-corrected (P = 0.00093 = 0.05/54) significance threshold, respectively. Bars are color coded as shown on the lower right of the plot. d Enrichment analysis of the genes hit by deleterious gDNMs in BD for the top 2% of the genes with the highest expression in each developmental period of a brain region in the BrainSpan Human Developmental Transcriptome dataset. Cells are color coded by uncorrected P values calculated by DNENRICH as shown on the right of the grid cells.
Fig. 3Single-cell (nucleus) enrichment analysis of the genes hit by deleterious gDNMs in BD.
a UMAP representation of 16 cell clusters (c0–15) identified from single-nucleus RNA sequencing data of human adult anterior cingulate cortices. Cell clusters are annotated based on the patterns of marker gene expression (b and Supplementary Fig. 3). b Expression patterns of representative marker genes for excitatory neurons (SLC17A7), general and specific interneurons (GAD1, SST, and VIP), astrocytes (GFAP), and oligodendrocytes (OLIG1). c A plot of the top 5% of the cells preferentially expressing genes hit by deleterious gDNMs in BD onto the UMAP. The blue and gray dots indicate the top 5% cells and the others, respectively. d Proportions of cells preferentially expressing the genes hit by deleterious gDNMs in BD in each cluster. Cluster IDs and cell types (red: excitatory, blue: interneuron, and light green: other cells) are shown on the left of the bar plots. The red vertical line indicates the theoretical expectation (5%). P values are calculated by comparing the observed and expected proportions (one-tailed binomial test with Bonferroni correction) are shown on the right of the bar plots. N.S. not significant. e Enrichment analysis of the genes hit by deleterious gDNMs in BD (top, orange) or controls (bottom, green) for the c8 signature genes. Uncorrected P values were calculated by DNENRICH that considers confounding factors, such as gene sizes and local sequence contexts (“Methods”). f Plots of the expression patterns of representative genes upregulated in the cluster 8 (c8 signature genes) onto the UMAP. g Network visualization of the GO terms significantly enriched among the c8 signature genes (FDR < 0.05). The node colors and label sizes indicate the statistical significance of enrichment (deep red indicates the most significant terms). The node sizes indicate the number of genes included in a term. The edge width is proportional to the overlap coefficient. Nodes are connected when the overlap coefficient of the containing genes >0.5. h A Venn diagram showing overlaps between the c8 signature genes and known DD genes[37] or the index genes in the largest BD GWAS to date[4]. Uncorrected P values for the observed overlaps under the hypergeometric distribution (one tailed) and the symbols of the overlapping genes are shown. CACNA1C is underlined as the only one gene in the intersection of three gene sets: c8 signature, DD, and BD GWAS index genes.
Genes with recurrent deleterious gDNMs in BD and other neuropsychiatric/developmental disorders.
| Gene | pLI | gDNM counta | gDNM enrichment | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| BD | SCZ | ASD | DD | Control | BD ( | BD + SCZ + ASD ( | BD + SCZ + ASD + DD ( | ||||
| ( | ( | ( | ( | ( | Uncorrected | gDNM count totala | Uncorrected | gDNM count totala | Uncorrected | ||
| 0.9971 | 2 (0.2) | 1 (0.1) | 1 (1.0) | 7 (1.6) | 0 | 1.13 × 10−4 | 4 (1.3) | 8.58 × 10−4 | 11 (2.9) | ||
| 0.9964 | 2 (0.2) | 0 | 0 | 3 (3.0) | 0 | 5.47 × 10−4 | 2 (0.2) | 0.231 | 5 (3.2) | 0.341 | |
| 1 | 1 | 2 | 4 | 14 | 0 | 0.0148 | 7 | 21 | |||
| 1 | 1 | 1 | 2 | 0 | 0 | 0.00404 | 4 | 5.62 × 10−6 | 4 | 0.00135 | |
| 1 | 1 | 0 | 1 | 1 | 0 | 0.00378 | 2 | 0.00496 | 3 | 0.00997 | |
| 0.9999 | 1 | 0 | 1 | 0 | 0 | 0.00542 | 2 | 0.00991 | 2 | 0.130 | |
| 0.9743 | 1 | 0 | 0 | 1 | 0 | 0.00125 | 1 | 0.0334 | 2 | 0.00937 | |
| 0.9997 | 1 | 0 | 0 | 1 | 0 | 0.00330 | 1 | 0.0861 | 2 | 0.0564 | |
| 0.9999 | 1 | 0 | 0 | 1 | 0 | 0.00436 | 1 | 0.112 | 2 | 0.0909 | |
| 1 | 1 | 0 | 0 | 1 | 1 | 0.00458 | 1 | 0.117 | 2 | 0.0988 | |
Boldface indicates genes with exome-wide significance defined as P < 2.74 × 10−6 (= 0.05/18,271).
ASD autism spectrum disorder, BD bipolar disorder, DD developmental disorder, gDNM germline de novo mutation, LoF loss of function, pLI probability of being LoF-intolerant, SCZ schizophrenia.
aFor genes with multiple deleterious gDNMs in BD and other disorders, the total numbers of deleterious gDNMs with the numbers of LoF and damaging missense/inframe indel gDNMs in the parenthesis are shown. For constrained genes with LoF gDNMs in BD and other disorders, the numbers of LoF gDNMs are shown.
bFor genes with multiple deleterious gDNMs in BD and other disorders, P values were calculated by comparing the observed and expected numbers of deleterious gDNMs. For constrained genes with LoF gDNMs in BD and other disorders, P values were calculated by comparing the observed and expected numbers of LoF gDNMs. The listed genes are limited to those also hit by deleterious/LoF gDNMs in other neuropsychiatric/developmental disorders than BD (i.e., SCZ, ASD, or DD in this Table, Supplementary Table 1).
Fig. 4Systematic analysis of pzDNMs in BD and deleterious pzDNMs in KMT2C and SRCAP.
a Schematic representation of the domain structure of the KMT2C protein and the positions of DNMs reported in our and previous studies (see Supplementary Table 1). LoF and missense DNMs are shown on the upper and lower sides, respectively. The DNMs are color coded according to the diagnostic group. Kleefstra syndrome phenotypic spectrum is included in DD. SCZ schizophrenia. b A confirmed LoF (p.Lys3601*) pzDNM in KMT2C in BD. Left: Sanger sequencing chromatogram confirming the de novo p.Lys3601* variant in the BD proband. The intensity of the variant allele is lower than that of the reference allele. Middle: the allele distribution of the p.Lys3601* (g.151859861T>A) variant and a nearby heterozygous common SNP (rs74483926, g.151859683G>A) in the cloned PCR fragments from the BD proband. Red letters indicate mosaicism of the p.Lys3601* variant in the paternal allele. Right: variant allele fractions of the p.Lys3601* variant in the saliva, nail, and hair of the BD proband shown as a bar plot. This observation excludes the possibility that the p.Lys3601* pzDNM is due to clonal hematopoiesis of indeterminate potential (CHIP). c Proportions of the deleterious (LoF or damaging missense) pzDNMs (left) and gDNMs (right) not observed in the general population in BD hitting a known DD gene[37]. The dotted lines indicate the theoretical expectation based on an established mutational model[39]. P values calculated by a comparison between the observation and the expectation (one-tailed binomial test) are shown above the bars. d The two pzDNMs (p.Leu696Phe or p.Arg971Cys) in SRCAP observed in two different BD probands. Top: schematic representation of the domain structure of the SRCAP protein[47] and the positions of the observed pzDNMs. Bottom: the evolutionary conservation around the pzDNM sites from the Multiz Alignments of 100 Vertebrates. For the amino acid hit by the p.Arg971Cys variant, no missense variant introducing a substitution to cysteine was observed. e Structure of human SRCAP complex. The location of the ATPase lobe 1 containing Leu696 had weaker density and was thus omitted from the deposited coordinates. f Structure of yeast SWR1 complex bound with nucleosome. Leu696 in human SRCAP corresponds to Leu774 in yeast SWR1. Arg971 in human SRCAP is conservatively substituted with Lys (Lys1069) in yeast SWR1. g A close-up view of Leu774 and nearby residues in the yeast SWR1 complex. The positions of the two DNA gyres (SHL +2 and SHL −6) are shown. The residues involved in a hydrophobic core with Leu774 (corresponding to Leu696 in human SRCAP complex) and a recognition of the SHL −6 position of DNA are shown as sticks with nitrogen atoms in blue.