Literature DB >> 27990183

Functional and Genomic Features of Human Genes Mutated in Neuropsychiatric Disorders.

Diego A Forero1, Carlos F Prada2, George Perry3.   

Abstract

BACKGROUND: In recent years, a large number of studies around the world have led to the identification of causal genes for hereditary types of common and rare neurological and psychiatric disorders.
OBJECTIVE: To explore the functional and genomic features of known human genes mutated in neuropsychiatric disorders.
METHODS: A systematic search was used to develop a comprehensive catalog of genes mutated in neuropsychiatric disorders (NPD). Functional enrichment and protein-protein interaction analyses were carried out. A false discovery rate approach was used for correction for multiple testing.
RESULTS: We found several functional categories that are enriched among NPD genes, such as gene ontologies, protein domains, tissue expression, signaling pathways and regulation by brain-expressed miRNAs and transcription factors. Sixty six of those NPD genes are known to be druggable. Several topographic parameters of protein-protein interaction networks and the degree of conservation between orthologous genes were identified as significant among NPD genes.
CONCLUSION: These results represent one of the first analyses of enrichment of functional categories of genes known to harbor mutations for NPD. These findings could be useful for a future creation of computational tools for prioritization of novel candidate genes for NPD.

Entities:  

Keywords:  Biological psychiatry; Brain diseases; Computational biology; Genomics; Neurological disorders; Systems biology

Year:  2016        PMID: 27990183      PMCID: PMC5120378          DOI: 10.2174/1874205X01610010143

Source DB:  PubMed          Journal:  Open Neurol J        ISSN: 1874-205X


INTRODUCTION

Neuropsychiatric disorders (NPD) represent a large burden on global public health, in terms of the disability-adjusted life-years associated with them [1]. Taking into account the severity and chronicity of some of these disorders, global annual costs of NPD have been estimated at several trillion dollars [2]. For several NPD, particularly for neurological disorders, a large heritability for subtypes with Mendelian inheritance has been identified [3]. In the last years, several large efforts have been carried out to identify the causal genes for a large number of NPD [4]. Initially, classical genome-wide linkage studies, followed for fine-mapping and gene sequencing analyses, were used. Recently, genome-wide and exome sequencing studies [5] have generated a large number of causal genes for NPD [6]. Several available databases provide information for genes mutated in specific categories of NPD [7]. However, there is a lack of a global functional analysis of all genes that are known to harbor mutations for NPD. In the current work, we present a comprehensive catalog of genes mutated in neuropsychiatric disorders and we explore the genomic and functional features of those 300 genes.

METHODS

Identification of genes mutated in NPD was carried out by a combination of automatic and manual search strategies of the scientific literature and associated databases. Original articles were identified and data (such as first author, gene names, disorders and PubMed identifiers –PMIDs-] were extracted and stored. HUGO Gene Nomenclature Committee [HGNC] database [8] was used for identification of official gene symbols and names. DAVID server [9] was used for conversion of HGNC IDs to Ensembl Gene IDs. Ensembl BioMart [10] was used for retrieval of chromosome, band, gene start and end, gene size, transcript count and GC% data. The LiftOver tool of the University of California at Santa Cruz [UCSC] genome browser [11] was used to convert coordinates from hg38 to hg19 assemblies, hg19 was used because the latest available annotation for that genome version was more complete. DAVID server (9) was used for functional clustering and enrichment analysis: Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathways, Gene Expression, Chromosomal Location, Interpro domains, UCSC Transcription Factor Binding Sites [TFBS], and Gene Ontology [GO] terms. Babelomics [FatiGO] Server [12] was used for functional enrichment analysis: miRNA targets and KEGG pathways. For both programs, the option of comparing against the entire genome was chosen and a False Discovery Rate (FDR) approach was used for correction for multiple testing. A random sample of protein coding genes (from Ensembl database, N=300) was generated to analyze continuous variables (gene length, GC content and transcript counts), which were compared using a Mann-Whitney U test using the Stata 11 program (those variables presented a non-normal distribution). Protein Protein Interaction (PPI) data were retrieved from the Human Interactome Project (Center for Cancer Systems Biology, Harvard University, USA). It consolidates different datasets: HI-II-14 and Lit-BM-13 [13], HI-I-05 [14]; Venkatesan-09 [15] and Yu-11 [16]. It led to 3482 interactions for 134 NPD proteins and 619 interactor proteins. VLOOKUP option in Excel 2013 was used for generation and integration of novel tables. Cytoscape 3.1 [17] was used for analysis and visualization of PPI networks. To facilitate PPI visualization, a subnetwork of highly connected proteins (>25 connections) was generated with the respective options in Cytoscape. A PPI network enrichment analysis was carried out with the SNOW tool [12], focusing on the following parameters: relative betweenness, connections and clustering coefficient. A list of druggable genes [18] was downloaded from the DGIdb database [19]. Sequences of the corresponding orthologous genes in Hominoids (chimpanzee, gorilla, orangutan and gibbon) were downloaded from the Ensembl database [20] and aligned using the MUSCLE alignment program [21]. Geneious software was used as a bioinformatics platform for all comparative analyses [22]. Two groups of genes were created: A group of proteins that are highly conserved between primates (>90% identity) and a second, less conserved group (<90% identity). Genes that have a unique gene structure in humans, compared with orthologues, were identified. Additionally, NPD genes that are located near or inside fragile regions of human X chromosome were recognized [23].

RESULTS

300 genes were identified as known to harbor mutations for NPD (Table ). These genes belong to several functional categories, such as neurotransmitter receptors, ion channels, synaptic proteins, adhesion molecules, among other groups (Table S2). A functional enrichment analysis of these genes found several significant categories (Table ). 15% of NPD genes are located on chromosome X and they have larger lengths and transcript counts. In terms of functional pathways, genes related to Wnt, Notch, MAPK signaling and long-term potentiation mechanisms were overrepresented (Table ). Among protein domains, only the ion transport domain from InterPro was significant. In terms of regulatory mechanisms, several transcription factors (TF) known to be involved in brain physiology and three miRNAs were identified (hsa-let-7a, hsa-mir-92b, hsa-let-7g) (Table ), with an enrichment of genes expressed in prefrontal cortex and occipital lobe. A number of significant categories from the Gene Ontology were nervous system development, transmission of nerve impulse, neuron projection and ion channel activity (Table ). Several topographic parameters of protein-protein interaction networks were significant: Relative betweenness, connections and clustering coefficient (Table ). Fig. () shows an overview of protein-protein interactions for a subnetwork of highly connected proteins. Sixty six NPD genes were identified as known as druggable (Table S3). From the analysis of conservation among orthologues of NPD genes, two main groups were identified: A group of 272 genes that are highly conserved between primates (>90% identity) and a second, less conserved group (<90% identity) with 28 genes. A multiple alignment of the second group of orthologous genes showed that the encoded proteins had from 55.1 to 90.6% identity, with a percentage of identical sites between 13.5 to 79.2% (Table S4). As an example, Fig. (S1) shows the alignment of the REEP1 orthologous genes, highlighting their low protein identity and Fig. (S2) shows the protein alignment of ARID1B, underscoring that the human protein has 429 additional amino acids at the N-terminal position (1 to 429 aminoacids) compared with orthologous genes found in Hominoids. Finally, nine NPD genes, highly conserved in primates, were found inside or adjacent to fragile regions previously reported in the human X chromosome (Table S5).

DISCUSSION

These results represent one of the first analyses of enrichment of functional categories of genes known to harbor mutations for NPD [4]. Previous studies that were focused on analyses of all genes for human diseases identified several genomic features [such as gene length] that were significant predictors [24]. In this study, we found several genomic features for NPD, such as larger gene lengths and transcript counts, location on chromosome X, presence of ion transport protein domains, expression in prefrontal cortex and regulation by several transcription factors that are known to be involved in brain function [4, 25]. As miRNAs are being identified as novel major regulators of brain function and NPD [26], it is interesting that in this study we found a possible common regulation by three miRNAs. Given the large number of features tested, a false discovery rate approach was used for correction for multiple testing. In terms of functional analyses, we found an enrichment of categories such as gene ontologies related to neural transmission and plasticity and signaling networks linked to synaptic plasticity (such as Wnt and Notch), which have been previously postulated as underlying several NPD [27-29]. Of special interest, from a systems biology perspective, we found several topographic parameters of protein-protein interaction networks that were significant for NPD genes [30, 31]. We found that 66 NPD genes are known to be druggable, a finding of relevance for development of novel therapeutic interventions [19]. We found that nine NPD genes are located inside or adjacent to fragile regions previously reported in the human X chromosome [23], with 28 NPD genes found to be less conserved among primates (<90% identity) and with 5 NPD genes showing a unique gene structure in humans, compared with orthologues. Of special relevance, from a global public health perspective, is the future identification of additional causal genes for NPD, particularly in developing countries [32-36]. These results could be useful for the future creation of computational tools [37] that allow prioritization of novel candidate genes (including ncRNAs [26, 38]) for NPD, incorporating several of the parameters that were found in this work as significant for NPD genes.
Table 1

Genomic analysis of 300 human genes known to be mutated in neuropsychiatric disorders.

Category Feature n (%) p value FDR
Chromosomal LocationChromosome X45/294 (15.3)1,0E-11 a8,1E-9
Gene SizeGene Length0.0000 d
Transcriptional ComplexityTranscript count0.0000 d
Gene Expression (GNF_U133A)Expression in Occipital Lobe88/294 (29.9)2,9E-11 a3,1E-8
Gene Expression (GNF_U133A)Expression in Prefrontal Cortex73/294 (24.8)4,6E-7 a4,9E-4
Protein Domains (INTERPRO)Ion Transport Domain14/294 (4.8)3,8E-8 a5,8E-5
TF binding sites (UCSC)SOX5148/294 (60.5)1,2E-12 a1,4E-9
TF binding sites (UCSC)ZIC2108/294 (36.7)1,7E-12 a2,1E-9
TF binding sites (UCSC)PAX6191/294 (65.0)1,5E-11 a1,8E-8
TF binding sites (UCSC)NF1141/294 (48.0)7,4E-10 a9,1E-7
TF binding sites (UCSC)POU3F2189/294 (64.3)4,4E-7 a5,4E-4
TF binding sites (UCSC)EN1174/294 (59.2)6,9E-7 a8,5E-4
miRNA targetshsa-let-7a21/300 (7.0)0.001 b0.03
miRNA targetshsa-mir-92b18/300 (6.0)0.001 b0.04
miRNA targetshsa-let-7g23/300 (7.7)0.0005 b0.02

a. DAVID server

b. FatiGO server

c. SNOW server

d. Mann-Whitney U test

Abbreviations: TF: Transcription Factor; FDR: False Discovery Rate.

Table 2

Functional enrichment analysis of 300 human genes known to be mutated in neuropsychiatric disorders.

Category Feature n (%) p value FDR
Biological Process (GO)Nervous system development76/294 (25.9)1,6E-23 a2,8E-20
Biological Process (GO)Transmission of nerve impulse39/294 (13.3)2,3E-18 a4,1E-15
Cellular Component (GO)Neuron projection43/294 (14.6)3,8E-24 a5,2E-21
Molecular Function (GO)Ion channel activity29/294 (9.9)2,1E-10 a3,1E-7
Signaling Pathways (KEGG)Wnt signaling pathway8/300 (2.7)0.0008 b0.03
Signaling Pathways (KEGG)Notch signaling pathway5/300 (1.7)0.0003 b0.01
Signaling Pathways (KEGG)Long-term potentiation5/300 (1.7)0.001 b0.04
Signaling Pathways (KEGG)MAPK signaling pathway11/300 (3.7)0.001 b0.03
Protein-Protein Interaction NetworksRelative betweenness0.01 c
Protein-Protein Interaction NetworksConnections0.01 c
Protein-Protein Interaction NetworksClustering Coefficient0.0007 c

a. DAVID server

b. FatiGO server

c. SNOW server

Abbreviations: GO: Gene Ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes; MAPK: Mitogen-activated protein kinases; FDR: False Discovery Rate.

  38 in total

Review 1.  Computational tools for prioritizing candidate genes: boosting disease gene discovery.

Authors:  Yves Moreau; Léon-Charles Tranchevent
Journal:  Nat Rev Genet       Date:  2012-07-03       Impact factor: 53.242

2.  Towards a proteome-scale map of the human protein-protein interaction network.

Authors:  Jean-François Rual; Kavitha Venkatesan; Tong Hao; Tomoko Hirozane-Kishikawa; Amélie Dricot; Ning Li; Gabriel F Berriz; Francis D Gibbons; Matija Dreze; Nono Ayivi-Guedehoussou; Niels Klitgord; Christophe Simon; Mike Boxem; Stuart Milstein; Jennifer Rosenberg; Debra S Goldberg; Lan V Zhang; Sharyl L Wong; Giovanni Franklin; Siming Li; Joanna S Albala; Janghoo Lim; Carlene Fraughton; Estelle Llamosas; Sebiha Cevik; Camille Bex; Philippe Lamesch; Robert S Sikorski; Jean Vandenhaute; Huda Y Zoghbi; Alex Smolyar; Stephanie Bosak; Reynaldo Sequerra; Lynn Doucette-Stamm; Michael E Cusick; David E Hill; Frederick P Roth; Marc Vidal
Journal:  Nature       Date:  2005-09-28       Impact factor: 49.962

3.  The druggable genome: an update.

Authors:  Andreas P Russ; Stefan Lampel
Journal:  Drug Discov Today       Date:  2005-12       Impact factor: 7.851

4.  A functional polymorphism in the promoter region of MAOA gene is associated with daytime sleepiness in healthy subjects.

Authors:  Diego A Ojeda; Carmen L Niño; Sandra López-León; Andrés Camargo; Ana Adan; Diego A Forero
Journal:  J Neurol Sci       Date:  2013-12-11       Impact factor: 3.181

5.  A high resolution map of mammalian X chromosome fragile regions assessed by large-scale comparative genomics.

Authors:  Carlos Fernando Prada; Paul Laissue
Journal:  Mamm Genome       Date:  2014-08-03       Impact factor: 2.957

6.  MIR137 variants identified in psychiatric patients affect synaptogenesis and neuronal transmission gene sets.

Authors:  M Strazisar; S Cammaerts; K van der Ven; D A Forero; A-S Lenaerts; A Nordin; L Almeida-Souza; G Genovese; V Timmerman; A Liekens; P De Rijk; R Adolfsson; P Callaerts; J Del-Favero
Journal:  Mol Psychiatry       Date:  2014-06-03       Impact factor: 15.992

7.  Cytoscape 2.8: new features for data integration and network visualization.

Authors:  Michael E Smoot; Keiichiro Ono; Johannes Ruscheinski; Peng-Liang Wang; Trey Ideker
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

8.  Next-generation sequencing to generate interactome datasets.

Authors:  Haiyuan Yu; Leah Tardivo; Stanley Tam; Evan Weiner; Fana Gebreab; Changyu Fan; Nenad Svrzikapa; Tomoko Hirozane-Kishikawa; Edward Rietman; Xinping Yang; Julie Sahalie; Kourosh Salehi-Ashtiani; Tong Hao; Michael E Cusick; David E Hill; Frederick P Roth; Pascal Braun; Marc Vidal
Journal:  Nat Methods       Date:  2011-04-24       Impact factor: 28.547

9.  Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors:  Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal:  Bioinformatics       Date:  2012-04-27       Impact factor: 6.937

10.  Speeding disease gene discovery by sequence based candidate prioritization.

Authors:  Euan A Adie; Richard R Adams; Kathryn L Evans; David J Porteous; Ben S Pickard
Journal:  BMC Bioinformatics       Date:  2005-03-14       Impact factor: 3.169

View more
  2 in total

1.  Integrative In Silico Analysis of Genome-Wide DNA Methylation Profiles in Schizophrenia.

Authors:  Diego A Forero; Yeimy González-Giraldo
Journal:  J Mol Neurosci       Date:  2020-05-26       Impact factor: 3.444

2.  Functional Genomics of Epileptogenesis in Animal Models and Humans.

Authors:  Diego A Forero
Journal:  Cell Mol Neurobiol       Date:  2020-07-28       Impact factor: 5.046

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.