Literature DB >> 18790807

OKCAM: an ontology-based, human-centered knowledgebase for cell adhesion molecules.

Chuan-Yun Li1, Qing-Rong Liu, Ping-Wu Zhang, Xiao-Mo Li, Liping Wei, George R Uhl.   

Abstract

'Cell adhesion molecules' (CAMs) are essential elements of cell/cell communication that are important for proper development and plasticity of a variety of organs and tissues. In the brain, appropriate assembly and tuning of neuronal connections is likely to require appropriate function of many cell adhesion processes. Genetic studies have linked and/or associated CAM variants with psychiatric, neurologic, neoplastic, immunologic and developmental phenotypes. However, despite increasing recognition of their functional and pathological significance, no systematic study has enumerated CAMs or documented their global features. We now report compilation of 496 human CAM genes in six gene families based on manual curation of protein domain structures, Gene Ontology annotations, and 1487 NCBI Entrez annotations. We map these genes onto a cell adhesion molecule ontology that contains 850 terms, up to seven levels of depth and provides a hierarchical description of these molecules and their functions. We develop OKCAM, a CAM knowledgebase that provides ready access to these data and ontologic system at http://okcam.cbi.pku.edu.cn. We identify global CAM properties that include: (i) functional enrichment, (ii) over-represented regulation modes and expression patterns and (iii) relationships to human Mendelian and complex diseases, and discuss the strengths and limitations of these data.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18790807      PMCID: PMC2686464          DOI: 10.1093/nar/gkn568

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

‘Cell adhesion molecules’ play central roles in much of the connection and communication between cells and their synapses (1). Cell adhesion-related communication is essential for many aspects of the proper development of a variety of organs and tissues (1). This cellular communication also plays substantial roles in the plasticity of cell recognition processes in the developed organism (2). Cell adhesion molecules (CAM) may be especially important in the brain. The brain requires proper connections of many trillions of synapses to develop properly as well as substantial plasticity in many of these synapses to facilitate learning and memory. The dynamics of neuronal synaptic recognition, connection and disconnection appear to make substantial contributions to disorders that display mnemonic features, including addictions and autism (3,4). Current physiologic and cell biologic studies have implicated CAMs as good candidates to play important roles in synapse adhesion (1,5), neuronal connectivity and communication (1), signal transduction (5–8) and proper arrangement of pre-synaptic active zones and postsynaptic densities at classical synapses (9,10). Current genetic studies have linked and/or associated variants in cell adhesion molecule genes with psychiatric, neurologic, neoplastic, immunologic and developmental phenotypes. The importance of CAMs in learning and memory-associated disorders is demonstrated in recent genome wide association studies (11). Vulnerabilities to addictions are associated with variants in CAM genes in studies of several independent samples (12–14). Genetic variants of the CAM genes NRXN1 and CNTNAP2 have been associated with autism (4,15). Variants in neuregulin have been associated with vulnerability to schizophrenia (16). Variants in an adhesion-like protein KIAA0319 have been associated with dyslexia (17,18). These data underscore the importance of cell adhesion molecules in both Mendelian and complex disorders of brain and other organs and suggest that a more comprehensive view of these genes and molecules would be valuable. However, there is currently no systematic study that enumerates: (i) the number of genes and gene families that function as CAMs; (ii) common and/or global CAM functions, including those that might extend beyond their cell/cell recognition functions; (iii) common CAM genetic variants that might provide individual differences in CAM structures and functions; (iv) over-represented regulation modes and expression patterns and (v) CAM associations with diseases, especially with brain disorders. We now report compilation of a list of 496 human CAM genes and construction of corresponding cell adhesion molecule ontology (CAMO) to systematically address these questions. Detailed annotations on CAM genes are provided. Global properties of CAM genes, overrepresented types of variation, overrepresented regulation modes and expression patterns, and disease associations are identified. We report a knowledgebase for cell adhesion molecules (OKCAM) that provides ready access to these data and the associated ontologic system that we describe here.

IDENTIFICATION OF HUMAN CAM GENES AND RODENT HOMOLOGS

CAMs were identified based on compilation of data from manual curation of protein domain structures, Gene Ontology annotations, and 1487 annotation entries from keyword queries based on NCBI Entrez Gene annotations (Figure 1). First, we identified features of common protein domains for CAM families based on common motifs from cadherin, immunoglobulin/FibronectinIII (IgFn), integrin, neurexin, neuroligin and catenin families. Using these features, we developed Perl scripts to retrieve and standardize related InterPro domain architectures and the proteins that contain such architectures (19). After manual curation, 44 types of protein domains with 202 detailed domain architectures were identified. These included 532 human proteins that map onto 218 human genes. We used similar protocols to identify cell adhesion gene lists for rat and mouse; these genes were then further mapped to the human genome using Homologene (20). We next extracted CAMs using the Gene Ontology term ‘cell adhesion’ (GO:0007155) (21). We focus on curated entries; entries that are identified only by annotations that display Evidence Code IEA (Inferred from Electronic Annotation) are noted in Supplementary Table 7. Two hundred eighteen human proteins were identified, which mapped onto 196 human genes. Finally, we manually curated 1487 annotation entries selected from results of the Entrez Gene query ‘adhesion AND Homo sapiens [organism]’ (20). This approach added 136 more human genes to the list of cell adhesion molecules. In total, we thus identified 496 unique human CAM genes and their homologs in other species.
Figure 1.

Collection of Human CAMs. CAMs were compiled by integrating Gene Ontology annotations, domain structure information and keywords query against NCBI Entrez Gene annotations. Four hundred and ninety-six unique human genes were identified as CAMs (additional genes that may also function in this way are identified in the supplement).

Collection of Human CAMs. CAMs were compiled by integrating Gene Ontology annotations, domain structure information and keywords query against NCBI Entrez Gene annotations. Four hundred and ninety-six unique human genes were identified as CAMs (additional genes that may also function in this way are identified in the supplement). Meta-data about the domain architectures for CAMs in nonhuman species provided information about CAM evolutionary histories. Of the 113 types of protein domains assessed in our dataset, 705 detailed domain architectures were noted. Among these, only 44 domains with 202 domain architectures were identified in all of the three species, human, rat and mouse. For example, in the cadherin superfamily, there is only one human gene encodes a protein with enzymatic activity, though several dozen cadherins with enzymatic activities are found in bacteria and yeast. Several categories with large numbers of domain architectures that can be detected in lower species including Caenorhabditis elegans, Drosophila melanogaster and Danio rerio, are totally absent from human, rat and/or mouse. These categories include ‘IgCAM-like cadherins’ that display 29 such domain architectures, ‘cadherins with Leucine-rich structures’ that display two such domain architectures, ‘toxin-related cadherins’ that display such 36 domain architectures and ‘cadherins with surface anchor structures’ that display seven such domain architectures. In striking contrast, 119 of the 123 ‘cadherin’ genes that can be identified in humans fall into the category of ‘simple cadherins’, that includes genes with only simple combinations of cadherin prodomains, cadherin domains and cadherin cytoplasmic domains. Although 79%, not all, of the proteins that we identify in this study display characteristic InterPro domains, the domain architecture patterns we identify do imply the specification of the CAMs in mammals.

DATA ANNOTATIONS

To elucidate the functions of CAMs, detailed annotations were given to each CAM gene. These data allow interpretation of features of each CAM at five levels: gene family and basic information, genetics, regulation, expression, and Mendelian or complex disease linkage/association. Information about gene family and basic characterization comes from NCBI Entrez gene annotations (20), Gene Ontology (21), InterPro domains (19), protein interaction databases (22–24), knowledgebases for molecular pathways including KEGG (25), BioCarta and Pathway Interaction Database (PID) and the NCBI PubMed database (20). Genetic variations in these genes, including chromosome recombination hotspots (26), SNPs (20), insertion/deletions (27), chromosomal translocations (27) and CNVs (27), were retrieved from the UCSC Genome Browser Database (26), HapMap (28), NCBI dbSNP database (20) and Database of Genomic Variants (27), respectively. Information about potential or actual modes of regulation was annotated based on the presence of experimentally validated transcription factor binding sites (TFBS) (29), experimentally validated (30,31) and putative miRNA targets (32), noncoding RNA loci (33), cis/trans-natural antisense transcripts (NATs) (34,35), alternative splicing and post-translational modifications (36) from databases that included TransFac (29), Argonaute (31), TarBase (30), PicTar (32), NatsDB (34,35), NONCODE (33) and dbPTM (36). Information about mRNA expression levels came from: (i) integrated human expressed sequence tag profiles based on developmental stages and tissue distributions, as deposited in Unigene (20) and (ii) mouse brain region expression profiles described in the Allen Brain Atlas (37), with mapping of these data to human orthologs using Homologene (20). We integrated gene expression information at peptide/protein levels by collecting expressed proteins and peptides deposited in the PRIDE database (38). To assess potential disease linkages or associations, we integrated OMIM (20) and genome-wide association datasets (39), from public data deposited in the Genetic Association Database (39) and an additional 12 in-house genome wide association datasets. Full descriptions of the annotation statistics are provided in Table 1. These annotations, extending from genome to post-translational modification, provide a novel avenue for studies of the global properties of CAM genes, overrepresented types of variation, overrepresented regulation modes and expression patterns, and disease associations, as we discuss in the following sections.
Table 1.

Annotations for CAM genes

DescriptionEvidence entry no.Annotated gene no.Annotation coverage (%)Reference
CAM gene families and basic information
    NCBI Entrez Gene annotations496496100.0(20)
    Pubmed entries18 37149098.8(20)
    Gene ontology annotations472847896.4(21)
    Protein interactions in BIND2307715.5(23)
    Protein interactions in HPRD222521844.0(24)
    Protein interactions in BioGRID256619940.1(22)
    Enriched KEGG pathways1728557.4(25)
    Enriched BioCarta pathways1925952.2NA
    Enriched PID pathways (Pathway interaction database)1639679.8NA
CAM genetics
    Recombination hotspots71425250.8(26)
    Chromosome insertion/deletion49315932.1(27)
    GVD CNV37115030.2(27)
    SNP in CDS regions375641884.3(20)
    SNP in UTR regions448942985.5(20)
    SNP in Intron regions236 21342786.1(20)
CAM expression
    Unigene expression profiles24 48045190.9(20)
    Allen Brain Atlas expression Profiles (express in brain)620535571.6(37)
    Allen Brain Atlas expression Profiles (high expressed in brain)13267815.7(37)
    Proteomics evidence in PRIDE15 33127755.8(38)
CAM regulation
    Validated transcription factor binding sites in transfac125214.2(29)
    Validated transcription factor binding sites by chip-chip18914028.2(29)
    Putative miRNA targets in PicTar35 51323647.6(32)
    Validated miRNA targets in TarBase551.0(30)
    Validated miRNA targets in Argonaute41910922.0(31)
    Cis-NATs regulation22121944.2(34,35)
    Trans-NATs regulation145367.2(34,35)
    Noncoding RNA loci1679519.2(33)
    Alternative splicing11 02646593.8NA
    Experiment validated PTMs308037375.2(36)
    Putative PTMs15 82940180.8(36)
Possible CAM diseases and disorders
    OMIM (with phenotype)1447515.1(20)
    Vulnerable markers identified by GWA6478016.1(39)
    In-house GWA vulnerable markers1216412.9NA
Annotations for CAM genes

CONSTRUCTION OF A CAMO

We iteratively organized the information and knowledge for CAMs to construct a novel CAMO. CAMO was constructed as a directed acyclic graph (DAG) using DAG-Edit (40) to input, manage and update data, as shown in the screenshot (Supplementary Figure 1). We annotated each term with name, definition and source references. We added its relationship to other terms based on manual reviews of domain architecture and functional annotations at the five levels noted above. If vertices represent terms and the relationships between terms are represented by edges, the terms in a DAG can be connected via a directed graph without cycles. CAMO thus provides a hierarchical description of functions and properties of CAMs with five top-level categories: CAM gene families, CAM genetics, CAM regulation, CAM expression and CAM diseases. Each top-level term is further divided into several categories to describe the functions in detail (Figure 2). In toto, CAMO has 850 terms with up to seven levels of depth. We mapped the 496 human genes that function in cell adhesion onto CAMO, providing a novel systematic description of CAMs (Figure 2). CAMO thus provides more specific, complete and resolved information about CAMs to scientists, especially to neuroscientists, than is available in general-purposed ontologies such as MeSH (41) and Gene Ontology (21).
Figure 2.

Structure of CAMO. CAMO provides a hierarchical description of functions and properties of CAMs with five top-level categories (A): CAM expression (B), CAM diseases (C), CAM genetics (D), CAM gene families (E) and CAM regulation (F). Each top-level term is further divided into several categories that allow more detailed functional descriptions.

Structure of CAMO. CAMO provides a hierarchical description of functions and properties of CAMs with five top-level categories (A): CAM expression (B), CAM diseases (C), CAM genetics (D), CAM gene families (E) and CAM regulation (F). Each top-level term is further divided into several categories that allow more detailed functional descriptions.

OKCAM WEB INTERFACE DESIGN

We developed a PostgreSQL database termed ‘OKCAM (Ontology-based Knowledgebase for Cell Adhesion Molecules)’ to manage the CAM gene list, annotations and ‘CAMO’. We implemented a web-based user interface of this database that uses PHP and PHP/SQL query scripts. Cross-references to key external databases were included to integrate functional information about CAM genes. These external databases provide annotations for CAM gene families, CAM genetics and genomics, CAM regulation modes and expression patterns, and relationships between CAMs and human diseases (Figure 3).
Figure 3.

Structure of OKCAM Web Server. Several interactive browsing options were implemented to facilitate user queries of OKCAM. These include ontology overview (A), full gene list overview (B), chromosomal overview (C), text search (D) and BLAST search (D). Each interactive browsing interface returns CAM gene/gene lists that meet query requirements (F). Users can then obtain further detailed annotations mentioned above by clicking on gene names (G). A download page makes all data, database schema and PostgreSQL commands available (E).

Structure of OKCAM Web Server. Several interactive browsing options were implemented to facilitate user queries of OKCAM. These include ontology overview (A), full gene list overview (B), chromosomal overview (C), text search (D) and BLAST search (D). Each interactive browsing interface returns CAM gene/gene lists that meet query requirements (F). Users can then obtain further detailed annotations mentioned above by clicking on gene names (G). A download page makes all data, database schema and PostgreSQL commands available (E). The information for each CAM gene is integrated and presented in a single graphical web page. For example, the OKCAM entry page for cadherin 1 (CDH1) (http://okcam.cbi.pku.edu.cn/entry-info.php?id=999) shows that CDH1 is located on chromosome 16 in a chromosome region that contains a recombination hotspot, copy number variations and insersion/deletions (‘CAM genetics information’). CDH1 transcripts are relatively highly expressed in adult (‘developmental stage’), mammary gland (‘tissue distribution’) and cerebral cortex (‘brain region’). Translation products are also expressed in placenta/blood serum (‘protein expression’). CDH1 is implicated in neoplasia by genomewide association studies and OMIM annotations (‘CAM disease’). Potential CDH1 regulatory modes include alternative splicing regulation, cis-NATs regulation, miRNA regulation as well as post-translational modifications (‘CAM regulation’). Links to the original databases and other resources facilitate information tracing. We implemented four interactive browsing options in OKCAM to facilitate user queries. Users can browse cell adhesion genes by ‘CAMO’, displayed as hierarchical trees on the homepage. They can zoom in on a particular branch of the ontology by clicking the ‘+’ sign to expand the branch. For example, a user interested in ‘psychiatric disorders’ may expand this category, focus on ‘drug addiction’ and see the 49 CAM genes currently mapped on this term by clicking the number that follows this term (Figures 2 and 3). A ‘Chromosomal Overview’ browser supports browsing the CAM genes by clicks on chromosomal locations marked by ‘+++’ (Figure 3). A text search interface facilitates database queries that use either gene IDs or names. A fourth interface supports sequence searching based on BLAST nucleotide and amino acid sequence similarities. Each interactive browsing interface returns CAM gene/gene lists that meet query requirements. Users can then obtain further detailed annotation by clicking on the gene name (Figure 3). A download page makes all data, database schema and PostgreSQL commands available at http://okcam.cbi.pku.edu.cn/download.php.

APPLICATIONS OF OKCAM

The comprehensive annotations and ontology system of OKCAM facilitate studies of the global properties of the CAM genes, overrepresented types of variation, overrepresented regulation modes and expression patterns, and disease associations.

GLOBAL FEATURES OF CAMs

CAMs in our dataset were annotated using Gene Ontology (GO) (21) and the pathway databases KEGG (25), BioCarta and Pathway Interaction Database (PID). We can thus identify significantly enriched Gene Ontology terms and pathways using DAVID (42) and KOBAS (43,44), respectively. We selected the functional categories that were more likely to be biologically meaningful by calculating the statistical significance of each functional category in the input set of genes versus all annotated genes in the human genome. There was statistically significant enrichment for CAM genes in 16 ‘molecular function’ terms (Supplementary Table 1), 11 ‘subcellular localization’ terms (Supplementary Table 2) and 45 ‘biological processes’ terms (Supplementary Table 3), when compared to corresponding data for the whole genome. Identification of functional enrichment for several of the ‘molecular function’ and ‘subcellular localization’ terms is reassuring. This identification provides relatively little additional information, however, since CAMs do function as ‘adhesion molecules’. Most are well documented to sit within (or be anchored to) plasma membranes. However, there is also significant enrichment for other molecular functions that might not have been so readily anticipated, including calcium binding, protein kinase, and protein phosphatase activities (Supplementary Tables 1 and 4). The significant overrepresentation of CAM localizations within receptor complexes and extracellular matrix is also of interest (Supplementary Table 2). It is interesting that the CAMs identified in this work are overrepresented in not only ‘cell adhesion’ but also in biological processes that include signal transduction, responses to external stimuli, cell motility, migration, and nervous system development (Supplementary Table 3). Reassuringly, the molecular pathway enrichment analyses that used each of the three different pathway databases provided results that implicated their roles in largely similar functional pathways (Supplementary Table 5). Data from OKCAM annotations for protein interactions allowed us to develop a molecular network based on proteins that could interact with the CAMs identified here (Supplementary Figure 2). As for other established biological networks (45,46), the connectivity distribution of the network that we nominated in this way appears to follow scale-free rules. CAMs appear to interact with each other to form a relatively tight ‘core’ that interrelates with hundreds of other signal transduction genes. Focus on the ‘hub nodes’ in this apparent network (Supplementary Figure 2) may even help to elucidate novel CAM roles in signal transduction that come from its partnerships with other signaling molecules.

CAM REGULATORY MODES

Mapping the CAMs in our dataset onto CAMO and detailed gene structural/regulatory terms allows us to identify specific potential regulatory modes for these CAMs. We can then perform Monte Carlo analyses to test whether these structural/regulatory modes are overrepresented among CAMs. On human genomic level, both recombination ‘hotspots’ (Monte Carlo P = 0.024) and copy number variations (Monte Carlo P < 0.0001) are over-represented in chromosome regions that contain CAM genes. Indeed, ‘cell adhesion molecule’ is the GO category that is most enriched in the genes that overlap with 1447 copy number variants identified using Affymetrix 500 K and whole genome TilePath (WGTP) reagents (47). There is a more modest but still significant 1.42-fold enrichment for CAM genes in chromosomal regions that contain both copy number variations and recombination hotspots (P = 0.07). By contrast, we detected no significant difference for the densities of single nucleotide polymorphisms (SNP) distributions in chromosomal regions that contain CAM genes versus the whole genome (P > 0.5). When we tested potential overrepresentation of transcriptional regulatory modes using hypergeometric tests, we found that the potential for miRNA regulation was significantly enriched for CAM genes when compared to the whole genome (P < 0.0001). In contrast, no over-represented transcription factor regulation for CAM genes were detected using either low scale experimentally validated (P = 0.37) or ChIP-chip data (P = 0.51). There was no significant over- or under-representation of CAMs among genes involved in either cis- or trans-NAT (35) regulation (P > 0.5 for each). We can also seek overrepresentation of CAM alternative splicing by compiling the alternative splicing isoforms for each human gene mapped on CAMO and plotting the distributions of the numbers of isoforms for (i) CAMs versus (ii) all human genes (Supplementary Figure 3). The overall distributions appear similar. However, genes that utilize a wealth of alternative transcripts, those that encode ∼40–50 alternatively spliced isoforms, are over-represented in the dataset that encodes CAMs. These genes provide an apparently distinct ‘peak’ in the distribution curve (Supplementary Figure 3). This analysis agrees with our previous work that has characterized multiple alternative splicing events in specific addiction-associated CAMs (13). We integrated post-translational modification (PTM) data to identify possible contributions of this regulatory mode to CAM functions. On the basis of the experimentally validated PTM data deposited in dbPTM, the 496 CAM genes are candidates for involvement in glycosylation (334 genes), phosphorylation (114 genes), amidation (22 genes), palmitoylation (eight genes), methylation (three genes), farnesylation (two genes), myristoylation (two genes), sulfation (one gene) and acetylation (one gene). There is a highly significant enrichment for CAM N-linked glycosylation (331 genes, P < 0.0001), but not for O-linked glycosylation (10 genes). No significant over- or under-representation was detected for other modes of post translational modification. On the basis of the OKCAM annotations and CAMO, we identified a list of regulatory modes for cell adhesion molecules. These analyses identified both expected and unexpected CAM regulatory modes. First, the data document the overrepresentation of CNVs within CAM genes, in ways that were suggested in even some of the initial descriptions of CNVs (48). Documenting a 1.4-fold enrichment for CAM genes in chromosomal regions that contain both copy number variations and recombination hotspots both supports these initial observations and provides a possible mechanism for the abundance of CNVs in CAM genes. Secondly, although many papers have described many alternative splicing isoforms for CAMs, it was somewhat surprising to note that the largest diversity of alternative transcripts (e.g. ∼40–50) was selectively over-represented among CAM genes.

CAM EXPRESSION PATTERNS

Integration of data from human expressed sequence tags (EST) derived from brain libraries and mouse brain atlas expression profiles provided strong levels of agreement that support use of this comparative approach (Supplementary Table 6). We thus analyzed CAM expression patterns and levels in 17 mouse brain regions, based on Allen Brain Atlas profiles from murine brains. For each brain region, we used the program R to plot the density curves that illustrate the frequency distributions of expression levels for (i) CAMs and (ii) all human genes expressed in this brain region (Supplementary Figure 4). For 16 of the 17 brain regions, the expression distribution curves for the two datasets merged. In these brain regions, CAM genes taken as a group appear to be expressed in ways that are not markedly different from those of other brain-expressed genes. However, in the cerebral cortex, CAM genes with the highest expression levels appear to be over-represented. There is thus an additional peak in the CAM distribution curve that is not found when all other genes are examined (Supplementary Figure 4). While much prior data documents expression of many CAMs in cerebral cortex, the specificity of the relatively richer expression of CAMs in this brain region provides a novel observation.

CAM DISEASE ASSOCIATIONS

We assessed potential relationships between CAM variants and disease using data from OMIM, public GWAS data and our in-house datasets. These data nominate 167 human CAMs as likely to contain variants that could contribute to individual differences in vulnerability to disorders in brain and a variety of other organs (Figure 4). CAMs were identified by association and/or linkage findings in disorders of the nervous system (91 genes), immune system (30 genes), metabolism (29 genes), cardiovascular system (28 genes), skin and connective tissues (26 genes), musculoskeletal system (25 genes) and hyperplasia and/or tumors (23 genes). When assessed in relation to specific disorders or narrower classes of disorders, there were relatively large numbers of cell adhesion molecules implicated in substance dependence (49 genes), Alzheimer's disease (42 genes), tumors (21 genes), heart disease (20 genes), bipolar disorder (18 genes), autoimmune diseases (19 genes) and diabetes mellitus (17 genes). The number of CAMs whose variants are tentatively implicated in nervous system phenotypes is larger than anticipated by chance (Figure 4). The distribution of findings in other disorders is similar to that displayed by all genes, when comparing data from either OMIM or GWA datasets.
Figure 4.

Distribution of CAM in OMIM and GWA. OMIM, GWA and/or our in-house GWA data implicates variants in at least 167 (of the 496) CAM genes in various diseases. Data from OMIM shares disease distribution patterns with that from GWA studies.

Distribution of CAM in OMIM and GWA. OMIM, GWA and/or our in-house GWA data implicates variants in at least 167 (of the 496) CAM genes in various diseases. Data from OMIM shares disease distribution patterns with that from GWA studies.

DISCUSSION

‘Cell adhesion molecules’ are increasingly recognized as ‘cell adhesion receptors’, since many of their functions are just ‘cell glue’ but rather are more consistent with roles in cell–cell and cell–matrix interactions and in molecular recognition events that transduce signals. The computational approaches that we use here to define and characterize a universe of ‘cell adhesion’ molecules provide both expected and unexpected results. These results should be assessed in light of the strengths and limitations of the approaches used here, and the strengths and limitations of the underlying datasets employed for these analyses. We also discuss details of the strengths and limitations of these data in Supplementary Text 1. We have attempted to provide as comprehensive a list of human CAM genes, annotations and ontology-based CAM knowledgebase as possible. However, it is clear that there will be rapid progress in the study of these molecules and of cell adhesion mechanisms. The OKCAM database provides means for integrating new data and updating knowledge, in ways that should facilitate better and better understanding of the global and specific CAM properties. As CAM genomic features regulatory modes, expression patterns and disease associations become clearer, we thus hope that OKCAM should become even more comprehensive and useful.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR online.

FUNDING

National Institutes of Health Intramural Research Program (NIDA), NIH grants P50CA/DA84718; China Scholarship Council (C.Y.L.); China National High-tech 863 Programs (2006AA02A312, 2006AA02Z334); 973 Programs (2007CB946904). Funding for open access charge: NIDA/IRP grants P50CA/DA84718. Conflict of interest statement. None declared.
  48 in total

1.  Synaptic strength as a function of post- versus presynaptic expression of the neural cell adhesion molecule NCAM.

Authors:  A Dityatev; G Dityateva; M Schachner
Journal:  Neuron       Date:  2000-04       Impact factor: 17.173

Review 2.  Cadherins and catenins in synapse development.

Authors:  Patricia C Salinas; Stephen R Price
Journal:  Curr Opin Neurobiol       Date:  2005-02       Impact factor: 6.627

3.  Combinatorial microRNA target predictions.

Authors:  Azra Krek; Dominic Grün; Matthew N Poy; Rachel Wolf; Lauren Rosenberg; Eric J Epstein; Philip MacMenamin; Isabelle da Piedade; Kristin C Gunsalus; Markus Stoffel; Nikolaus Rajewsky
Journal:  Nat Genet       Date:  2005-04-03       Impact factor: 38.330

Review 4.  Synaptic contact dynamics controlled by cadherin and catenins.

Authors:  Masatoshi Takeichi; Kentaro Abe
Journal:  Trends Cell Biol       Date:  2005-04       Impact factor: 20.808

5.  Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary.

Authors:  Xizeng Mao; Tao Cai; John G Olyarchuk; Liping Wei
Journal:  Bioinformatics       Date:  2005-04-07       Impact factor: 6.937

Review 6.  Protein tyrosine phosphatases and signalling.

Authors:  Andrew W Stoker
Journal:  J Endocrinol       Date:  2005-04       Impact factor: 4.286

7.  TRANSFAC: a database on transcription factors and their DNA binding sites.

Authors:  E Wingender; P Dietze; H Karas; R Knüppel
Journal:  Nucleic Acids Res       Date:  1996-01-01       Impact factor: 16.971

8.  Association analysis of mild mental impairment using DNA pooling to screen 432 brain-expressed single-nucleotide polymorphisms.

Authors:  L M Butcher; E Meaburn; P S Dale; P Sham; L C Schalkwyk; I W Craig; R Plomin
Journal:  Mol Psychiatry       Date:  2005-04       Impact factor: 15.992

9.  Neuroligin 1 is a postsynaptic cell-adhesion molecule of excitatory synapses.

Authors:  J Y Song; K Ichtchenko; T C Südhof; N Brose
Journal:  Proc Natl Acad Sci U S A       Date:  1999-02-02       Impact factor: 11.205

10.  Relations in biomedical ontologies.

Authors:  Barry Smith; Werner Ceusters; Bert Klagges; Jacob Köhler; Anand Kumar; Jane Lomax; Chris Mungall; Fabian Neuhaus; Alan L Rector; Cornelius Rosse
Journal:  Genome Biol       Date:  2005-04-28       Impact factor: 13.583

View more
  13 in total

1.  Human protein reference database and human proteinpedia as discovery resources for molecular biotechnology.

Authors:  Renu Goel; Babylakshmi Muthusamy; Akhilesh Pandey; T S Keshava Prasad
Journal:  Mol Biotechnol       Date:  2011-05       Impact factor: 2.695

2.  Genome-wide association for smoking cessation success in a trial of precessation nicotine replacement.

Authors:  George R Uhl; Tomas Drgon; Catherine Johnson; Marco F Ramoni; Frederique M Behm; Jed E Rose
Journal:  Mol Med       Date:  2010-08-24       Impact factor: 6.354

3.  Cell adhesion molecules: druggable targets for modulating the connectome and brain disorders?

Authors:  George R Uhl; Jana Drgonova
Journal:  Neuropsychopharmacology       Date:  2014-01       Impact factor: 7.853

Review 4.  The genetics of behavioral alcohol responses in Drosophila.

Authors:  Aylin R Rodan; Adrian Rothenfluh
Journal:  Int Rev Neurobiol       Date:  2010       Impact factor: 3.230

Review 5.  Human cell adhesion molecules: annotated functional subtypes and overrepresentation of addiction-associated genes.

Authors:  Xiaoming Zhong; Jana Drgonova; Chuan-Yun Li; George R Uhl
Journal:  Ann N Y Acad Sci       Date:  2015-05-18       Impact factor: 5.691

6.  ECM microenvironment regulates collective migration and local dissemination in normal and malignant mammary epithelium.

Authors:  Kim-Vy Nguyen-Ngoc; Kevin J Cheung; Audrey Brenot; Eliah R Shamir; Ryan S Gray; William C Hines; Paul Yaswen; Zena Werb; Andrew J Ewald
Journal:  Proc Natl Acad Sci U S A       Date:  2012-08-23       Impact factor: 11.205

7.  GNAI3: Another Candidate Gene to Screen in Persons with Ocular Albinism.

Authors:  Alejandra Young; Uma Dandekar; Calvin Pan; Avery Sader; Jie J Zheng; Richard A Lewis; Debora B Farber
Journal:  PLoS One       Date:  2016-09-08       Impact factor: 3.240

8.  Candidate pathways and genes for prostate cancer: a meta-analysis of gene expression data.

Authors:  Ivan P Gorlov; Jinyoung Byun; Olga Y Gorlova; Ana M Aparicio; Eleni Efstathiou; Christopher J Logothetis
Journal:  BMC Med Genomics       Date:  2009-08-04       Impact factor: 3.063

9.  Genome wide association for substance dependence: convergent results from epidemiologic and research volunteer samples.

Authors:  Catherine Johnson; Tomas Drgon; Qing-Rong Liu; Ping-Wu Zhang; Donna Walther; Chuan-Yun Li; James C Anthony; Yulan Ding; William W Eaton; George R Uhl
Journal:  BMC Med Genet       Date:  2008-12-18       Impact factor: 2.103

10.  Differential B-Cell Receptor Signaling Requirement for Adhesion of Mantle Cell Lymphoma Cells to Stromal Cells.

Authors:  Laia Sadeghi; Gustav Arvidsson; Magali Merrien; Agata M Wasik; André Görgens; C I Edvard Smith; Birgitta Sander; Anthony P Wright
Journal:  Cancers (Basel)       Date:  2020-05-02       Impact factor: 6.639

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.