| Literature DB >> 18790807 |
Chuan-Yun Li1, Qing-Rong Liu, Ping-Wu Zhang, Xiao-Mo Li, Liping Wei, George R Uhl.
Abstract
'Cell adhesion molecules' (CAMs) are essential elements of cell/cell communication that are important for proper development and plasticity of a variety of organs and tissues. In the brain, appropriate assembly and tuning of neuronal connections is likely to require appropriate function of many cell adhesion processes. Genetic studies have linked and/or associated CAM variants with psychiatric, neurologic, neoplastic, immunologic and developmental phenotypes. However, despite increasing recognition of their functional and pathological significance, no systematic study has enumerated CAMs or documented their global features. We now report compilation of 496 human CAM genes in six gene families based on manual curation of protein domain structures, Gene Ontology annotations, and 1487 NCBI Entrez annotations. We map these genes onto a cell adhesion molecule ontology that contains 850 terms, up to seven levels of depth and provides a hierarchical description of these molecules and their functions. We develop OKCAM, a CAM knowledgebase that provides ready access to these data and ontologic system at http://okcam.cbi.pku.edu.cn. We identify global CAM properties that include: (i) functional enrichment, (ii) over-represented regulation modes and expression patterns and (iii) relationships to human Mendelian and complex diseases, and discuss the strengths and limitations of these data.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18790807 PMCID: PMC2686464 DOI: 10.1093/nar/gkn568
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Collection of Human CAMs. CAMs were compiled by integrating Gene Ontology annotations, domain structure information and keywords query against NCBI Entrez Gene annotations. Four hundred and ninety-six unique human genes were identified as CAMs (additional genes that may also function in this way are identified in the supplement).
Annotations for CAM genes
| Description | Evidence entry no. | Annotated gene no. | Annotation coverage (%) | Reference |
|---|---|---|---|---|
| CAM gene families and basic information | ||||
| NCBI Entrez Gene annotations | 496 | 496 | 100.0 | ( |
| Pubmed entries | 18 371 | 490 | 98.8 | ( |
| Gene ontology annotations | 4728 | 478 | 96.4 | ( |
| Protein interactions in BIND | 230 | 77 | 15.5 | ( |
| Protein interactions in HPRD | 2225 | 218 | 44.0 | ( |
| Protein interactions in BioGRID | 2566 | 199 | 40.1 | ( |
| Enriched KEGG pathways | 17 | 285 | 57.4 | ( |
| Enriched BioCarta pathways | 19 | 259 | 52.2 | NA |
| Enriched PID pathways (Pathway interaction database) | 16 | 396 | 79.8 | NA |
| CAM genetics | ||||
| Recombination hotspots | 714 | 252 | 50.8 | ( |
| Chromosome insertion/deletion | 493 | 159 | 32.1 | ( |
| GVD CNV | 371 | 150 | 30.2 | (27) |
| SNP in CDS regions | 3756 | 418 | 84.3 | ( |
| SNP in UTR regions | 4489 | 429 | 85.5 | (20) |
| SNP in Intron regions | 236 213 | 427 | 86.1 | (20) |
| CAM expression | ||||
| Unigene expression profiles | 24 480 | 451 | 90.9 | ( |
| Allen Brain Atlas expression Profiles (express in brain) | 6205 | 355 | 71.6 | ( |
| Allen Brain Atlas expression Profiles (high expressed in brain) | 1326 | 78 | 15.7 | (37) |
| Proteomics evidence in PRIDE | 15 331 | 277 | 55.8 | ( |
| CAM regulation | ||||
| Validated transcription factor binding sites in transfac | 125 | 21 | 4.2 | ( |
| Validated transcription factor binding sites by chip-chip | 189 | 140 | 28.2 | (29) |
| Putative miRNA targets in PicTar | 35 513 | 236 | 47.6 | ( |
| Validated miRNA targets in TarBase | 5 | 5 | 1.0 | ( |
| Validated miRNA targets in Argonaute | 419 | 109 | 22.0 | ( |
| Cis-NATs regulation | 221 | 219 | 44.2 | ( |
| Trans-NATs regulation | 145 | 36 | 7.2 | ( |
| Noncoding RNA loci | 167 | 95 | 19.2 | ( |
| Alternative splicing | 11 026 | 465 | 93.8 | NA |
| Experiment validated PTMs | 3080 | 373 | 75.2 | ( |
| Putative PTMs | 15 829 | 401 | 80.8 | (36) |
| Possible CAM diseases and disorders | ||||
| OMIM (with phenotype) | 144 | 75 | 15.1 | ( |
| Vulnerable markers identified by GWA | 647 | 80 | 16.1 | ( |
| In-house GWA vulnerable markers | 121 | 64 | 12.9 | NA |
Figure 2.Structure of CAMO. CAMO provides a hierarchical description of functions and properties of CAMs with five top-level categories (A): CAM expression (B), CAM diseases (C), CAM genetics (D), CAM gene families (E) and CAM regulation (F). Each top-level term is further divided into several categories that allow more detailed functional descriptions.
Figure 3.Structure of OKCAM Web Server. Several interactive browsing options were implemented to facilitate user queries of OKCAM. These include ontology overview (A), full gene list overview (B), chromosomal overview (C), text search (D) and BLAST search (D). Each interactive browsing interface returns CAM gene/gene lists that meet query requirements (F). Users can then obtain further detailed annotations mentioned above by clicking on gene names (G). A download page makes all data, database schema and PostgreSQL commands available (E).
Figure 4.Distribution of CAM in OMIM and GWA. OMIM, GWA and/or our in-house GWA data implicates variants in at least 167 (of the 496) CAM genes in various diseases. Data from OMIM shares disease distribution patterns with that from GWA studies.