| Literature DB >> 27899674 |
Aron Marchler-Bauer1, Yu Bo2, Lianyi Han2, Jane He2, Christopher J Lanczycki2, Shennan Lu2, Farideh Chitsaz2, Myra K Derbyshire2, Renata C Geer2, Noreen R Gonzales2, Marc Gwadz2, David I Hurwitz2, Fu Lu2, Gabriele H Marchler2, James S Song2, Narmada Thanki2, Zhouxi Wang2, Roxanne A Yamashita2, Dachuan Zhang2, Chanjuan Zheng2, Lewis Y Geer2, Stephen H Bryant2.
Abstract
NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints. An archive of pre-computed domain annotation is maintained for proteins tracked by NCBI's Entrez database, and live search services are offered as well. CDD curation staff supplements a comprehensive collection of protein domain and protein family models, which have been imported from external providers, with representations of selected domain families that are curated in-house and organized into hierarchical classifications of functionally distinct families and sub-families. CDD also supports comparative analyses of protein families via conserved domain architectures, and a recent curation effort focuses on providing functional characterizations of distinct subfamily architectures using SPARCLE: Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27899674 PMCID: PMC5210587 DOI: 10.1093/nar/gkw1129
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
URLs and other resources associated with the CDD project
| URL | Description |
|---|---|
| CD-Search interface utilizing the RPS-BLAST algorithm and the model database, and to the CDART database of pre-computed domain annotation | |
| BATCH CD-Search interface utilizing the RPS-BLAST algorithm and the model database, and to the CDART database of pre-computed domain annotation. Up to 4,000 protein queries may be submitted per request | |
| Entrez interface to CDD | |
| CDD project home page | |
| CDART domain architecture viewer | |
| CDD FTP site, see README file for content | |
| Domain hierarchy editor/viewer and protein structure/alignment viewer | |
| RPS-BLAST stand-alone tool for searching databases of profile models, part of the NCBI toolkit distribution | |
| Entrez interface to SPARCLE (Subfamily Protein Architecture Labeling Engine) |
Figure 1.CD-Search reporting pre-computed domain annotation for the protein with GenBank accession KUG45846, a hypothetical protein from Pseudomonas savastanoi pv. Fraxini. The section circled in red provides the functional label that has been assigned to the subfamily domain architecture characterized by the string ‘cd00714 cd05008 cd05009’, which is shared by over 70 000 sequences in Entrez/protein.
Figure 2.Subfamily domain architecture summary page. The summary pages include a browser that provides options for retrieving sub-sets of the sequences sharing the same subfamily architecture, such as those from particular sources, a particular organism, or those that are linked to papers in PubMed.