| Literature DB >> 21646335 |
Corin Yeats1, Jonathan Lees, Phil Carter, Ian Sillitoe, Christine Orengo.
Abstract
The Gene3D structural domain database provides domain annotations for 7 million proteins, based on the manually curated structural domain superfamilies in CATH. These annotations are integrated with functional, genomic and molecular information from external resources, such as GO, EC, UniProt and the NCBI Taxonomy database. We have constructed a set of web services that provide programmatic access to this integrated database, as well as the Gene3D domain recognition tool (Gene3DScan) and protein sequence annotation pipeline for analysing novel protein sequences. Example queries include retrieving all curated GO terms for a domain superfamily or all the multi-domain architectures for the human genome. The services can be accessed using simple HTTP calls and are able to return results in a range of formats for quick downloading and easy parsing, graphical rendering and data storage. Hence, they provide a simple, but flexible means of integrating domain annotations and associated data sets into locally run pipelines and analysis software. The services can be found at http://gene3d.biochem.ucl.ac.uk/WebServices/.Entities:
Mesh:
Year: 2011 PMID: 21646335 PMCID: PMC3125800 DOI: 10.1093/nar/gkr438
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) Graphical output from the Gene3DScanSvc, generated using the Pfam domain drawing JavaScript library. In this example, the domain architecture of FtsA from Thermatoga maritima has been calculated to consist of four domains, including two discontinuous domains from the same superfamily as indicated by the colours. Discontinuous domains are identified by jagged internal edges and linking black lines. Information about displayed elements is shown in pop-up boxes activated by rolling the mouse over the domain of interest. (B) Part of the corresponding CSV format response. E-values and boundaries produced by HMMER and DomainFinder are reported.
Complete list of current Gene3D web services, their root URIs and a brief description of the services
| Service name | URI | Description |
|---|---|---|
| Gene3DScan ( | ||
| Synchronous | /SuperfamilyScan | Scan FASTA for structural domains (<1000 sequences). |
| Asynchronous | /async | Scan large FASTA for structural domains (<2.5MB). |
| Computational Services ( | ||
| Coiled-coils | Simple interface to the Marcoils ( | |
| Transmembrane helices | /tmhmm | Simple interface to TMHMM v2 ( |
| Disordered regions | /anchor | Simple interface to the IUPred ( |
| MetaMotif | /metamotif | Unified front end to the computational services. |
| Data services ( | ||
| CATH Superfamily Descriptions | /CathFamilyDescriptions | Get descriptions for CATH superfamily codes. |
| CATH Superfamily Members | /CathFamilyMembers | Get the UniProt members for a CATH superfamily. |
| CATH structural domains mapped to UniProt | /CathToUniprotMap | Get the location of CATH (PDB) domains in protein sequences. |
| CATH-Gene3D phylogentic profiles | /GenomeProfiles | Get the distribution of superfamilies for approximately 2000 genomes. |
| Detailed domain assignments for complete genomes | /GenomeAssignments | Get detailed domain assignments for complete genomes in Gene3D. |
| Domain assignments and protein architectures | /DomainArchitectures | Get domain assignments for individual proteins and large-scale collections. |
| Enzyme Commission Code Assignments | /EnzymeCodes | Get EC codes associated with superfamilies. |
| Functional residues | /FunctionalResidues | Get functional residues (e.g. active sites) that overlap with domains. |
| GO functional annotations | /GoFunctions | Get GO function terms associated with superfamilies. |
| Pfam families with no structural representatives | /PfamNsr | Get the Pfam family annotations that do not overlap with a Gene3D domain. |
The services divide into three sets: Gene3DScan, external computational services and data access services. Examples can be found in the on-line documentation for all services, along with a complete list of paths for the data services at http://gene3d.biochem.ucl.ac.uk/Gene3DDataServices/rest/service_paths.html.