| Literature DB >> 26578585 |
Su Datt Lam1, Natalie L Dawson1, Sayoni Das1, Ian Sillitoe1, Paul Ashford1, David Lee1, Sonja Lehtinen2, Christine A Orengo1, Jonathan G Lees3.
Abstract
Gene3D http://gene3d.biochem.ucl.ac.uk is a database of domain annotations of Ensembl and UniProtKB protein sequences. Domains are predicted using a library of profile HMMs representing 2737 CATH superfamilies. Gene3D has previously featured in the Database issue of NAR and here we report updates to the website and database. The current Gene3D (v14) release has expanded its domain assignments to ∼ 20,000 cellular genomes and over 43 million unique protein sequences, more than doubling the number of protein sequences since our last publication. Amongst other updates, we have improved our Functional Family annotation method. We have also improved the quality and coverage of our 3D homology modelling pipeline of predicted CATH domains. Additionally, the structural models have been expanded to include an extra model organism (Drosophila melanogaster). We also document a number of additional visualization tools in the Gene3D website.Entities:
Mesh:
Year: 2015 PMID: 26578585 PMCID: PMC4702871 DOI: 10.1093/nar/gkv1231
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) mutation data displayed for NOD2_HUMAN and summarized for each domain (B) Clicking on a domain provides a link to the domain page where structural models and sequence alignments of the domain are available. Various data including which mutations affect interactions are displayed and clicking on one of these mutations shows the residue in ‘space-fill’. In this case we highlight a mutation in the N-terminal domain of NOD2 which decreases its protein interaction strength with ATG16L1. The alignment is built from the FunFam seed alignment and the query domain is added using the MAFFT ‘add_sequences’ function.
Figure 2.Domain family tree view of the domain superfamily 2.10.10.10 in the pan-Ensembl (compara) taxonomic tree. A red border indicates that taxonomic level has at least one gene assigned with the query domain family. Nodes filled with a blue colour indicate the child taxonomic levels are hidden, and clicking on these nodes will expand to show extra species.
Figure 3.Example highly connected domains from the Domain-protein interaction network view for (A) A4_HUMAN. The large circular node (with a representative domain superfamily image inside) shows the domain with most overlapping interaction annotations (from the A4_HUMAN protein) and the label shows its superfamily code and region on the full protein sequence. The small grey nodes show proteins that this domain of A4_HUMAN is likely to mediate interactions with (NB. In this image the interactions are filtered to only include those interactions where greater than 50% of the sub-region annotation is covered by the domain). The width of the edge indicates the proportion of the sub-region annotation that is covered by the domain. Blue links indicate cases where a mutated residue in the domain has been shown to affect the interaction. (B) Interactions for ABL1_HUMAN zoomed in on the SH3 Domain of ABL1_HUMAN. The Networks are built using cytoscape.js.