| Literature DB >> 31733063 |
Ian Sillitoe1, Antonina Andreeva2, Tom L Blundell3, Daniel W A Buchan4,5, Robert D Finn6, Julian Gough2, David Jones4,5, Lawrence A Kelley7, Typhaine Paysan-Lafosse6, Su Datt Lam1,8, Alexey G Murzin2, Arun Prasad Pandurangan2, Gustavo A Salazar6, Marcin J Skwark3, Michael J E Sternberg7, Sameer Velankar6, Christine Orengo1.
Abstract
Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31733063 PMCID: PMC7139969 DOI: 10.1093/nar/gkz967
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Summary of the resources currently contributing to Genome3D. Resources either contribute structural predictions (domain annotations and/or 3D models) or a domain classification scheme
| Resource (reference) | Principal Investigator | Contribution | Classification source |
|---|---|---|---|
| DomSerf ( | Jones | Prediction (3D Models) | CATH |
| FUGUE ( | Blundell | Prediction (Domains) | CATH + SCOP |
| Gene3D ( | Orengo | Prediction (3D Models + Domains) | CATH |
| pDomTHREADER ( | Jones | Prediction (Domains) | CATH |
| PHYRE2 ( | Sternberg / Kelley | Prediction (3D Models + Domains) | SCOP + PDB |
| SUPERFAMILY ( | Gough | Prediction (3D Models + Domains) | SCOP |
| VIVACE | Blundell | Prediction (3D Models) | CATH + SCOP |
| CATH ( | Orengo | Classification | - |
| SCOP ( | Murzin | Classification | - |
Figure 1.Screenshot showing a typical Genome3D webpage for the gene ABL1_HUMAN. Annotations are grouped into predicted domains (coloured according to the evolutionary relationships within CATH and SCOP) or predicted 3D structures if a group has provided 3D coordinates.
Figure 2.A summary of the increase in target sequences and the number of annotations (split into domain annotations and 3D models) when comparing the latest version of Genome3D v2.1 against the Genome3D release from the previous NAR update article (v1.0, 2015).
Figure 3.An overview of the workflow for data entry in the new Genome3D API. Key advantages over the previous mechanism include: removing inefficient, time-consuming and error-prone manual processing steps, improving flexibility for resource to submit annotations in their own schedule, allowing annotations to be integrated into existing pipelines and reducing the work involved in adding new annotation groups and algorithms.
Description of Genome3D databases and releases
| Release | Permission | Domain | Purpose |
|---|---|---|---|
| LATEST | Read only |
| Most recent static release |
| HEAD | Read / Write |
| Manage annotations for upcoming release |
| DAILY | Read / Write |
| Testing area (refreshed daily from HEAD) |
Figure 4.(A) Genome3D predictions table for IPR035074. 10 domains have been found for two UniProt accessions. (B) D2TKH3 protein page in InterPro. The sequence viewer shows Genome3D domains and 3D structure predictions alongside integrated InterPro entries and unintegrated InterPro member databases.