| Literature DB >> 25361972 |
Emilio Potenza1, Tomás Di Domenico1, Ian Walsh1, Silvio C E Tosatto2.
Abstract
MobiDB (http://mobidb.bio.unipd.it/) is a database of intrinsically disordered and mobile proteins. Intrinsically disordered regions are key for the function of numerous proteins. Here we provide a new version of MobiDB, a centralized source aimed at providing the most complete picture on different flavors of disorder in protein structures covering all UniProt sequences (currently over 80 million). The database features three levels of annotation: manually curated, indirect and predicted. Manually curated data is extracted from the DisProt database. Indirect data is inferred from PDB structures that are considered an indication of intrinsic disorder. The 10 predictors currently included (three ESpritz flavors, two IUPred flavors, two DisEMBL flavors, GlobPlot, VSL2b and JRONN) enable MobiDB to provide disorder annotations for every protein in absence of more reliable data. The new version also features a consensus annotation and classification for long disordered regions. In order to complement the disorder annotations, MobiDB features additional annotations from external sources. Annotations from the UniProt database include post-translational modifications and linear motifs. Pfam annotations are displayed in graphical form and are link-enabled, allowing the user to visit the corresponding Pfam page for further information. Experimental protein-protein interactions from STRING are also classified for disorder content.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25361972 PMCID: PMC4384034 DOI: 10.1093/nar/gku982
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Disorder sources consensus definition matrix
| DisProt | PDB | Predictors | Consensus |
|---|---|---|---|
| Disorder | Structure | Ambiguous | |
| Disorder | Ambiguous | Ambiguous | |
| Structure | Disorder | Ambiguous | |
| Structure | Ambiguous | Ambiguous | |
| Ambiguous | Ambiguous | ||
| Ambiguous | Ambiguous | ||
Each possible annotation scenario is listed for for the three data sources (DisProt, PDB, predictors) together with its consensus annotation. Ambiguous is used for residues with conflicting annotations warranting further investigation, which may be due to folding upon binding events. LC means low confidence. Combinations yielding structure as consensus are underlined and those for disorder are shown in bold. Sources which are not contributing to the consensus are shown in italics.
Figure 1.Search results page. In this example the keyword ‘P53’ in organism ‘human’ is searched and the first 20 results (out of 262) are shown. Long disorder (% LD) coloring is as follows: none (white), low (green), medium (yellow), high (red) and full (black, not shown). Default sorting is by UniProt results, but can be changed by clicking on% LD or length.
Figure 2.Sequence annotations for alpha-synuclein (UniProt entry: P37840). (a) Overview disorder annotations combining DisProt, NMR, x-ray and predictors are shown. The highlighted red circle shows the experimentally determined and predicted long disordered region. Other information includes secondary structure and Pfam domains. Each of these annotations can be downloaded by clicking on the corresponding green button on the top right side of the page. (b) Detailed disorder annotation showing experimental (DisProt, NMR and x-ray) and predicted disorder (10 predictors). For each entry, it is possible to view the detailed sequence annotation by clicking on the green magnifying glass icon (see red circle and left inset). Where available, the 3D structure can be visualized to inspect interesting protein regions (see red circle and right inset). The red circle highlights the only known complete structure alpha-synuclein structure (PDB entry 2kkw). (c) Known protein–protein interactions deduced from PDB files and STRING are shown in analogy to the search results page, with color-coded long disorder percentage, length, protein name and organism. (d) Functional sequence features from UniProt, including binding sites, post-translational modifications and sequence regions.