| Literature DB >> 26013810 |
David Sehnal1, Lukáš Pravda2, Radka Svobodová Vařeková2, Crina-Maria Ionescu3, Jaroslav Koča4.
Abstract
Well defined biomacromolecular patterns such as binding sites, catalytic sites, specific protein or nucleic acid sequences, etc. precisely modulate many important biological phenomena. We introduce PatternQuery, a web-based application designed for detection and fast extraction of such patterns. The application uses a unique query language with Python-like syntax to define the patterns that will be extracted from datasets provided by the user, or from the entire Protein Data Bank (PDB). Moreover, the database-wide search can be restricted using a variety of criteria, such as PDB ID, resolution, and organism of origin, to provide only relevant data. The extraction generally takes a few seconds for several hundreds of entries, up to approximately one hour for the whole PDB. The detected patterns are made available for download to enable further processing, as well as presented in a clear tabular and graphical form directly in the browser. The unique design of the language and the provided service could pave the way towards novel PDB-wide analyses, which were either difficult or unfeasible in the past. The application is available free of charge at http://ncbr.muni.cz/PatternQuery.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26013810 PMCID: PMC4489247 DOI: 10.1093/nar/gkv561
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The query recognizes the binding pocket of any residue containing a pyranose moiety in the envelope glycoprotein gp160 from Human immunodeficiency virus 1 in complex with Homo sapiens immunoglobulins (3u7y). One of the recognized patterns is highlighted in the box. (A) First, the query identifies a pyranose moiety (a ring composed of 5 carbons and an oxygen atom). (B) Then, all residues which include this pattern in their structure are identified. (C) Finally, all the residues that are at most 4Å from any of the pyranose containing residues are detected as well. This ensures all the potential coordination partners are recognized properly. The molecules were visualized using PyMOL.
Figure 2.The PatternQuery Explorer mode is tailored for querying smaller user-defined datasets (up to 100 entries) uploaded in one of the supported formats. Additionally, a subset of the PDB archive can be queried as well, based on PDB ID or a variety of metadata.