| Literature DB >> 35639505 |
Nur Syatila Ab Ghani1, Reeki Emrizal2, Sabrina Mohamed Moffit2, Hazrina Yusof Hamdani3, Effirul Ikhwan Ramlan4, Mohd Firdaus-Raih1,2.
Abstract
The GrAfSS (Graph theoretical Applications for Substructure Searching) webserver is a platform to search for three-dimensional substructures of: (i) amino acid side chains in protein structures; and (ii) base arrangements in RNA structures. The webserver interfaces the functions of five different graph theoretical algorithms - ASSAM, SPRITE, IMAAAGINE, NASSAM and COGNAC - into a single substructure searching suite. Users will be able to identify whether a three-dimensional (3D) arrangement of interest, such as a ligand binding site or 3D motif, observed in a protein or RNA structure can be found in other structures available in the Protein Data Bank (PDB). The webserver also allows users to determine whether a protein or RNA structure of interest contains substructural arrangements that are similar to known motifs or 3D arrangements. These capabilities allow for the functional annotation of new structures that were either experimentally determined or computationally generated (such as the coordinates generated by AlphaFold2) and can provide further insights into the diversity or conservation of functional mechanisms of structures in the PDB. The computed substructural superpositions are visualized using integrated NGL viewers. The GrAfSS server is available at http://mfrlab.org/grafss/.Entities:
Year: 2022 PMID: 35639505 PMCID: PMC9252811 DOI: 10.1093/nar/gkac402
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Figure 1.An overview of the options and flow for a GrAfSS search that begins with selecting the type of macromolecule and progresses to the different search programs and database options based on the user's intended search objectives. The different query formats and types of databases searched are presented to better illustrate the different searches that the GrAfSS webserver can execute.
Figure 2.Graph theoretical representations of amino acid side chains and RNA bases used in GrAfSS. (A) The 20 amino acids are represented by Key Start (indicated in yellow) and Key End (indicated in green) pseudo-atoms as graph nodes for the SPRITE and ASSAM algorithms; for the IMAAAGINE algorithm, a single Key pseudo-atom is used as indicated in cyan. The overlapping Key Start/End and single Key pseudo-atom for both algorithms are indicated in purple. (B) The four RNA bases are represented by pseudo-atom vectors that are also the nodes of a graph. (C) The connectivity of the bases by hydrogen bonds (dotted lines) are represented in a connection table (lower panel).
Information on the corresponding input formats, program, example search objectives and the source datasets for the databases used
| Program | Search objective | Query (format) | Data set source for search database | |
|---|---|---|---|---|
| Protein | SPRITE | Search for the presence of a 3D substructure composed of amino acid side chain arrangements in a protein structure. | Protein structure coordinate file (*.pdb, *.cif) or a four character PDBID. | •Catalytic Site Atlas ( |
| ASSAM | Search for protein structures having a similar 3D substructure as the query. | 3D motif or substructure composed of 3–12 residues (*.pdb). | • Non-redundant PDB datasets at 30% and 35% sequence identity excluding mutant structures; | |
| IMAAAGINE | Search for protein structures having a similar 3D substructure as the query. | Conceptual / hypothetical substructure or motif composed of 3 to 8 residues that users can define using the interface provided. | • Selected proteomes by Alphafold | |
| RNA | NASSAM | Search for the presence of a 3D substructure composed of base arrangements in an structure containing RNA chains. | Structure coordinate file (*.pdb, *.cif) containing RNA chains. | • RNA base arrangements from the Nucleic Acids Interaction Library ( |
| COGNAC | Search for clusters of RNA bases that are interconnected by at least one hydrogen bond. | Structure coordinate file (*.pdb) containing RNA chains and base connection pattern options of 2 to 6 bases. An option to upload two files for comparisons is available. | • PDB structures containing RNA chains (with resolution of 3.5A or higher). |
Comparison of the different web servers that can up to an extent be used for the detection of substructural similarities and 3D motifs in the structures of proteins and RNA; features only found in the GrAfSS webserver are marked with an*
| Available Comparable Webservers | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Ef-Seek ( | GrAfSS | MultiBind ( | ProFunc ( | PDBeMotif ( | ProBis ( | R3D-BLAST ( | RAG-3D ( | RASMOT-3D Pro ( | RCLICK ( | SA-Mot ( | SETTER ( | SuMo ( | WebFR3D ( | |
|
| ||||||||||||||
| PDB ID |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||
| Protein structure coordinate file in PDB format | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||
| Protein structure coordinate file in mmCIF format* |
| |||||||||||||
| User-defined query of conceptual amino acid arrangements / nucleic acid arrangement / interaction |
| ✓ | ✓ | ✓ | ✓ | |||||||||
| Structure coordinate file containing RNA chain(s) in PDB format |
| ✓ | ✓ | ✓ | ✓ | |||||||||
| Structure coordinate file containing RNA chain(s) in mmCIF format* |
| |||||||||||||
|
| ||||||||||||||
| Representatives of the PDB |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||
| 3D arrangements (ie. motifs, functional site, ligand binding site) | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
| AlphaFold structures at EBI* |
| |||||||||||||
|
| ||||||||||||||
| Homologous structures (fold similarity) | ✓ | ✓ | ||||||||||||
| Local structural similarity | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Pairwise structural similarity | ✓ | ✓ | ✓ | |||||||||||
| Catalytic sites |
| ✓ | ✓ | |||||||||||
| Ligand binding sites | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | |||||||
| DNA/RNA-binding sites |
| ✓ | ✓ | ✓ | ||||||||||
| Protein-protein interfaces |
| ✓ | ✓ | |||||||||||
| Various 3D motifs | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||
| Similar 3D arrangements to known drug binding sites* |
| |||||||||||||
|
| ||||||||||||||
| List of predicted 3D motifs / substructure ranked by structural similarity scores (ie. RMSD) | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||
| Direct molecular visualization of results - 3D motifs / substructure enabled | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Downloadable output files | ✓ |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||
Figure 3.Example of a search process that a GrAfSS user can carry out to (A) explore a thematic spatial formation by providing a conceptual amino acid arrangement and searching: (B) a database of non-redundant PDB structures or (C) a database of biological assemblies from the PDB. (D) The results of the search can then be further investigated by providing the specific arrangement as a query to determine whether there are other representative structures in the PDB that also contain a similar arrangement. All search results can then be visualized using the embedded NGL viewer as presented in (B), (C) and (D).