| Literature DB >> 35887033 |
Claudia Andreini1,2, Antonio Rosato1,2.
Abstract
All living organisms require metal ions for their energy production and metabolic and biosynthetic processes. Within cells, the metal ions involved in the formation of adducts interact with metabolites and macromolecules (proteins and nucleic acids). The proteins that require binding to one or more metal ions in order to be able to carry out their physiological function are called metalloproteins. About one third of all protein structures in the Protein Data Bank involve metalloproteins. Over the past few years there has been tremendous progress in the number of computational tools and techniques making use of 3D structural information to support the investigation of metalloproteins. This trend has been boosted by the successful applications of neural networks and machine/deep learning approaches in molecular and structural biology at large. In this review, we discuss recent advances in the development and availability of resources dealing with metalloproteins from a structure-based perspective. We start by addressing tools for the prediction of metal-binding sites (MBSs) using structural information on apo-proteins. Then, we provide an overview of the methods for and lessons learned from the structural comparison of MBSs in a fold-independent manner. We then move to describing databases of metalloprotein/MBS structures. Finally, we summarizing recent ML/DL applications enhancing the functional interpretation of metalloprotein structures.Entities:
Keywords: bioinorganic chemistry; copper; iron; metal-binding; structural biology; transition metals; zinc
Mesh:
Substances:
Year: 2022 PMID: 35887033 PMCID: PMC9323969 DOI: 10.3390/ijms23147684
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Figure 1Definition of a MBS according to the MetalPDB protocol. For each metal atom in a given 3D structure, the non-hydrogen non-carbon atoms at a distance smaller than 3.0 Å from the metal ion (green sphere) are identified as its donor atoms (blue atoms), i.e., the atoms that bind directly to the metal. The protein residues or small molecules that contain at least one donor atom are the metal ligands (cyan sticks), and constitute the first coordination sphere of the metal ion. The full MBS is obtained by including any other residue or chemical species with at least one atom within 5.0 Å from a metal ligand (orange sticks). Metal ligands provided by small molecules (e.g., water, ammonia, synthetic inhibitors) or ions (e.g., acetate, hydroxamate) are called exogenous ligands. This example MBS is the zinc(II)-binding site of human Schlafen 5 protein (PDB entry 7Q3Z [31]).
Figure 2The concept of the template-based detection of MBSs. Each template MBS from a suitably designed library, such as all of the MBSs from the non-redundant PDB, is docked to the query apo-structure. If the docking is successful, the position of the site and the nature of the bound metal are predicted. Often, the docking procedure is guided by first identifying candidate metal ligands in the query structure. The template MBS is the manganese(II)-binding site of the LigD phosphoesterase domain of Pseudomonas aeruginosa (PDB entry 3N9B [34]); the apo-structure is taken from PDB entry 2LJ6, which is the solution structure of the same protein in the absence of manganese(II) [35].
Figure 3Detection of relationships between MPs with divergent sequences. MBS-driven structural superposition permits the detection of similarities between proteins with no homology, or which have experienced extensive sequence divergence. PDB entry 2PA8 is the crystal structure of the D/L subcomplex of the RNA polymerase of Sulfolobus solfataricus [51]. PDB entry 1QLB is the crystal structure of fumarate reductase from the respiratory complex II-like of Wolinella succinogenes [52]. This example superposition was computed with the MetalS2 [53] tool.
Figure 4Comparison of a mouse zinc(II)-binding domain with the AlphaFold prediction of the whole structure and analysis of the site with DeepCys. (A) The first model of the NMR structure of the PHDVC5HCHNSD1 tandem domain of the Nsd1 protein from Mus musculus, with one of its four zinc(II) MBSs highlighted (the metal ion is in magenta) (PDB entry 2NAA [83]). (B) AlphaFold model of the entire human homolog of this protein (Uniprot entry Q96L73), with the corresponding site highlighted. The 2NAA structure was input to DeepCys to predict the likelihood that each of the three cysteines is metal-binding; this is mapped onto the model. Some parts of the model were omitted for clarity.
Summary of all of the resources mentioned in this article. The resources are listed in the same order as they are discussed in the corresponding sections. Only links active as of 15 June 2022 are included.
| Tool Name and Link | Implemented Approach | Reference |
|---|---|---|
|
| ||
| Identification of cavities with high hydrophobicity contrast | [ | |
| CHED | Identification of suitable arrangement(s) of triads of the CHED residues based on the distances between candidate donor atom | [ |
| IonCom | Integration of four structure-based predictors and a novel sequence-based predictor | [ |
| MIB | Docking MBS templates with the fragment transformation method | [ |
| ZINCCLUSTER | Detection of known structural patterns | [ |
| Predictive algorithm in the | Identification of accessible cavities whose center of mass is within 3.5 Å from the β-carbon atoms of three or more CHED residues | [ |
| BioMetAll | Identification of cavities followed by their validation against pre-defined geometric patterns of the protein backbone | [ |
| N.A. | Docking MBS templates with geometric hashing against an ensemble of 11 structural conformations for the query protein, generated with coarse-grained molecular mechanics | [ |
|
| ||
| Zincbindpredict | Application of a portfolio of predictive models, each optimized to detect a specific type of zinc-binding site. Each type corresponds to a different zinc-binding patterns. | [ |
| Prediction of positions where metal ligands can be introduced, based on protein backbone coordinates, to design artificial MPs | [ | |
|
| ||
| MetalS2 | Pairwise metal-centered superposition of MBSs based on a combination of sequence and structural similarity | [ |
| MetalS3 | A web server using an optimized version of MetalS2 to search the MetalPDB database for MBSs structurally similar to the query | [ |
| mFASD | A structure-based algorithm to predict which metal populates a MBS based on systematic comparison against a template library | [ |
| MeCOM | Pairwise superposition of MBSs based on a combination of site features and the position of the Cα atoms | [ |
| TopMatch + Sahle | Scoring of pairwise structural superpositions computed by the TopMatch tool, which ignores metal ions, with the sahle function to detect alignments having a good overlap of the MBSs | [ |
|
| ||
| MetalPDB | MetalPDB collects structural information on all the MBSs present in the Protein Data Bank | [ |
| BioLiP | A database collecting structures of protein adducts, including metal-protein complexes | [ |
| ZincBind | A database specialized on zinc-binding sites built on biological assemblies | [ |
| PyDISH | PyDISH is specialized on the analysis of heme-binding sites in PDB structures | [ |
| VirusMED | A database of epitopes, drug binding site and metal binding sites in viral proteins of known 3D structure | [ |
| InterMetalDB | A database of MBSs occurring at macromolecular interfaces, built on biological assemblies | [ |
| MetLigDB | MetLigDB focuses on the structural and chemical properties of small molecules that bind directly to the metal ion(s) in MP structures | [ |
| MeLAD | A database derived from the 3D structures of all metalloenzyme-ligand adducts, which integrates detailed analyses of metal-binding pharmacophores, metalloenzyme structural similarity and ligand chemical similarity | [ |
|
| ||
| Use of conditional variational autoencoders for the automated design of artificial metalloproteins | [ | |
| Identification of disease-related mutations through a multichannel convolutional neural network (MCCNN) | [ | |
| DeepCys | Discrimination of four cysteine different roles, i.e., metal-binding, disulphide formation, sulphenylation and thioether | [ |
| MAHOMES | Discrimination of enzymatic and non-enzymatic metals in MPs | [ |
| AlphaFill | A database derived from AlphaFold predictions of apo-proteins where holo-structures of MPs have been reconstructed | [ |
| bindEmbed21 | bindEmbed21 uses a combination of homology-based inference and a convolutional neural network to predict whether a protein residue binds to a metal ion, a nucleic acid, or a small molecule | [ |
| Sequence-based prediction of MPs using a NN trained with information derived from 3D structures | [ | |
| Discrimination of physiological and adventitious zinc-binding sites in MPs using a recurrent neural network (RNN) | [ | |