| Literature DB >> 35402609 |
Nan Ye1, Feng Zhou2, Xingchen Liang2, Haiting Chai3, Jianwei Fan2, Bo Li4, Jian Zhang2.
Abstract
Clear evidence has shown that metal ions strongly connect and delicately tune the dynamic homeostasis in living bodies. They have been proved to be associated with protein structure, stability, regulation, and function. Even small changes in the concentration of metal ions can shift their effects from natural beneficial functions to harmful. This leads to degenerative diseases, malignant tumors, and cancers. Accurate characterizations and predictions of metalloproteins at the residue level promise informative clues to the investigation of intrinsic mechanisms of protein-metal ion interactions. Compared to biophysical or biochemical wet-lab technologies, computational methods provide open web interfaces of high-resolution databases and high-throughput predictors for efficient investigation of metal-binding residues. This review surveys and details 18 public databases of metal-protein binding. We collect a comprehensive set of 44 computation-based methods and classify them into four categories, namely, learning-, docking-, template-, and meta-based methods. We analyze the benchmark datasets, assessment criteria, feature construction, and algorithms. We also compare several methods on two benchmark testing datasets and include a discussion about currently publicly available predictive tools. Finally, we summarize the challenges and underlying limitations of the current studies and propose several prospective directions concerning the future development of the related databases and methods.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35402609 PMCID: PMC8989566 DOI: 10.1155/2022/8965712
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Fraction of top 10 metal-binding interactions that stored in PDB (date: December 20, 2021).
Summary of recently released database of metal ion binding interactions.
| Name | Year | Considered metal ions | Number of sites | Web link | Ref. | Citation | Availability |
|---|---|---|---|---|---|---|---|
| InterMetalDB | 2021 | All metal ion binding | 6,423 |
| [ | N/A | Yes |
| MeLAD | 2020 | All metal ion binding | N/A |
| [ | 9 | Yes |
| ZincBindDB | 2019 | Zn | 24,992 |
| [ | 23 | Yes |
| MetalPDB (v2) | 2018 | All metal ion binding | N/A |
| [ | 90 | No |
| BioLiP | 2013 | All metal ion binding | 146,969 |
| [ | 446 | Yes |
| ZiFDB (v2) | 2013 | Zn | N/A |
| [ | 25 | No |
| MetalPDB (v1) | 2013 | All metal ion binding | N/A |
| [ | 108 | No |
| BioMe | 2012 | All metal ion binding | 20,307 |
| [ | 30 | No |
| MetLigDB | 2011 | Zn, Mn, Fe, Ni, mg, cu, co, Mo | 732 |
| [ | 13 | Yes |
| MIPS | 2010 | All metal ion binding | N/A |
| [ | 28 | Yes |
| MEDB | 2010 | All metal ion binding | N/A |
| [ | 14 | No |
| ZiFDB (v1) | 2009 | Zn | N/A |
| [ | 87 | No |
| MetalMine | 2009 | All metal ion binding | 412 |
| [ | 3 | No |
| Metal-MACiE | 2009 | All metal ion binding | N/A |
| [ | 60 | Yes |
| ZifBASE | 2009 | Zn | N/A |
| [ | 35 | Yes |
| MESPEUS | 2008 | Na, mg, K, ca, Mn, Fe, co, Ni, cu, Zn | 34,896 |
| [ | 102 | No |
| MSDsite | 2005 | All metal ion binding | N/A |
| [ | 122 | Yes |
| MDB | 2002 | All metal ion binding | N/A |
| [ | 276 | No |
1We estimate the availability on December 1st, 10th, and 20th of 2021, respectively.
Figure 2The flowchart of computation-based methods for prediction of metal-binding residues.
Summary of learning-based methods.
| Type | Method1 | Ref. | Year | Metal ion binding2 | Dataset3 | Resolution | Sequence similarity (tool)4 | Prediction model5 | Cross-validation | Independent test | Measurements6 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Sequence-based | Liu et al. | [ | 2020 | Zn, Cu, Fe, Co, Mn, Ca, Mg, Na, K | 5,340 | ≤3 Å | 30% (CD-HIT) | RF | 5-fold | √ | SN, SP, ACC, MCC |
| MIonSite | [ | 2019 | Zn, Ca, Mg, Mn, Fe, Cu, Fe, Co, Na, K, Cd, Ni | 7,676 | N/A | 30% (CD-HIT) | SVM, AdaBoost | 5-fold | √ | SN, SP, ACC, MCC, AUC | |
| MPLs-Pred | [ | 2019 | General metal ions | 1,492 | N/A | 30% (CD-HIT) | RF | 10-fold | √ | SN, SP, ACC, MCC | |
| SXGBsite | [ | 2019 | Ca, Zn, Mg, Mn, Fe | 4,421 | N/A | 40% (PISCES) | GBM | 5-fold | √ | SN, SP, ACC, MCC, AUC | |
| Wang et al. | [ | 2019 | Zn, Cu, Fe, Mn, Ca, Mg, Na, K | 5,146 | ≤3 Å | 30% (N/A) | SVM, SMO | 5-fold | √ | SN, SP, ACC, MCC | |
| znMachine | [ | 2019 | Zn | 2,043 | ≤3 Å | 30% (BLASTclust) | SVM, NN | 5-fold | √ | SN, SP, ACC, MCC, PRE, AUC | |
| SSWPNN | [ | 2019 | Zn | 213 | ≤2.5 Å | 70% (N/A) | SVM, NN | 5-fold | √ | SN, SP, PRE, F1, MCC, ACC | |
| ZinCaps | [ | 2019 | Zn | 738 | ≤3 Å | N/A (N/A) | CN | 5-fold | √ | SN, SP, ACC, MCC, AUC | |
| Haberal and Oğul | [ | 2018 | General metal ions | 2,727 | N/A | N/A (N/A) | CNN | 5-fold | √ | SN, ACC, PRE, F1 | |
| ZincBinder | [ | 2018 | Zn | 738 | ≤2.5 Å | 30% (PISCES) | SVM | 5-fold | √ | SN, SP, ACC, MCC, AUC | |
| EC-RUS | [ | 2017 | Ca, Mg, Mn, Fe, Zn | 4,421 | N/A | 40% (PISCES) | WSRC | 5-fold | √ | SN, SP, ACC, MCC, AUC | |
| Cao et al. | [ | 2017 | Zn, Cu, Fe, Co, Mn, Ca, Mg, K, Na | 5,340 | ≤3 Å | 30% (CD-HIT) | SVM | 5-fold | √ | SN, SP, ACC, MCC | |
| Kumar | [ | 2017 | Cu, Ca, Co, Fe, Mg, Mn, Ni, Zn | 3,922 | N/A | 50% (CD-HIT) | RF | 10-fold | √ | SN, SP, ACC, MCC | |
| DeepMBS | [ | 2017 | General metal ions | 2,727 | ≤3 Å | N/A (N/A) | CNN | 5-fold | √ | SN, PRE, F1 | |
| Qiao et al. | [ | 2017 | Ca | 2,239 | N/A | 30% (CD-HIT) | SVM | 5-fold | √ | SN, ACC, PRE, MCC, AUC | |
| IonCom | [ | 2016 | Zn, Cu, Fe, Ca, Mg, Mn, Na, K | 1,374 | N/A | 30% (CD-HIT) | SVM, AdaBoost | 5-fold | √ | SN, SP, ACC, MCC | |
| Jiang et al. | [ | 2016 | Ca | 1,885 | ≤3 Å | 25% (N/A) | SVM | 5-fold | √ | SN, SP, ACC, MCC | |
| TargetCom | [ | 2016 | Cu, Fe, Zn | 1,373 | ≤3 Å | 40% (CD-HIT) | SVM, AdaBoost | 5-fold | √ | SN, SP, ACC, MCC | |
| OSML | [ | 2015 | Ca, Zn, Mg, Mn, Fe | 4,421 | N/A | 40% (PISCES) | SVM | 5-fold | √ | SN, SP, ACC, MCC | |
| TargetS | [ | 2013 | Ca, Zn, Mg, Mn, Fe | 4,421 | N/A | 40% (PISCES) | SVM, AdaBoost | 5-fold | √ | SN, SP, ACC, MCC, AUC | |
| ETMB-RBF | [ | 2013 | General metal ions | 55 | N/A | 20% (BLASTclust) | RBFN | 10-fold | √ | SN, SP, ACC, MCC | |
| ZincExplorer | [ | 2013 | Zn | 392 | ≤3 Å | N/A (N/A) | SVM | 5-fold | √ | SN, SP, PRE, MCC, AUPRC | |
| Horst et al. | [ | 2010 | Ca | 635 | ≤2.1 Å | 35% (N/A) | LR | 10-fold | √ | MCC,AUC,AUPRC | |
| Structure-based | Nguyen et al. | [ | 2021 | Mn, Fe, Co, Ni, Cu, Zn | 9,955 | ≤2.5 Å | 90% (N/A) | RF | 5-fold | × | ACC |
| TMP-MIBS | [ | 2021 | General metal ions | 427 | N/A | 40% (CD-HIT) | RF | 10-fold | √ | SN, SP, ACC, MCC, AUC | |
| Zincbindpredict | [ | 2021 | Zn | N/A | ≤ 2 Å | 40% (CD-HIT) | RF | 5-fold | √ | SN, PRE, F1, MCC | |
| Wang et al. | [ | 2021 | Zn, Cu, Fe, Ca, Mg, Mn, Na, K, Co | 5,340 | ≤3 Å | 30% (N/A) | MLP,SVM | 5-fold | √ | SN, SP, ACC, MCC | |
| DELIA | [ | 2020 | Ca, Mn, Mg | 3,966 | N/A | 30% (CD-HIT) | CNN | 5-fold | √ | SN, PRE, MCC, AUC | |
| Hu et al. | [ | 2020 | Zn, Cu, Fe, Co, Mn, Ca, Mg, Na, K | 5,340 | ≤3 Å | 30% (CD-HIT) | GBM | 5-fold | √ | SN, SP, FPR, ACC, MCC | |
| MetalExplorer | [ | 2017 | Ca, Co, Cu, Fe, Ni, Mg, Mn, Zn | 3,192 | ≤2.5 Å | 30% (CD-HIT) | RF | 5-fold | √ | SN, FPR, PRE, AUC, AURPC | |
| FINDSITE-metal | [ | 2011 | Ca, Co, Cu, Fe, Mg, Mn, Ni, Zn | 860 | N/A | 35% (PISCES) | SVM | 2-fold | √ | ACC, SPC, PPV | |
| Zinc identifier | [ | 2011 | Zn | 1,103 | ≤2.5 Å | N/A (N/A) | RF | 5-fold | √ | SN, PRE, SP, FPR, AUC, AUPRC |
1The name of each method is provided in either the publication or the last name of its first author. 2General metal ions mean that the related predictor does not differentiate the types of metal ion binding. Otherwise, we list the specific types of metal-binding in detail. 3The number represents the size of the benchmark dataset. 4The value reveals the protein similarity threshold in the benchmark dataset. The content in the blanket indicates the tool that is used for clustering proteins. 5SMO: sequential minimal optimization; SVM: support vector machine; WSRC: weighted sparse representation based classifier; NN: neural network; CN: capsule network; CNN: convolutional neural networks; RF: random forest; GBM: gradient boosting machine; RBFN: radial basis function networks; LR: logistic regression; MLP: multilayer perceptron. 6SN: sensitivity/recall; SP: specificity; ACC: accuracy; MCC: Matthew's correlation coefficient; PRE: precision; F1: F1-score; AUC: area under the ROC curve; AUPRC: area under the precision recall curve; FPR: false positive rate (FPR = 1-SP).
Figure 3Summary of the feature construction and selection for learning-based methods. The light green cells indicate sequence directly derived features. The light blue cells stand for profile-based features. The light red cells mean putative structure-based features. The light grey cells are native structure-based features. 1In the amino acid pair column, the cells without annotations indicate original amino acid pairs; the cells annotated using “k-spaced” means k-spaced amino acid pairs. 2PWM: position weight matrix; PSSM: position specific scoring matrix; PMS: position matrix scoring; PAM: point accepted mutation; EMS: evolutionary matrix scoring. 3TS: topology structure; DM: distance matrix; GTN: graph theoretic network; RRCG: residue-residue contact graphs. 4FFS: forward feature selection; EB: experience-based; BA: Boruta algorithm; mRMR: minimum-redundancy maximum-relevancy; MDGI: mean decrease Gini index [50–51, 53–58, 65, 77–81, 87, 90–105].
Figure 4Ribbon and surface model of X-ray structure of Ca2+- and Zn2+-bound calmodulin (PDB: 4HEX) in Mus musculus. Red sphere represents bound zinc ion; green one indicates calcium ion; the spatial adjacent residues participating its coordination are shown by the stick model.
Summary of docking-based, template-based, and meta-based methods.
| Type | Method | Year | Notes |
|---|---|---|---|
| Docking-based | mFASD [ | 2015 | Capture the characteristics of metal-binding sites and discriminate most types of these sites |
| Zhou et al. [ | 2015 | Use a FEATURE-based calcium model and convert high scoring regions into specific site predictions | |
| GaudiMM [ | 2019 | Find poses that satisfy metal-derived geometrical rules and use post optimizations | |
| BioMetAll [ | 2020 | Predict metal-binding sites with particular motifs, determine transient sites in structures, and predict potential mutations to generate convenient sites | |
| Template-based | Deng et al. [ | 2006 | Use a graph theory algorithm to find oxygen clusters of the protein (high potential for calcium binding) |
| Goyal et al. [ | 2008 | Describe generation of 3D-structural motifs for metal-binding sites from the known metalloproteins | |
| Levy et al. [ | 2009 | Analyze whether structural models based on remote homology are effective in predicting 3D metal binding sites | |
| FunFOLD [ | 2011 | Use an automated method for ligand clustering and identification of binding residues | |
| FunFOLDQA [ | 2012 | Use a fully automated agglomerative clustering approach for both ligand identification and residue selection | |
| FunFOLD2 [ | 2013 | Propose a method that include protein-ligand binding prediction and quality assessment protocol | |
| Meta-based | Li et al. [ | 2017 | Integrate the results of ZincExplorer [ |
| IBayes_Zinc [ | 2019 | Adopt Bayesian method and combine the predictions from ZincExplorer [ |
Figure 5Comparative assessment of several predictors on two benchmark dataset. (a) and (c) indicate the MCC bar charts for considered methods on different metal ion binding residues on Yu et al.'s and Cao et al.'s testing datasets, respectively. (b) draws the AUC values of three predictors on corresponding metal ion binding residues.
A breakdown of predictive tools of metal-binding residues.
| Method | Year | Platform1 | Web link | Availability2 |
|---|---|---|---|---|
| TMP-MIBS [ | 2021 | SS |
| Yes |
| Wang et al. [ | 2021 | WS |
| No |
| Zincbindpredict [ | 2021 | WS |
| No |
| DELIA [ | 2020 | WS |
| Yes |
| BioMetAll [ | 2020 | SS |
| Yes |
| MPLs-Pred [ | 2019 | WS |
| Yes |
| SXGBsite [ | 2019 | SS |
| Yes |
| MIonSite [ | 2019 | SS |
| Yes |
| znMachine [ | 2019 | WS&SS |
| No |
| ZinCaps [ | 2019 | SS |
| Yes |
| EC-RUS [ | 2017 | SS |
| Yes |
| MetalExplorer [ | 2017 | WS |
| No |
| Cao et al. [ | 2017 | WS |
| No |
| ZincBinder [ | 2017 | WS&SS |
| Yes |
| SSWPNN [ | 2017 | SS |
| Yes |
| Jiang et al. [ | 2016 | WS |
| No |
| TargetCom [ | 2016 | SS |
| No |
| OSML [ | 2015 | WS |
| Yes |
| mFASD [ | 2015 | SS |
| Yes |
| FunFOLD2 [ | 2013 | WS |
| Yes |
| ZincExplorer [ | 2013 | WS |
| No |
| TargetS [ | 2013 | WS |
| Yes |
| FunFOLDQA [ | 2012 | SS |
| Yes |
| Zincidentifier [ | 2012 | WS |
| No |
| FINDSITE-metal [ | 2011 | WS |
| No |
| FunFOLD [ | 2011 | WS&SS |
| Yes |
| Goyal et al. [ | 2008 | WS |
| No |
| Deng et al. [ | 2006 | SS |
| No |
1WS: web server; SS: standalone software. 2The availability was estimated on Dec 1st, 10th, and 20th of 2021, respectively.