| Literature DB >> 35883523 |
Shiwei Wang1, Haoyu Lin1,2, Zhixian Huang1, Yufeng He3, Xiaobing Deng1, Youjun Xu4, Jianfeng Pei5,6, Luhua Lai1,2,3,5,6.
Abstract
Location and properties of ligand binding sites provide important information to uncover protein functions and to direct structure-based drug design approaches. However, as binding site detection depends on the three-dimensional (3D) structural data of proteins, functional analysis based on protein ligand binding sites is formidable for proteins without structural information. Recent developments in protein structure prediction and the 3D structures built by AlphaFold provide an unprecedented opportunity for analyzing ligand binding sites in human proteins. Here, we constructed the CavitySpace database, the first pocket library for all the proteins in the human proteome, using a widely-applied ligand binding site detection program CAVITY. Our analysis showed that known ligand binding sites could be well recovered. We grouped the predicted binding sites according to their similarity which can be used in protein function prediction and drug repurposing studies. Novel binding sites in highly reliable predicted structure regions provide new opportunities for drug discovery. Our CavitySpace is freely available and provides a valuable tool for drug discovery and protein function studies.Entities:
Keywords: AlphaFold predicted structures; CAVITY; ligand binding sites; pocket library
Mesh:
Substances:
Year: 2022 PMID: 35883523 PMCID: PMC9312471 DOI: 10.3390/biom12070967
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Figure 1The flowchart for building CavitySpace.
Figure 2The druggability distribution of AlphaFold cavities (a) and hrefPDB cavities (b).
Figure 3The distributions of Index that is the proportion of cavity residues with high confidence (pLDDT > 90). AF_cavity: cavities detected from AlphaFold predicted protein structures; Subset_1: cavities from proteins that have known 3D structures in our hrefPDB dataset; Subset_2: the remaining cavities after removing Subset_1 from AF_cavity. The distributions of all cavities (blue) and cavities with strong druggability (orange) in each dataset are displayed respectively.
Figure 4(a) The recall rate of hrefPDB cavities at different Tc thresholds. (b) The proportion of true ligand binding sites recovery under different thresholds of residue coverage for both hrefPDB structures and AlphaFold structures. The threshold of residue coverage means how many true binding site residues were included in one of the CAVITY detected binding sites.