| Literature DB >> 22829718 |
Nallusamy Saranya1, Samuel Selvaraj.
Abstract
Atomic level molecular similarity and diversity studies have gained considerable importance through their wide application in Bioinformatics and Chemo-informatics for drug design. The availability of large volumes of data on chemical compounds requires new methodologies for efficient and effective searching of its archives in less time with optimal computational power. We describe an alphabetic algorithm for similarity searching based on atom-atom bonding preference for ligands. We represented 170 cyclindependent kinase 2 inhibitors using strings of pre-defined alphabets for searching using known protein sequence alignment tools. Thus, a common pattern was extracted using this set of compounds for database searching to retrieve similar active compounds. Area under the receiver operating characteristic (ROC) curve was used for the discrimination of similar and dissimilar compounds in the databases. An average retrieval rate of about 60% is obtained in cross-validation using the home-grown dataset and the directory of useful decoys (DUD, formally known as the ZINC database) data. This will help in the effective retrieval of similar compounds using database search.Entities:
Keywords: Atom pair; CDK-2; molecular similarity; similarity searching
Year: 2012 PMID: 22829718 PMCID: PMC3398777 DOI: 10.6026/97320630008498
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1Alphabetic representation of atoms pairs in (PDBID_HTMID) 1E1V_CMG.
Figure 2Workflow of the alphabet algorithm
Figure 3Clustering of alphabetic strings of 200 ligands
Figure 4(A) Common features marked in ligand RC8 has been boxed with dark line in the strings alignment; (B) Common features mapped (marked as spheres) in the ligand RC8 structure.
Figure 5Common features (marked as spheres) of ligand RC8 reported in the (PDBID) 3DDP binding site. Nitrogen atom marked in yellow favors hydrogen bonding.
Figure 6ROC curve for the retrieval of actives in (A) DUD and (B) Local database
Figure 7ROC curve for the retrieval of actives using query developed from 13 compounds in (A) DUD database and (B) Local database