Literature DB >> 20370695

Advances in cheminformatics methodologies and infrastructure to support the data mining of large, heterogeneous chemical datasets.

Rajarshi Guha1, Kevin Gilbert, Geoffrey Fox, Marlon Pierce, David Wild, Huapeng Yuan.   

Abstract

In recent years, there has been an explosion in the availability of publicly accessible chemical information, including chemical structures of small molecules, structure-derived properties and associated biological activities in a variety of assays. These data sources present us with a significant opportunity to develop and apply computational tools to extract and understand the underlying structure-activity relationships. Furthermore, by integrating chemical data sources with biological information (protein structure, gene expression and so on), we can attempt to build up a holistic view of the effects of small molecules in biological systems. Equally important is the ability for non-experts to access and utilize state of the art cheminformatics method and models. In this review we present recent developments in cheminformatics methodologies and infrastructure that provide a robust, distributed approach to mining large and complex chemical datasets. In the area of methodology development, we highlight recent work on characterizing structure-activity landscapes, Quantitative Structure Activity Relationship (QSAR) model domain applicability and the use of chemical similarity in text mining. In the area of infrastructure, we discuss a distributed web services framework that allows easy deployment and uniform access to computational (statistics, cheminformatics and computational chemistry) methods, data and models. We also discuss the development of PubChem derived databases and highlight techniques that allow us to scale the infrastructure to extremely large compound collections, by use of distributed processing on Grids. Given that the above work is applicable to arbitrary types of cheminformatics problems, we also present some case studies related to virtual screening for anti-malarials and predictions of anti-cancer activity.

Entities:  

Mesh:

Year:  2010        PMID: 20370695     DOI: 10.2174/157340910790980115

Source DB:  PubMed          Journal:  Curr Comput Aided Drug Des        ISSN: 1573-4099            Impact factor:   1.606


  4 in total

1.  Exploiting PubChem for Virtual Screening.

Authors:  Xiang-Qun Xie
Journal:  Expert Opin Drug Discov       Date:  2010-12       Impact factor: 6.098

2.  USR-VS: a web server for large-scale prospective virtual screening using ultrafast shape recognition techniques.

Authors:  Hongjian Li; Kwong-S Leung; Man-H Wong; Pedro J Ballester
Journal:  Nucleic Acids Res       Date:  2016-04-22       Impact factor: 16.971

3.  The rcdk and cluster R packages applied to drug candidate selection.

Authors:  Adrian Voicu; Narcis Duteanu; Mirela Voicu; Daliborca Vlad; Victor Dumitrascu
Journal:  J Cheminform       Date:  2020-01-20       Impact factor: 5.514

4.  Cheminformatics and the Semantic Web: adding value with linked data and enhanced provenance.

Authors:  Jeremy G Frey; Colin L Bird
Journal:  Wiley Interdiscip Rev Comput Mol Sci       Date:  2013-01-08
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.