| Literature DB >> 20234771 |
Christine Galustian1, Angus G Dalgleish.
Abstract
The discovery of effective cancer treatments is a key goal for pharmaceutical companies. However, the current costs of bringing a cancer drug to the market in the USA is now estimated at $1 billion per FDA approved drug, with many months of research at the bench and costly clinical trials. A growing number of papers highlight the use of data mining tools to determine associations between drugs, genes or protein targets, and possible mechanism of actions or therapeutic efficacy which could be harnessed to provide information that can refine or direct new clinical cancer studies and lower costs. This report reviews the paper by R.J. Epstein, which illustrates the potential of text mining using Boolean parameters in cancer drug discovery, and other studies which use alternative data mining approaches to aid cancer research.Entities:
Keywords: cancer; clinical trials; data mining; drug discovery
Year: 2010 PMID: 20234771 PMCID: PMC2834378 DOI: 10.4137/cin.s3191
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
A selection of databases with direct application to cancer drug and target molecule discovery.
| Clinical trials.gov | Text | A list of clinical trials taking or taken place: currently 80,055 trials with locations in 170 countries | |
| Pubmed | Text | Public access but no abstract publications available on this database | |
| Web of knowledge | Text | Abstract publications + full papers available | |
| Oncomine | Gene array data | Largest cancer specific microarray based database. (48,000,000 gene expression measurements from over 4700 microarray experiments | |
| CGAP database | Gene array data | Contains cancer specific cDNA libraries, clones, and sequence data in addition to microarray data. Public access | |
| Gene expression atlas | Gene array data | Uses gene expression data from tissue samples | |
| Genemap | Gene array data | Run by Stanford university: 1975 published microarrays spanning 22 tumor types | |
| Pathway studio | Gene array data and text | Private software program integrating gene array and text mining analyses to identify pathways of action for genes/drugs | |
| Open proteomics database | Mass spectrometry data | Public database for storing and disseminating mass spectrometry based proteomics data. The database currently contains roughly 3,000,000 spectra representing experiments from 5 different organisms | |
| EMBL proteomic database (Pride database) | Mass spectrometry data | PRIDE currently contains: 9,964 experiments, 2,564,320 proteins, 12,015,539 peptides 1,753,906 unique Peptides 53,348,019 Spectra | |
| NCI clinical proteomics program | Seldi-Tof data | Restricted at present to seldi-tof data sets from ovarian and pancreatic cancer data sets. Just data sets and not a database as such at present |