| Literature DB >> 26492224 |
Aisyah Mohd Noor1, Lars Holmberg1,2, Cheryl Gillett1,3, Anita Grigoriadis1,4.
Abstract
In the past decade, cancer research has seen an increasing trend towards high-throughput techniques and translational approaches. The increasing availability of assays that utilise smaller quantities of source material and produce higher volumes of data output have resulted in the necessity for data storage solutions beyond those previously used. Multifactorial data, both large in sample size and heterogeneous in context, needs to be integrated in a standardised, cost-effective and secure manner. This requires technical solutions and administrative support not normally financially accounted for in small- to moderate-sized research groups. In this review, we highlight the Big Data challenges faced by translational research groups in the precision medicine era; an era in which the genomes of over 75,000 patients will be sequenced by the National Health Service over the next 3 years to advance healthcare. In particular, we have looked at three main themes of data management in relation to cancer research, namely (1) cancer ontology management, (2) IT infrastructures that have been developed to support data management and (3) the unique ethical challenges introduced by utilising Big Data in research.Entities:
Mesh:
Year: 2015 PMID: 26492224 PMCID: PMC4815885 DOI: 10.1038/bjc.2015.341
Source DB: PubMed Journal: Br J Cancer ISSN: 0007-0920 Impact factor: 7.640
Published integrative databases for cancer research
| Breast Cancer Surgical Outcomes Research Database (BRCASO) | Group Health Cooperative, Kaiser Permanente Colorado, Marshfield Clinic | Breast | 6095 | SQL Server | |
| Pancreatic Expression Database (PED) | ICR, QMUL | Pancreas | 7636 | MySQL, MartView (BioMart), Perl | |
| Breast Information Core (BIC) | International Agency for Research on Cancer | Breast | — | Sybase Server, SQL, PERL | |
| Pathology Analytic Imaging Standards (PAIS) | Emory University | Breast, brain | 4740 | IB DB2 Server, SQL, XML | |
| Breast Diseases Registry System (BDRS) | Middle East Technical University | Breast | — | SQL Server, XML | |
| Cooperative Prostate Cancer Tissue Resource (CPCTR) | University of Pittsburgh | Prostate | >6000 | Oracle, PL/SQL | |
| Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC) Biorepository | University of Pittsburgh | Melanoma, breast, prostate | >11 000 | NCI Cancer Biomedical Informatics Grid (caBIG), Java | |
| METABRIC Repository | Cambridge University | Breast | 2000 | CancerGrid, SQIV, SPARQL, XML | |
| Genes-to-Systems Breast Cancer (G2SBC) Database | Institute for Biomedical Technologies | Breast cancer | >2000 | MySQL, PHP, JavaScript | |
| SPORE Head and Neck Neoplasm Database | University of Pittsburgh | Head and neck | 6553 | Oracle, PL/SQL, Java | |
| GEM Registry | Cambridge University | GI | — | MS Access, SQL | |
| Cancer Gene Expression Database (CGED) | Nara Institute of Science and Technology | Breast, GI | >400 | — | |
| OncomiR Database (OncomiRdbB) | Council of scientific and Industrial Research, India | Breast | 782 | MySQL, Perl | |
| Stanford Translational Research Integrated Database Environment (STRIDE) | Stanford University | Various | 1.3 m | Oracle, XML | |
| Thoracic Oncology Program Database Project | University of Chicago | Thoracic | — | MS Access | |
| Georgetown Database of Cancer (G-DOC) | Georgetown University | Breast, GI | >3000 | Oracle, Java | |
| Breast Cancer Gene Expression Miner (bc-GenExMiner) | Centre de Lutte Contre le Cancer Rene Gauducheau | Breast | >3000 | MySQL, PHP, Java | |
| Data Warehouse for Translational Research (DW4TR) | Windber Research Institute | Breast | >5000 | Oracle, AJAX | |
| Danish Centre for Translational Research in Breast Cancer (DCTB) | The Danish Centre for Translational Breast Cancer Research | Breast | — | — | |
| Cancer Genomics Hub | National Cancer Institute | Various | >11 000 | XML, Apache Solr Web | |
| Catalogue of Somatic Mutations in Cancer (COSMIC) | Wellcome Trust Sanger Institute | Various | — | Oracle, Biomart |
Abbreviations: DB=database; DBMS=Database Management System; GI=gastrointestinal cancer; ICR=The Institute of Cancer Research, London; QMUL=Queen Mary University London.
Figure 1Translational research data in the era of Genomics England. Research data from multi-disciplinary fields such as genomics, histopathology, mouse models and fluorescence imaging as managed by a typical translational research group will be integrated with their associated clinical data managed by the institutional biobank and healthcare centre, encompassing features such as treatment, follow-up, demographic and diagnostic data. In alliance with the Genomics England project, these data and their associated biosamples will be used in GeCIP studies and fed back to both healthcare and on-going translational studies within the research group.