| Literature DB >> 25332396 |
Rosalind J Cutts1, José Afonso Guerra-Assunção1, Emanuela Gadaleta1, Abu Z Dayem Ullah1, Claude Chelala2.
Abstract
BCCTBbp (http://bioinformatics.breastcancertissue bank.org) was initially developed as the data-mining portal of the Breast Cancer Campaign Tissue Bank (BCCTB), a vital resource of breast cancer tissue for researchers to support and promote cutting-edge research. BCCTBbp is dedicated to maximising research on patient tissues by initially storing genomics, methylomics, transcriptomics, proteomics and microRNA data that has been mined from the literature and linking to pathways and mechanisms involved in breast cancer. Currently, the portal holds 146 datasets comprising over 227,795 expression/genomic measurements from various breast tissues (e.g. normal, malignant or benign lesions), cell lines and body fluids. BCCTBbp can be used to build on breast cancer knowledge and maximise the value of existing research. By recording a large number of annotations on samples and studies, and linking to other databases, such as NCBI, Ensembl and Reactome, a wide variety of different investigations can be carried out. Additionally, BCCTBbp has a dedicated analytical layer allowing researchers to further analyse stored datasets. A future important role for BCCTBbp is to make available all data generated on BCCTB tissues thus building a valuable resource of information on the tissues in BCCTB that will save repetition of experiments and expand scientific knowledge.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25332396 PMCID: PMC4384036 DOI: 10.1093/nar/gku984
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Accessing the BCCTBbp data through the MartView web query interface. In this first example our goal was to identify differentially regulated genes in Infiltrating Lobular Carcinoma (ILC), related pathways and check if any of these genes appear in the Parker PAM 50 gene set. The MartView interface can be accessed by following the ‘Advanced/BioMart Search’ button on the main bioinformatics portal web page (http://bioinformatics.breastcancertissuebank.org). The query starts by choosing the ‘BCCTB Bioinformatics Portal’ from the ‘Choose Database’ drop-down selection in the right panel. Users will be automatically directed to choose a dataset from the ‘Choose Dataset’ drop-down menu. (A) The ‘BCCTB Bioinformatics Gene Data’ dataset can be chosen from the dropdown menus. The left panel will refresh automatically displaying the ‘Filters’ and ‘Attributes’ nodes with their default settings. The next step involves choosing the appropriate attributes and filters to restrict the query. Clicking on the ‘Filters’ or ‘Attributes’ nodes on the left will display the related page on the right where ‘Filters’ or ‘Attributes’ are arranged into sections, which can be expanded/collapsed using the ‘+/−’ box. To choose an attribute or a filter, users can simply click on the checkbox next to its description. A summary of the selected filters and attributes is automatically displayed in the left panel. We will click on Filters in the left panel and then expand the ‘SELECT GENES FROM TRANSCRIPTOMICS PROFILING EXPERIMENTS:’ section. From the rich list of possible options, we will select ‘Specimen histology’ then ‘ILC versus normal lobular’ from the related list. Clicking on the ‘Count’ button in the tool bar at any time when constructing the query will return the number of genes satisfying the pre-selected criteria. (B) Clicking on the Attributes tab in the left panel allows the user to select which attributes of the data will be returned in the results. On the right panel, these are arranged into six modules for ‘Study Data’, ‘Features’, ‘Structures’, ‘Transcript Event’, ‘Homologs’, ‘Variation’ and ‘Sequences’. In order to select the output content, the ‘Attributes’ node on the left needs to be clicked on and the attribute page on the right needs to be chosen. In this example we are interested in exploring the de-regulated pathways. From the ‘Study Data’ attribute page, a selection is made of Pathway URL and Pathway name options. Finally, pressing the results button on the top left of the page will produce a table containing the requested information (C). One can select the option to download the results to a file at the top of the result page and export them using the ‘GO’ button. Again, there are options to change the format (‘HTML’, ‘CSV’ for comma separated values, ‘TSV’ for tab separated values, ‘XLS’ for Excel, ‘ADF’ for array description format) and whether to make the results unique. One can select a compressed file output and the query will run in the background to be downloaded later. One needs to provide an email address to receive an URL in a notification email that allows the query results to be downloaded. There is an option to produce a Perl script to repeat the query programmatically with BioMart Perl API. Due to the nature of the database and the one-to-many relationship between genes and transcripts, some results may appear more than once. To address that there is a button that can be used to present only unique results and the transcript attribute could be removed from the final results.
Figure 2.Accessing the BCCTBbp data through the BCCTB Sample Finder. The BCCTB sample finder (https://breastcancertissuebank.org/bcc/tissueBank?Name=sampleFinder, described elsewhere, manuscript in peer-review) is used to find samples within the Tissue Bank matching characteristics of interest. Here the Sample Finder is used to retrieve ILC samples available from BCCTB. On the lower left corner, the bioinformatics section indicates 18 studies stored in the BCCTBbp with ILC samples. Clicking on the hyperlink ‘View details’ will produce a detailed list of these studies, with PubMed and GEO links (where available), as well as the number of the total samples in each of the datasets. Users could then choose to go to the main BCCTBbp to mine the results published on these samples (see Figure 1).
Figure 3.Analytical layer and integrated modules. The analytical tools can be accessed from the menu bar at http://bioinformatics.breastcancertissuebank.org/analysisTools.html. First one could choose the dataset of interest by pressing the radio-buttons to the left of the dataset title. For this example we are selecting the dataset from (A). Ivshina et al. as it has the largest sample size and contains survival information (http://bioinformatics.breastcancertissuebank.org/analysisTools.html?dset=Ivshina). (B) Selecting the ‘Transcriptomics Analysis’ option, and giving a gene name, MELK in this example, will result in the analysis of MELK gene (Entrez Gene ID: 9833) expression per molecular subtype (C) Returning to the analysis screen and selecting ‘Survival analysis by gene of interest’ for the MELK gene will produce the survival figure for two sub-groups automatically selected by the median value of MELK gene expression. Risk group assignment is presented with RG1 (black) for low expression and RG2 (red) for high expression of MELK.