| Literature DB >> 22106335 |
José Caldas1, Nils Gehlenborg, Eeva Kettunen, Ali Faisal, Mikko Rönty, Andrew G Nicholson, Sakari Knuutila, Alvis Brazma, Samuel Kaski.
Abstract
MOTIVATION: Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights.Entities:
Mesh:
Year: 2011 PMID: 22106335 PMCID: PMC3259436 DOI: 10.1093/bioinformatics/btr634
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Flowchart outlining the key steps of the REx information retrieval framework. ‘MSigDB’ is the molecular signature database, ‘NDCG’ is the normalized discounted cumulative gain measure.
Fig. 2.Plate diagram of the proposed graphical model. Rectangles indicate sets of variables, with the cardinality of the set marked in the bottom right corner. Gray nodes correspond to observed data.
Fig. 3.Bar plots of MPM versus pleura log-ratio gene expression values obtained via RT-PCR. The height of the bars represent the log-ratio expression of the corresponding genes and error bars indicate SD.
Fig. 4.Data-driven retrieval performance, NDCG results. The box plots summarize the distribution of NDCG results for 219 interpretable query comparisons. ‘LDA’ corresponds to our earlier method (Caldas ).