| Literature DB >> 26355519 |
Mohammad Shahrokh Esfahani, Edward R Dougherty.
Abstract
Small samples are commonplace in genomic/proteomic classification, the result being inadequate classifier design and poor error estimation. The problem has recently been addressed by utilizing prior knowledge in the form of a prior distribution on an uncertainty class of feature-label distributions. A critical issue remains: how to incorporate biological knowledge into the prior distribution. For genomics/proteomics, the most common kind of knowledge is in the form of signaling pathways. Thus, it behooves us to find methods of transforming pathway knowledge into knowledge of the feature-label distribution governing the classification problem. In this paper, we address the problem of prior probability construction by proposing a series of optimization paradigms that utilize the incomplete prior information contained in pathways (both topological and regulatory). The optimization paradigms employ the marginal log-likelihood, established using a small number of feature-label realizations (sample points) regularized with the prior pathway information about the variables. In the special case of a Normal-Wishart prior distribution on the mean and inverse covariance matrix (precision matrix) of a Gaussian distribution, these optimization problems become convex. Companion website: gsp.tamu.edu/Publications/supplementary/shahrokh13a.Entities:
Mesh:
Year: 2014 PMID: 26355519 DOI: 10.1109/TCBB.2013.143
Source DB: PubMed Journal: IEEE/ACM Trans Comput Biol Bioinform ISSN: 1545-5963 Impact factor: 3.710