| Literature DB >> 20126276 |
Neil D Clarke1, Guillaume Bourque.
Abstract
Our group produced the best predictions overall in the DREAM3 signaling response challenge, being tops by a substantial margin in the cytokine sub-challenge and nearly tied for best in the phosphoprotein sub-challenge. We achieved this success using a simple interpolation strategy. For each combination of a stimulus and inhibitor for which predictions were required, we had noted there were six other datasets using the same stimulus (but different inhibitor treatments) and six other datasets using the same inhibitor (but different stimuli). Therefore, for each treatment combination for which values were to be predicted, we calculated rank correlations for the data that were in common between the treatment combination and each of the 12 related combinations. The data from the 12 related combinations were then used to calculate missing values, weighting the contributions from each experiment based on the rank correlation coefficients. The success of this simple method suggests that the missing data were largely over-determined by similarities in the treatments. We offer some thoughts on the current state and future development of DREAM that are based on our success in this challenge, our success in the earlier DREAM2 transcription factor target challenge, and our experience as the data provider for the gene expression challenge in DREAM3.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20126276 PMCID: PMC2811179 DOI: 10.1371/journal.pone.0008417
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Visualization of the data provided to predictors for the phosphoprotein sub-challenge.
Phosphoprotein levels were been normalized such that values above the median for all values are yellow and those below the median are red. Each column is one of the phosphoproteins, clustered based on similarity in expression. Rows correspond to experiments, sorted in an arbitrary hierarchical manner (cell type, time point, stimulus type, and inhibitor type). The white rows that appear to subdivide the dataset represent the missing data to be predicted.
Figure 2Determination of weights for calculating the weighted averages of similar experiments.
(A) Example of how correlations between inhibitors and stimuli were calculated. The two colored columns represent the vector of phosphoprotein values obtained under all experimental conditions, sorted in an arbitrary but defined way. In the case of the mTOR inhibitor, data for the IGF-I stimulus is missing; these data are to be predicted. Similarly, in the case of the MEK inhibitor, data for the INFg stimulus is missing. The data in common (dashed box) was used to calculate the Spearman rank correlation coefficient. (B) Graphic representation of the normalized correlation coefficients relating inhibitors (top) and stimuli (bottom). The matrices are asymmetric because correlation coefficients were separately normalized for each inhibitor (stimulus), setting the maximum in a row to 1 (yellow) and the minimum to 0 (black). Other values were based on the correlation coefficient, scaling linearly between the minimum and maximum values in the row.