| Literature DB >> 23703923 |
Elizabeth Siewert1, Katerina J Kechris.
Abstract
Although genome-wide expression data sets from multiple species are now more commonly generated, there have been few studies on how to best integrate this type of correlated data into models. Starting with a single-species, linear regression model that predicts transcription factor binding sites as a case study, we investigated how best to take into account the correlated expression data when extending this model to multiple species. Using a multivariate regression model, we accounted for the phylogenetic relationships among the species in two ways: (i) a repeated-measures model, where the error term is constrained; and (ii) a Bayesian hierarchical model, where the prior distributions of the regression coefficients are constrained. We show that both multiple-species models improve predictive performance over the single-species model. When compared with each other, the repeated-measures model outperformed the Bayesian model. We suggest a possible explanation for the better performance of the model with the constrained error term.Entities:
Keywords: evolution; expression; multi-species; multivariate modeling; transcription
Mesh:
Substances:
Year: 2013 PMID: 23703923 PMCID: PMC4964853 DOI: 10.1002/sim.5850
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.373