BACKGROUND: Biclustering is a popular method for identifying under which experimental conditions biological signatures are co-expressed. However, the general biclustering problem is NP-hard, offering room to focus algorithms on specific biological tasks. We hypothesize that conditional co-regulation of genes is a key factor in determining cell phenotype and that accurately segregating conditions in biclusters will improve such predictions. Thus, we developed a bicluster sampled coherence metric (BSCM) for determining which conditions and signals should be included in a bicluster. RESULTS: Our BSCM calculates condition and cluster size specific p-values, and we incorporated these into the popular integrated biclustering algorithm cMonkey. We demonstrate that incorporation of our new algorithm significantly improves bicluster co-regulation scores (p-value = 0.009) and GO annotation scores (p-value = 0.004). Additionally, we used a bicluster based signal to predict whether a given experimental condition will result in yeast peroxisome induction. Using the new algorithm, the classifier accuracy improves from 41.9% to 76.1% correct. CONCLUSIONS: We demonstrate that the proposed BSCM helps determine which signals ought to be co-clustered, resulting in more accurately assigned bicluster membership. Furthermore, we show that BSCM can be extended to more accurately detect under which experimental conditions the genes are co-clustered. Features derived from this more accurate analysis of conditional regulation results in a dramatic improvement in the ability to predict a cellular phenotype in yeast. The latest cMonkey is available for download at https://github.com/baliga-lab/cmonkey2. The experimental data and source code featured in this paper is available http://AitchisonLab.com/BSCM. BSCM has been incorporated in the official cMonkey release.
BACKGROUND: Biclustering is a popular method for identifying under which experimental conditions biological signatures are co-expressed. However, the general biclustering problem is NP-hard, offering room to focus algorithms on specific biological tasks. We hypothesize that conditional co-regulation of genes is a key factor in determining cell phenotype and that accurately segregating conditions in biclusters will improve such predictions. Thus, we developed a bicluster sampled coherence metric (BSCM) for determining which conditions and signals should be included in a bicluster. RESULTS: Our BSCM calculates condition and cluster size specific p-values, and we incorporated these into the popular integrated biclustering algorithm cMonkey. We demonstrate that incorporation of our new algorithm significantly improves bicluster co-regulation scores (p-value = 0.009) and GO annotation scores (p-value = 0.004). Additionally, we used a bicluster based signal to predict whether a given experimental condition will result in yeast peroxisome induction. Using the new algorithm, the classifier accuracy improves from 41.9% to 76.1% correct. CONCLUSIONS: We demonstrate that the proposed BSCM helps determine which signals ought to be co-clustered, resulting in more accurately assigned bicluster membership. Furthermore, we show that BSCM can be extended to more accurately detect under which experimental conditions the genes are co-clustered. Features derived from this more accurate analysis of conditional regulation results in a dramatic improvement in the ability to predict a cellular phenotype in yeast. The latest cMonkey is available for download at https://github.com/baliga-lab/cmonkey2. The experimental data and source code featured in this paper is available http://AitchisonLab.com/BSCM. BSCM has been incorporated in the official cMonkey release.
Authors: Virginia D Marks; Shannan J Ho Sui; Daniel Erasmus; George K van der Merwe; Jochen Brumm; Wyeth W Wasserman; Jennifer Bryan; Hennie J J van Vuuren Journal: FEMS Yeast Res Date: 2008-02 Impact factor: 2.796
Authors: Theo A Knijnenburg; Johannes H de Winde; Jean-Marc Daran; Pascale Daran-Lapujade; Jack T Pronk; Marcel J T Reinders; Lodewyk F A Wessels Journal: BMC Genomics Date: 2007-01-22 Impact factor: 3.969
Authors: Ronita Nag; McKenna Kyriss; John W Smerdon; John J Wyrick; Michael J Smerdon Journal: Nucleic Acids Res Date: 2009-12-09 Impact factor: 16.971
Authors: Maxwell L Neal; Ling Wei; Eliza Peterson; Mario L Arrieta-Ortiz; Samuel A Danziger; Nitin S Baliga; Alexis Kaushansky; John D Aitchison Journal: Nucleic Acids Res Date: 2021-05-21 Impact factor: 16.971
Authors: Samuel A Danziger; Mark McConnell; Jake Gockley; Mary H Young; Adam Rosenthal; Frank Schmitz; David J Reiss; Phil Farmer; Daisy V Alapat; Amrit Singh; Cody Ashby; Michael Bauer; Yan Ren; Kelsie Smith; Suzana S Couto; Frits van Rhee; Faith Davies; Maurizio Zangari; Nathan Petty; Robert Z Orlowski; Madhav V Dhodapkar; Wilbert B Copeland; Brian Fox; Antje Hoering; Alison Fitch; Katie Newhall; Bart Barlogie; Matthew W B Trotter; Robert M Hershberg; Brian A Walker; Andrew P Dervan; Alexander V Ratushny; Gareth J Morgan Journal: PLoS Med Date: 2020-11-04 Impact factor: 11.069