Hannah De Los Santos1,2, Kristin P Bennett1,2,3, Jennifer M Hurley4,5. 1. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA. 2. Institute for Data Exploration and Applications, Rensselaer Polytechnic Institute, Troy, NY 12180, USA. 3. Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA. 4. Department of Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA. 5. Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Abstract
MOTIVATION: Circadian rhythms are approximately 24-h endogenous cycles that control many biological functions. To identify these rhythms, biological samples are taken over circadian time and analyzed using a single omics type, such as transcriptomics or proteomics. By comparing data from these single omics approaches, it has been shown that transcriptional rhythms are not necessarily conserved at the protein level, implying extensive circadian post-transcriptional regulation. However, as proteomics methods are known to be noisier than transcriptomic methods, this suggests that previously identified arrhythmic proteins with rhythmic transcripts could have been missed due to noise and may not be due to post-transcriptional regulation. RESULTS: To determine if one can use information from less-noisy transcriptomic data to inform rhythms in more-noisy proteomic data, and thus more accurately identify rhythms in the proteome, we have created the Multi-Omics Selection with Amplitude Independent Criteria (MOSAIC) application. MOSAIC combines model selection and joint modeling of multiple omics types to recover significant circadian and non-circadian trends. Using both synthetic data and proteomic data from Neurospora crassa, we showed that MOSAIC accurately recovers circadian rhythms at higher rates in not only the proteome but the transcriptome as well, outperforming existing methods for rhythm identification. In addition, by quantifying non-circadian trends in addition to circadian trends in data, our methodology allowed for the recognition of the diversity of circadian regulation as compared to non-circadian regulation. AVAILABILITY AND IMPLEMENTATION: MOSAIC's full interface is available at https://github.com/delosh653/MOSAIC. An R package for this functionality, mosaic.find, can be downloaded at https://CRAN.R-project.org/package=mosaic.find. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Circadian rhythms are approximately 24-h endogenous cycles that control many biological functions. To identify these rhythms, biological samples are taken over circadian time and analyzed using a single omics type, such as transcriptomics or proteomics. By comparing data from these single omics approaches, it has been shown that transcriptional rhythms are not necessarily conserved at the protein level, implying extensive circadian post-transcriptional regulation. However, as proteomics methods are known to be noisier than transcriptomic methods, this suggests that previously identified arrhythmic proteins with rhythmic transcripts could have been missed due to noise and may not be due to post-transcriptional regulation. RESULTS: To determine if one can use information from less-noisy transcriptomic data to inform rhythms in more-noisy proteomic data, and thus more accurately identify rhythms in the proteome, we have created the Multi-Omics Selection with Amplitude Independent Criteria (MOSAIC) application. MOSAIC combines model selection and joint modeling of multiple omics types to recover significant circadian and non-circadian trends. Using both synthetic data and proteomic data from Neurospora crassa, we showed that MOSAIC accurately recovers circadian rhythms at higher rates in not only the proteome but the transcriptome as well, outperforming existing methods for rhythm identification. In addition, by quantifying non-circadian trends in addition to circadian trends in data, our methodology allowed for the recognition of the diversity of circadian regulation as compared to non-circadian regulation. AVAILABILITY AND IMPLEMENTATION: MOSAIC's full interface is available at https://github.com/delosh653/MOSAIC. An R package for this functionality, mosaic.find, can be downloaded at https://CRAN.R-project.org/package=mosaic.find. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Ludovic S Mure; Hiep D Le; Giorgia Benegiamo; Max W Chang; Luis Rios; Ngalla Jillani; Maina Ngotho; Thomas Kariuki; Ouria Dkhissi-Benyahya; Howard M Cooper; Satchidananda Panda Journal: Science Date: 2018-02-08 Impact factor: 47.728
Authors: Jennifer M Hurley; Meaghan S Jankowski; Hannah De Los Santos; Alexander M Crowell; Samuel B Fordyce; Jeremy D Zucker; Neeraj Kumar; Samuel O Purvine; Errol W Robinson; Anil Shukla; Erika Zink; William R Cannon; Scott E Baker; Jennifer J Loros; Jay C Dunlap Journal: Cell Syst Date: 2018-12-12 Impact factor: 10.304
Authors: Hannah De Los Santos; Emily J Collins; Catherine Mann; April W Sagan; Meaghan S Jankowski; Kristin P Bennett; Jennifer M Hurley Journal: Bioinformatics Date: 2020-02-01 Impact factor: 6.937
Authors: Jingkui Wang; Laura Symul; Jake Yeung; Cédric Gobet; Jonathan Sobel; Sarah Lück; Pål O Westermark; Nacho Molina; Felix Naef Journal: Proc Natl Acad Sci U S A Date: 2018-02-05 Impact factor: 11.205
Authors: Sharleen M Buel; Shayom Debopadhaya; Hannah De Los Santos; Kaelyn M Edwards; Alexandra M David; Uyen H Dao; Kristin P Bennett; Jennifer M Hurley Journal: G3 (Bethesda) Date: 2022-08-25 Impact factor: 3.542