Yu Zhang1,2, Juan Xie3,4, Jinyu Yang3,4, Anne Fennell4,5, Chi Zhang6, Qin Ma3,4,5. 1. College of Computer Science and Technology, Jilin University, Changchun, China. 2. Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, China. 3. Department of Mathematics and Statistics, South Dakota State University, Brookings, SD, USA. 4. Department of Agronomy, Horticulture and Plant Science, South Dakota State University, Brookings, SD, USA. 5. BioSNTR, Brookings, SD, USA. 6. Center for Computational Biology and Bioinformatics and Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA.
Abstract
Motivation: Biclustering is widely used to identify co-expressed genes under subsets of all the conditions in a large-scale transcriptomic dataset. The program, QUBIC, is recognized as one of the most efficient and effective biclustering methods for biological data interpretation. However, its availability is limited to a C implementation and to a low-throughput web interface. Results: An R implementation of QUBIC is presented here with two unique features: (i) a 82% average improved efficiency by refactoring and optimizing the source C code of QUBIC; and (ii) a set of comprehensive functions to facilitate biclustering-based biological studies, including the qualitative representation (discretization) of expression data, query-based biclustering, bicluster expanding, biclusters comparison, heatmap visualization of any identified biclusters and co-expression networks elucidation. Availability and Implementation: The package is implemented in R (as of version 3.3) and is available from Bioconductor at the URL: http://bioconductor.org/packages/QUBIC, where installation and usage instructions can be found. Contact: qin.ma@sdstate.edu Supplimentary Information: Supplementary data are available at Bioinformatics online.
Motivation: Biclustering is widely used to identify co-expressed genes under subsets of all the conditions in a large-scale transcriptomic dataset. The program, QUBIC, is recognized as one of the most efficient and effective biclustering methods for biological data interpretation. However, its availability is limited to a C implementation and to a low-throughput web interface. Results: An R implementation of QUBIC is presented here with two unique features: (i) a 82% average improved efficiency by refactoring and optimizing the source C code of QUBIC; and (ii) a set of comprehensive functions to facilitate biclustering-based biological studies, including the qualitative representation (discretization) of expression data, query-based biclustering, bicluster expanding, biclusters comparison, heatmap visualization of any identified biclusters and co-expression networks elucidation. Availability and Implementation: The package is implemented in R (as of version 3.3) and is available from Bioconductor at the URL: http://bioconductor.org/packages/QUBIC, where installation and usage instructions can be found. Contact: qin.ma@sdstate.edu Supplimentary Information: Supplementary data are available at Bioinformatics online.
Authors: Jing Zhang; Joydeep Ghosh; Safa F Mohamad; Chi Zhang; Xinxin Huang; Maegan L Capitano; Andrea M Gunawan; Scott Cooper; Bin Guo; Qingchun Cai; Hal E Broxmeyer; Edward F Srour Journal: Stem Cells Date: 2019-08-14 Impact factor: 5.845