| Literature DB >> 31251324 |
Patryk Orzechowski1,2, Krzysztof Boryczko3, Jason H Moore1.
Abstract
Biclustering is a technique of discovering local similarities within data. For many years the complexity of the methods and parallelization issues limited its application to big data problems. With the development of novel scalable methods, biclustering has finally started to close this gap. In this paper we discuss the caveats of biclustering and present its current challenges and guidelines for practitioners. We also try to explain why biclustering may soon become one of the standards for big data analytics.Entities:
Keywords: biclustering; big data; biomarker detection; co-clustering; data mining; disease subtype identification; gene-drug interaction; parallel algorithms; precision medicine
Mesh:
Year: 2019 PMID: 31251324 PMCID: PMC6598466 DOI: 10.1093/gigascience/giz078
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1.Different patterns in biclustering. The original patterns were sorted first by rows and second by columns for visualization purposes. Biclusters with constant or upregulated pattern have all values exact. Constant rows/columns patterns are characterized by the same value across all columns/rows of the bicluster. The values between rows/columns may differ. In a bicluster with shift pattern the contribution of a given row is added to the contribution of a given column, whilst in the case of scale pattern the contribution of a row is multiplied. In a shift-scale pattern each row contributes twice: by a factor that is multiplied by a column contribution and by an additive further shifting the values. In plaid patterns, the data are modeled as a sum of multiple layers. Note that all the patterns could be considered order-preserving.