Jason T Serviss1, Jesper R Gådin2, Per Eriksson2, Lasse Folkersen2,3, Dan Grandér1. 1. Department of Oncology and Pathology, Karolinska University Hospital Solna, Cancer Center Karolinska, Stockholm, Sweden. 2. Department of Medicine, Cardiovascular Medicine Unit, Karolinska University Hospital Solna, Center for Molecular Medicine, Stockholm, Sweden. 3. Department of Bioinformatics, Technical University of Denmark, Copenhagen, Denmark.
Abstract
SUMMARY: Multi-dimensional data generated via high-throughput experiments is increasingly used in conjunction with dimensionality reduction methods to ascertain if resulting separations of the data correspond with known classes. This is particularly useful to determine if a subset of the variables, e.g. genes in a specific pathway, alone can separate samples into these established classes. Despite this, the evaluation of class separations is often subjective and performed via visualization. Here we present the ClusterSignificance package; a set of tools designed to assess the statistical significance of class separations downstream of dimensionality reduction algorithms. In addition, we demonstrate the design and utility of the ClusterSignificance package and utilize it to determine the importance of long non-coding RNA expression in the identity of multiple hematological malignancies. AVAILABILITY AND IMPLEMENTATION: ClusterSignificance is an R package available via Bioconductor (https://bioconductor.org/packages/ClusterSignificance) under GPL-3. CONTACT: dan.grander@ki.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
SUMMARY: Multi-dimensional data generated via high-throughput experiments is increasingly used in conjunction with dimensionality reduction methods to ascertain if resulting separations of the data correspond with known classes. This is particularly useful to determine if a subset of the variables, e.g. genes in a specific pathway, alone can separate samples into these established classes. Despite this, the evaluation of class separations is often subjective and performed via visualization. Here we present the ClusterSignificance package; a set of tools designed to assess the statistical significance of class separations downstream of dimensionality reduction algorithms. In addition, we demonstrate the design and utility of the ClusterSignificance package and utilize it to determine the importance of long non-coding RNA expression in the identity of multiple hematological malignancies. AVAILABILITY AND IMPLEMENTATION: ClusterSignificance is an R package available via Bioconductor (https://bioconductor.org/packages/ClusterSignificance) under GPL-3. CONTACT: dan.grander@ki.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: C A Staunton; E D Owen; K Hemmings; A Vasilaki; A McArdle; R Barrett-Jolley; M J Jackson Journal: Skelet Muscle Date: 2022-01-29 Impact factor: 4.912
Authors: Sam T M Ball; Numan Celik; Elaheh Sayari; Lina Abdul Kadir; Fiona O'Brien; Richard Barrett-Jolley Journal: PLoS One Date: 2022-05-10 Impact factor: 3.752