| Literature DB >> 30617454 |
Abstract
The application of statistics has been instrumental in clarifying our understanding of the genome. While insights have been derived for almost all levels of genome function, most importantly, statistics has had the greatest impact on improving our knowledge of transcriptional regulation. But the drive to extract the most meaningful inferences from big data can often force us to overlook the fundamental role that statistics plays, and specifically, the basic assumptions that we make about big data. Normality is a statistical property that is often swept up into an assumption that we may or may not be consciously aware of making. This review highlights the inherent value of non-normal distributions to big data analysis by discussing use cases of non-normality that focus on gene expression data. Collectively, these examples help to motivate the premise of why at this stage, now more than ever, non-normality is important for learning about gene regulation, transcriptomics, and more.Entities:
Keywords: Big data; Gene expression; Gene expression variability; Non-normality; Single-cell sequencing; Skewness
Year: 2019 PMID: 30617454 PMCID: PMC6381358 DOI: 10.1007/s12551-018-0494-4
Source DB: PubMed Journal: Biophys Rev ISSN: 1867-2450
Fig. 1Distributions come in different shapes and sizes. a Normal distribution. b Gamma distribution. c Bimodal distribution
Fig. 2Contrasting differential average gene expression against differential variability in gene expression. a Differential expression relies upon identifying significant genes with a large difference in average expression and a small amount of variance. b Two scenarios are shown demonstrating how changes in variability of gene expression could occur between two phenotypic groups
Fig. 3Moments characterize different properties of the distribution. The first three moments are shown using a hypothetical gene expression distribution collected from a population of single cells