Literature DB >> 29801956

Big data uncertainties.

Pierre-André G Maugis1.   

Abstract

Big data-the idea that an always-larger volume of information is being constantly recorded-suggests that new problems can now be subjected to scientific scrutiny. However, can classical statistical methods be used directly on big data? We analyze the problem by looking at two known pitfalls of big datasets. First, that they are biased, in the sense that they do not offer a complete view of the populations under consideration. Second, that they present a weak but pervasive level of dependence between all their components. In both cases we observe that the uncertainty of the conclusion obtained by statistical methods is increased when used on big data, either because of a systematic error (bias), or because of a larger degree of randomness (increased variance). We argue that the key challenge raised by big data is not only how to use big data to tackle new problems, but to develop tools and methods able to rigorously articulate the new risks therein.
Copyright © 2016. Published by Elsevier Ltd.

Entities:  

Keywords:  Big data; Selection bias; Sparse dependence; Statistical inference

Mesh:

Year:  2016        PMID: 29801956     DOI: 10.1016/j.jflm.2016.09.005

Source DB:  PubMed          Journal:  J Forensic Leg Med        ISSN: 1752-928X            Impact factor:   1.614


  2 in total

Review 1.  Medical Big Data Is Not Yet Available: Why We Need Realism Rather than Exaggeration.

Authors:  Hun Sung Kim; Dai Jin Kim; Kun Ho Yoon
Journal:  Endocrinol Metab (Seoul)       Date:  2019-12

2.  Uncertainty in and around biophysical modelling: insights from interdisciplinary research on agricultural digitalization.

Authors:  M Espig; S C Finlay-Smits; E D Meenken; D M Wheeler; M Sharifi
Journal:  R Soc Open Sci       Date:  2020-12-23       Impact factor: 2.963

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.