Literature DB >> 25797506

Big data in medical science--a biostatistical view.

Harald Binder1, Maria Blettner.   

Abstract

BACKGROUND: Inexpensive techniques for measurement and data storage now enable medical researchers to acquire far more data than can conveniently be analyzed by traditional methods. The expression "big data" refers to quantities on the order of magnitude of a terabyte (1012 bytes); special techniques must be used to evaluate such huge quantities of data in a scientifically meaningful way. Whether data sets of this size are useful and important is an open question that currently confronts medical science.
METHODS: In this article, we give illustrative examples of the use of analytical techniques for big data and discuss them in the light of a selective literature review. We point out some critical aspects that should be considered to avoid errors when large amounts of data are analyzed.
RESULTS: Machine learning techniques enable the recognition of potentially relevant patterns. When such techniques are used, certain additional steps should be taken that are unnecessary in more traditional analyses; for example, patient characteristics should be differentially weighted. If this is not done as a preliminary step before similarity detection, which is a component of many data analysis operations, characteristics such as age or sex will be weighted no higher than any one out of 10 000 gene expression values. Experience from the analysis of conventional observational data sets can be called upon to draw conclusions about potential causal effects from big data sets.
CONCLUSION: Big data techniques can be used, for example, to evaluate observational data derived from the routine care of entire populations, with clustering methods used to analyze therapeutically relevant patient subgroups. Such analyses can provide complementary information to clinical trials of the classic type. As big data analyses become more popular, various statistical techniques for causality analysis in observational data are becoming more widely available. This is likely to be of benefit to medical science, but specific adaptations will have to be made according to the requirements of the applications.

Entities:  

Mesh:

Year:  2015        PMID: 25797506      PMCID: PMC4381554          DOI: 10.3238/arztebl.2015.0137

Source DB:  PubMed          Journal:  Dtsch Arztebl Int        ISSN: 1866-0452            Impact factor:   5.594


  32 in total

1.  Attenuation caused by infrequently updated covariates in survival analysis.

Authors:  Per Kragh Andersen; Knut Liestøl
Journal:  Biostatistics       Date:  2003-10       Impact factor: 5.899

2.  A sequential Cox approach for estimating the causal effect of treatment in the presence of time-dependent confounding applied to data from the Swiss HIV Cohort Study.

Authors:  Jon Michael Gran; Kjetil Røysland; Marcel Wolbers; Vanessa Didelez; Jonathan A C Sterne; Bruno Ledergerber; Hansjakob Furrer; Viktor von Wyl; Odd O Aalen
Journal:  Stat Med       Date:  2010-11-20       Impact factor: 2.373

3.  Integrative genomic analysis of medulloblastoma identifies a molecular subgroup that drives poor clinical outcome.

Authors:  Yoon-Jae Cho; Aviad Tsherniak; Pablo Tamayo; Sandro Santagata; Azra Ligon; Heidi Greulich; Rameen Berhoukim; Vladimir Amani; Liliana Goumnerova; Charles G Eberhart; Ching C Lau; James M Olson; Richard J Gilbertson; Amar Gajjar; Olivier Delattre; Marcel Kool; Keith Ligon; Matthew Meyerson; Jill P Mesirov; Scott L Pomeroy
Journal:  J Clin Oncol       Date:  2010-11-22       Impact factor: 44.544

4.  Big data in epidemiology: too big to fail?

Authors:  Arnaud Chiolero
Journal:  Epidemiology       Date:  2013-11       Impact factor: 4.822

5.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays.

Authors:  John C Marioni; Christopher E Mason; Shrikant M Mane; Matthew Stephens; Yoav Gilad
Journal:  Genome Res       Date:  2008-06-11       Impact factor: 9.043

Review 6.  Basic concepts and methods for joint models of longitudinal and survival data.

Authors:  Joseph G Ibrahim; Haitao Chu; Liddy M Chen
Journal:  J Clin Oncol       Date:  2010-05-03       Impact factor: 44.544

7.  MGDB: crossing the marker genes of a user microarray with a database of public-microarrays marker genes.

Authors:  Mario Huerta; Marc Munyi; David Expósito; Enric Querol; Juan Cedano
Journal:  Bioinformatics       Date:  2014-02-25       Impact factor: 6.937

8.  geneCBR: a translational tool for multiple-microarray analysis and integrative information retrieval for aiding diagnosis in cancer research.

Authors:  Daniel Glez-Peña; Fernando Díaz; Jesús M Hernández; Juan M Corchado; Florentino Fdez-Riverola
Journal:  BMC Bioinformatics       Date:  2009-06-18       Impact factor: 3.169

9.  Causality, mediation and time: a dynamic viewpoint.

Authors:  Odd O Aalen; Kjetil Røysland; Jon Michael Gran; Bruno Ledergerber
Journal:  J R Stat Soc Ser A Stat Soc       Date:  2012-10       Impact factor: 2.483

10.  Next generation sequencing and tumor mutation profiling: are we ready for routine use in the oncology clinic?

Authors:  Debu Tripathy; Kathleen Harnden; Kimberly Blackwell; Mark Robson
Journal:  BMC Med       Date:  2014-08-12       Impact factor: 8.775

View more
  15 in total

Review 1.  What have we learned in minimally invasive colorectal surgery from NSQIP and NIS large databases? A systematic review.

Authors:  Gabriela Batista Rodríguez; Andrea Balla; Santiago Corradetti; Carmen Martinez; Pilar Hernández; Jesús Bollo; Eduard M Targarona
Journal:  Int J Colorectal Dis       Date:  2018-04-06       Impact factor: 2.571

2.  A new unbiased and highly automated approach to find new prognostic markers in preclinical research.

Authors:  Martin Neidnicht; Daniela Mittermüller; Katharina Effenberger-Neidnicht
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

3.  How can Big Data Analytics Support People-Centred and Integrated Health Services: A Scoping Review.

Authors:  Timo Schulte; Sabine Bohnet-Joschko
Journal:  Int J Integr Care       Date:  2022-06-16       Impact factor: 2.913

Review 4.  Machine learning as the new approach in understanding biomarkers of suicidal behavior.

Authors:  Alja Videtič Paska; Katarina Kouter
Journal:  Bosn J Basic Med Sci       Date:  2021-08-01       Impact factor: 3.363

5.  A data analysis framework for biomedical big data: Application on mesoderm differentiation of human pluripotent stem cells.

Authors:  Benjamin Ulfenborg; Alexander Karlsson; Maria Riveiro; Caroline Améen; Karolina Åkesson; Christian X Andersson; Peter Sartipy; Jane Synnergren
Journal:  PLoS One       Date:  2017-06-27       Impact factor: 3.240

Review 6.  Nature and Consequences of Biological Reductionism for the Immunological Study of Infectious Diseases.

Authors:  Ariel L Rivas; Gabriel Leitner; Mark D Jankowski; Almira L Hoogesteijn; Michelle J Iandiorio; Stylianos Chatzipanagiotou; Anastasios Ioannidis; Shlomo E Blum; Renata Piccinini; Athos Antoniades; Jane C Fazio; Yiorgos Apidianakis; Jeanne M Fair; Marc H V Van Regenmortel
Journal:  Front Immunol       Date:  2017-05-31       Impact factor: 7.561

7.  Medical big data: promise and challenges.

Authors:  Choong Ho Lee; Hyung-Jin Yoon
Journal:  Kidney Res Clin Pract       Date:  2017-03-31

Review 8.  Big data for bipolar disorder.

Authors:  Scott Monteith; Tasha Glenn; John Geddes; Peter C Whybrow; Michael Bauer
Journal:  Int J Bipolar Disord       Date:  2016-04-11

9.  Clinical decision-making and secondary findings in systems medicine.

Authors:  T Fischer; K B Brothers; P Erdmann; M Langanke
Journal:  BMC Med Ethics       Date:  2016-05-21       Impact factor: 2.652

10.  Regularization and grouping -omics data by GCA method: A transcriptomic case.

Authors:  Monika Piwowar; Kinga A Kocemba-Pilarczyk; Piotr Piwowar
Journal:  PLoS One       Date:  2018-11-01       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.