Christine Choirat1, Danielle Braun2,3, Marianthi-Anna Kioumourtzoglou4. 1. Swiss Data Science Center, ETH Zurich and EPFL, Switzerland. 2. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA. 3. Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA. 4. Department of Environmental Health Sciences, Columbia University Mailman School of Public Health, New York, NY.
Abstract
PURPOSE OF REVIEW: Data science is an exploding trans-disciplinary field that aims to harness the power of data to gain information or insights on researcher-defined topics of interest. In this paper we review how data science can help advance environmental health research. RECENT FINDINGS: We discuss the concepts computationally scalable handling of Big Data and the design of efficient research data platforms, and how data science can provide solutions for methodological challenges in environmental health research, such as high-dimensional outcomes and exposures, and prediction models. Finally, we discuss tools for reproducible research. SUMMARY: In this paper we present opportunities to improve environmental research capabilities by embracing data science, and the pitfalls that environmental health researchers should avoid when employing data scientific approaches. Throughout the paper, we emphasize the need for environmental health researchers to collaborate more closely with biostatisticians and data scientists to ensure robust and interpretable results.
PURPOSE OF REVIEW: Data science is an exploding trans-disciplinary field that aims to harness the power of data to gain information or insights on researcher-defined topics of interest. In this paper we review how data science can help advance environmental health research. RECENT FINDINGS: We discuss the concepts computationally scalable handling of Big Data and the design of efficient research data platforms, and how data science can provide solutions for methodological challenges in environmental health research, such as high-dimensional outcomes and exposures, and prediction models. Finally, we discuss tools for reproducible research. SUMMARY: In this paper we present opportunities to improve environmental research capabilities by embracing data science, and the pitfalls that environmental health researchers should avoid when employing data scientific approaches. Throughout the paper, we emphasize the need for environmental health researchers to collaborate more closely with biostatisticians and data scientists to ensure robust and interpretable results.
Entities:
Keywords:
Big Data; Data Science; Environmental Health Research; Environmental Mixtures; High-Dimensional; Reproducibility; Research Data Platforms
Authors: Lianne Sheppard; Richard T Burnett; Adam A Szpiro; Sun-Young Kim; Michael Jerrett; C Arden Pope; Bert Brunekreef Journal: Air Qual Atmos Health Date: 2011-03-23 Impact factor: 3.763
Authors: Nick Weber; David Liou; Jennifer Dommer; Philip MacMenamin; Mariam Quiñones; Ian Misner; Andrew J Oler; Joe Wan; Lewis Kim; Meghan Coakley McCarthy; Samuel Ezeji; Karlynn Noble; Darrell E Hurt Journal: Bioinformatics Date: 2018-04-15 Impact factor: 6.937
Authors: Jeanette A Stingone; Sofia Triantafillou; Alexandra Larsen; Jay P Kitt; Gary M Shaw; Judit Marsillach Journal: Environ Res Date: 2021-03-15 Impact factor: 8.431