Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Hierarchical confounder discovery in the experiment-machine learning cycle.

Literature DB >> 35465234

Hierarchical confounder discovery in the experiment-machine learning cycle.

Alex Rogozhnikov¹, Pavan Ramkumar¹, Rishi Bedi¹, Saul Kato^1,2, G Sean Escola^1,3.

Abstract

The promise of machine learning (ML) to extract insights from high-dimensional datasets is tempered by confounding variables. It behooves scientists to determine if a model has extracted the desired information or instead fallen prey to bias. Due to features of natural phenomena and experimental design constraints, bioscience datasets are often organized in nested hierarchies that obfuscate the origins of confounding effects and render confounder amelioration methods ineffective. We propose a non-parametric statistical method called the rank-to-group (RTG) score that identifies hierarchical confounder effects in raw data and ML-derived embeddings. We show that RTG scores correctly assign the effects of hierarchical confounders when linear methods fail. In a public biomedical image dataset, we discover unreported effects of experimental design. We then use RTG scores to discover crossmodal correlated variability in a multi-phenotypic biological dataset. This approach should be generally useful in experiment-analysis cycles and to ensure confounder robustness in ML models.

Entities: Chemical

Keywords: Mann-Whitney U test; bias; confounders; debiasing; experimental design; hierarchical confounders; machine learning; robustness; stem cell biology

Year: 2022 PMID： 35465234 PMCID： PMC9024009 DOI： 10.1016/j.patter.2022.100451

Source DB: PubMed Journal: Patterns (N Y) ISSN： 2666-3899

12 in total

1. A robust removing unwanted variation-testing procedure via γ -divergence.

Authors: Hung Hung
Journal: Biometrics Date: 2019-08-20 Impact factor: 2.571

2. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.

Authors: Varun Gulshan; Lily Peng; Marc Coram; Martin C Stumpe; Derek Wu; Arunachalam Narayanaswamy; Subhashini Venugopalan; Kasumi Widner; Tom Madams; Jorge Cuadros; Ramasamy Kim; Rajiv Raman; Philip C Nelson; Jessica L Mega; Dale R Webster
Journal: JAMA Date: 2016-12-13 Impact factor: 56.272

3. Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization.

Authors: Izhar Wallach; Abraham Heifets
Journal: J Chem Inf Model Date: 2018-05-08 Impact factor: 4.956

4. Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images From Multiethnic Populations With Diabetes.

Authors: Daniel Shu Wei Ting; Carol Yim-Lui Cheung; Gilbert Lim; Gavin Siew Wei Tan; Nguyen D Quang; Alfred Gan; Haslina Hamzah; Renata Garcia-Franco; Ian Yew San Yeo; Shu Yen Lee; Edmund Yick Mun Wong; Charumathi Sabanayagam; Mani Baskaran; Farah Ibrahim; Ngiap Chuan Tan; Eric A Finkelstein; Ecosse L Lamoureux; Ian Y Wong; Neil M Bressler; Sobha Sivaprasad; Rohit Varma; Jost B Jonas; Ming Guang He; Ching-Yu Cheng; Gemmy Chui Ming Cheung; Tin Aung; Wynne Hsu; Mong Li Lee; Tien Yin Wong
Journal: JAMA Date: 2017-12-12 Impact factor: 56.272

5. Two-sample tests for comparing intra-individual genetic sequence diversity between populations.

Authors: Peter B Gilbert; A J Rossini; Raj Shankarappa
Journal: Biometrics Date: 2005-03 Impact factor: 2.571

6. Insights into the Mutational Burden of Human Induced Pluripotent Stem Cells from an Integrative Multi-Omics Approach.

Authors: Matteo D'Antonio; Paola Benaglio; David Jakubosky; William W Greenwald; Hiroko Matsui; Margaret K R Donovan; He Li; Erin N Smith; Agnieszka D'Antonio-Chronowska; Kelly A Frazer
Journal: Cell Rep Date: 2018-07-24 Impact factor: 9.423

7. Analysis of Transcriptional Variability in a Large Human iPSC Library Reveals Genetic and Non-genetic Determinants of Heterogeneity.

Authors: Ivan Carcamo-Orive; Gabriel E Hoffman; Paige Cundiff; Noam D Beckmann; Sunita L D'Souza; Joshua W Knowles; Achchhe Patel; Dimitri Papatsenko; Fahim Abbasi; Gerald M Reaven; Sean Whalen; Philip Lee; Mohammad Shahbazi; Marc Y R Henrion; Kuixi Zhu; Sven Wang; Panos Roussos; Eric E Schadt; Gaurav Pandey; Rui Chang; Thomas Quertermous; Ihor Lemischka
Journal: Cell Stem Cell Date: 2016-12-22 Impact factor: 25.269

8. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study.

Authors: John R Zech; Marcus A Badgeley; Manway Liu; Anthony B Costa; Joseph J Titano; Eric Karl Oermann
Journal: PLoS Med Date: 2018-11-06 Impact factor: 11.069

9. Resolving challenges in deep learning-based analyses of histopathological images using explanation methods.

Authors: Miriam Hägele; Philipp Seegerer; Sebastian Lapuschkin; Michael Bockmayr; Wojciech Samek; Frederick Klauschen; Klaus-Robert Müller; Alexander Binder
Journal: Sci Rep Date: 2020-04-14 Impact factor: 4.379

Review 10. Addressing variability in iPSC-derived models of human disease: guidelines to promote reproducibility.

Authors: Viola Volpato; Caleb Webber
Journal: Dis Model Mech Date: 2020-01-17 Impact factor: 5.758

1 in total

1. Data science, human intelligence, and therapeutics discovery: An interview with Sean Escola, Saul Kato, and Pavan Ramkumar.

Authors: Pavan Ramkumar; Saul Kato; G Sean Escola
Journal: Patterns (N Y) Date: 2022-04-08

1 in total