Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, ℓ 2-consistency and Neuroscience Applications.

Literature DB >> 31742253

When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, ℓ ₂-consistency and Neuroscience Applications.

Hao Henry Zhou¹, Yilin Zhang¹, Vamsi K Ithapu¹, Sterling C Johnson^1,2, Grace Wahba¹, Vikas Singh¹.

Abstract

Many studies in biomedical and health sciences involve small sample sizes due to logistic or financial constraints. Often, identifying weak (but scientifically interesting) associations between a set of predictors and a response necessitates pooling datasets from multiple diverse labs or groups. While there is a rich literature in statistical machine learning to address distributional shifts and inference in multi-site datasets, it is less clear when such pooling is guaranteed to help (and when it does not) - independent of the inference algorithms we use. In this paper, we present a hypothesis test to answer this question, both for classical and high dimensional linear regression. We precisely identify regimes where pooling datasets across multiple sites is sensible, and how such policy decisions can be made via simple checks executable on each site before any data transfer ever happens. With a focus on Alzheimer's disease studies, we present empirical results showing that in regimes suggested by our analysis, pooling a local dataset with data from an international study improves power.

Entities: Disease Gene Species

Year: 2017 PMID： 31742253 PMCID： PMC6859896

Source DB: PubMed Journal: Proc Mach Learn Res

12 in total

1. Domain adaptation via transfer component analysis.

Authors: Sinno Jialin Pan; Ivor W Tsang; James T Kwok; Qiang Yang
Journal: IEEE Trans Neural Netw Date: 2010-11-18

2. The inevitable application of big data to health care.

Authors: Travis B Murdoch; Allan S Detsky
Journal: JAMA Date: 2013-04-03 Impact factor: 56.272

3. Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer's Disease.

Authors: Hao Henry Zhou; Sathya N Ravi; Vamsi K Ithapu; Sterling C Johnson; Grace Wahba; Vikas Singh
Journal: Adv Neural Inf Process Syst Date: 2016

4. The Centiloid Project: standardizing quantitative amyloid plaque estimation by PET.

Authors: William E Klunk; Robert A Koeppe; Julie C Price; Tammie L Benzinger; Michael D Devous; William J Jagust; Keith A Johnson; Chester A Mathis; Davneet Minhas; Michael J Pontecorvo; Christopher C Rowe; Daniel M Skovronsky; Mark A Mintun
Journal: Alzheimers Dement Date: 2014-10-28 Impact factor: 21.566

Review 5. Chemotherapy in adult high-grade glioma: a systematic review and meta-analysis of individual patient data from 12 randomised trials.

Authors: L A Stewart
Journal: Lancet Date: 2002-03-23 Impact factor: 79.321

Review 6. Impact of the Alzheimer's Disease Neuroimaging Initiative, 2004 to 2014.

Authors: Michael W Weiner; Dallas P Veitch; Paul S Aisen; Laurel A Beckett; Nigel J Cairns; Jesse Cedarbaum; Michael C Donohue; Robert C Green; Danielle Harvey; Clifford R Jack; William Jagust; John C Morris; Ronald C Petersen; Andrew J Saykin; Leslie Shaw; Paul M Thompson; Arthur W Toga; John Q Trojanowski
Journal: Alzheimers Dement Date: 2015-07 Impact factor: 21.566

Review 7. Accuracy of neutrophil gelatinase-associated lipocalin (NGAL) in diagnosis and prognosis in acute kidney injury: a systematic review and meta-analysis.

Authors: Michael Haase; Rinaldo Bellomo; Prasad Devarajan; Peter Schlattmann; Anja Haase-Fielitz
Journal: Am J Kidney Dis Date: 2009-10-21 Impact factor: 8.860

8. Multi-site genetic analysis of diffusion images and voxelwise heritability analysis: a pilot project of the ENIGMA-DTI working group.

Authors: Neda Jahanshad; Peter V Kochunov; Emma Sprooten; René C Mandl; Thomas E Nichols; Laura Almasy; John Blangero; Rachel M Brouwer; Joanne E Curran; Greig I de Zubicaray; Ravi Duggirala; Peter T Fox; L Elliot Hong; Bennett A Landman; Nicholas G Martin; Katie L McMahon; Sarah E Medland; Braxton D Mitchell; Rene L Olvera; Charles P Peterson; John M Starr; Jessika E Sussmann; Arthur W Toga; Joanna M Wardlaw; Margaret J Wright; Hilleke E Hulshoff Pol; Mark E Bastin; Andrew M McIntosh; Ian J Deary; Paul M Thompson; David C Glahn
Journal: Neuroimage Date: 2013-04-28 Impact factor: 6.556

Review 9. Machine learning and its applications to biology.

Authors: Adi L Tarca; Vincent J Carey; Xue-wen Chen; Roberto Romero; Sorin Drăghici
Journal: PLoS Comput Biol Date: 2007-06 Impact factor: 4.475

Review 10. Meta-analysis: pitfalls and hints.

Authors: T Greco; A Zangrillo; G Biondi-Zoccai; G Landoni
Journal: Heart Lung Vessel Date: 2013

1 in total

1. Optimizing Nondecomposable Data Dependent Regularizers via Lagrangian Reparameterization Offers Significant Performance and Efficiency Gains.

Authors: Sathya N Ravi; Abhay Venkatesh; Glenn M Fung; Vikas Singh
Journal: Proc Conf AAAI Artif Intell Date: 2020-06-16

1 in total