Literature DB >> 33574563

Conservation machine learning: a case study of random forests.

Moshe Sipper1,2, Jason H Moore3.   

Abstract

Conservation machine learning conserves models across runs, users, and experiments-and puts them to good use. We have previously shown the merit of this idea through a small-scale preliminary experiment, involving a single dataset source, 10 datasets, and a single so-called cultivation method-used to produce the final ensemble. In this paper, focusing on classification tasks, we perform extensive experimentation with conservation random forests, involving 5 cultivation methods (including a novel one introduced herein-lexigarden), 6 dataset sources, and 31 datasets. We show that significant improvement can be attained by making use of models we are already in possession of anyway, and envisage the possibility of repositories of models (not merely datasets, solutions, or code), which could be made available to everyone, thus having conservation live up to its name, furthering the cause of data and computational science.

Entities:  

Year:  2021        PMID: 33574563      PMCID: PMC7878914          DOI: 10.1038/s41598-021-83247-4

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


  6 in total

1.  A heuristic method for simulating open-data of arbitrary complexity that can be used to compare and evaluate machine learning methods.

Authors:  Jason H Moore; Maksim Shestov; Peter Schmitt; Randal S Olson
Journal:  Pac Symp Biocomput       Date:  2018

Review 2.  There's plenty of room at the Top: What will drive computer performance after Moore's law?

Authors:  Charles E Leiserson; Neil C Thompson; Joel S Emer; Bradley C Kuszmaul; Butler W Lampson; Daniel Sanchez; Tao B Schardl
Journal:  Science       Date:  2020-06-05       Impact factor: 47.728

3.  GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures.

Authors:  Ryan J Urbanowicz; Jeff Kiralis; Nicholas A Sinnott-Armstrong; Tamra Heberling; Jonathan M Fisher; Jason H Moore
Journal:  BioData Min       Date:  2012-10-01       Impact factor: 2.522

4.  PMLB: a large benchmark suite for machine learning evaluation and comparison.

Authors:  Randal S Olson; William La Cava; Patryk Orzechowski; Ryan J Urbanowicz; Jason H Moore
Journal:  BioData Min       Date:  2017-12-11       Impact factor: 2.522

5.  The UK Biobank resource with deep phenotyping and genomic data.

Authors:  Clare Bycroft; Colin Freeman; Desislava Petkova; Gavin Band; Lloyd T Elliott; Kevin Sharp; Allan Motyer; Damjan Vukcevic; Olivier Delaneau; Jared O'Connell; Adrian Cortes; Samantha Welsh; Alan Young; Mark Effingham; Gil McVean; Stephen Leslie; Naomi Allen; Peter Donnelly; Jonathan Marchini
Journal:  Nature       Date:  2018-10-10       Impact factor: 49.962

  6 in total
  1 in total

1.  Using Machine Learning to Evaluate the Role of Microinflammation in Cardiovascular Events in Patients With Chronic Kidney Disease.

Authors:  Xiao Qi Liu; Ting Ting Jiang; Meng Ying Wang; Wen Tao Liu; Yang Huang; Yu Lin Huang; Feng Yong Jin; Qing Zhao; Gui Hua Wang; Xiong Zhong Ruan; Bi Cheng Liu; Kun Ling Ma
Journal:  Front Immunol       Date:  2022-01-10       Impact factor: 7.561

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.