Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.

Literature DB >> 23102953

The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.

O González-Recio¹, J A Jiménez-Montero, R Alenda.

Abstract

In the next few years, with the advent of high-density single nucleotide polymorphism (SNP) arrays and genome sequencing, genomic evaluation methods will need to deal with a large number of genetic variants and an increasing sample size. The boosting algorithm is a machine-learning technique that may alleviate the drawbacks of dealing with such large data sets. This algorithm combines different predictors in a sequential manner with some shrinkage on them; each predictor is applied consecutively to the residuals from the committee formed by the previous ones to form a final prediction based on a subset of covariates. Here, a detailed description is provided and examples using a toy data set are included. A modification of the algorithm called "random boosting" was proposed to increase predictive ability and decrease computation time of genome-assisted evaluation in large data sets. Random boosting uses a random selection of markers to add a subsequent weak learner to the predictive model. These modifications were applied to a real data set composed of 1,797 bulls genotyped for 39,714 SNP. Deregressed proofs of 4 yield traits and 1 type trait from January 2009 routine evaluations were used as dependent variables. A 2-fold cross-validation scenario was implemented. Sires born before 2005 were used as a training sample (1,576 and 1,562 for production and type traits, respectively), whereas younger sires were used as a testing sample to evaluate predictive ability of the algorithm on yet-to-be-observed phenotypes. Comparison with the original algorithm was provided. The predictive ability of the algorithm was measured as Pearson correlations between observed and predicted responses. Further, estimated bias was computed as the average difference between observed and predicted phenotypes. The results showed that the modification of the original boosting algorithm could be run in 1% of the time used with the original algorithm and with negligible differences in accuracy and bias. This modification may be used to speed the calculus of genome-assisted evaluation in large data sets such us those obtained from consortiums.

Mesh：

Year: 2012 PMID： 23102953 DOI： 10.3168/jds.2012-5630

Source DB: PubMed Journal: J Dairy Sci ISSN： 0022-0302 Impact factor: 4.034

Keyword Cloud
Cited

9 in total

1. Machine learning in postgenomic biology and personalized medicine.

Authors: Animesh Ray
Journal: Wiley Interdiscip Rev Data Min Knowl Discov Date: 2022-01-24

2. Genomic Prediction Methods Accounting for Nonadditive Genetic Effects.

Authors: Luis Varona; Andres Legarra; Miguel A Toro; Zulma G Vitezica
Journal: Methods Mol Biol Date: 2022

3. Genome-Enabled Prediction Methods Based on Machine Learning.

Authors: Edgar L Reinoso-Peláez; Daniel Gianola; Oscar González-Recio
Journal: Methods Mol Biol Date: 2022

4. A Model for Predicting Cervical Cancer Using Machine Learning Algorithms.

Authors: Naif Al Mudawi; Abdulwahab Alazeb
Journal: Sensors (Basel) Date: 2022-05-29 Impact factor: 3.847

5. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View.

Authors: Wei Luo; Dinh Phung; Truyen Tran; Sunil Gupta; Santu Rana; Chandan Karmakar; Alistair Shilton; John Yearwood; Nevenka Dimitrova; Tu Bao Ho; Svetha Venkatesh; Michael Berk
Journal: J Med Internet Res Date: 2016-12-16 Impact factor: 5.428

The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.

1. Machine learning in postgenomic biology and personalized medicine.

2. Genomic Prediction Methods Accounting for Nonadditive Genetic Effects.

3. Genome-Enabled Prediction Methods Based on Machine Learning.

4. A Model for Predicting Cervical Cancer Using Machine Learning Algorithms.

5. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View.

6. Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits.

7. Prediction performance of linear models and gradient boosting machine on complex phenotypes in outbred mice.

8. Symptom-Based COVID-19 Prognosis through AI-Based IoT: A Bioinformatics Approach.

9. Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes.