Literature DB >> 30721970

The impact of clustering methods for cross-validation, choice of phenotypes, and genotyping strategies on the accuracy of genomic predictions.

Johnna L Baller1, Jeremy T Howard1, Stephen D Kachman2, Matthew L Spangler1.   

Abstract

For genomic predictors to be of use in genetic evaluation, their predicted accuracy must be a reliable indicator of their utility, and thus unbiased. The objective of this paper was to evaluate the accuracy of prediction of genomic breeding values (GBV) using different clustering strategies and response variables. Red Angus genotypes (n = 9,763) were imputed to a reference 50K panel. The influence of clustering method [k-means, k-medoids, principal component (PC) analysis on the numerator relationship matrix (A) and the identical-by-state genomic relationship matrix (G) as both data and covariance matrices, and random] and response variables [deregressed estimated breeding values (DEBV) and adjusted phenotypes] were evaluated for cross-validation. The GBV were estimated using a Bayes C model for all traits. Traits for DEBV included birth weight (BWT), marbling (MARB), rib-eye area (REA), and yearling weight (YWT). Adjusted phenotypes included BWT, YWT, and ultrasonically measured intramuscular fat percentage and REA. Prediction accuracies were estimated using the genetic correlation between GBV and associated response variable using a bivariate animal model. A simulation mimicking a cattle population, replicated 5 times, was conducted to quantify differences between true and estimated accuracies. The simulation used the same clustering methods and response variables, with the addition of 2 genotyping strategies (random and top 25% of individuals), and forward validation. The prediction accuracies were estimated similarly, and true accuracies were estimated as the correlation between the residuals of a bivariate model including true breeding value (TBV) and GBV. Using the adjusted Rand index, random clusters were clearly different from relationship-based clustering methods. In both real and simulated data, random clustering consistently led to the largest estimates of accuracy, while no method was consistently associated with more or less bias than other methods. In simulation, random genotyping led to higher estimated accuracies than selection of the top 25% of individuals. Interestingly, random genotyping seemed to overpredict true accuracy while selective genotyping tended to underpredict accuracy. When forward in time validation was used, DEBV led to less biased estimates of GBV accuracy. Results suggest the highest, least biased GBV accuracies are associated with random genotyping and DEBV.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Society of Animal Science. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Keywords:  beef cattle; bias; genomic prediction; simulation

Mesh:

Year:  2019        PMID: 30721970      PMCID: PMC6447245          DOI: 10.1093/jas/skz055

Source DB:  PubMed          Journal:  J Anim Sci        ISSN: 0021-8812            Impact factor:   3.159


  33 in total

1.  Comparison between genomic predictions using daughter yield deviation and conventional estimated breeding value as response variables.

Authors:  G Guo; M S Lund; Y Zhang; G Su
Journal:  J Anim Breed Genet       Date:  2010-12       Impact factor: 2.380

2.  The impact of genetic relationship information on genome-assisted breeding values.

Authors:  D Habier; R L Fernando; J C M Dekkers
Journal:  Genetics       Date:  2007-12       Impact factor: 4.562

Review 3.  Mapping genes for complex traits in domestic animals and their use in breeding programmes.

Authors:  Michael E Goddard; Ben J Hayes
Journal:  Nat Rev Genet       Date:  2009-06       Impact factor: 53.242

4.  Fast and flexible simulation of DNA sequence data.

Authors:  Gary K Chen; Paul Marjoram; Jeffrey D Wall
Journal:  Genome Res       Date:  2008-11-24       Impact factor: 9.043

5.  Efficient methods to compute genomic predictions.

Authors:  P M VanRaden
Journal:  J Dairy Sci       Date:  2008-11       Impact factor: 4.034

6.  Genomic prediction of simulated multibreed and purebred performance using observed fifty thousand single nucleotide polymorphism genotypes.

Authors:  K Kizilkaya; R L Fernando; D J Garrick
Journal:  J Anim Sci       Date:  2009-10-09       Impact factor: 3.159

7.  The impact of genetic relationship information on genomic breeding values in German Holstein cattle.

Authors:  David Habier; Jens Tetens; Franz-Reinhold Seefried; Peter Lichtner; Georg Thaller
Journal:  Genet Sel Evol       Date:  2010-02-19       Impact factor: 4.297

8.  The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation.

Authors:  Tu Luan; John A Woolliams; Sigbjørn Lien; Matthew Kent; Morten Svendsen; Theo H E Meuwissen
Journal:  Genetics       Date:  2009-08-24       Impact factor: 4.562

9.  Linkage disequilibrium and persistence of phase in Holstein-Friesian, Jersey and Angus cattle.

Authors:  A P W de Roos; B J Hayes; R J Spelman; M E Goddard
Journal:  Genetics       Date:  2008-07-13       Impact factor: 4.562

10.  Deregressing estimated breeding values and weighting information for genomic regression analyses.

Authors:  Dorian J Garrick; Jeremy F Taylor; Rohan L Fernando
Journal:  Genet Sel Evol       Date:  2009-12-31       Impact factor: 4.297

View more
  2 in total

1.  Genomic prediction using pooled data in a single-step genomic best linear unbiased prediction framework.

Authors:  Johnna L Baller; Stephen D Kachman; Larry A Kuehn; Matthew L Spangler
Journal:  J Anim Sci       Date:  2020-06-01       Impact factor: 3.159

2.  Using pooled data for genomic prediction in a bivariate framework with missing data.

Authors:  Johnna L Baller; Stephen D Kachman; Larry A Kuehn; Matthew L Spangler
Journal:  J Anim Breed Genet       Date:  2022-06-14       Impact factor: 3.271

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.