| Literature DB >> 26867647 |
Claas Heuer1, Christoph Scheel2, Jens Tetens3, Christa Kühn4,5, Georg Thaller6.
Abstract
BACKGROUND: Categorical traits without ordinal representation of classes do not qualify for threshold models. Alternatively, the multinomial problem can be assessed by a sequence of independent binary contrasts using schemes such as one-vs-all or one-vs-one. Class probabilities can be arrived at by normalization or pair-wise coupling strategies. We assessed the predictive ability of whole-genome regression models and support vector machines for the classification of horses into four German Warmblood breeds.Entities:
Mesh:
Year: 2016 PMID: 26867647 PMCID: PMC4751658 DOI: 10.1186/s12711-016-0192-2
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Overview of the analysed subpopulations
| Subpopulation | Numeric indicator | Individuals | Sires | Dams |
|---|---|---|---|---|
| Hanoverian | 1 | 306 | 95 | 296 |
| Holsteiner | 2 | 348 | 98 | 322 |
| Oldenburger | 3 | 219 | 95 | 216 |
| Trakehner | 4 | 44 | 33 | 43 |
Fig. 1Pedigree-derived and marker-derived additive relationship matrices arranged by subpopulations. The upper and lower triangular matrices represent the pedigree-derived and marker-derived additive relationships, respectively. Diagonals are set to 1
Measures of relationship and separability between subpopulations
| Base | A | G |
|
|
|---|---|---|---|---|
| 12 | 0.0002 (0.0075) | −0.0290 (0.0321) | 0.0151 (0.0196) | 0.9809 (0.0158) |
| 13 | 0.0029 (0.0269) | 0.0085 (0.0447) | 0.0029 (0.0040) | 0.9179 (0.0675) |
| 14 | 0.0008 (0.0143) | 0.0123 (0.0334) | 0.0070 (0.0100) | 0.8444 (0.0779) |
| 23 | 0.0003 (0.0084) | −0.0176 (0.0377) | 0.0109 (0.0147) | 0.9442 (0.0269) |
| 24 | 0.0000 (0.0000) | −0.0269 (0.0252) | 0.0117 (0.0176) | 0.9207 (0.0700) |
| 34 | 0.0001 (0.0057) | 0.0003 (0.0262) | 0.0097 (0.0137) | 0.7967 (0.1376) |
The averages (standard deviations) of pair-wise additive genetic relationship (A), genomic relationship (G), average values of all markers, pedigree-based heritability on the liability scale (), marker-based heritability on the liability scale () for the trait “subpopulation assignment”, percentage (pEV) and number (nEV) of eigenvectors included in the variable selection threshold model and the prediction accuracy of subpopulation assignment (CV-Accuracy) of a leave-one-out cross-validation using SVM. Base indicates the classes used in the pairwise contrasts. Numeric indicators for subpopulations are: 1 Hanoverian, 2 Holsteiner, 3 Oldenburger, 4 Trakehner
Fig. 2Scatterplot of the first two eigenvectors of the genomic relationship matrix (a) and cumulative proportion of explained variance by eigenvalues in decreasing order (b)
Fig. 3Inclusion probabilities of eigenvectors by pair-wise contrasts. Results show posterior means of the inclusion probabilities for eigenvectors that were obtained from pair-wise genomic relationship matrices and fitted as random effects in the BayesC threshold model. The pair-wise response variable was “subpopulation assignment”. Numeric indicators for subpopulations are as follows: 1 Hanoverian, 2 Holsteiner, 3 Oldenburger, 4 Trakehner
Prediction accuracies and scaled Brier scores for subpopulation assignment
| Base | Contrast | Accuracy | sBS | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | Total | 1 | 2 | 3 | 4 | Total | ||
| Ridge | Regression | ||||||||||
| 1234 | One-vs-one | 0.97 | 0.99 | 0.00 | 0.86 | 0.74 | 0.86 | 0.97 | −0.43 | 0.66 | 0.58 |
| 1234 | One-vs-all | 0.86 | 1.00 | 0.54 | 0.86 | 0.84 | 0.69 | 0.98 | 0.18 | 0.68 | 0.68 |
| 234 | One-vs-one | 1.00 | 0.87 | 0.75 | 0.93 | 0.97 | 0.72 | 0.59 | 0.85 | ||
| 234 | One-vs-all | 1.00 | 0.87 | 0.93 | 0.95 | 0.98 | 0.70 | 0.77 | 0.87 | ||
| 134 | One-vs-one | 0.85 | 0.66 | 0.64 | 0.76 | 0.66 | 0.31 | 0.36 | 0.50 | ||
| 134 | One-vs-all | 0.85 | 0.67 | 0.82 | 0.78 | 0.66 | 0.32 | 0.59 | 0.52 | ||
| 124 | One-vs-one | 0.97 | 1.00 | 0.66 | 0.96 | 0.93 | 0.99 | 0.38 | 0.92 | ||
| 124 | One-vs-all | 0.97 | 1.00 | 0.80 | 0.97 | 0.92 | 0.99 | 0.61 | 0.94 | ||
| 123 | One-vs-one | 0.84 | 1.00 | 0.54 | 0.83 | 0.65 | 0.96 | 0.09 | 0.63 | ||
| 123 | One-vs-all | 0.85 | 1.00 | 0.53 | 0.83 | 0.65 | 0.97 | 0.07 | 0.63 | ||
| SVM | |||||||||||
| 1234 | One-vs-one | 0.86 | 1.00 | 0.50 | 0.59 | 0.81 | |||||
| 234 | One-vs-one | 1.00 | 0.85 | 0.73 | 0.93 | ||||||
| 134 | One-vs-one | 0.87 | 0.62 | 0.59 | 0.75 | ||||||
| 124 | One-vs-one | 0.96 | 1.00 | 0.59 | 0.96 | ||||||
| 123 | One-vs-one | 0.86 | 1.00 | 0.50 | 0.82 | ||||||
The proportion of correct classifications to all classifications (Accuracy) and the scaled Brier Scores (sBS) per class and over all samples (Total) for the classification of subpopulation assignment using leave-one-out cross-validation. Base indicates the classes that were included in training and testing. Numeric indicators for subpopulations are: 1 Hanoverian, 2 Holsteiner, 3 Oldenburger, 4 Trakehner
Fig. 4Posterior predictive probabilities by subpopulation and contrast scheme. The binary classifier was a Ridge Regression threshold model. Posterior predictive probabilities for subpopulation assignment were obtained from leave-one-out cross-validation
Fig. 5Posterior predictive probabilities by subpopulation and contrast scheme excluding Oldenburger from training. The binary classifier was a Ridge Regression threshold model. Posterior predictive probabilities for subpopulation assignment were obtained from leave-one-out cross-validation