| Literature DB >> 34064417 |
Anthony Tedde1,2, Clément Grelet3, Phuong N Ho4, Jennie E Pryce4,5, Dagnachew Hailemariam6, Zhiquan Wang6, Graham Plastow6, Nicolas Gengler1, Eric Froidmont3, Frédéric Dehareng3, Carlo Bertozzi7, Mark A Crowe8, Hélène Soyeurt1.
Abstract
We predicted dry matter intake of dairy cows using parity, week of lactation, milk yield, milk mid-infrared (MIR) spectrum, and MIR-based predictions of bodyweight, fat, protein, lactose, and fatty acids content in milk. The dataset comprised 10,711 samples of 534 dairy cows with a geographical diversity (Australia, Canada, Denmark, and Ireland). We set up partial least square (PLS) regressions with different constructs and a one-hidden-layer artificial neural network (ANN) using the highest contribution variables. In the ANN, we replaced the spectra with their projections to the 25 first PLS factors explaining 99% of the spectral variability to reduce the model complexity. Cow-independent 10 × 10-fold cross-validation (CV) achieved the best performance with root mean square errors (RMSECV) of 3.27 ± 0.08 kg for the PLS regression and 3.25 ± 0.13 kg for ANN. Although the available data were significantly different, we also performed a country-independent validation (CIV) to measure the models' performance fairly. We found RMSECIV varying from 3.73 to 6.03 kg for PLS and 3.69 to 5.08 kg for ANN. Ultimately, based on the country-independent validation, we discussed the developed models' performance with those achieved by the National Research Council's equation.Entities:
Keywords: artificial neural network; dairy cows; dimensionality reduction; dry matter intake; feed efficiency; feed intake; machine learning; mid infrared spectra; partial least square
Year: 2021 PMID: 34064417 PMCID: PMC8147833 DOI: 10.3390/ani11051316
Source DB: PubMed Journal: Animals (Basel) ISSN: 2076-2615 Impact factor: 2.752
Structure and prediction performance of recent literature models predicting the dry matter intake of dairy cows.
| Authors | Model 1 | Regression Type 2 | R2 cv 3 | RMSEcv 4 (kg) | RPD 5 |
|---|---|---|---|---|---|
| Grelet et al. (2020) [ | MIR + PRT + MY | SVM | 0.66 | 2.71 | 1.67 |
| Lahart et al. (2019) [ | MY + fat + prot + BW + DIM + PRT | OLS | 0.71 | 1.67 | 1.85 |
| MIR | PLS | 0.48 | 2.24 | 1.38 | |
| MY + fat + prot + BW + DIM + PRT + MIR | PLS | 0.76 | 1.51 | 2.04 | |
| Wallén et al. (2018) [ | fat + prot + lact | OLS | 0.01 | 3.52 | 1.00 |
| MY | OLS | 0.24 | 3.07 | 1.15 | |
| MY + fat + prot + lact | OLS | 0.25 | 3.06 | 1.15 | |
| MIR | PLS | 0.14 | 3.27 | 1.08 | |
| fat + prot + lact + MIR | PLS | 0.06 | 3.44 | 1.02 | |
| MY + MIR | PLS | 0.27 | 3.01 | 1.17 | |
| MY + BW | OLS | 0.29 | 2.97 | 1.18 | |
| MY + BW + MIR | PLS | 0.29 | 2.98 | 1.18 | |
| Shetty et al. (2017) [ | MY | PLS | 0.58 | 2.22 | 1.55 |
| MY + BW | PLS | 0.72 | 1.82 | 1.89 | |
| MY + BW + MIR | PLS | 0.82 | 1.46 | 2.36 | |
| MY + fat + BW | PLS | 0.79 | 1.60 | 2.15 | |
| MY + fat + prot + BW | PLS | 0.78 | 1.61 | 2.14 | |
| MY + fat + prot + lact + BW | PLS | 0.77 | 1.64 | 2.10 | |
| MIR | PLS | 0.31 | 2.86 | 1.20 |
1 Milk yield (MY), bodyweight (BW), days in milk (DIM), parity (PRT), lactose (lact), protein (prot). 2 Partial least square (PLS), ordinary least square (OLS), and support vector machines (SVM) regressions. 3 Cow-independent R-squared of cross-validation (R2 cv). 4 Cow-independent root mean square error of cross-validation (RMSEcv). 5 RPD = Ratio of performance to deviation. This value was approximated for Wallén et al. [15] and Shetty et al. [11] using the published global standard deviation.
Figure 1Descriptive statistics of the training datasets (red = AUS, blue = CAN, orange = GPE). (a) Measured dry matter intake; (b) Parity; (c) Milk yield; (d) Bodyweight predicted from regression involving milk mid-infrared spectra, parity, days in milk, and milk yield; (e) Week of lactation; (f) Projection of the milk mid-infrared spectra to their first two principal components computed from a principal component analysis involving 277 spectral points; (g) Prediction of short chains fatty acids from milk mid-infrared spectra; (h) Prediction of medium chains fatty acids from milk mid-infrared spectra; (i) Prediction of long chains fatty acids from milk mid-infrared spectra; (j) Prediction of fat content in milk from milk mid-infrared spectra; (k) Prediction of protein content in milk from milk mid-infrared spectra; (l) Prediction of lactose content in milk from milk mid-infrared spectra.
Descriptive statistics of the original datasets.
| Variables 1 | Measure 2 | Datasets 3 | ||
|---|---|---|---|---|
| AUS | CAN | GPE | ||
| Total Number of Records | 5743 | 4105 | 1115 | |
| Dry matter intake | Mean | 23 | 23 | 20 |
| s.d | 5 | 4 | 5 | |
| Bodyweight | Mean | 554 | 635 | 611 |
| s.d | 36 | 36 | 59 | |
| Milk yield | Mean | 26 | 40 | 35 |
| s.d | 5 | 10 | 9 | |
| Parity | Primiparous | 1340 | 1311 | 281 |
| Multiparous | 4403 | 2794 | 834 | |
| Week of lactation | Min | 6 | 1 | 1 |
| Q1 | 14 | 11 | 3 | |
| Median | 15 | 19 | 4 | |
| Q3 | 17 | 27 | 6 | |
| Max | 23 | 43 | 8 | |
| Short-chain FA | Mean | 7.82 | 7.75 | 7.52 |
| s.d | 0.68 | 1.12 | 0.88 | |
| Medium-chain FA | Mean | 33.44 | 53.66 | 41.74 |
| s.d | 3.06 | 4.25 | 5.50 | |
| Long-chain FA | Mean | 35.08 | 38.16 | 38.81 |
| s.d | 3.26 | 4.51 | 6.34 | |
| Fat | Mean | 4.69 | 3.73 | 4.79 |
| s.d | 1.07 | 0.96 | 0.85 | |
| Protein | Mean | 2.59 | 3.06 | 3.05 |
| s.d | 0.24 | 0.29 | 0.35 | |
| Lactose | Mean | 4.20 | 4.89 | 5.07 |
| s.d | 0.16 | 0.23 | 0.21 | |
1 Variables in the datasets. FA = Fatty Acids. 2 Function applied to a variable or an entire dataset, per herd. Total number of records (Total); number of records belonging to primiparous/multiparous cows (Primiparous/Multiparous); average value of a specific variable (Mean); standard deviation of a specific variable (s.d.); minimum or maximum (Min/Max); first, second, or third quartile (Q1, Median, Q3). 3 Statistics for the corresponding datasets: AUS, CAN, or GPE.
Figure 2Scores for the variable importance of MIR fatty acids contents (g/100 g of fat) to predict dairy cows’ dry matter intake.
Cow-independent cross-validation and out-of-sample country-independent performance for partial least squares regressions predicting the dry matter intake (in kg) of dairy cows from datasets with different origins (AUS = Australia (number of records (N) = 5629), CAN = Canada (N = 4063), GPE = GplusE project (N = 1019)).
| M | Features 1 | RMSEcv (kg) 2 | R2cv 3 | RMSEv (kg) 4 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| AUS | CAN | GPE | AUS | CAN | GPE | CAUS | CCAN | CGPE | ||
| (VCAN, VGPE) | (VAUS, VGPE) | (VAUS, VCAN) | ||||||||
| M1 | MIR | 3.86 | 3.48 | 3.69 | 0.20 | 0.24 | 0.35 | (6.65, 11.13) | (19.5, 4.88) | (18.78, 8.11) |
| M2 | PRT | 4.07 | 3.38 | 4.13 | 0.11 | 0.28 | 0.19 | (4.19, 5.28) | (3.54, 5.03) | (5.8, 5.74) |
| M3 | pFA | 4.07 | 3.65 | 3.94 | 0.11 | 0.16 | 0.26 | (7.69, 4.39) | (5.53, 5.49) | (4.52, 6.45) |
| M4 | pBW | 3.95 | 3.31 | 4.05 | 0.16 | 0.31 | 0.22 | (6.2, 6.58) | (5.11, 4.25) | (7.3, 4.7) |
| M5 | MY | 4.02 | 3.54 | 3.62 | 0.13 | 0.20 | 0.37 | (4.72, 7.32) | (5.72, 4.25) | (7.18, 4.56) |
| M6 | WOL | 4.29 | 68.19 | 4.39 | 0.02 | 0.06 | 0.07 | (4.32, 3 × 105) | (6 × 1013, 5 × 107) | (1 × 1014, 5.01) |
| M7 | pMC | 4.20 | 3.96 | 4.42 | 0.06 | 0.02 | 0.07 | (4.33, 8.41) | (4.72, 5.13) | (4.83, 5.56) |
| M8 | PRT + MIR | 3.69 | 3.13 | 3.33 | 0.27 | 0.38 | 0.46 | (4.98, 10.52) | (11.37, 4.22) | (13.83, 6.54) |
| M9 | pBW + MIR | 3.60 | 3.04 | 3.30 | 0.30 | 0.42 | 0.46 | (5.92, 10.45) | (6.54, 4.02) | (9.94, 7.44) |
| M10 | MY + MIR | 3.42 | 3.01 | 3.05 | 0.36 | 0.42 | 0.53 | (4.86, 8.47) | (23.79, 3.7) | (22.01, 5.14) |
| M11 | pFA + MIR | 3.89 | 3.48 | 3.70 | 0.19 | 0.24 | 0.34 | (14.3, 6.08) | (13.28, 4.66) | (14.37, 10.46) |
| M12 | PRT + MIR + pBW | 3.56 | 2.98 | 3.31 | 0.32 | 0.44 | 0.46 | (5.23, 10.47) | (5.11, 3.77) | (6.46, 5.22) |
| M13 | PRT + MIR + MY | 3.39 | 2.87 | 3.06 | 0.37 | 0.48 | 0.53 | (4.54, 9.36) | (20.57, 3.61) | (20.26, 4.98) |
| M14 | pBW + MIR + MY | 3.38 | 2.89 | 3.07 | 0.38 | 0.47 | 0.53 | (4.37, 9.4) | (17.42, 3.64) | (17.55, 5.33) |
| M15 | PRT + MIR + pBW + MY | 3.38 | 2.86 | 3.06 | 0.38 | 0.48 | 0.53 | (4.45, 9.98) | (13.51, 3.5) | (13.73, 5.01) |
| M16 | PRT + MIR + pBW + MY + WOL | 3.41 | 2.86 | 3.09 | 0.37 | 0.48 | 0.53 | (4.33, 13.14) | (5.91, 6.8) | (5.95, 3.93) |
| M17 | PRT + MIR + pBW + MY + pFA | 3.38 | 2.86 | 3.07 | 0.38 | 0.48 | 0.53 | (4.55, 10.67) | (10.52, 3.48) | (11, 5.66) |
| M18 | PRT + MIR + pBW + MY + pMC | 3.38 | 2.86 | 3.06 | 0.38 | 0.48 | 0.53 | (4.51, 9.99) | (13.59, 3.49) | (13.87, 4.88) |
| M19 | M16 + pFA | 3.41 | 2.86 | 3.10 | 0.37 | 0.48 | 0.52 | (4.79, 11.94) | (4.43, 5.69) | (4.55, 3.64) |
| M20 | M16 + pMC | 3.41 | 2.86 | 3.09 | 0.37 | 0.48 | 0.53 | (4.35, 13.14) | (5.83, 6.78) | (5.89, 3.89) |
| M21 | M17 + pMC | 3.38 | 2.86 | 3.07 | 0.38 | 0.48 | 0.53 | (4.59, 10.66) | (10.41, 3.48) | (11.02, 5.55) |
| M22 | M16 + pFA + pMC | 3.41 | 2.86 | 3.10 | 0.37 | 0.48 | 0.52 | (4.81, 11.93) | (4.43, 5.69) | (4.55, 3.63) |
1 Milk yield (MY), predicted bodyweight (pBW), days in milk (DIM), parity (PRT), milk mid-infrared spectra (MIR), predicted fatty acids (pFA), predicted milk composition (pMC: fat, lactose, protein). 2 Cow-independent root mean square error of cross-validation (RMSEcv), whose calibration was made either on AUS, CAN, or GPE dataset. 3 Cow-independent R-squared of cross-validation (R2cv), whose calibration was made either on AUS, CAN, or GPE dataset. 4 Out-of-sample country-independent root mean square errors of validation (RMSEv), whose validation was made either on AUS (VAUS), CAN (VCAN), or GPE (VGPE) dataset; while calibration either made on AUS (CAUS), CAN (CCAN), or GPE(CGPE).
Cow-independent cross-validation and out-of-sample country-independent performance for partial least squares regressions predicting the dry matter intake (in kg) of dairy cows from a dataset coming data coming from 3 different origins (AUS = Australia, CAN = Canada, GPE = GplusE project).
| Model 1 | Statistics 2 | Datasets 3 | |||
|---|---|---|---|---|---|
| AUS + CAN | AUS + GPE | CAN + GPE | AUS + CAN + GPE | ||
| (N = 9692) | (N = 6648) | (N = 5082) | (N = 10,711) | ||
| M12 | RMSEcv (kg) | 3.39 | 3.62 | 3.12 | 3.42 |
| R2cv | 0.34 | 0.35 | 0.47 | 0.37 | |
| RMSEv (kg) | 5.05 | 4.26 | 5.41 | NA | |
| M15 | RMSEcv (kg) | 3.27 | 3.45 | 2.95 | 3.28 |
| R2cv | 0.38 | 0.41 | 0.53 | 0.42 | |
| RMSEv (kg) | 4.45 | 3.75 | 6.34 | NA | |
| M16 | RMSEcv (kg) | 3.27 | 3.43 | 2.94 | 3.27 |
| R2cv | 0.38 | 0.41 | 0.53 | 0.42 | |
| RMSEv (kg) | 4.63 | 3.8 | 5.49 | NA | |
| M19 | RMSEcv (kg) | 3.27 | 3.43 | 2.94 | 3.27 |
| R2cv | 0.38 | 0.42 | 0.53 | 0.42 | |
| RMSEv (kg) | 4.4 | 3.73 | 6.03 | NA | |
| M20 | RMSEcv (kg) | 3.27 | 3.43 | 2.94 | 3.27 |
| R2cv | 0.38 | 0.41 | 0.53 | 0.42 | |
| RMSEv (kg) | 4.59 | 3.81 | 5.43 | NA | |
| M22 | RMSEcv (kg) | 3.27 | 3.43 | 2.94 | 3.27 |
| R2cv | 0.38 | 0.42 | 0.53 | 0.42 | |
| RMSEv (kg) | 4.39 | 3.73 | 5.99 | NA | |
1 M12 = PRT + MIR + pBW, M15 = M12 + MY, M16 = M15 + WOL Milk, M19 = M16 + pFA, M20 = M16 + pMC, and M22 = M19 + pMC, with milk yield (MY), predicted bodyweight (pBW), days in milk (DIM), parity (PRT), milk mid-infrared spectra (MIR), predicted fatty acids (pFA), predicted milk composition (pMC: fat, lactose, protein). 2 Cow-independent root mean square error of cross-validation (RMSEcv), whose calibration was made either on AUS, CAN, or GPE dataset. Cow-independent R-squared of cross-validation (R2cv), whose calibration was made either on AUS, CAN, or GPE dataset. Out-of-sample country-independent root mean square errors of cross-validation (RMSEv), whose validation was made either on AUS, CAN, or GPE dataset whenever these datasets did not participate in the calibration process. NA if no data left for out-of-sample country-independent validation. 3 Calibration datasets, with N being the number of records of each dataset.
Performance of cow-independent cross-validation and out-of-sample country-independent validation for artificial neural networks predicting the dry matter intake (in kg) of dairy cows.
| Model 1 | Dataset 2 | Size 3 | RMSEcv 4 | R2cv 5 | RPDcv 6 | RMSEv 7 |
|---|---|---|---|---|---|---|
| M19 | AUS U CAN | 3 | 3.25 | 0.39 | 1.28 | 4.2 |
| AUS U GPE | 3 | 3.46 | 0.41 | 1.30 | 3.69 | |
| CAN U GPE | 2 | 3 | 0.51 | 1.43 | 5.08 | |
| AUS U CAN U GPE | 2 | 3.25 | 0.43 | 1.33 | NA |
1 M19 = PRT + MIR + pBW + MY + WOL + pFA with milk yield (MY), predicted bodyweight (pBW), days in milk (DIM), parity (PRT), milk mid-infrared spectra (MIR), predicted fatty acids (pFA). 2 Calibration datasets. 3 Number of nodes of the hidden layer of the Artificial Neural Network. 4 Cow-independent root mean square error of cross-validation (RMSEcv). 5 Cow-independent R-squared of cross-validation (R2cv). 6 Ratio of performance to deviation of cross-validation (RPDcv).7 Out-of-sample country-independent root mean square errors of cross-validation (RMSEv), whose validation was made either on AUS, CAN, or GPE dataset whenever these datasets did not participate in the calibration process. NA if no data left for out-of-sample country-independent validation.
Comparison of out-of-sample country-independent performance between models M19, M19b and NRC, 2001 equation.
| Validation Data | RMSEv, nrc2001 1 | RMSEv, PLS M19 2 | RMSEv, ANN M19 3 |
|---|---|---|---|
| AUS | 4.72 | 6.03 | 5.08 |
| CAN | 4.29 | 3.73 | 3.69 |
| GPE | 3.96 | 4.4 | 4.2 |
1 Out-of-sample country-independent root mean square errors of cross-validation (RMSEv) for the equation provided by the National Research Council [1], whose validation was made either on AUS, CAN, or GPE dataset. 2 Out-of-sample country-independent root mean square errors of cross-validation (RMSEv) for the equation provided by the partial least square M19 model (M19 = PRT + MIR + pBW + MY + WOL + pFA), whose validation was made either on AUS, CAN, or GPE dataset, with milk yield (MY), predicted bodyweight (pBW), days in milk (DIM), parity (PRT), milk mid-infrared spectra (MIR), predicted fatty acids (pFA). 3 Out-of-sample country-independent root mean square errors of cross-validation (RMSEv) for the equation provided by the artificial neural netword M19 model (M19 = PRT + MIR + pBW + MY + WOL + pFA), whose validation was made either on AUS, CAN, or GPE dataset, with milk yield (MY), predicted bodyweight (pBW), days in milk (DIM), parity (PRT), projections of the milk mid-infrared spectra to the 25 first PLS factors explaining 99% of their spectral variability (MIR), predicted fatty acids (pFA).
Figure 3Evolution of the predicted dry matter intake for Walloon Holstein cows with (a) lactation and (b) year. Predictions related to partial least square regression model M19 in blue and artificial neural network M19 in red. Evolution of (c) predicted fat content in milk and (d) milk yield and 4% fat corrected milk yield with year.