Literature DB >> 31623092

Machine Learning Based Prediction of Insufficient Herbage Allowance with Automated Feeding Behaviour and Activity Data.

Abu Zar Shafiullah¹, Jessica Werner², Emer Kennedy³, Lorenzo Leso⁴, Bernadette O'Brien⁵, Christina Umstätter⁶.

Abstract

Sensor technologies that measure grazing and ruminating behaviour as well as physical activities of individual cows are intended to be included in precision pasture management. One of the advantages of sensor data is they can be analysed to support farmers in many decision-making processes. This article thus considers the performance of a set of RumiWatchSystem recorded variables in the prediction of insufficient herbage allowance for spring calving dairy cows. Several commonly used models in machine learning (ML) were applied to the binary classification problem, i.e., sufficient or insufficient herbage allowance, and the predictive performance was compared based on the classification evaluation metrics. Most of the ML models and generalised linear model (GLM) performed similarly in leave-out-one-animal (LOOA) approach to validation studies. However, cross validation (CV) studies, where a portion of features in the test and training data resulted from the same cows, revealed that support vector machine (SVM), random forest (RF) and extreme gradient boosting (XGBoost) performed relatively better than other candidate models. In general, these ML models attained 88% AUC (area under receiver operating characteristic curve) and around 80% sensitivity, specificity, accuracy, precision and F-score. This study further identified that number of rumination chews per day and grazing bites per minute were the most important predictors and examined the marginal effects of the variables on model prediction towards a decision support system.

Entities: Chemical Disease Species

Keywords: binary classification; feeding behaviour and activities; herbage allowance; machine learning; precision pasture management

Mesh：

Year: 2019 PMID： 31623092 PMCID： PMC6832637 DOI： 10.3390/s19204479

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.576

1. Introduction

One of the key roles of precision pasture management is to ensure that the herbage allowance is well maintained and utilised for the individual cows through the applications of smart farming technologies. In order for economical and efficient usage of the technologies, it is extremely important that the procedure analyses the recorded data to assist farmers in diverse decision-making processes. The RumiWatchSystem, consisting of a noseband pressure sensor [1] and a pedometer [2], is such a sensor-based technology in which the physical activities as well as grazing and ruminating behaviour of individual cows can be recorded. The reliability and validity of sensor data and their applications in precision farming were studied in a wide range of literature. For example, Greenwood, et al. [3] proposed simple initial algorithms for predicting pasture intake by individual cattle using sensor data. Other studies (e.g., [4,5]) addressed the scope of developing the support systems that could assist farmers with proper feed allowances, physical activities and behavioural changes, estimation of herbage dry matter and locomotion behaviour of the cattle. In a similar context, the present study considers the problem of identifying the cows with insufficient herbage allowance based on a set of RumiWatchSystem recorded variables. Since direct measurement of herbage intake of cows on pasture is difficult, time consuming and expensive, this study explored the scope of using the variables as predictors of a decision class in binary classification, i.e., sufficient or insufficient herbage allowance. The data were collected from a study where a group of spring calving dairy cows had access to 100% of their intake capacity as herbage allowance, whereas another group had 60% of their intake capacity [6]. Each cow was equipped with an automated noseband pressure sensor and a pedometer, which continuously recorded the feeding and activity related variables. For the present study, the recorded variables were summarised (total or mean) to extract the features in 24-h windows. The rationale of this study lies in the fact that the complexities of herbage intake measurements can be reduced substantially if a classification model is found that efficiently predicts the insufficient allowance using the extracted features, towards a decision support system for optimal pasture management. The subsequent sections of this article are organised as follows. The datasets used in this study, exploratory analysis for variable selection, commonly used machine-learning (ML) models in R [7] and the performance metrics used for evaluating and comparing the models are discussed in Section 2. Section 3 demonstrates the results of validation studies for the commonly used ML models and generalised linear model (GLM). This section further identifies the important variables, observed thresholds and the marginal effects of the variables on the model prediction. Section 4 discusses the study findings followed by a summary of this article in Section 5.

2. Materials and Methods

2.1. Data Collection

Data were collected for this study from a larger overall experiment at Teagasc, Moorepark Dairy Research Farm, Animal & Grassland Research and Innovation Centre, Fermoy, Co. Cork, Ireland. The experiment was conducted in spring time 2016 using 105 calving cows to examine the effects of restricted herbage allowance on milk production, immunology and indicators of reproductive health of grazing dairy cows. Ethical approval was received from Teagasc Animal Ethics Committee (TAEC; TAEC100/2015) and the procedure authorisation was granted by the Irish Health Products Regulatory Authority (HPRA). For the present study, 40 focal cows were selected for recording the feeding behaviour and activities using the RumiWatchSystem. Out of these, 10 cows were randomly selected to have 100% of their intake capacity. The remaining 30 cows had restricted herbage allowance, i.e., 60% of their intake capacity. The 60% group was further divided into six blocks with respect to the period of restriction (two-week or six-week) and stages of lactation at the commencement of restriction: start (S: restriction started at the beginning of experiment), mid (M: two weeks after the S restriction commenced) or late (L: four weeks after the S restriction commenced). The behaviour of cows in the 100% group was monitored over a 10-week period. The three blocks S2, M2 and L2, which received two-week restriction of herbage allowance, had their behaviour recorded during the full two-week period, whereas the behaviour of blocks M6 and L6, which received six-week restriction, were recorded during the last two weeks of the restriction period. The S6 block was monitored during the entire six-week restriction period in order to mitigate the imbalance frequency of rows for the 100% and 60% groups in the combined data. The RumiWatchSystem recorded pressure and accelerometer data in a 10 Hz resolution. The raw data were then converted into one-hour summaries by generic algorithms included in the RumiWatch Converter V.7.3.36, which were later summarised in individual daily records (features) per animal. There was some data loss and changing cows due to injuries and breakdown of sensors. As a result, there were 63 individual daily records per cow in the 100% group over a 10-week period included and 12 or 13 daily records per individual cow in the 60% group (except S6 block) depending on the application time of the sensor, as only complete daily records during the two-week period were considered. Only two cows had less than 12 daily records, due to technical issues with the sensor device. In case of S6 block, there were 38 individual daily records for four cows and 36 daily records for one cow during the six-week period included. The missing and incomplete rows were removed for the safety and strictness in comparing the prediction performance of the competing models. Thus, the combined dataset included 1096 rows and 21 columns with 629 rows for the cows with 100% herbage allowance and 467 rows for the cows with 60% allowance. Each column included the extracted features (daily mean or total) of individual cows based on the recorded feeding behaviour or activity related variable. Out of the 21 features (variables), those listed in Table 1 were, on average, significantly different in the 100% and 60% allowance groups, hence considered as model predictors in this study. The study design is further discussed in [6].

Table 1

List of feeding behaviour and activity related variables used in the classification models.

Notation	Grazing Behaviour
BITEFREQ	Bite frequency or grazing bites per min (n/min)
GRAZINGSTART	Number of grazing bouts started per day (grazing bout = minimum duration of 7 min and intra-bout interval is smaller than 7 min [9]) (n/day)
	Rumination Behaviour
RUMINATECHEW	Number of rumination chews per day (n/day)
RUMICHEWBOLUS	Mean number of rumination chews per bolus (n/bolus)
RUMIBOUTLENGTH	Mean duration of a rumination bout (rumination bout = minimum duration of 3 min and intra-bout interval is smaller than 1 min [9]) (min/bout)
RUMIBOUTTIME	Time of rumination within all rumination bouts (min/day)
	Activity
HACTIVITY	Head movement activity index (n) based on accelerometer data; the averaged variance of 3-dimensional acceleration captured on the head in 10-s segments
LAYDOWN	Number of event (n) at which the pedometer angle changes its position from a vertical angle towards a horizontal angle for a duration of at least 50 s when the cow is lying down or standing up [2]

The combined dataset were divided into six subsets based on the blocks of cows in the 60% allowance group. Throughout this paper, S2, S6, M2, M6, L2 and L6 denote the blocks of cows with restricted allowance as well as the datasets, which contained the respective rows from the 60% and 100% herbage allowance groups. In addition, the 100% and 60% groups are called sufficient allowance and insufficient allowance in the prediction of decision classes. The S2, M2 and L2 datasets were merged to create W2, which comprised the recorded features for two-week duration. Similarly, S6, M6 and L6 datasets were merged to create W6. These additional subsets of combined data were used to compare the changes in prediction performance as the duration of 60% herbage allowance increased from two to six weeks, regardless the lactation stages of the cows. Thus, the number of rows which corresponded to the cows with unrestricted and restricted allowance in the subsets S2, S6, M2, M6, L2 and L6 were (130, 65), (130, 65), (119, 60), (130, 56), (130, 52), and (120, 38), respectively. In the present study, a number of predictive models were first applied to the combined data and the performance was compared based on leave-out-one-animal (LOOA [8]) approach to validation and cross validation (CV) studies. The models were further compared using the subsets of combined data based on CV studies.

2.2. Variable Selection

A set of predictor variables was selected based on the exploratory analysis, i.e., box plots (Figure 1), t-tests (Table A1 and Table A2) and analysis of variance (Table A3). The selected variables were broadly classified as grazing behaviour, rumination behaviour and activity. The definitions, measurement units and notations used to denote the variables are presented in Table 1. For each variable, the measurement unit indicates the extracted feature (using 24-h window) considered in this study. Throughout this paper, the variable names will refer to the corresponding features extracted from the sensor data.

Figure 1

Side-by-side box plots of selected variables using the combined data for sufficient (100%) and insufficient (60%) herbage allowance groups.

Table A1

Independent samples t-tests for significant differences of means of each predictor in the 100% and 60% herbage allowance groups using S2, S6, M2, M6, L2 and L6 data.

Variable	S2	S6	M2	M6	L2	L6
BITEFREQ	6.35 ***	4.76 ***	3.48 *	5.04 ***	3.2 ***	3.07 ***
GRAZINGSTART	−0.97 **	−0.29	0.05	−0.68	−0.76 *	−0.21
RUMINATECHEW	−1689 *	−9169 ***	−7482 ***	−3969 ***	−8429 ***	−7218 ***
RUMICHEWBOLUS	−1.24	−8.12 ***	−2.87 ***	−3.06 ***	−3.71 ***	−4.95 ***
RUMIBOUTLENGTH	−3.47 ***	−10.90 ***	−7.18 ***	−6.04 ***	−7.12 ***	−9.0 ***
RUMIBOUTTIME	−12.76	−107.04 ***	−105.07 ***	−53.9 ***	−104.4 ***	−80.0 ***
HACTIVITY	13.81 ***	15.31 ***	6.71	17.9 ***	8.53 **	9.29 *
LAYDOWN	−2.09 ***	−1.48 ***	−2.45 ***	−1.32 **	−1.36 ***	−0.81

P-value: *** < 0.001; ** < 0.01; * < 0.05.

Table A2

Independent samples t-tests for significant differences of means of each predictor in the 100% and 60% herbage allowance groups using W2, W6 and combined data.

Variable	W2	W6	Combined
BITEFREQ	4.5 ***	4.22 ***	4.48 ***
GRAZINGSTART	−0.538 *	−0.343	−0.60 **
RUMINATECHEW	−5812 ***	−6840 ***	−6680 ***
RUMICHEWBOLUS	−2.52 ***	−5.45 ***	−4.48 ***
RUMIBOUTLENGTH	−6.0 ***	−8.71 ***	−7.2 ***
RUMIBOUTTIME	−73.48 ***	−81.7 ***	−80.1 ***
HACTIVITY	9.89 ***	14.2 ***	10.2 ***
LAYDOWN	−2.0 ***	−1.19 ***	−1.7 ***

P-value: *** < 0.001; ** < 0.01; * < 0.05.

Table A3

ANOVA F-test and multiple comparisons tests for the blocks of 60% and 100% herbage allowance groups using each predictor of the combined data.

Variable	S2	S6	M2	M6	L2	L6	F-Test
BITEFREQ	6.2 ***	4.9 ***	2.4 **	4.9 ***	3.5	4.9 ***	***
GRAZINGSTART	−0.13	−0.83	−0.61 ***	−0.65	−1.82	0.60	***
RUMINATECHEW	−2672 **	−8147 ***	−7154 ***	−6517 ***	−11,215 ***	−1674	***
RUMICHEWBOLUS	−0.067	−6.76 ***	−3.12 **	−6.43 ***	−2.85 **	−3.62 ***	***
RUMIBOUTLENGTH	−3.82 **	−8.11 ***	−4.87 ***	−7.65 ***	−10.7 **	−6.27 ***	***
RUMIBOUTTIME	−29.42 *	−93.06 ***	−85.13 ***	−65.85	−154.47 ***	−26.37 ***	***
HACTIVITY	13.44 ***	9.31 ***	7.03	13.08 **	−5.08	25.22 ***	***
LAYDOWN	−2.40 ***	−1.78 ***	−1.13	−1.59 ***	−2.04 ***	−0.92	***

P-value: *** < 0.001; ** < 0.01; * < 0.05.

On average, the RumiWatchSystem-recorded measures of these variables in the sufficient allowance group was significantly different from at least one of the blocks of insufficient allowance group. For example, using the combined data, the side-by-side box plots in Figure 1 show that most of the selected variables centred higher in the 100% group than 60% group, except bite frequency per minute (BITEFREQ) and head activity index (HACTIVITY), which centred higher in the 60% group. In this study, GLM and ML models used these variables as predictors of the herbage allowance classes.

2.3. Classification Models

The commonly used ML models and GLM with binomial family were considered for the binary classification problem. For convenience, the dependent variable herbage allowance is denoted by y where and 0 refer to the insufficient and sufficient herbage allowance class, respectively. Given a set of predictor variables X for n observations, the GLM with logit link (Equation (1)) predicts insufficient herbage allowance if the estimated logit, or sufficient allowance if . Here, denotes the probability of insufficient allowance and denotes the probability of sufficient allowance for the ith observation . The GLM was implemented using the function of the package in R [7]. Table 2 presents the list of ML methods considered in this study, and the packages that implement the methods in R. In each case, the underlying classification model used the variables of Table 1 as predictors. For more details and familiarising with hyper-parameters of specific ML, see the R package [10].

Table 2

List of machine learning methods with R packages.

Machine Learning	R Package	Function(s)
k-Nearest Neighbour (kNN)	class [11]	knn
Linear Discriminant Analysis (LDA)	MASS [11]	lda
Neural Network (NNET)	nnet [11]	nnet
Naïve Bayes (NB)	e1071 [12]	naiveBayes
Support Vector Machine(SVM)	e1071 [12]	svm
Decision Tree (DT)	rpart [13]	rpart
Random Forest (RF)	randomForest [14]	randomForest
Extreme Gradient Boosting (XGBoost)	xgboost [15]	xgb.DMatrix, xgb.train

In this study, first the performance of the ML models and GLM was compared using combined data. At this stage, the predictive performance of the models was evaluated based on LOOA approach and CV studies. Then, GLM and selected ML models, which achieved desirable performance, were further compared based on CV studies using S2, M2, L2, S6, M6, L6, W2 and W6 datasets. Thus, the effect of restriction period on predictive performance was examined for separate blocks and regardless the lactation stages of the calving cows. Finally, the important variables and partial dependencies of model prediction were examined for random forest (RF).

2.4. Evaluation Metrics

The prediction performance of the candidate models was compared based on a number of classification evaluation metrics. The metrics were estimated in validation studies using the confusion matrix (Table 3) of actual and predicted classes for the test cases. Table 4 shows the estimation formulae for the list of metrics considered in this study.

Table 3

Confusion matrix for estimating the classification evaluation metrics based on the number of actual and predicted classes among the test cases.

Predicted Herbage Allowance	Allocated Herbage Allowance
Predicted Herbage Allowance	Insufficient	Sufficient
Insufficient	True Positive (TP)	False Positive (FP)
Sufficient	False Negative (FN)	True Negative (TN)

Table 4

Estimators of sensitivity, specificity, accuracy, positive predictive value (PPV), F-score and the area under receiver operating characteristic curve (AUC) in terms of the number of true positive (TP), false positive (FP), true negative (TN) and false negative (FN) classes among the test cases.

Evaluation Metric	Estimator
Sensitivity	TPTP+FN
Specificity	TNTN+FP
Accuracy	TN+TPTP+FP+TN+FN
Positive predictive value (PPV)	TPTP+FP
F-score	2×Sensitivity×PPVSensitivity+PPV
AUC	Area under ROC curve

For binary classification, one way to evaluate the performance of a predictive model is the estimation of accuracy, i.e., the rate of correctly predicting the class of a test case. Accuracy is a commonly used evaluation metric since it takes into account both true negative and true positive rates. Here, negative means sufficient allowance and positive means insufficient allowance. However, in the case of imbalance training data, accuracy is often overestimated. The area under receiver operating characteristic curve (AUC [16]) also considers true negative and true positive rates and is often used along with other evaluation metrics. In the context of present study, AUC denotes the probability that a randomly chosen cow with insufficient allowance is ranked higher than a cow with sufficient allowance. Both accuracy and AUC range in value from 0 to 1, a higher value indicating greater ability to discriminate one class from the other. According to Steensels, et al. [17], a diagnostic test is usually classified as excellent (AUC = ), good (AUC = ), fair (AUC = ), poor (AUC = ) or fail (AUC = ). Since the subsets of the combined data were unbalanced, accuracy and AUC were not sufficient in this study to validate the performance of the competing models. Moreover, in the case of an animal monitoring model, it is often more important to identify cows with insufficient feed allowance than sufficient allowance. Thus, additional metrics, namely specificity, sensitivity, positive predictive value (PPV) and F-score, were considered in this study. Here, specificity (rate of predicting sufficient allowance given the cow had sufficient allowance) assesses the prediction performance for the test cows in the 100% herbage allowance group. Conversely, sensitivity, PPV and F-score focus on the correct prediction rate for cows with insufficient herbage allowance. Sensitivity of a model estimates the rate at which insufficient allowance was predicted when a randomly selected cow actually had 60% allowance. The PPV metric further estimates the proportion of predicted insufficient allowance that were actually insufficient. The F-score considers both sensitivity and PPV since it is the harmonic mean of these two metrics. Thus, a high F-score implies that the model is highly efficient in predicting insufficient herbage allowance. In this study, the performance of the candidate models was compared based on the estimates of these metrics using validation studies. For the combined data, the estimates were first obtained based on LOOA approach, where data from one animal create the test set while the remaining animals create the training set. Since the candidate models are trained with no overlapping features that come from the same animal in the test set, the LOOA approach gives the estimated metrics that are more reliable in the prediction of new (unseen) animal. However, in the present context, since the previous data of cows on pasture can be included in the training set, the evaluation metrics were further estimated based on CV studies. This approach identified the models, which may perform relatively better when a support system continuously updates the training data with the previous records of cows on pasture. Given a dataset, the CV study was conducted as follows. Randomly split the observations into a training and a test set such that each observation has 70% chance to be included in the training set and 30% chance to be included in the test set. Train the ML models (fit the GLM) in the training set and apply them for predicting the herbage allowance classes in the test set. Create a confusion matrix for each model and estimate the evaluation metrics of Table 4. Repeat Steps i–iii a large number (1000) of times and summarise the results by the mean and standard error of the estimates for each model.

3. Results

3.1. Predictive Performance

Table 5 and Table 6 summarise the results for combined data using LOOA approach to validation and CV studies. In LOOA approach, since the estimates were obtained by using a single confusion matrix for all calving cows under study, the standard errors of the estimates were not applicable.

Table 5

Predictive performance of machine learning and generalised linear models based on the estimated sensitivity, specificity, accuracy, positive predictive value (PPV), F-score and the area under receiver operating characteristic curve (AUC) using leave-out-one-animal approach to validation studies for combined data.

Classifier	Sensitivity	Specificity	Accuracy	PPV	F-Score	AUC
kNN	0.70	0.71	0.71	0.64	0.67	0.78
NB	0.72	0.74	0.73	0.68	0.70	0.81
NNET	0.77	0.67	0.71	0.63	0.70	0.80
LDA	0.78	0.65	0.70	0.62	0.69	0.79
DT	0.74	0.67	0.70	0.63	0.68	0.78
SVM	0.74	0.61	0.67	0.59	0.66	0.74
XGBoost	0.73	0.59	0.65	0.57	0.64	0.72
RF	0.75	0.63	0.68	0.60	0.67	0.76
GLM	0.74	0.64	0.69	0.63	0.70	0.76

The estimates in bold correspond to the best models.

Table 6

Classifier	Sensitivity	Specificity	Accuracy	PPV	F-Score	AUC
kNN	0.66 (0.004)	0.67 (0.003)	0.67 (0.002)	0.65 (0.003)	0.65 (0.003)	0.73 (0.003)
NB	0.73 (0.004)	0.73 (0.003)	0.73 (0.002)	0.71 (0.004)	0.72 (0.003)	0.81 (0.003)
NNET	0.76 (0.004)	0.78 (0.004)	0.76 (0.002)	0.76 (0.004)	0.76 (0.003)	0.85 (0.003)
LDA	0.76 (0.003)	0.78 (0.004)	0.77 (0.002)	0.77 (0.004)	0.76 (0.002)	0.85 (0.002)
DT	0.75 (0.003)	0.76 (0.004)	0.75 (0.003)	0.74 (0.005)	0.74 (0.003)	0.83 (0.003)
SVM	0.79 (0.004)	0.80 (0.003)	0.79 (0.002)	0.79 (0.004)	0.79 (0.002)	0.88 (0.002)
XGBoost	0.79 (0.003)	0.81 (0.003)	0.80 (0.002)	0.80 (0.003)	0.79 (0.002)	0.88 (0.002)
RF	0.80 (0.003)	0.80 (0.003)	0.80 (0.002)	0.79 (0.004)	0.79 (0.002)	0.88 (0.002)
GLM	0.76 (0.004)	0.77 (0.003)	0.76 (0.002)	0.76 (0.004)	0.76 (0.003)	0.85 (0.003)

Standard errors are indicated in parentheses. The estimates in bold correspond to the best models.

It can be observed that in both LOOA and CV studies the ML models predicted the sufficient and insufficient allowance classes relatively more accurately than GLM. Table 5 reveals that, on average, the prediction accuracy of insufficient allowance using linear discriminant analysis (LDA) (78% sensitivity) and that of sufficient allowance using naïve Bayes (NB) (74% specificity) were higher than all other models. Additionally, the NB model attained relatively higher prediction accuracy (73%), PPV (68%), F-score (70%) and AUC (81%), which indicate that the model can be more reliable in predicting the herbage allowance classes of new calving cows based on the current data. The neural network (NNET) and GLM also attained the F-scores equal 70%. The more advanced ML models such as RF and XGBoost attained similar accuracy when predicting the insufficient allowance but relatively lower accuracy when predicting the sufficient allowance in LOOA approach. The sensitivity, specificity, accuracy, PPV, F-score and AUC estimates of random forest (RF) model were 75%, 63%, 68%, 60%, 67% and 76%, respectively. Comparing the results in Table 6, it is further observed that there was an increase in the estimated metrics of each model when a portion of features in the training and test set were observed from the same cows. This indicates that the models were over trained in CV approach, i.e., the estimates may be reliable in case the future prediction of herbage allowance is based on previous records of the cows included in the training set. Using CV approach, the support vector machine (SVM), extreme gradient boosting (XGBoost) and RF models achieved relatively higher accuracy () and AUC (88%) than GLM and other ML models. The observed accuracy and AUC for GLM were 76% and 85%. Comparing the sensitivity, specificity, PPV and F-score, the SVM, XGBoost and RF models, on average, scored higher values (), whereas the estimates for other ML models lied mostly in the range –. GLM attained these estimates around 76%. The standard errors of the estimates were small in CV studies, which indicate that the estimates were precise. Based on the results in Table 5 and Table 6, GLM, RF, XGBoost, SVM, LDA, NNET, and NB models were selected for CV studies using the subsets of combined data.

3.2. Effects of Restriction Period

Table 7 and Table 8 summarise the CV results for GLM and RF using the subsets of combined data. Similar tables are created for SVM, XGBoost, LDA, NNET and NB in the Appendix A (Table A4, Table A5, Table A6, Table A7 and Table A8).

Table 7

Predictive performance of generalised linear model based on the estimated sensitivity, specificity, accuracy, positive predictive value (PPV), F-score and the area under receiver operating characteristic curve (AUC) for two-week and six-week restriction periods among the cows in early (S), mid (M) and late (L) lactation stage using cross validation studies.

Subset	Sensitivity	Specificity	Accuracy	PPV	F-Score	AUC
S2	0.78 (0.01)	0.81 (0.006)	0.80 (0.006)	0.68 (0.013)	0.72 (0.01)	0.88 (0.007)
S6	0.84 (0.01)	0.88 (0.006)	0.86 (0.005)	0.81 (0.01)	0.82 (0.008)	0.94 (0.004)
M2	0.82 (0.012)	0.82 (0.007)	0.81 (0.006)	0.69 (0.014)	0.74 (0.01)	0.89 (0.007)
M6	0.78 (0.014)	0.80 (0.005)	0.79 (0.004)	0.62 (0.012)	0.68 (0.009)	0.87 (0.006)
L2	0.74 (0.015)	0.81 (0.007)	0.78 (0.006)	0.63 (0.016)	0.67 (0.012)	0.85 (0.008)
L6	0.81 (0.02)	0.84 (0.005)	0.83 (0.005)	0.67 (0.02)	0.72 (0.013)	0.90 (0.007)
W2	0.74 (0.008)	0.84 (0.004)	0.81 (0.003)	0.63 (0.008)	0.68 (0.006)	0.87 (0.004)
W6	0.71 (0.009)	0.85 (0.003)	0.81 (0.003)	0.61 (0.008)	0.65 (0.007)	0.86 (0.004)

Standard errors are indicated in parentheses.

Table 8

Predictive performance of random forest based on the estimated sensitivity, specificity, accuracy, positive predictive value (PPV), F-score and the area under receiver operating characteristic curve (AUC) for two-week and six-week restriction periods among the cows in early (S), mid (M) and late (L) lactation stage using cross validation studies.

Subset	Sensitivity	Specificity	Accuracy	PPV	F-Score	AUC
S2	0.90 (0.011)	0.87 (0.007)	0.88 (0.006)	0.78 (0.015)	0.84 (0.01)	0.96 (0.005)
S6	0.91 (0.008)	0.91 (0.005)	0.91 (0.004)	0.86 (0.01)	0.88 (0.006)	0.97 (0.002)
M2	0.89 (0.015)	0.87 (0.007)	0.88 (0.006)	0.79 (0.015)	0.83 (0.013)	0.95 (0.008)
M6	0.89 (0.009)	0.88 (0.004)	0.89 (0.004)	0.79 (0.011)	0.83 (0.007)	0.95 (0.002)
L2	0.87 (0.011)	0.89 (0.006)	0.88 (0.005)	0.79 (0.014)	0.82 (0.009)	0.95 (0.005)
L6	0.91 (0.015)	0.90 (0.005)	0.90 (0.005)	0.80 (0.02)	0.85 (0.013)	0.96 (0.005)
W2	0.78 (0.008)	0.84 (0.004)	0.82 (0.003)	0.62 (0.01)	0.68 (0.007)	0.88 (0.004)
W6	0.78 (0.006)	0.88 (0.003)	0.85 (0.002)	0.69 (0.009)	0.73 (0.006)	0.91 (0.002)

Standard errors are indicated in parentheses.

Table A4

Predictive performance of support vector machine based on the estimated sensitivity, specificity, accuracy, positive predictive value (PPV), F-score and the area under receiver operating characteristic curve (AUC) for two-week and six-week restriction period among the cows in early (S), mid (M) and late (L) lactation stage using cross validation studies.

Subset	Sensitivity	Specificity	Accuracy	PPV	F-Score	AUC
S2	0.88 (0.011)	0.82 (0.06)	0.84 (0.005)	0.68 (0.011)	0.76 (0.008)	0.92 (0.005)
S6	0.88 (0.008)	0.90 (0.005)	0.89 (0.004)	0.84 (0.009)	0.86 (0.007)	0.96 (0.002)
M2	0.91 (0.013)	0.82 (0.008)	0.85 (0.006)	0.68 (0.013)	0.78 (0.011)	0.94 (0.006)
M6	0.89 (0.012)	0.83 (0.005)	0.85 (0.005)	0.68 (0.016)	0.76 (0.012)	0.93 (0.004)
L2	0.85 (0.012)	0.85 (0.005)	0.85 (0.004)	0.72 (0.013)	0.77 (0.008)	0.92 (0.004)
L6	0.87 (0.014)	0.79 (0.005)	0.81 (0.004)	0.52 (0.017)	0.64 (0.014)	0.89 (0.005)
W2	0.79 (0.007)	0.83 (0.004)	0.82 (0.003)	0.60 (0.008)	0.68 (0.006)	0.89 (0.003)
W6	0.77 (0.007)	0.87 (0.002)	0.84 (0.002)	0.66 (0.007)	0.70 (0.005)	0.90 (0.003)