| Literature DB >> 35816512 |
Andreas W Oehm1, Andrea Springer2, Daniela Jordan2, Christina Strube2, Gabriela Knubben-Schweizer1, Katharina Charlotte Jensen3,4, Yury Zablotski1.
Abstract
Fasciola hepatica and Ostertagia ostertagi are internal parasites of cattle compromising physiology, productivity, and well-being. Parasites are complex in their effect on hosts, sometimes making it difficult to identify clear directions of associations between infection and production parameters. Therefore, unsupervised approaches not assuming a structure reduce the risk of introducing bias to the analysis. They may provide insights which cannot be obtained with conventional, supervised methodology. An unsupervised, exploratory cluster analysis approach using the k-mode algorithm and partitioning around medoids detected two distinct clusters in a cross-sectional data set of milk yield, milk fat content, milk protein content as well as F. hepatica or O. ostertagi bulk tank milk antibody status from 606 dairy farms in three structurally different dairying regions in Germany. Parasite-positive farms grouped together with their respective production parameters to form separate clusters. A random forests algorithm characterised clusters with regard to external variables. Across all study regions, co-infections with F. hepatica or O. ostertagi, respectively, farming type, and pasture access appeared to be the most important factors discriminating clusters (i.e. farms). Furthermore, farm level lameness prevalence, herd size, BCS, stage of lactation, and somatic cell count were relevant criteria distinguishing clusters. This study is among the first to apply a cluster analysis approach in this context and potentially the first to implement a k-medoids algorithm and partitioning around medoids in the veterinary field. The results demonstrated that biologically relevant patterns of parasite status and milk parameters exist between farms positive for F. hepatica or O. ostertagi, respectively, and negative farms. Moreover, the machine learning approach confirmed results of previous work and shed further light on the complex setting of associations a between parasitic diseases, milk yield and milk constituents, and management practices.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35816512 PMCID: PMC9273072 DOI: 10.1371/journal.pone.0271413
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Descriptive statistics of the data across the three study regions (North = 191 farms, East = 201 farms, South = 214 farms).
| North | East | South | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Variable | Mean | Range | 1st Qu. | Median | 3rd Qu. | Mean | Range | 1st Qu. | Median | 3rd Qu. | Mean | Range | 1st Qu. | Median | 3rd Qu. |
| Herd size | 93.71 | 10.00–486.00 | 51.50 | 79.00 | 115.50 | 334.80 | 1.00–2,821.00 | 129.00 | 245.00 | 418.00 | 46.46 | 5.00–231.00 | 27.00 | 40.50 | 59.00 |
| BCS | 3.05 | 2.54–4.58 | 2.90 | 3.03 | 3.16 | 3.33 | 2.34–3.98 | 3.18 | 3.36 | 3.50 | 3.68 | 2.71–4.26 | 3.54 | 3.74 | 3.85 |
| Milk yield | 26.08 | 20.00–30.82 | 24.83 | 26.08 | 27.48 | 25.65 | 14.74–31.78 | 24.61 | 26.10 | 27.14 | 25.19 | 19.40–31.20 | 24.21 | 25.45 | 26.50 |
| Milk fat | 3.81 | 3.37–4.32 | 3.69 | 3.78 | 3.92 | 3.68 | 3.09–4.58 | 3.61 | 3.67 | 3.74 | 3.96 | 3.53–4.35 | 3.88 | 3.94 | 4.35 |
| Milk protein | 3.20 | 2.93–3.49 | 3.12 | 3.18 | 3.29 | 3.11 | 2.54–3.49 | 3.07 | 3.13 | 3.16 | 3.35 | 3.12–3.61 | 3.30 | 3.36 | 3.40 |
| SCC | 217.40 | 122.90–663.90 | 183.90 | 221.70 | 239.20 | 228.27 | 27.64–365.94 | 199.49 | 222.58 | 254.36 | 205.20 | 106.2–421.8 | 167.0 | 197.7 | 233.0 |
| Lameness | 25.82 | 0.00–76.92 | 14.72 | 23.08 | 34.74 | 38.51 | 0.00–77.63 | 30.91 | 39.00 | 47.92 | 25.59 | 0.00–76.47 | 15.10 | 23.57 | 33.33 |
| DIM | 213.40 | 131.60–360.70 | 194.60 | 209.40 | 231.40 | 205.20 | 101.30–333.70 | 187.80 | 204.50 | 219.20 | 197.20 | 118.50–614.70 | 175.50 | 191.00 | 197.20 |
| Parity | 2.85 | 1.79–5.10 | 2.60 | 2.77 | 3.08 | 2.80 | 1.86–5.15 | 2.54 | 2.72 | 2.97 | 2.90 | 1.85–4.76 | 2.57 | 2.85 | 3.11 |
1 number of lactating and dry cows
2 bayesian median value per farm
3 in kg/day
4 in %
5 × 1,000
6 in number of cells/ml
7 Farm level prevalence in %
Fig 1Cluster plot of the partitioning around medoids clustering process for F. hepatica (regions North and South).
Region North (top): Two distinct clusters are displayed with 161 farms in cluster 1 (red) and 30 observations in cluster 2 (blue). Region South (bottom): Two clusters with 51 observations in cluster 1 (red) and 163 observations in cluster 2 (blue) naturally aggregated.
Descriptive cluster statistics of the F. hepatica cluster analysis in regions North and South (continuous variables).
| Region North | Region South | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Cluster 1 | Cluster 1 | |||||||||
| Variable | Mean | Range | 1st Qu. | Median | 3rd Qu. | Mean | Range | 1st Qu. | Median | 3rd Qu. |
| BCS | 3.06 | 2.54–4.58 | 2.93 | 3.05 | 3.16 | 3.55 | 2.71–4.07 | 3.34 | 3.64 | 3.78 |
| SCC | 218.90 | 122.90–663.90 | 184.30 | 211.80 | 242.70 | 210.40 | 112.50–393.40 | 175.20 | 205.50 | 224.50 |
| Lame | 25.05 | 0.00–76.92 | 15.71 | 23.08 | 34.75 | 15.98 | 0.00–48.00 | 7.87 | 14.81 | 21.86 |
| DIM | 213.50 | 143.50–360.70 | 197.30 | 210.30 | 231.10 | 200.10 | 139.70–291.30 | 174.60 | 201.80 | 219.70 |
| Parity | 2.85 | 1.79–5.06 | 2.60 | 2.77 | 3.09 | 3.11 | 2.13–4.76 | 2.74 | 2.96 | 3.47 |
| Milk yield | 26.24 | 20.77–30.31 | 24.95 | 26.36 | 27.60 | 24.59 | 21.12–29.23 | 23.00 | 24.48 | 25.60 |
| Milk fat | 3.79 | 3.37–4.24 | 3.68 | 3.76 | 3.91 | 3.92 | 3.53–4.18 | 3.83 | 3.92 | 4.00 |
| Milk protein | 3.19 | 2.93–3.49 | 3.11 | 3.17 | 3.27 | 3.33 | 3.18–3.61 | 3.27 | 3.32 | 3.39 |
| Cluster 2 | Cluster 2 | |||||||||
| BCS | 2.98 | 2.73–3.41 | 2.82 | 2.94 | 3.08 | 3.73 | 2.92–4.26 | 3.62 | 3.77 | 3.87 |
| SCC | 209.80 | 134.40–342.00 | 182.30 | 211.00 | 228.10 | 203.60 | 106.20–421.80 | 164.60 | 194.50 | 235.20 |
| Lame | 24.56 | 0.00–73.13 | 13.40 | 22.57 | 33.77 | 28.60 | 0.00–76.47 | 19.59 | 26.32 | 36.24 |
| DIM | 212.70 | 131.60–360.60 | 183.80 | 202.70 | 233.70 | 196.30 | 118.50–614.70 | 176.70 | 189.90 | 210.80 |
| Parity | 2.85 | 1.90–4.15 | 2.61 | 2.77 | 3.07 | 2.84 | 1.85–4.50 | 2.54 | 2.81 | 3.07 |
| Milk yield | 25.27 | 20.00–30.82 | 24.43 | 25.26 | 26.07 | 25.38 | 19.40–31.20 | 24.41 | 25.66 | 25.55 |
| Milk fat | 3.92 | 3.62–4.32 | 3.81 | 3.90 | 4.04 | 3.97 | 3.69–4.35 | 3.88 | 3.96 | 4.05 |
| Milk protein | 3.26 | 3.07–3.46 | 3.18 | 3.25 | 3.36 | 3.36 | 3.18–3.53 | 3.31 | 3.36 | 3.40 |
1 × 1,000
2 in number of cells/ml
3 Farm level prevalence in %
4 in kg
5 in %
Descriptive statistics (observations per cluster) of the F. hepatica cluster analysis in study regions North and South (categorical variables).
| Study Region Cluster (Counts [%]) | Variable Categories Counts [%] | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Housing system | Pasture access | Farming type | Herd size | |||||||||||
| Tie Stall | Free Stall | Other | Yes | No | Conventional | Organic | Small | Medium | Large | Negative | Positive | Negative | Positive | |
| Region North | ||||||||||||||
|
Cluster 1 (161.00 [84.29]) | 7.00 [4.35] | 137.00 [85.09] | 17.00 [10.56] | 121.00 [75.16] | 40.00 [24.84] | 156.00 [96.89] | 5.00 [3.11] | 41.00 [25.47] | 79.00 [49.07] | 41.00 [25.47] | 161.00 [100] | 0.00 [0.00] | 96.00 [59.63] | 65.00 [40.37] |
|
Cluster 2 (30.00 [15.71]) | 1.00 [3.33] | 25.00 [83.33] | 4.00 [13.33] | 30.00 [100.00] | 0.00 [0.00] | 26.00 [86.67] | 4.00 [13.33] | 7.00 [23.33] | 16.00 [53.33] | 7.00 [23.33] | 0.00 [0.00] | 30.00 [100.00] | 4.00 [13.33] | 26.00 [86.67] |
| Region South | ||||||||||||||
|
Cluster 1 (51.00 [23.83]) | 22.00 [43.14] | 28.00 [54.90] | 1.00 [1.96] | 47.00 [92.16] | 4.00 [7.84] | 30.00 [58.82] | 21.00 [41.18] | 23.00 [45.10] | 26.00 [50.98] | 2.00 [3.93] | 0.00 [0.00] | 51.00 [100.00] | 6.00 [11.76] | 45.00 [88.24] |
|
Cluster 2 (163.00 [76.17]) | 33.00 [20.25] | 124.00 [76.07] | 6.00 [3.68] | 136.00 [83.44] | 27.00 [16.56] | 151.00 [92.64] | 12.00 [7.36] | 29.00 [17.79] | 84.00 [51.53] | 50.00 [30.67] | 163.00 [100.00] | 0.00 [0.00] | 129.00 [79.14] | 34.00 [20.86] |
1 number of cows present on farm
region North: small < 51.50 cows, medium 51.50–115.50 cows, large > 115.50 cows
region South: small < 27 cows, medium 27–59 cows, large > 59 cow
Fig 2Cluster plot of the k-medoids clustering process for O. ostertagi in the three study regions.
Region North (top): Two clusters with 91 observations in cluster 1 (red) and 100 observations in cluster 2 (blue). Region East (middle): Two clusters with 71 observations in cluster 1 (red) and 130 observations in cluster 2 (blue). Region South (bottom): Two clusters with 79 observations in cluster 1 (red) and 135 observations in cluster 2 (blue).
Descriptive statistics of the O. ostertagi cluster analysis across study regions (continuous variables).
| Region North | Region East | Region South | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cluster 1 | Cluster 1 | Cluster 1 | |||||||||||||
| Variable | Mean | Range | 1st Qu. | Median | 3rd Qu. | Mean | Range | 1st Qu. | Median | 3rd Qu. | Mean | Range | 1st Qu. | Median | 3rd Qu. |
| BCS | 3.02 | 2.54–4.58 | 2.85 | 2.97 | 3.12 | 3.28 | 2.34–3.93 | 3.11 | 3.33 | 3.49 | 3.60 | 2.71–4.07 | 3.44 | 3.67 | 3.80 |
| SCC | 224.10 | 134.40–663.90 | 183.70 | 211.80 | 244.90 | 232.44 | 27.64–365.94 | 198.32 | 230.94 | 267.71 | 206.60 | 106.20–393.40 | 170.80 | 205.50 | 227.40 |
| Lame | 22.14 | 0.00–76.92 | 10.19 | 19.12 | 28.00 | 36.52 | 0.00–70.95 | 25.40 | 38.12 | 49.07 | 19.92 | 0.00–57.90 | 9.55 | 18.75 | 26.77 |
| DIM | 211.40 | 131.60–360.60 | 187.80 | 205.00 | 230.20 | 210.40 | 101.30–333.70 | 193.90 | 208.40 | 226.40 | 201.40 | 118.50–291.30 | 176.30 | 203.00 | 220.90 |
| Parity | 2.90 | 1.79–5.10 | 2.59 | 2.78 | 3.18 | 2.82 | 1.97–5.09 | 2.52 | 2.71 | 3.01 | 3.00 | 1.92–4.76 | 2.59 | 2.92 | 3.21 |
| Milk yield | 25.52 | 20.00–29.90 | 24.42 | 25.50 | 26.71 | 24.40 | 14.74–29.49 | 23.25 | 24.92 | 26.11 | 24.58 | 20.11–31.20 | 22.89 | 24.40 | 26.03 |
| Milk fat | 3.84 | 3.37–4.32 | 3.72 | 3.79 | 3.96 | 3.71 | 3.40–4.41 | 3.63 | 3.70 | 3.77 | 3.92 | 3.53–4.18 | 3.84 | 3.92 | 4.00 |
| Milk protein | 3.21 | 2.95–3.49 | 3.13 | 3.19 | 3.30 | 3.10 | 2.76–3.49 | 3.04 | 3.11 | 3.16 | 3.34 | 3.18–3.61 | 3.27 | 3.33 | 3.40 |
| Cluster 2 | Cluster 2 | Cluster 2 | |||||||||||||
| BCS | 3.07 | 2.69–3.81 | 2.97 | 3.06 | 3.17 | 3.35 | 2.84–3.98 | 3.21 | 3.37 | 3.50 | 3.74 | 2.92–4.26 | 3.62 | 3.79 | 3.87 |
| SCC | 211.40 | 122.90–458.60 | 184.30 | 211.00 | 234.80 | 226.00 | 132.00–332.40 | 200.80 | 220.30 | 246.90 | 204.50 | 115.80–421.80 | 166.60 | 194.50 | 236.40 |
| Lame | 29.17 | 3.18–61.36 | 17.67 | 27.48 | 39.10 | 39.60 | 0.00–77.63 | 32.64 | 39.40 | 47.59 | 28.90 | 0.00–76.47 | 19.59 | 27.12 | 36.47 |
| DIM | 215.20 | 166.60–360.70 | 199.10 | 212.20 | 232.60 | 202.30 | 127.00–320.10 | 185.00 | 203.80 | 217.20 | 194.70 | 134.50–614.70 | 175.20 | 188.90 | 203.90 |
| Parity | 2.80 | 1.98–4.18 | 2.60 | 2.74 | 2.99 | 2.79 | 1.86–5.15 | 2.56 | 2.74 | 2.95 | 2.84 | 1.85–4.50 | 2.56 | 2.82 | 3.07 |
| Milk yield | 26.60 | 20.77–30.82 | 25.43 | 26.75 | 27.84 | 26.33 | 16.36–31.78 | 25.48 | 26.51 | 27.58 | 25.55 | 19.40–29.11 | 24.76 | 25.75 | 26.59 |
| Milk fat | 3.78 | 3.44–4.13 | 3.68 | 3.76 | 3.91 | 3.66 | 3.09–4.58 | 3.60 | 3.66 | 3.72 | 3.98 | 3.69–4.35 | 3.89 | 3.97 | 4.05 |
| Milk protein | 3.29 | 2.93–3.45 | 3.11 | 3.18 | 3,28 | 3.12 | 2.54–3.33 | 3.08 | 3.13 | 3.16 | 3.36 | 3.19–3.53 | 3.31 | 3.36 | 3.40 |
1 × 1,000
2 in number of cells/ml
3 Farm level prevalence in %
4 in kg
5 in %
Descriptive statistics (observations per cluster) of the O. ostertagi cluster analysis across study regions (categorical variables).
| Study Region Cluster (Counts [%]) | Variable Categories Counts [%] | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Housing system | Pasture access | Farming type | Herd size | |||||||||||
| Tie Stall | Free Stall | Other | Yes | No | Conventional | Organic | Small | Medium | Large | Negative | Positive | Negative | Positive | |
| Region North | ||||||||||||||
|
Cluster 1 (91.00 [47.64]) | 7.00 [7.69] | 70.00 [76.92] | 14.00 [15.38] | 85.00 [93.41] | 6.00 [6.59] | 82.00 [90.11] | 9.00 [9.89] | 27.00 [29.67] | 54.00 [59.34] | 10.00 [10.99] | 65.00 [71.43] | 26.00 [28.57] | 0.00 [0.00] | 91.00 [100.00] |
|
Cluster 2 (100.00 [52.36]) | 1.00 [1.00] | 92.00 [92.00] | 7.00 [7.00] | 66.00 [66.00] | 34.00 [34.00] | 100 [100.00] | 0.00 [0.00] | 21.00 [21.00] | 41.00 [41.00] | 38.00 [38.00] | 96.00 [96.00] | 4.00 [4.00] | 100.00 [100.00] | 0.00 [0.00] |
| Region East | ||||||||||||||
|
Cluster 1 (71.00 [35.32]) | 2.00 [2.82] | 55.00 [77.46] | 14.00 [19.72] | 55.00 [77.46] | 16.00 [22.54] | 55.00 [77.46] | 16.00 [22.54] | 24.00 [33.80] | 38.00 [53.52] | 9.00 [12.68] | - | - | 0.00 [0.00] | 71.00 [100.00] |
|
Cluster 2 (130.00 [64.68]) | 0.00 [0.00] | 102.00 [78.46] | 28.00 [21.54] | 78.00 [60.00] | 52.00 [40.00] | 126.00 [96.92] | 4.00 [3.08] | 26.00 [20.00] | 63.00 [48.46] | 41.00 [31.54] | - | - | 130.00 [100.00] | 0.00 [0.00] |
| Region South | ||||||||||||||
|
Cluster 1 (79.00 [36.92]) | 27.00 [34.18] | 50.00 [63.29] | 2.00 [2.53] | 57.00 [72.15] | 22.00 [27.85] | 52.00 [65.82] | 27.00 [34.18] | 27.00 [34.18] | 39.00 [49.37] | 13 [16.46] | 34.00 [43.04] | 45.00 [56.96] | 0.00 [0.00] | 79.00 [100.00] |
|
Cluster 2 (135.00 [63.08]) | 28.00 [20.74] | 102.00 [75.56] | 5.00 [3.70] | 17.00 [12.59] | 118.00 [87.41] | 129.00 [95.56] | 6.00 [4.44] | 25 [18.52] | 71 [52.59] | 39 [28.89] | 129.00 [95.56] | 6.00 [4.44] | 135.00 [0.00] | 0.00 [0.00] |
1 number of cows present on farm, categorised
region North: small < 51.50 cows, medium 51.50–115.50 cows, large > 115.50 cows
region East: small < 129.00 cows, medium 129.00–418.00 cows, large > 418.00 cows
region South: small < 27 cows, medium 27–59 cows, large > 59 cow
Fig 3Mean decrease accuracy plots and variable importance plots for the random forest classification process of clusters 1 and 2 in study regions North and South (F. hepatica analyses).
A: study region North. B: study region South. The mean decrease accuracy plot expresses how much accuracy the model loses by excluding each variable. The more the accuracy suffers, the more important the variable is in distinguishing clusters. Vice versa, the higher the value of the mean decrease accuracy, the higher the importance of the variable in the model. The first three criteria appear to be the most valuable ones in characterizing clusters in region North, whereas five variables were ranked as the most relevant in region South.
Fig 4Mean decrease accuracy plots and variable importance plots for the random forest classification process of clusters 1 and 2 in study regions North and South (O. ostertagi analyses).
A: study region North. B: study region South. C: study region South. Changes of prediction accuracy, i. e. decrease of prediction accuracy if excluding single variables. In region North the first eight factors are the most valuable ones characterising clusters compared with the remaining variables. In study regions East and South, the first three variables were the most important ones differentiating clusters.