| Literature DB >> 33033298 |
Marina Gulyaeva1,2, Falk Huettmann3, Alexander Shestopalov2, Masatoshi Okamatsu4, Keita Matsuno4,5, Duc-Huy Chu6, Yoshihiro Sakoda4,5, Alexandra Glushchenko2, Elaina Milton7, Eric Bortz7.
Abstract
Avian Influenza (AI) is a complex but still poorly understood disease; specifically when it comes to reservoirs, co-infections, connectedness and wider landscape perspectives. Low pathogenic (Low-path LP) AI in chickens caused by less virulent strains of AI viruses (AIVs)-when compared with highly pathogenic AIVs (HPAIVs)-are not even well-described yet or known how they contribute to wider AI and immune system issues. Co-circulation of LPAIVs with HPAIVs suggests their interactions in their ecological aspects. Here we show for the Pacific Rim an international approach how to data mine and model-predict LP AI and its ecological niche with machine learning and open access data sets and geographic information systems (GIS) on a 5 km pixel size for best-possible inference. This is based on the best-available data on the issue (~ 40,827 records of lab-analyzed field data from Japan, Russia, Vietnam, Mongolia, Alaska and Influenza Research Database (IRD) and U.S. Department of Agriculture (USDA) database sets, as well as 19 GIS data layers). We sampled 157 hosts and 110 low-path AIVs with 32 species as drivers. The prevalence across low-path AIV subtypes is dominated by Muscovy ducks, Mallards, Whistling Swans and gulls also emphasizing industrial impacts for the human-dominated wildlife contact zone. This investigation sets a good precedent for the study of reservoirs, big data mining, predictions and subsequent outbreaks of HPAI and other pandemics.Entities:
Mesh:
Year: 2020 PMID: 33033298 PMCID: PMC7545095 DOI: 10.1038/s41598-020-73664-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Study area of eASIA project and sampling sites in the Pacific Rim.
Figure 2Workflow of this study to obtain best-available AI data and to data mine and predict them with machine learning in a geographic information system (GIS) for best-possible predictions and inference for the Pacific Rim study area (IRD = Influenza Research Database; USDA = U.S. Department of Agriculture); for more details, model specifications etc. see manuscript text.
List of GIS Predictors used in this study to data mine and predict low path (LP) Avian Influenza (AI) *
| Predictor Number | Predictor Group | Predictor name | Meaning |
|---|---|---|---|
| 1 | Species | Host species | Species that was caught and samples for AI assessment. This predictor is an attribute of the field-based AI lab data and used in the data mining but it’s not GIS-based and not used in creating the GIS map predictions |
| 2 | Landscape classification | Koeppen–Geiger | A widely-used landcover scheme |
| 3 | GLC2000 | A modern scheme of landcover classes | |
| 4 | NPD | NPD | |
| 5 | Livestock | Poultry density | Map of density of poultry per pixel |
| 6 | Pig density | Map of density of chicken per pixel | |
| 7 | Precipitation | March | Rainfall in March |
| 8 | June | Rainfall in June | |
| 9 | September | Rainfall in September | |
| 10 | December | Rainfall in December | |
| 11 | Temperature | March | Temperature in March |
| 12 | June | Temperature in June | |
| 13 | September | Temperature in September | |
| 14 | December | Temperature in December | |
| 15 | Proximity | Proximity to roads | Road closeness |
| 16 | Proximity to coastline | Coastal or not | |
| 17 | Cyclone | Cyclone | Cyclone occurs in that area, or not |
| 18 | Topography | Altitude | Altitude above sea level |
| 19 | Slope | Slope in degrees |
* Source and details: Except for host species those data come from Sriram and Huettmann (unpublished; https://essd.copernicus.org/preprints/essd-2016-65/).
Figure 3Map of low-path AI data, presence and absence data distribution.
Prevalences of host species for low-path AI strains from the compiled AI dataset.
| Host species | AI Samples | Presences of low-path AI | Proportion of sampled species in % |
|---|---|---|---|
| Tufted duck | 1 | 1 | 100.00 |
| Whistling swan | 8 | 8 | 100.00 |
| Chicken | 488 | 450 | 92.21 |
| Duck * | 1,120 | 1015 | 90.63 |
| Emperor Goose | 79 | 18 | 22.78 |
| Muscovy duck | 103 | 18 | 17.48 |
| Environment | 259 | 25 | 9.65 |
| Mallard | 4,338 | 159 | 3.67 |
| Green-winged teal | 1,448 | 47 | 3.25 |
| Pintail | 5,166 | 136 | 2.63 |
| Ring-necked duck | 85 | 2 | 2.35 |
| Shoveler | 1,514 | 34 | 2.25 |
| Gadwall | 74 | 1 | 1.35 |
| Cackling goose | 122 | 1 | 0.82 |
| Glaucous-winged gull | 3,490 | 26 | 0.74 |
| Greater white-fronted goose | 527 | 2 | 0.38 |
| Sandpiper | 644 | 2 | 0.31 |
| American Wigeon | 1,616 | 4 | 0.25 |
| Unidentified Larus gull | 422 | 1 | 0.24 |
* Duck undefined usually refers to Mallard or Muscovy Duck.
Figure 4Figure showing relationship between prevalence and importance rank (contribution) to the top 10 low-path AI strains (data shown in Table 2).
Figure 5Partial dependence plots of the top 3 predictors a) host species for data mining, and b) showing a 3-dimensional partial dependence plot for predictions (Koeppen Geiger classification and poultry density index).
Figure 6a) Model-predicted surface of low-path AI, b) Alaska zoom-in, c) Asia zoom-in. This map shows a heatmap where predicted presence and absence is shown as a relative index of occurrence (RIO with red = presence, green = absence, and gradient colors in-between).
Importance ranking of predictors for low-path AI model based on Treenet algorithm (SPM).
| Predictor rank of importance | Name of predictor | Percent importance ranking |
|---|---|---|
| 1 | Host species* | 100 |
| 2 | Koeppen geiger classification | 24 |
| 3 | Proximity to known poultry farms | 13 |
| 4 | Mean precipitation June | 12 |
| 5 | Proximity to roads | 12 |
| 6 | GLC2000 landcover | 11 |
| 7 | Slope in degrees | 11 |
| 8 | Proximity to coast | 10 |
| 9 | Proximity to pig farms | 9 |
| 10 | Mean precipitation September | 8 |
| 11 | Mean temperature December | 8 |
| 12 | Altitude | 7 |
| 13 | Nationalpark | 7 |
| 14 | Mean temperature March | 6 |
| 15 | Mean precipitation March | 6 |
| 16 | Mean temperature September | 5 |
| 17 | Mean temperature June | 5 |
| 18 | Mean precipitation December | 5 |
| 19 | Located in cyclone area | 3 |
* An attribute that is associated with AI lab data; it was used for the data mining (not landscape prediction surface as such information is not really available on a landscape scale; see also[10,15])
Figure 7Model assessment and overlay of predicted surface of low-path AI vs alternative AI data a) IRD Asia data, b) USDA Alaska data.