| Literature DB >> 26571497 |
Christopher L Burdett1, Brian R Kraus1, Sarah J Garza1, Ryan S Miller2, Kathe E Bjork2.
Abstract
Livestock distribution in the United States (U.S.) can only be mapped at a county-level or worse resolution. We developed a spatial microsimulation model called the Farm Location and Agricultural Production Simulator (FLAPS) that simulated the distribution and populations of individual livestock farms throughout the conterminous U.S. Using domestic pigs (Sus scrofa domesticus) as an example species, we customized iterative proportional-fitting algorithms for the hierarchical structure of the U.S. Census of Agriculture and imputed unpublished state- or county-level livestock population totals that were redacted to ensure confidentiality. We used a weighted sampling design to collect data on the presence and absence of farms and used them to develop a national-scale distribution model that predicted the distribution of individual farms at a 100 m resolution. We implemented microsimulation algorithms that simulated the populations and locations of individual farms using output from our imputed Census of Agriculture dataset and distribution model. Approximately 19% of county-level pig population totals were unpublished in the 2012 Census of Agriculture and needed to be imputed. Using aerial photography, we confirmed the presence or absence of livestock farms at 10,238 locations and found livestock farms were correlated with open areas, cropland, and roads, and also areas with cooler temperatures and gentler topography. The distribution of swine farms was highly variable, but cross-validation of our distribution model produced an area under the receiver-operating characteristics curve value of 0.78, which indicated good predictive performance. Verification analyses showed FLAPS accurately imputed and simulated Census of Agriculture data based on absolute percent difference values of < 0.01% at the state-to-national scale, 3.26% for the county-to-state scale, and 0.03% for the individual farm-to-county scale. Our output data have many applications for risk management of agricultural systems including epidemiological studies, food safety, biosecurity issues, emergency-response planning, and conflicts between livestock and other natural resources.Entities:
Mesh:
Year: 2015 PMID: 26571497 PMCID: PMC4646625 DOI: 10.1371/journal.pone.0140338
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The structure of the FLAPS model.
The FLAPS population simulation model consists of three interactive sub-models: (1) a missing-data model, (2) a distribution model, and (3) a simulation model. The output of the missing-data and distribution models provides input data for the simulation model. The definitions of the acronyms are: (1) IPF = iterative proportional fitting, and (2) LR = logistic regression.
Fig 2The density of domestic swine (A) farms, and (B) populations in the conterminous United States.
Data are from 2012 Census of Agriculture [16]. Counties colored black in (B) are those counties where swine population data were withheld to ensure respondent confidentiality.
Example of Census of Agriculture data from 2012 for the entire U.S. (including Alaska and Hawaii) showing the paired nature of the frequency distributions for the number of swine farms and individual pigs.
The number of swine farms is not confidential information and is published for all hierarchical levels of the Census of Agriculture. In contrast, the number of individual pigs can reveal socioeconomic information about individual farms and can be redacted, most commonly for county totals and subtotals due to fewer farms in these finer resolution categories.
| Farm/population-size class | ||||||||
|---|---|---|---|---|---|---|---|---|
| Data type |
| 1 to 24 | 25 to 49 | 50 to 99 | 100 to 199 | 200 to 499 | 500 to 999 | ≥ 1000 |
| Farm | 63,246 | 41,688 | 3,435 | 2,161 | 1,469 | 2,115 | 1,977 | 10,401 |
| Population | 66,026,785 | 244,250 | 116,808 | 146,967 | 201,460 | 683,977 | 1,384,921 | 63,248,402 |
a The total number of farms or population occurring within each of seven farm/population-size bins. Data from Table 19, 2012 U.S. Census of Agriculture [16].
b Grand totals for the farm and population data types representing the total number of swine farms and total swine population for the entire U.S.
Covariates used to model the distribution of swine farms in the United States.
| Covariate | Description |
|---|---|
|
| |
|
| Barren land |
|
| Cropland |
|
| Developed areas (low, medium and medium-high intensities) |
|
| Upland forest |
|
| Grassland |
|
| Open areas |
|
| Pasture |
|
| Shrubland or scrubland |
|
| Developed areas (high intensity) |
|
| Water |
|
| Lowland areas |
|
| |
| Slope | Slope (measured in degrees) |
| Rugged | Ruggedness (measurement of topographic variation) |
|
| |
| Temp | Mean annual temperature (1950–2000) |
| Precip | Mean annual precipitation (1950–2000) |
|
| |
|
| Roads |
a Covariates with prefix d are measured as linear distances (m) to the environmental or anthropogenic feature.
b Sources and references: Land-cover categories: 2006 National Land Cover Dataset [21]; Topography: National Elevation Dataset [22]; Climate: WORLDCLIM database [23]; Transportation: Environmental Systems Research Institute (ESRI) World Transportation [24].
c Open areas = Cropland + Pasture + Grassland + low and medium intensity Developed areas
Fig 3A flow chart of iterative steps in the simulation model based on algorithms used to place individual farms with both geographic (i.e., location) and demographic (i.e., population) attributes.
Fig 4Mean absolute percent differences for states in our county-to-state IPF verification analysis.
This map depicts the reaggregation of our county-level estimates of swine populations to the state-level totals from which they were derived. The most missing data in the Census of Agriculture occurs at the county-level, and this missing data precludes the high accuracy (mean absolute percent differences of ≤ 0.03%) our IPF algorithms achieved at the other two hierarchical scales (individual farms to counties, and states to the national total). For the nine states with absolute percent differences of > 5%, we overlaid the percent of total U.S. swine population occurring in each state. Collectively, these nine states comprised only 2.24% of the total U.S. swine industry.
Variable importance ranks for individual covariates used for predicting the distribution of swine farms in the conterminous U.S.
Each run is an iteration of a 5-fold cross-validation where 80% of the dataset was used for model building and 20% used for model testing. Quadratic forms of these covariates were used when their AIC values were less than the linear forms (S1 Table).
| Covariate | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Mean | SE |
|---|---|---|---|---|---|---|---|
|
| 0.438 | 0.448 | 0.449 | 0.443 | 0.468 | 0.449 | 0.005 |
|
| 0.223 | 0.227 | 0.237 | 0.227 | 0.214 | 0.226 | 0.004 |
| Temp | 0.223 | 0.234 | 0.191 | 0.238 | 0.230 | 0.223 | 0.008 |
|
| 0.201 | 0.195 | 0.215 | 0.209 | 0.178 | 0.200 | 0.006 |
| Slope2 | 0.141 | 0.138 | 0.133 | 0.133 | 0.130 | 0.135 | 0.002 |
| Precip2 | 0.077 | 0.059 | 0.061 | 0.082 | 0.056 | 0.067 | 0.005 |
|
| 0.010 | 0.009 | 0.013 | 0.009 | 0.009 | 0.010 | 0.001 |
|
| 0.010 | 0.006 | 0.002 | 0.007 | 0.009 | 0.007 | 0.001 |
|
| 0.010 | 0.006 | 0.005 | 0.004 | 0.010 | 0.007 | 0.001 |
|
| 0.001 | 0.003 | 0.003 | 0.002 | 0.001 | 0.002 | < 0.001 |
a Covariates with prefix d are measured as distance to the environmental or anthropogenic feature.
Model-selection analysis for logistic regression modeling of swine farm distribution in the conterminous U.S.
Results are shown for models with AICΔ ≤ 2.0 and all these models were used to develop model-averaged coefficients for our final distribution model.
| Model |
|
|
|
|
|---|---|---|---|---|
|
| -3463.2 | 10 | 6946.5 | 0.0 |
|
| -3462.3 | 11 | 6946.7 | 0.2 |
|
| -3463.5 | 10 | 6947.1 | 0.6 |
|
| -3464.5 | 9 | 6947.1 | 0.6 |
|
| -3465.0 | 9 | 6947.9 | 1.4 |
|
| -3464.0 | 10 | 6948.0 | 1.5 |
|
| -3465.2 | 9 | 6948.4 | 1.9 |
a Covariates with prefix d are measured as distance to the environmental or anthropogenic feature.
Fig 5The probability surface used to simulate the locations of individual farms throughout the conterminous United States.
The blue to red color scheme represents a gradient of low to high predicted probability values at a 100 m resolution.