| Literature DB >> 36172296 |
Byron K Williams1, Eleanor D Brown1.
Abstract
The actual state of ecological systems is rarely known with certainty, but management actions must often be taken regardless of imperfect measurement (partial observability). Because of the difficulties in accounting for partial observability, it is usually treated in an ad hoc fashion, or simply ignored altogether. Yet incorporating partial observability into decision processes lends a realism that has the potential to improve ecological outcomes significantly. We review frameworks for dealing with partial observability, focusing specifically on dynamic ecological systems with Markovian transitions, i.e., transitions among system states that are influenced by the current system state and management action over time. Fully observable states are represented in an observable Markov decision process (MDP), whereas obscure or hidden states are represented in a partially observable process (POMDP). POMDPs can be seen as a natural extension of observable MDPs. Management under partial observability generalizes the situation for complete observability, by recognizing uncertainty about the system's state and incorporating sequential observations associated with, but not the same as, the states themselves. Decisions that otherwise would depend on the actual state must be based instead on state probability distributions ("belief states"). Partial observability requires adaptation of the entire decision process, including the use of belief states and Bayesian updates, valuation that includes expectations over observations, and optimal strategy that identifies actions for belief states over a continuous belief space. We compare MDPs and POMDPs and highlight POMDP applications to some common ecological problems. We clarify the structure and operations, approaches for finding solutions, and analytic challenges of POMDPs for practicing ecologists. Both observable and partially observable MDPs can use an inductive approach to identify optimal strategies and values, with a considerable increase in mathematical complexity with POMDPs. Better understanding of POMDPs can help decision makers manage imperfectly measured ecological systems more effectively. Published 2022. This article is a U.S. Government work and is in the public domain in the USA. Ecology and Evolution published by John Wiley & Sons Ltd.Entities:
Keywords: Markov decision process; decision strategy; partial observability; system dynamics; uncertainty
Year: 2022 PMID: 36172296 PMCID: PMC9468910 DOI: 10.1002/ece3.9197
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 3.167
Applications of partially observable Markov decision processes in ecology
| Author(s) | Resource Context | Species | Actions | Features | Solution approach |
|---|---|---|---|---|---|
| Chadès et al. ( | Endangered species | Sumatran tiger | Manage habitat, survey, do nothing | 2 states (extant/extinct), 3 actions, 3 time steps | Analytical solution, incremental pruning |
| Fackler and Pacifici ( | Pest infestation | General pest species | Unspecified | 2 hidden models, environment signals used for updating model uncertainty | State space discretization, stochastic dynamic programming |
| Haight and Polasky ( | Invasive species | Forest pest | Monitoring only, treatment only, both, or neither | Discretized approximation of belief space for 3 states | Customized FORTRAN program for 4 actions over 20 time periods |
| Lane ( | Commercial fishing | Salmon ( | Intra‐seasonal fishing within geographic fishing zones | Finite states for each zone; actions are fishing within a zone, or not | Sondik one‐pass algorithm |
| McDonald‐Madden et al. ( | Endangered species | Sumatran tiger | Survey, management, do nothing | Considers 2 populations, compares results with perfect observability | Incremental pruning |
| Memarzadeh and Boettiger ( | Marine fisheries | Argentine hake | Fish harvest | Handles either or both structural uncertainty and partial observability | Point‐based value iteration with SARSOP |
| Regan et al. ( | Invasive species | Branched broomrape | Low cost and inefficient action, high cost but efficient action, do nothing | 3 states, 3 actions, 2 observations | Stochastic dynamic programming |
| Rout et al. ( | Invasive species | Black rat | Quarantine, control, mixture | 3 states, 3 actions | Incremental pruning |
| Tomberlin ( | Endangered seabird habitat | Marbled murrelet | Monitoring, habitat management | 2 occupancy states, 2 actions, 5 time steps | Eagle/Monahan algorithms |
| Pascal et al. ( | Threatened and endangered species | Sumatran tiger | Manage, monitor | Framed in terms of the termination of management or surveying, 2 observations | Point‐based value iteration with SARSOP |
|
Tomberlin ( | Erosion control on roads in redwood forests | Forestry | High or low monitoring effort, erosion treatment, no action | High‐ and low‐level erosion condition | Backward recursion with |
| Nicol et al. ( | Management of shorebird habitats under changing sea‐level conditions | 10 species of migratory shorebirds | Protection of non‐breeding sites against sea‐level change | Factored MOMDP with observed states, uncertain and non‐stationary transition structure | Point based value iteration with Symbolic Perseus |
| Nicol and Chadès ( | Marine oil spill risks | Sea otter | Reduction or cleanup of oil spills, reintroduction, monitoring | Discretization of a continuous state with CU‐tree | Point‐based value iteration with Perseus |
| Kling et al. ( | Marine species in a preserve | Lionfish | Lionfish removal, monitoring | Observations based on removal effort and presence/absence of monitoring | Projected belief MDP |
| Memarzadeh et al. ( | Commercial fisheries | Multiple fish species | Fishing catch quotas | Compares MSY, MDP, and POMDP solutions | Point‐based value iteration with SARSOP |
| Sloggy et al. ( | Commercial forestry | Loblolly pine | Monitoring, harvest, regeneration, delay | Valuation based on stochastic price and harvest amounts | Projected belief MDP |
| Fackler and Haight ( | Invasive species | Forest pest | Treatment only, monitoring, both monitoring and treatment, no action | Discretization of a continuous state | Monahan's exact method, variations on Lovejoy's discretization method |
| Fackler et al. ( | Recreation management | Golden eagle | Management of recreation near nesting sites | Observable occupancy, uncertain disturbance effect | Unspecified |
| MacLachlan et al. ( | Bovine tuberculosis in cattle herds | Domestic cattle | Herd testing for infection, followed by isolation if found | 3 states for a herd, testing and control of multiple herds | Projected belief MDP |
| Sethi et al. ( | Commercial fisheries | Commercial fish species | Fish harvest quotas | Discrete observations, continuous states and actions; spline interpolations | Value function iteration for infinite time horizon |
| Chadès et al. ( | Australian birds threatened by habitat degradation and predation | Gouldian finch | Fire and grazing management, feral cat control, provide nesting boxes, do nothing | Observable states, stationary but unknown state transitions | MOMDP (“hidden model MDP”) with discrete states and models, several solution algorithms |
See Uthe and Veloso (1998).
FIGURE 1Model of waterfowl population dynamics. Includes survival rates for spring–summer and fall–winter , along with harvest rates for young and adults and age ratio for reproduction/recruitment.
FIGURE 2Influence diagram for an observable Markov decision process
FIGURE 3Influence diagram for a partially observable Markov decision process (after Chadès et al., 2021).
FIGURE 4Policy tree for an observable Markov decision process
FIGURE 5Policy tree for a partially observable Markov decision process
FIGURE 6Value functions for terminal time T, with 2 states, 4 actions, and belief state . Each action generates a different return function . Partitioning of belief space into 3 segments and the optimal actions for each are determined by which return function produces the largest value at each belief state. Optimal value function is indicated by darkened line segments.
Immediate return for conservation action a given state x
| Action | ||||
|---|---|---|---|---|
|
|
|
| ||
| State |
|
|
|
|
|
|
|
|
| |
Optimal time‐specific values and conservation actions for state x
| Time | |||
|---|---|---|---|
|
|
|
| |
| State |
|
|
|
| State |
|
|
|
Probabilities corresponding to observation a for a given state x
| Observation | |||
|---|---|---|---|
|
|
|
| |
| State |
|
|
|
| State |
|
|
|
Immediate returns for two actions, given two states. corresponds to returns averaged over belief state b
| State | |||
|---|---|---|---|
|
|
|
| |
| Action |
|
|
|
| Action |
|
|
|
FIGURE 7Valuation at time T–1 for a policy tree with root action and optimal sub‐policies thereafter. Graphs display (i) immediate returns ; (ii) backcast values for each observation, along with partition segment cutpoints; and (iii) the accumulation of immediate returns and optimal backcast values over observations to get .
FIGURE 8Valuation at time T–1 for a policy tree with root action and optimal sub‐policies thereafter. Graphs display (i) immediate returns ; (ii) backcast values for each observation, along with partition segment cutpoints; and (iii) the accumulation of immediate returns and optimal backcast values over observations to get .
Values at time T–1 for actions and observation following
| Observation | ||||
|---|---|---|---|---|
|
|
|
| ||
| Action |
|
|
|
|
|
|
|
|
| |
| Action |
|
|
|
|
|
|
|
|
| |
FIGURE 9Combining the value functions and to produce optimal valuation for time T–1. Partitioning of belief space is determined by the time T partitions for and , and the intersection points of the 2 functions. The optimal action for belief states in each partition segment is determined by which of the 2 value functions produces the larger value.