| Literature DB >> 35921327 |
Hamed Ghaedi1, Allison C Reilly1, Hiba Baroud2, Daniel V Perrucci2, Celso M Ferreira3.
Abstract
A spatially-resolved understanding of the intensity of a flood hazard is required for accurate predictions of infrastructure reliability and losses in the aftermath. Currently, researchers who wish to predict flood losses or infrastructure reliability following a flood usually rely on computationally intensive hydrodynamic modeling or on flood hazard maps (e.g., the 100-year floodplain) to build a spatially-resolved understanding of the flood's intensity. However, both have specific limitations. The former requires both subject matter expertise to create the models and significant computation time, while the latter is a static metric that provides no variation among specific events. The objective of this work is to develop an integrated data-driven approach to rapidly predict flood damages using two emerging flood intensity heuristics, namely the Flood Peak Ratio (FPR) and NASA's Giovanni Flooded Fraction (GFF). This study uses data on flood claims from the National Flood Insurance Program (NFIP) to proxy flood damage, along with other well-established flood exposure variables, such as regional slope and population. The approach uses statistical learning methods to generate predictive models at two spatial levels: nationwide and statewide for the entire contiguous United States. A variable importance analysis demonstrates the significance of FPR and GFF data in predicting flood damage. In addition, the model performance at the state-level was higher than the nationwide level analysis, indicating the effectiveness of both FPR and GFF models at the regional level. A data-driven approach to predict flood damage using the FPR and GFF data offer promise considering their relative simplicity, their reliance on publicly accessible data, and their comparatively fast computational speed.Entities:
Mesh:
Year: 2022 PMID: 35921327 PMCID: PMC9348728 DOI: 10.1371/journal.pone.0271230
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1Framework of methodology.
Out-of-sample confusion matrix for monthly-county records in 2016.
This matrix sums together each of the 2,555 counties’ confusion matrices.
| Predicted | |||
|---|---|---|---|
|
|
| ||
|
|
| 20,188 (66%) | 7,587 (25%) |
|
| 1,246 (4%) | 1,639 (5%) | |
Summary of model performance for FPR model.
|
| In-Sample | Out-of-sample | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| ||
|
| 0.64 | 6.11 | 0.67 | 0.08 | 6.70 | 1.37 | 0.16 | 0.58 |
|
| 0.51 | 5.41 | 0.64 | 0.08 | 6.75 | 1.37 | 0.15 | 0.73 |
|
| 0.98 | 7.09 | 1.01 | 0.07 | 7.09 | 1.39 | 0.14 | 0.48 |
|
| 0.79 | 13.88 | 0.81 | 0.20 | 12.45 | 10.81 | 0.13 | 0.28 |
|
| 0.83 | 7.39 | ||||||
aMAE—Mean Absolute Error.
bRMSE—Root Mean Squared Error.
cSD-MAE—Standard Deviation of MAE.
dSD-RMSE—Standard Deviation of RMSE.
Summary of model performance for GFF model.
|
| In-Sample | Out-of-sample | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| ||
|
| 0.67 | 6.36 | 0.73 | 0.10 | 6.97 | 1.72 | 0.08 | 0.57 |
|
| 0.42 | 4.33 | 0.65 | 0.10 | 6.91 | 1.59 | 0.11 | 0.86 |
|
| 1.04 | 7.26 | 1.05 | 0.09 | 7.11 | 1.75 | 0.10 | 0.42 |
|
| 0.81 | 7.86 | 0.83 | 0.12 | 7.89 | 1.93 | 0.02 | 0.10 |
|
| 0.83 | 7.39 | ||||||
aMAE—Mean Absolute Error.
bRMSE—Root Mean Squared Error.
cSD-MAE—Standard Deviation of MAE.
dSD-RMSE—Standard Deviation of RMSE.
Fig 2Variable importance plot for the FPR model.
See S1 Table for a description of variables.
Fig 3Variable importance plot for the GFF model.
See S1 Table for a description of variables.
Fig 4Partial dependence plots of top three important variables for both the FPR model and the GFF model.
See S1 Table for a description of variables.
Fig 5Predictive accuracy measures for each state using RF model.
a. R2 (FPR model), b. R2 (GFF model), c. RMSE reduction compared to null model (FPR model, out-of-sample), d. RMSE reduction compared to null model (GFF model, out-of-sample), e. MAE reduction compared to null model (FPR model, out-of-sample), f. MAE reduction compared to null model (GFF model, out-of-sample), g. Correlation between predicted and actual values (FRP model), h. Correlation between predicted and actual values (GFF model). Note: MAE—Mean Absolute Error; RMSE—Root Mean Squared Error. The base map in this figure is from 2010 TIGER/Line Shapefiles, prepared by the U.S. Census Bureau. It is in the public domain and is not copyrighted [51].
The most important variable, in terms of error reduction, for each state in statewide analysis.
See S1 Table for a description of variables and abbreviations.
| State | Top Important Variable (FPR) | Top Important Variable (GFF) | State | Top Important Variable (FPR) | Top Important Variable (GFF) |
|---|---|---|---|---|---|
| AL | ratio_greater_0.2 | flooded.frac.max | MT | ratio_greater_1 | flooded.frac.max |
| AZ | population_density | flooded.frac.max | NE | ratio_greater_0.5 | flooded.frac.max |
| AR | max_ratio | gff_greater_0.05 | NV | ratio_greater_0.5 | population_density |
| CA | max_ratio | barren | NH | ratio_greater_0.5 | flooded.frac.max |
| CO | devmed | herbaceuous | NJ | max_ratio | flooded.frac.max |
| CT | ratio_greater_0.2 | water | NM | penetration_rate | flooded.frac.max |
| DE | ratio_greater_1 | gff_greater_0.05 | NY | penetration_rate | penetration_rate |
| FL | ratio_greater_0.5 | gff_greater_0.05 | NC | ratio_greater_1 | gff_greater_0.05 |
| GA | max_ratio | gff_greater_0.05 | OH | ratio_greater_0.5 | flooded.frac.max |
| ID | population_density | devmed | OK | ratio_greater_0.5 | flooded.frac.max |
| IL | ratio_greater_1 | gff_greater_0.05 | OR | max_ratio | flooded.frac.max |
| IN | ratio_greater_2 | flooded.frac.max | PA | ratio_greater_1 | flooded.frac.max |
| IA | ratio_greater_1 | gff_greater_0.05 | SC | max_ratio | gff_greater_0.05 |
| KS | ratio_greater_0.5 | flooded.frac.max | SD | max_ratio | gff_greater_0.05 |
| KY | water | flooded.frac.max | TN | max_ratio | gff_greater_0.05 |
| LA | max_ratio | gff_greater_0.2 | TX | max_ratio | gff_greater_0.05 |
| ME | max_ratio | flooded.frac.max | UT | max_ratio | flooded.frac.max |
| MD | ratio_greater_0.5 | gff_greater_0.05 | VT | ratio_greater_0.5 | herbaceuous |
| MA | planted/cultivated | flooded.frac.max | VA | ratio_greater_1 | gff_greater_0.05 |
| MI | population_density | flooded.frac.max | WA | max_ratio | flooded.frac.max |
| MN | ratio_greater_0.5 | gff_greater_0.05 | WV | max_ratio | flooded.frac.max |
| MS | ratio_greater_1 | gff_greater_0.05 | WI | max_ratio | gff_greater_0.05 |
| MO | ratio_greater_0.2 | flooded.frac.max | WY | max_ratio | flooded.frac.max |
Comparison of the maximum actual number of claims from actual events versus the maximum predicted number of claims from predicted events using the state-level analysis in 2016 at the county level.
| Error metric | Null model | Maximum actual vs. maximum predicted claims | Error reduction compared to null model (%) | ||
|---|---|---|---|---|---|
| FPR model | GFF model | FPR model | GFF model | ||
| MAE | 112.77 | 62.28 | 62.60 | 44.77 | 44.49 |
| RMSE | 471.65 | 471.57 | 472.07 | 0.02 | -0.09 |
|
| 0.05 | 0.06 | |||
aMAE—Mean Absolute Error.
bRMSE—Root Mean Squared Error.
Comparison of the total actual number of claims from actual events versus the total predicted number of claims from predicted events using the state-level analysis in 2016 at the county level.
| Error metric | Null model | Total actual vs. total predicted claims | Error reduction compared to null model (%) | ||
|---|---|---|---|---|---|
| FPR model | GFF model | FPR model | GFF model | ||
| MAE | 138.76 | 67.32 | 66.23 | 51.48 | 52.27 |
| RMSE | 581.69 | 560.83 | 560.15 | 3.59 | 3.70 |
|
| 0.22 | 0.22 | |||
aMAE—Mean Absolute Error.
bRMSE—Root Mean Squared Error.
Comparison of the total actual number of claims from actual events versus the total predicted number of claims from only predicted events with some time overlap with the actual events using the state-level analysis in 2016 at the county level.
| Error metric | Null model | Actual claims vs. overlapped predicted claims | Error reduction compared to null model (%) | ||
|---|---|---|---|---|---|
| FPR model | GFF model | FPR model | GFF model | ||
| MAE | 65.79 | 28.82 | 27.50 | 56.19 | 58.19 |
| RMSE | 195.17 | 180.27 | 178.21 | 7.63 | 8.69 |
|
| 0.16 | 0.18 | |||
aMAE—Mean Absolute Error.
bRMSE—Root Mean Squared Error.