Rong Zhang1, Zhao-Yue Chen1, Li-Jun Xu1, Chun-Quan Ou2. 1. State Key Laboratory of Organ Failure Research, Department of Biostatistics, Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou 510515, China. 2. State Key Laboratory of Organ Failure Research, Department of Biostatistics, Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou 510515, China. Electronic address: ouchunquan@hotmail.com.
Abstract
BACKGROUND: Drought is a major natural disaster that causes severe social and economic losses. The prediction of regional droughts may provide important information for drought preparedness and farm irrigation. The existing drought prediction models are mainly based on a single weather station. Efforts need to be taken to develop a new multistation-based prediction model. OBJECTIVES: This study optimizes the predictor selection process and develops a new model to predict droughts using past drought index, meteorological measures and climate signals from 32 stations during 1961 to 2016 in Shaanxi province, China. METHODS: We applied and compared two methods, including a cross-correlation function and a distributed lag nonlinear model (DLNM), in selecting the optimal predictors and specifying their lag time. Then, we built a DLNM, an artificial neural network model and an XGBoost model and compared their validations for predicting the Standardized Precipitation Evapotranspiration Index (SPEI) 1-6 months in advance. RESULTS: The DLNM was better than the cross-correlation function in predictor selection and lag effect determination. The XGBoost model more accurately predicted SPEI with a lead time of 1-6 months than the DLNM and the artificial neural network, with cross-validation R2 values of 0.68-0.82, 0.72-0.89, 0.81-0.92, and 0.84-0.95 at 3-, 6-, 9- and 12-month time scales, respectively. Moreover, the XGBoost model had the highest prediction accuracy for overall droughts (89%-97%) and for three specific drought categories (i.e., moderate, severe, and extreme) (76%-94%). CONCLUSION: This study offers a new modeling strategy for drought predictions based on multistation data. The incorporation of nonlinear and lag effects of predictors into the XGBoost method can significantly improve prediction accuracy of SPEI and drought.
BACKGROUND: Drought is a major natural disaster that causes severe social and economic losses. The prediction of regional droughts may provide important information for drought preparedness and farm irrigation. The existing drought prediction models are mainly based on a single weather station. Efforts need to be taken to develop a new multistation-based prediction model. OBJECTIVES: This study optimizes the predictor selection process and develops a new model to predict droughts using past drought index, meteorological measures and climate signals from 32 stations during 1961 to 2016 in Shaanxi province, China. METHODS: We applied and compared two methods, including a cross-correlation function and a distributed lag nonlinear model (DLNM), in selecting the optimal predictors and specifying their lag time. Then, we built a DLNM, an artificial neural network model and an XGBoost model and compared their validations for predicting the Standardized Precipitation Evapotranspiration Index (SPEI) 1-6 months in advance. RESULTS: The DLNM was better than the cross-correlation function in predictor selection and lag effect determination. The XGBoost model more accurately predicted SPEI with a lead time of 1-6 months than the DLNM and the artificial neural network, with cross-validation R2 values of 0.68-0.82, 0.72-0.89, 0.81-0.92, and 0.84-0.95 at 3-, 6-, 9- and 12-month time scales, respectively. Moreover, the XGBoost model had the highest prediction accuracy for overall droughts (89%-97%) and for three specific drought categories (i.e., moderate, severe, and extreme) (76%-94%). CONCLUSION: This study offers a new modeling strategy for drought predictions based on multistation data. The incorporation of nonlinear and lag effects of predictors into the XGBoost method can significantly improve prediction accuracy of SPEI and drought.