Literature DB >> 35036147

Multiple comparisons of precipitation variations in different areas using simultaneous confidence intervals for all possible ratios of variances of several zero-inflated lognormal models.

Patcharee Maneerat1, Sa-Aat Niwitpong2.   

Abstract

Flash flooding and landslides regularly cause injury, death, and homelessness in Thailand. An advancedwarning system is necessary for predicting natural disasters, and analyzing the variability of daily precipitation might be usable in this regard. Moreover, analyzing the differences in precipitation data among multiple weather stations could be used to predict variations in meteorological conditions throughout the country. Since precipitation data in Thailand follow a zero-inflated lognormal (ZILN) distribution, multiple comparisons of precipitation variation in different areas can be addressed by using simultaneous confidence intervals (SCIs) for all possible pairwise ratios of variances of several ZILN models. Herein, we formulate SCIs using Bayesian, generalized pivotal quantity (GPQ), and parametric bootstrap (PB) approaches. The results of a simulation study provide insight into the performances of the SCIs. Those based on PB and the Bayesian approach via probability matching with the beta prior performed well in situations with a large amount of zero-inflated data with a large variance. Besides, the Bayesian based on the reference-beta prior and GPQ SCIs can be considered as alternative approaches for small-to-large and medium-to-large sample sizes from large population, respectively. These approaches were applied to estimate the precipitation variability among weather stations in lower southern Thailand to illustrate their efficacies. ©2021 Maneerat and Niwitpong.

Entities:  

Keywords:  Bayesian approach; Beta prior; Parametric bootstrap approach; Precipitation variation; Rainfall data; Ratio of variances; Simulation

Year:  2021        PMID: 35036147      PMCID: PMC8697768          DOI: 10.7717/peerj.12659

Source DB:  PubMed          Journal:  PeerJ        ISSN: 2167-8359            Impact factor:   2.984


Introduction and Motivation

In early 2021, approximately 186,300 people in lower southern Thailand were affected by heavy rainfall resulting in flash flooding, landslides, and windstorms, as reported by Thailand’s Department of Disaster Prevention and Migration (DDPM) (Thailand, 2021). Four provinces in the lower southern region of Thailand were affected by flooding: Songkhla (60 households), Pattani (2,810 households), Yala (12,082 households), and Narathiwat (22,308 households). Meanwhile, landslides occurred in Yala and Narathiwat that affected approximately 57 households (Thailand, 2021). Unfortunately, these natural disasters resulted in deaths and injuries (David, 2021). It would be possible to reduce the impact of natural disasters if governmental organizations had an early warning system that could be triggered to warn people in high-risk areas in advance of impending catastrophes. Rainfall dispersion data can provide essential information indicating imminent flooding when variation is high by analyzing historical precipitation data. Importantly, it could also be used to predict precipitation variation in each area. From the historical evidence of flooding in lower southern Thailand, the precipitation data in four areas are inflated with zero observations, while the non-zero precipitation records are log-normally distributed, as can be seen in An Empirical Application Section. These properties indicate that precipitation data obey the assumptions for a zero-inflated lognormal (ZILN) distribution and can be modeled accordingly. The ZILN model, also referred to as the delta-lognormal model, is appropriate for modeling right-skewed data with a proportion of zero (Aitchison & Brown, 1963; Fletcher, 2008; Wu & Hsieh, 2014; Hasan & Krishnamoorthy, 2018; Maneerat, Niwitpong & Niwitpong, 2019). Variance is a dispersion measure of probability used in statistical inference for both point and interval (e.g., confidence interval: CI) estimation. Several researchers have formulated point and interval estimates via various approaches. For example, Burdick & Graybill (1984) established CIs for linear combinations of the variance components using the unbalanced one-way classification model and the Graybill-Wang procedure by considering the inequality of the design (Graybill & Wang, 1980). Ciach & Krajewski (1999) estimated the radar-raingauge difference variances which can be separated into the area-point ground raingauge originating from resolution difference between them, and the error of the radar area-average rainfall estimate. Another important approach for variance estimation is bootstrapping based on t-statistics to formulate nonparametric CIs for a single variance and the difference between variances, which was used to estimate the variance in insurance data for properties (Cojbasic & Tomovic, 2007). Bebu & Mathew (2008) used a modified single log-likelihood ratio procedure to construct CIs for the ratio of bivariate lognormal variances and applied it to compare variation in health care costs. Cojbasic & Loncar (2011) suggested Hall’s bootstrapped-t method for constructing one-sided CIs (lower and upper endpoint CIs) for the variances of skewed distributions and illustrated the efficacy of their method by analyzing revenue variability within the food retail industry. Later, Herbert et al. (2011) suggested an analytical method for the difference between two independent variances that performed well even with small unequal sample sizes and highly skewed leptokurtic data; they used data from a randomized trial for a cholesterol-lowering drug to portray the efficacies of their proposed methods. Harvey & Merwe (2012) revealed that a Bayesian CI based on the highest posterior density outperformed one based on the equal-tailed interval for the variance of lognormal distribution with zero observations. Maneerat, Niwitpong & Niwitpong (2020) showed that the highest posterior density interval based on a probability matching prior produced the narrowest interval with correct coverage for comparing delta-lognormal variances; they applied it to estimate the difference between rainfall variability in the lower and upper northern regions of Thailand. Recently, Bayesian credible intervals based on a non-informative prior were presented by Maneerat, Niwitpong & Niwitpong (2021a) for the single variance of a delta-lognormal model that was used on daily rainfall records. Nevertheless, no studies have yet been conducted on simultaneous CIs (SCIs) for pairwise comparisons of the variances of several ZILN models, and so we addressed our research toward filling this gap. Hence, we estimated all possible ratios of variances of several ZILN models by using SCIs based on Bayesian, parametric bootstrap (PB), and generalized pivotal quantity (GPQ) approaches. The reasons for choosing them are that the Bayesian and PB approaches can be used to construct CIs capable of handling situations with large differences in the variances and high proportion of zero values of delta-lognormal models, respectively (Maneerat, Niwitpong & Niwitpong, 2020), while CI based on the GPQ approach perform quite well when the variance was large maneeratEstimatingFishDispersal2020. Their efficacies were determined via simulation studies and precipitation data from four areas of the lower southern region of Thailand in terms of the coverage rate (CR), the lower error rate (LER), the upper error rate (UER), and the average width (AW).

Model and methods

Model

For h groups, d; i = 1, 2, …, h, denotes the probability of having zero observations while the remaining probability for non-zero observations, , follows a lognormal distribution denoted as with mean μ and variance . For random samples from the groups, let Y = (Y, Y, …., Y) denote a ZILN variate based on n observations from group i with the probability density function given by For Y = 0, the number of zero observations n follows a binomial distribution with sample size n and the probability of having zero observations d, where n = n + n, n = #{j:Y = 0} and n = #{j:Y > 0}; j = 1, 2, …, n. For Y > 0, W = lnY are normally distributed with mean μ and variance . For a ZILN model, the maximum likelihood estimates of d, μ and are , and , respectively. For the ith group, the population variance of Y is given by which can be log-transformed as . Considering the third term of T leads to obtaining when is large. Thus, the log-transformed variance of V can be approximated as Given , and from the observations, the estimates of T can be written as ; . Using the delta theorem, the variance of becomes In the present study, the parameter of interest is all pairwise ratios among the log-transformed variances of several ZILN models, which is defined as Its estimates can be obtained as ; ∀i ≠ k and i, k =1 , 2, …, h. From Eq. (4), the variance of can be expressed as where the covariance between and is because Y = (Y, Y, …., Y) comprise independent and identically distributed (iid) random vector from a ZILN model. Thus, we can obtain estimates of that are independent random variables. Using estimates and from the samples enables the estimated variance of to become where and denote the estimated parameters of and , respectively.

Methods

To estimate λ, the SCIs are formulated based on Bayesian, GPQ and PB approaches.

The Bayesian approach

The essential feature of Bayesian approach is to use the situation-specific prior distribution that reflects knowledge or subjective belief about the parameter of interest; this is modified in accordance with Baye’s Theorem to yield the posterior distribution. Thus, CIs based on the Bayesian approach are derived by using the posterior distribution. In Bayesian theory, the CI is referred to as the credible interval because it is not unique on the posterior distribution. The following methods are used to define suitable credible intervals: the narrowest interval for a univariate distribution (the highest posterior density interval) (Box & Tiao, 1973); the interval when the probability of being below is the same as being above, which is sometimes referred to as the equal-tailed interval (Gelman et al., 2014); or the interval with the mean as the central point (assuming that it exists). In the present study, the SCIs based on the Bayesian approach were constructed based on the equal-tailed interval. Motivated by Maneerat, Niwitpong & Niwitpong (2020), the probability-matching-beta (PMB) and reference-beta (RB) priors were our choice for parameter in this study. Thus, Bayesian SCIs for λ were established as follows: The PMB prior: The probability-matching prior for is combined with the prior of as a beta distribution with a = b = 1/2. Thus, the PMB prior for can be defined as When updated with its likelihood, we obtain The respective marginal posterior distributions of are which are denoted as , , and , respectively. Thus, the posterior of λ becomes where and . In agreement with Ganesh (2009), the 100(1 − α)% Bayesian-based SCI with PMB prior for λ is where stands for the (1 − α) percentile of the distribution of . The RB prior: This is a non-informative prior derived from the Fisher information matrix (Maneerat, Niwitpong & Niwitpong, 2020). The RB prior of is defined as in which the prior of d′ is a beta distribution. When combined with its likelihood Eq. (9), the posterior of differs from the PMB prior as follows: Moreover, it can be similarly denoted as , and , respectively. The posterior of λ is , where and . According to Ganesh (2009), the 100(1 − α)% Bayesian-based SCI with the RB prior for λ is where stands for the (1 − α) percentile of the distribution of .

The GPQ approach

Motivated by Wu & Hsieh (2014), the GPQ of d is formulated using the arcsin square-root transformation of the variance. Moreover, the GPQs for are also obtained from transformation of the normal approximation by using the central limit theorem (Tian, 2005; Hasan & Krishnamoorthy, 2017). The GPQ for T can be written as where . The random variables , and are independent from standard normal, normal and distributions, respectively. Thus, the corresponding GPQ of λ can be expressed as Similarly, denotes the GPQ of T; , , and . Therefore, the 100(1 − α)% SCI for λ based on the GPQ approach is given by where denotes the (1 − α) percentile of the QGPQ distribution; the QGPQ is derived as In agreement with Hannig et al. (2006), Kharrati-Kopaei & Eftekhar (2017), the asymptotic coverage probability of the SCI for λ based on the GPQ is slightly modified from that in Maneerat, Niwitpong & Niwitpong (2021b) (the proof of Theorem 1 in the Appendix). Let . For Y = 0, n is binomially distributed with the proportion of zero inflation d = E(n/n) . For Y > 0, lnY is log-normally distributed with mean μ = E(lnY)and variance . Moreover, let λ = T/T; from group ibe the log-transformed variance of ZILN. Given y = (y, y, …., y), let be an approximated variance of , where are the estimates of (T, T). Suppose that n/n → φ ∈ (0, 1) as , thus it follows that the asymptotically coverage probability of 100 (1 − α)% SCI for λ based the GPQ approach is given by for ∀i ≠ k and i, k =1 , …, h.

The PB approach

Here, we assume that the data come from a known distribution with unknown parameters that are estimated by using samples stimulated from the estimated distribution. In the present study, the PB approach is adjusted to suit our particular situation. Let , and be the observed values of , , and representing the estimated values of parameters d, μ, and , respectively. Thus, we can obtain the empirical distribution of T based on the PB approach. In accordance with Sadooghi-Alvandi & Malekzadeh (2014), the respective sampling distributions of (, , ) are where and are independent random variables with standard normal and Chi-square distributions, respectively. The PB variable-based pivotal quantity is expressed as where and . By replacing observed values from the samples, we respectively obtain where and . Hence, the 100(1 − α)% SCI for λ based on the PB approach is where is the (1 − α) percentile of the distribution of MPB. Theorem 2 shows the asymptotic coverage probability of the 100(1 − α)% SCI for λ based on the PB approach (see the proof in the Appendix ). Suppose that Y = (Y, Y, …., Y) comprise an iid random vector from a ZILN model based on n observations from population group i. Let be the estimate of λ, where and are the approximately log-transformed variances of and from the population groups ith and kth, respectively. Hence, where is the estimated variance of ; ∀i ≠ k and i, k =1 , 2, .., h.

Simulation studies and results

Simulation studies were conducted to assess the performances of the SCIs based Bayesian, GPQ, and PB approaches for all pairwise ratios of variances of several ZILN distributions: Bayesian SCIs based on PMB and RB priors (Maneerat, Niwitpong & Niwitpong, 2020), the GPQ-based SCI (Wu & Hsieh, 2014), and the PB-based SCI (Sadooghi-Alvandi & Malekzadeh, 2014; Li, Song & Shi, 2015; Kharrati-Kopaei & Eftekhar, 2017). CRs, LERs, UERs, and AWs of the SCIs were determined when the population group size(h) were fixed at 3 and 5; the optimal values of CR, LER, UER, and AW are 95%, 5%, 5% and 0, respectively, which were used to judge the best-performing SCI. Critical values , , and for the Bayesian SCIs based on PMB and RB priors, GPQ and PB, respectively, were also assessed. Throughout the simulation studies, the simulation procedure to estimate the CRs, LERs, and UERs was as follows: Generate random samples Y = (Y, Y, …., Y) from , and compute , ; i = 1, 2, …, h from the samples. Compute the critical values for each method using 2500 Monte Carlo simulations. Apply the SCIs based on Bayesian-based PMB and RB priors, GPQ, and PB approaches given in Eqs. (14), (18), (21) and (31), respectively, and record whether or not the values of (λ; i ≠ k) fall within their corresponding confidence intervals. Repeat steps (i)-(iii) M = 5000 times. For each method: obtain the number of times that all (λ; i ≠ k) are in their corresponding SCIs to estimated the CR. Obtain the number of times that all (λ; i ≠ k) is less than or greater than their corresponding SCIs to estimate the LER and UER, respectively. For the three-group comparison, the following parameter combinations were used: large variances ; small (30, 30, 30), moderate (50, 50, 50), large [(100, 100, 100) and (100, 100, 200)], small-to-large (30, 50, 100) and medium-to-large (50, 100, 200) sample sizes; and zero-inflation percentages of (10, 20, 30), (10, 30, 50) and (30, 50, 50). For the five-group comparison, the following parameter combinations were used: large variances =(1, 1, 2, 2, 3); small-to-large (30, 50, 50, 100, 200), medium-to-large (50, 50, 50, ) (100, 100), and large (70, 100, 100, 200, 200) sample sizes; and zero-inflation percentages of (10, 10, 20, 20, 20), (20, 20, 30, 30, 50) and (50, 50, 50, 70, 70). The results are reported in Table 1.
Table 1

Performance measures of SCIs-based different approaches.

n i di(%)B-PMBB-RBGPQPBAW
LERCRUERLERCRUERLERCRUERLERCRUERB-PMBB-RBGPQPB
3 sample groups and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$({\sigma }_{1}^{2},{\sigma }_{2}^{2},{\sigma }_{3}^{2})=(3,5,7)$\end{document}σ12,σ22,σ32=3,5,7
(303)(10,20,30)1.99397.9730.0331.30798.6930.0000.70799.2930.0002.46097.5400.00022.96125.49327.764 22.942
(10,30,50)1.88098.1130.0071.20098.8000.0000.96799.0330.0002.90097.1000.00028.70233.32333.300 27.254
(30,50,50)1.12098.8730.0070.42799.5730.0000.62099.3800.0002.52097.4800.00030.76435.73736.813 29.113
(503)(10,20,30)2.83396.8000.3672.30097.5670.1331.10798.8870.0072.34797.6270.027 15.521 16.54419.07816.893
(10,30,50)2.88797.0270.0872.17397.8000.0271.25398.7470.0002.60797.3930.000 18.848 20.65422.40319.733
(30,50,50)2.08797.8400.0731.41398.5670.0200.97399.0270.0002.32097.6730.007 20.104 21.99624.56721.096
(1003)(10,20,30)3.48095.1401.3803.20095.6931.1071.27398.5270.2001.96097.7670.273 10.015 10.32512.44811.681
(10,30,50)3.66095.6270.7133.20096.4200.3801.42798.5400.0332.08797.8330.080 11.866 12.41014.32713.422
(30,50,50)3.22096.0400.7402.78096.7470.4731.16798.8130.0202.07397.8530.073 12.389 12.94815.40814.202
(30,50,100)(10,20,30)1.78796.7531.4601.36797.4531.1800.38099.4800.1401.12798.4670.407 12.846 13.40216.55214.152
(10,30,50)1.85397.1271.0201.38797.9930.6200.42099.5530.0271.42098.3530.227 14.348 15.04218.36815.604
(30,50,50)1.01397.9471.0400.54798.6870.7670.26099.6530.0871.05398.6270.320 16.343 17.45220.82617.181
(50,100,200)(10,20,30)2.58094.7732.6472.24795.2932.4600.46799.0470.4870.84798.3070.847 8.637 8.82211.26110.230
(10,30,50)2.84795.0732.0802.59395.5601.8470.66799.0930.2401.31398.1400.547 9.522 9.72512.33411.166
(30,50,50)2.17395.8801.9471.79396.5331.6730.38099.3800.2401.02098.4600.520 10.618 10.93913.75112.189
(1002,200)(10,20,30)3.25394.2132.5332.95394.6932.3530.96798.6730.3601.50797.9200.573 7.952 8.09010.2669.647
(10,30,50)2.94095.0132.0472.62095.5331.8470.98098.7930.2271.46098.1270.413 8.985 9.18411.48910.773
(30,50,50)2.56795.3872.0472.22796.0071.7670.90098.8930.2071.54798.0470.407 9.888 10.19712.66611.709
5 sample groups and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$({\sigma }_{1}^{2},{\sigma }_{2}^{2},{\sigma }_{3}^{2},{\sigma }_{4}^{2},{\sigma }_{5}^{2})=(1,1,2,2,3)$\end{document}σ12,σ22,σ32,σ42,σ52=1,1,2,2,3
(30, 502, 100, 200)(10,10,20,20,20)0.32699.5040.1700.23299.6260.1420.34499.5680.0880.75699.0020.2426.2246.4716.310 5.600
(20,20,30,30,50)0.24499.6200.1360.15499.7540.0920.24499.6940.0620.66699.1640.1706.9527.2507.067 6.201
(20,30,50,50,70)0.15499.7380.1080.09299.8280.0800.32299.6420.0360.78899.0840.1288.5108.9718.513 7.369
(50,50,50,70,70)0.06299.8820.0560.02699.9420.0320.11699.8720.0120.42699.4900.0849.57210.2269.861 8.223
(503, 1002)(10,10,20,20,20)0.39899.5040.0980.33899.5820.0800.55899.4140.0281.12298.7880.0906.6146.8266.557 5.914
(20,20,30,30,50)0.39299.5120.0960.31299.6180.0700.52699.4480.0261.09298.8100.0987.7918.1007.567 6.768
(20,30,50,50,70)0.35899.6180.0240.24499.7480.0080.58299.3980.0201.19698.7540.05010.06710.7379.488 8.354
(50,50,50,70,70)0.20499.7660.0300.13699.8500.0140.25499.7460.0000.82299.1660.01210.68711.35210.571 9.039
(70, 1002, 2002)(10,10,20,20,20)0.78499.0380.1780.71099.1400.1500.81099.1200.0801.23298.6400.1284.4994.5654.507 4.237
(20,20,30,30,50)0.66699.1740.1600.58099.2800.1400.62099.3100.0701.05898.8260.1165.2185.3215.116 4.783
(20,30,50,50,70)0.62099.2900.0900.55099.3800.0600.75099.2000.0601.15898.7440.0986.5466.7436.202 5.753
(50,50,50,70,70)0.37499.5480.0780.31099.6300.0600.37099.6000.0300.68099.2580.0626.9387.1396.892 6.249

Notes.

Note: (1003, 2002) = (100, 100, 100, 200, 200). Bold denotes the best-performing method.

Notes. Note: (1003, 2002) = (100, 100, 100, 200, 200). Bold denotes the best-performing method. For h = 3 with large variance, Table 1 and Fig. 1 reveal that all of the methods provided CR performances close to and greater than the nominal confidence level (95%). Meanwhile, the SCIs based on the Bayesian approach based on the PMB prior and GPQ maintained a good balance between LER and UER. Importantly, the AW of PB was narrower than the other methods for small sample sizes, while those of the Bayesian approach based on the PMB prior were slightly narrower than the others for the other sample sizes. When a group comparison was h = 5 (Table 1 and Fig. 2), the PB approach provided the best CRs and narrowest AWs for all scenarios tested.
Figure 1

The CR and AW performance measures for three sample groups: (A) CR (B) AW.

Figure 2

The CR and AW performance measures for five sample groups: (A) CR (B) AW.

An empirical application of the four methods to daily precipitation data

Daily precipitation records comprise publicly available data from the Thailand Meteorology Department (Department, 2021). Flash floods, landslides, and windstorms caused by heavy rainfall occurred in the four provinces in the lower southern area of Thailand: Songkhla, Yala, Narathiwat, and Pattani during January 2021, as reported by Thailand’s Department of Disaster Prevention and Mitigation (Thailand, 2021). According to automatic weather system (Department, 2021), Songkhla has two weather stations in the Songkhla and Sadao districts, which means that we could simultaneously estimate variations in precipitation at five weather stations. Daily precipitation data from December 2020 to January 2021 (Table 2) were used in the analysis. Figure 3 shows histogram along with normal quantile–quantile (Q-Q), cumulative density function (CDF) and probability-probability (P-P) plots. Furthermore, the Akaike information criterion (AIC) and Bayesian information criterion (BIC) values of five models: normal, logistic, lognormal, exponential, and Cauchy applied to fitting the non-zero precipitation data were compared to check the appropriateness of each model for fitting the data (Table 3). The AIC and BIC results for the lognormal model were the lowest, and thus it was the most efficient. The data from all of the stations were zero-inflated, thereby verifying that they follow the assumptions for ZILN.
Table 2

Daily precipitation data in five stations of southern Thailand.

DatesWeather stations: December 2020DatesWeather stations: January 2021
ShongklhaSongkhla-based Sadao districtYalaNarathiwatPattaniShongklhaSongkhla-based Sadao districtYalaNarathiwatPattani
1160.056.446.438.682.010.84.26.631.20.8
214.685.846.670.00.021.48.25.66.42.0
320.84.255.874.20.032.642.649.638.649.8
48.80.227.00.47.2421.48.428.610.44.4
50.00.00.20.00.059.270.2137.862.849.0
60.00.00.00.00.260.22.884.813.20.2
70.00.00.00.00.070.00.01.89.22.8
80.20.01.60.00.080.40.00.40.01.2
952.00.00.00.00.090.80.00.01.40.0
1039.40.00.00.83.61029.015.62.812.622.8
110.60.02.89.29.81123.00.60.20.20.0
1212.24.217.20.08.0125.00.20.63.61.2
135.437.22.08.212.8130.00.02.43.01.0
149.40.00.00.03.4145.40.00.00.00.0
157.02.412.478.47.2151.80.00.00.00.0
1619.225.643.843.062.8160.80.00.00.00.0
1784.497.4126.4162.0164.8170.00.00.00.00.0
1897.29.2113.8141.246.4180.00.00.01.20.0
1992.019.239.843.626.2190.00.00.00.00.0
2019.87.227.820.47.0200.00.00.00.00.0
215.40.40.00.23.4210.00.00.00.00.0
220.00.01.21.03.4220.00.00.00.00.0
2323.80.031.061.412.6230.00.00.00.00.0
2423.40.019.66.60.0240.00.02.20.00.0
252.20.046.639.86.8250.00.00.00.00.0
261.010.027.684.02.8260.00.00.00.00.0
270.00.01.00.00.2270.00.02.00.20.0
280.00.00.00.00.0280.00.00.00.00.0
290.00.00.00.00.0290.00.00.00.00.0
300.00.00.00.00.0304.40.00.00.00.0
316.20.411.289.23.2319.62.03.00.41.6

Notes.

Source: Thailand Meteorological Department Automatic Weather System.

http://www.aws-observation.tmd.go.th/web/climate/climate_past.asp.

Figure 3

Histogram, normal Q-Q, CDF and P-P plots of nonzero precipitation records in five stations of southern Thailand: (A) Songkhla (B) Songkhla-Sadao (C) Yala (D) Narathiwat (E) Pattani.

Table 3

The AIC and BIC results for five associated models.

StationsCriterionModels
NormalLognormalLogisticExponentialCauchy
SongkhlaAIC387.611305.171373.337317.644345.549
BIC390.938308.498376.664319.308348.876
Songkhla-Sadao districtAIC241.141196.707238.534203.226225.198
BIC243.579199.145240.971204.445227.635
YalaAIC373.538313.718368.171322.168365.426
BIC376.760316.940371.393323.779368.648
NarathiwatAIC362.209310.600359.299317.455358.947
BIC365.320313.711362.410319.010362.058
PattaniAIC328.067242.474313.959260.584273.318
BIC331.060245.467316.952262.080276.311
Notes. Source: Thailand Meteorological Department Automatic Weather System. http://www.aws-observation.tmd.go.th/web/climate/climate_past.asp. The results in Table 4 reveals that since variance was greater than the mean μ, quite large precipitation variations were required in the present study. For applying data of daily precipitation to measure the efficacy of the four methods, the 95% SCIs-based Bayesian, GPQ and PB approaches for all pairwise precipitation datasets from the five weather stations cover their point estimates (Table 5). In a agree with the simulation results for n1 = n2 = n3 = 50 and n4 = n5 = 100, the PB approach provided the best SCI performance for ratio of variances of several ZILN models. This can be interpreted as Narathiwat has the highest variation in precipitation, followed by Yala. These results are in line with the Asia Disaster Monitoring and Response System (Thailand, 2021), which reported that both areas were affected by flooding and landslides damaging 22,308 households in Narathiwat and 12,082 households in Yala during the time period covered by the data used in the study.
Table 4

Summary statistics for five stations.

Weather stationsi n i0 n i1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${\hat {d}}_{i}$\end{document}d ˆi (%) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${\hat {\mu }}_{i}$\end{document}μ ˆi \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${\sigma }_{i}^{2}$\end{document}σi2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${\hat {\lambda }}_{i}$\end{document}λ ˆi
Songkhla1392337.0971.9092.9829.317
Songkhla-Sadao district2253759.6771.8283.5099.766
Yala3372540.3232.1553.49010.774
Narathiwat4352743.5482.2534.23812.411
Pattani5332946.7741.6692.9508.607
Table 5

95% SCIs of all pairwise log-ratios of precipitation variabilities amoung five weather stations in lower southern Thailand.

MethodsLimitsAll pairwise log-ratios of precipitation variabilities among weather stations
Songkhla/ Songkhla-sadaoSongkhla/ YalaSongkhla/ NarathiwatSongkhla/ PattaniSongkhla-sadao/Yala
−0.4489−1.4568−3.09390.71043−1.0079
Bayesian SCIs -based PMB priorLower−8.7881−9.796−11.4331−7.6287−9.3471
Upper7.89036.88245.24529.04967.3313
Width16.678316.678316.678316.678316.6783
Bayesian SCIs -based RB priorLower−9.4711−10.479−12.1161−8.3117−10.0301
Upper8.57337.56545.92839.73268.0143
Width18.044418.044418.044418.044418.0444
SCI-based GPQLower−9.3037−9.2166−11.9695−6.6362−10.4292
Upper8.40596.3035.78168.05718.4134
Width17.709615.519617.751114.693218.8426
SCI-based PBLower−7.4257−7.5709−10.0871−5.0781−8.4311
Upper6.52794.65733.89926.49896.4153
Width13.953612.228113.986311.57714.8464
MethodsLimitsSongkhla-sadao/ NarathiwatSongkhla-sadao/ PattaniYala/ NarathiwatYala/ PattaniNarathiwat/ Pattani
−2.6451.1593−1.63712.16723.8043
Bayesian SCIs -based PMB priorLower−10.9842−7.1798−9.9763−6.1719−4.5348
Upper5.69419.49856.70210.506412.1435
Width16.678316.678316.678316.678316.6783
Bayesian SCIs -based RB priorLower−11.6672−7.8629−10.6593−6.855−5.2178
Upper6.377110.18157.38511.189412.8266
Width18.044418.044418.044418.044418.0444
SCI-based GPQLower−13.0047−7.9247−11.078−5.8532−5.2999
Upper7.714610.24337.803710.187612.9086
Width20.719318.16818.881716.040818.2085
SCI-based PBLower−10.8075−5.9981−9.0757−4.1522−3.369
Upper5.51758.31685.80148.486610.9777
Width16.32514.314914.877112.638814.3467

Discussion

From the above numerical results, it can be seen that the SCIs based on PB and the Bayesian approach based on the PMB prior dealt with large variations in the data better than the other approaches. The PB-based SCI has some strong points for small sample sizes due to random samples being obtained via bootstrap resampling. Furthermore, the performance of the Bayesian SCI based on the PMB prior declined as the number of populations increased and the sample size decreased. Although, the GPQ method provided appropriate CRs, its AWs were wider than the other methods, possibly because the GPQ of d is limited for cases with unequal zero-inflated percentages. Since it has performed quite well for one population group especially (Wu & Hsieh, 2014; Maneerat, Niwitpong & Niwitpong, 2021a). Further research could be conducted to explore subjective or prior beliefs about parameters when using the Bayesian approach for parameter estimation

Conclusions

SCIs for the comparison of the variance ratios among several ZILN models were formulated by applying Bayesian approaches based on the PMB and RB priors, along with the GPQ and PB approaches. In practice, the daily precipitation data for each of the weather stations considered were overdispersed (i.e., the variance was greater than the mean) and zero-inflated (Table 4). Thus, the ZILN distribution is an appropriate model for estimating parameters in the construction of SCIs for multiple comparisons between their variances. For three populations, all of the methods produced 95% SCIs for all pairwise comparisons among variances covering the true parameter. Meanwhile, the SCI constructed via the Bayesian approach based on the PMB prior maintained a good balance between LER and UER and provided the narrowest AWs except for small sample sizes. On the other hand, the PB-based SCI could handle extreme cases when the sample sizes were small with large variances. For five populations, the PB-based SCI performed the best overall, with the Bayesian approach based on the RB prior for small-to-large sample sizes and the GPQ approach for medium-to-large and large sample sizes providing acceptable results, and thus can be recommended as alternative SCIs. Click here for additional data file.
  4 in total

1.  Inferences on the mean of zero-inflated lognormal data: the generalized variable approach.

Authors:  Lili Tian
Journal:  Stat Med       Date:  2005-10-30       Impact factor: 2.373

2.  Comparing the means and variances of a bivariate log-normal distribution.

Authors:  Ionut Bebu; Thomas Mathew
Journal:  Stat Med       Date:  2008-06-30       Impact factor: 2.373

3.  Bayesian confidence intervals for the difference between variances of delta-lognormal distributions.

Authors:  Patcharee Maneerat; Sa-Aat Niwitpong; Suparat Niwitpong
Journal:  Biom J       Date:  2020-06-22       Impact factor: 2.207

4.  Simultaneous confidence intervals for all pairwise comparisons of the means of delta-lognormal distributions with application to rainfall data.

Authors:  Patcharee Maneerat; Sa-Aat Niwitpong; Suparat Niwitpong
Journal:  PLoS One       Date:  2021-07-06       Impact factor: 3.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.