Literature DB >> 35277866

Improved estimation in negative binomial regression.

Euloge Clovis Kenne Pagui¹, Alessandra Salvan¹, Nicola Sartori¹.

Abstract

Negative binomial regression is commonly employed to analyze overdispersed count data. With small to moderate sample sizes, the maximum likelihood estimator of the dispersion parameter may be subject to a significant bias, that in turn affects inference on mean parameters. This article proposes inference for negative binomial regression based on adjustments of the score function aimed at mean or median bias reduction. The resulting estimating equations generalize those available for improved inference in generalized linear models and can be solved using a suitable extension of iterative weighted least squares. Simulation studies confirm the good properties of the new methods, which are also found to solve in many cases numerical problems of maximum likelihood estimation. The methods are illustrated and evaluated using two case studies: an Ames salmonella assay data set and data on epileptic seizures. Inference based on adjusted scores turns out to generally improve on maximum likelihood, and even on explicit bias correction, with median bias reduction being overall preferable.

Entities: Chemical

Keywords: adjusted score; iterative weighted least squares; maximum likelihood; mean and median bias reduction; parameterization invariance

Mesh：

Year: 2022 PMID： 35277866 PMCID： PMC9314673 DOI： 10.1002/sim.9361

Source DB: PubMed Journal: Stat Med ISSN： 0277-6715 Impact factor: 2.497

INTRODUCTION

Regression models for count data are employed in many contexts, especially in social sciences, economics, biology, and epidemiology. It is not uncommon that empirical counts display substantial overdispersion and a popular modeling approach is negative binomial regression, see for example, section 7.3 of the book by Agresti and the monograph by Hilbe for recent accounts. Frequentist inference about mean and shape parameters in negative binomial regression is typically based on the likelihood and this is the method of choice for standard software, such as the glm.nb function of the R package MASS. Maximum likelihood has been studied starting from Fisher and Anscombe for independent and identically distributed data and from Lawless for the regression setting. With moderate sample sizes, the maximum likelihood estimator of the shape parameter may be subject to a substantial bias that can influence the inferential conclusions also about regression coefficients. General improved estimation methods based on adjustments of the likelihood equations have been proposed starting from the contributions of Firth and Kenne Pagui et al, resulting in mean or median bias reduction, respectively. While mean bias reduction yields an estimator with reduced bias, median bias reduction is such that each component of the estimator is, with high accuracy, median unbiased, that is, it has the same probability of underestimating and overestimating the corresponding parameter component. Mean bias reduction is invariant under linear transformation of the parameters, while median bias reduction is invariant under monotone component‐wise parameter transformations. Unlike traditional bias correction, that subtracts an estimate of the bias from the maximum likelihood estimate, see for instance section 9.2 of Cox and Hinkley, both mean and median bias reduction methods do not rely on finiteness of the maximum likelihood estimate and have the advantage of solving practical issues related to boundary estimates that can occur with positive probability in models for discrete data. Obtaining the quantities required for mean and median bias reduction, as well as development of efficient software, is not always straightforward, but is necessary in order to make these methods available to practitioners. A major effort has been devoted to generalized linear models , , leading to the brglm2 package for the software R. Additional effort was needed for other specific models, such as beta regression , or cumulative link models. Negative binomial regression does not fall into the generalized linear models class when the shape parameter is unknown, as is the case in practical applications. Therefore, due to its widespread use, it is of interest to provide the quantities required for mean and median bias reduction, together with an efficient implementation, and to assess whether the general theoretical properties of the methods produce appreciable improvements over standard maximum likelihood. Previous work in this direction includes the paper by Saha and Paul, who, for independent and identically distributed data, derived a bias corrected maximum likelihood estimator for the shape parameter and showed that it is preferable to other methods. , The authors also give an example of bias correction involving negative binomial regression, although the expression of the correction is not provided. In this article, we derive the adjusted score equations for mean and median bias reduction for negative binomial regression and show that they can be solved by iterative weighted least squares after an appropriate adjustment of the ordinary working variate, or adjusted dependent variable, for maximum likelihood. Moreover, the method is developed for various link functions and parameterizations of the shape parameter. An R implementation is given in the brnb function which has been added to the current version of the R package brglm2. Mean and median bias reduced estimators are compared with the maximum likelihood estimator through an extensive simulation experiment under different scenarios. Two case studies, the Ames salmonella reverse mutagenicity assay and the epileptic seizures data, are also considered and include comparison with other methods previously used for the same data, in particular bias correction. The results indicate that, overall, mean and median bias reduction are both preferable to standard likelihood inference, even after bias correction, especially with moderate sample sizes. Median bias reduction provides the best empirical coverage of Wald‐type confidence intervals for all parameters. Moreover, numerical problems that lead to unavailability of the maximum likelihood estimate, and therefore of its bias correction, occur more frequently than with mean or median bias reduction. In addition, traditional bias correction is seen to be less accurate than mean and median bias reduction when the number of covariates is large relative to the sample size. The rest of the article is organized as follows. In Section 2, we introduce the notation for the negative binomial regression model. In Section 3, we give the adjusted score functions for mean and median bias reduction, together with computational details. Sections 4 and 5 contain simulation results and case studies, respectively. A brief discussion is given in Section 6. The Supplementary Material contains additional figures and simulation results together with R code to reproduce the analyses in the article.

NEGATIVE BINOMIAL REGRESSION

Using Poisson regression when overdispersion is present typically leads to underestimation of standard errors of regression coefficients and therefore to potentially misleading inferential conclusions. Negative binomial regression , allows to model overdispersion introducing a shape parameter in the variance specification, so that, for count mean response , , the inflated variance has the form , where . Poisson regression is a limiting case as approaches zero. Let , , be realizations of independent negative binomial random variables with mean and variance , where . The probability mass function of is and . In a regression setting, we consider , where is the inverse of the link function, is the linear predictor, with and a vector of covariates, with the ith row of the model matrix X. When an intercept is included in the linear predictor, , . The usual choice for the link function is . For sake of generality, the derivation below is for a generic monotone reparameterization of , say , with inverse and derivative . Common choices are , or . Let . Noting that for any , , the log likelihood is where is a fixed weight for the ith observation, , is zero when and . Weights are typically equal to 1, but can be greater than 1 with grouped data. The score function has components and given by where D is a diagonal matrix with diagonal elements , W is a diagonal matrix with diagonal elements , , and . The expected information is where is a p‐vector of zeros and The maximum likelihood estimate is obtained as solution of the equations and that can be solved using a Fisher scoring algorithm. Exploiting the orthogonality between and , the current iterate is found by replacing into the jth Fisher scoring iteration for . The procedure is alternated until convergence. With simple algebra, the jth iteration of Fisher scoring algorithm for updates the current iterate providing where the superscript indicates that the quantity is evaluated at and z is the vector with elements , usually called the adjusted dependent variables or working variates. Equation (2) has the same form as the iterative weighted least squares (IWLS) iteration in generalized linear models.

MEAN AND MEDIAN BIAS REDUCTION

Bias of maximum likelihood estimators in small samples or with sparse data can result in significant loss of accuracy of the related inferential procedures. An extensive amount of literature has focused on methods for reducing mean bias. A general classification separates explicit methods, also called bias correction, obtained subtracting from the maximum likelihood estimate an estimate of its first order bias, from implicit methods, also called bias reduction, obtained modifying the score function. A unified review is presented by Kosmidis. See also Greenland et al for an expository discussion of sparse data bias and available remedies. Such classification also holds for methods aiming at improving other centering properties of the estimator, such as the median centering. Generally, explicit methods are one‐step approximations to the corresponding implicit methods, using the maximum likelihood estimate as a starting value. Therefore, they are less accurate and rely on existence of the latter. We recall below mean and median bias reduction and obtain the relevant expressions for negative binomial regression. Consider a regular model with d‐dimensional parameter , log likelihood , score function , and expected information , the latter assumed in the following to be of order n. We let be a component of , , and be the observed information. While the maximum likelihood estimate is a solution of the equation , the improved estimates proposed here are based on adjusted score equations having the general form , with a model‐dependent adjustment term of order under repeated sampling. All the proposed adjustments involve the quantities (see Kosmidis and Firth for their first use) In particular, Firth showed that the leading term of order of the bias of the maximum likelihood estimator is reduced to order with , where has elements with the trace operator. We let and we denote by the corresponding estimator, solution of . The bias corrected maximum likelihood estimator, see for example, section 9.2 of Cox and Hinkley and section 5.3 of Barndorff‐Nielsen and Cox, is given by , where is the term of order of the bias of and is equal to . Also has bias of order , although the availability of relies on the existence of . Both bias reduction and bias correction are tied to a specific parameterization. This means that if is a nonlinear reparameterization of , the transformed estimator or will not have reduced bias of order . Equivariance under nonlinear componentwise reparameterizations is obtained with median bias reduction of Kenne Pagui et al, leading to the estimator satisfying, in the continuous case, the improved median centering property , , in contrast with the corresponding order of error for the maximum likelihood estimator. More in detail, the adjusted score for median bias reduction, as given in formula (10) of the cited paper, has the form , with The vector involves the quantities and and its expression is given in the Appendix. The median bias reduced estimator is obtained as a solution of . Since both and are of order , and have the same asymptotic normal distribution as the maximum likelihood estimator. , In practice, standard errors are computed using diagonal elements of the inverse Fisher information, evaluated at the corresponding estimate, that is, and respectively. The asymptotic coverage error of the associated Wald confidence intervals will be the same as with maximum likelihood, although empirical coverage error is typically better due to improved centering. For the negative binomial regression model (1), we have and the quantity , whose derivation is in the Appendix, has where , with . The quantity appearing in and in is the “hat” value for the ith observation, obtained as the ith diagonal element of the matrix and . The expression of is given in the Appendix. For independent and identically distributed observations, with , bias reduction for the shape parameter was considered by Zhang et al in Example 4. The adjustment term for median bias reduction has where expressions for u and are derived in the Appendix. With simple algebra, the jth iteration of IWLS which updates the current iterate leads to where is the adjusted version of the working variate z defined in (2). The jth iteration of IWLS for has the same expression as (5), with working variate in place of . All the improved methods for negative binomial regression, together with maximum likelihood fitting, are implemented in the brnb R function of the R package brglm2. Maximum likelihood fitting can also be performed using the glm.nb function of the MASS R package.

SIMULATION STUDIES

In this section, the properties of the estimators are assessed through simulation under different scenarios corresponding to combinations of values of n, , and . For each setting, we run 10 000 Monte Carlo replications. For all configurations, we used the logarithmic link function and the identity transformation for the shape parameter . Maximum likelihood (ML), mean and median bias reduced (BR) estimates were computed using the brnb R function. Convergence is achieved when the absolute difference between the previous and current estimates is less than . The default option sets to 100 the maximum number of iterations. For a scalar parameter , let us denote by , , the rth Monte Carlo value of an estimator and by its corresponding standard error, computed using the square root of a diagonal element of the inverse of Fisher information, evaluated at . Let, in addition, be the empirical mean squared error, the empirical mean bias, where , and the empirical standard deviation. Moreover, let be the indicator function of the set A and the ‐quantile of the standard normal distribution. Estimators are evaluated in terms of empirical probability of underestimation, PU = ; estimated relative (mean) bias, RBIAS = ; estimated root mean squared error, RMSE = ; estimated coverage probability of 95% Wald‐type confidence intervals, WALD = and the relative increase in estimated mean squared error from its absolute minimum due to bias, IBMSE = . Except for RMSE, the performance measures are expressed in percentages. We first conducted a simulation study with constant mean , that is, with intercept only, and shape parameter . In particular, for sample sizes , Monte Carlo samples were drawn from the negative binomial with values of the parameters and . When the empirical variance is less than the mean, which happened in some simulated samples only with and , ML, mean and median BR estimates do not exist. We denote by the number of simulated samples, out of 10 000, where this occurred. Nonconvergence of the ML algorithm was also observed in some simulated samples with empirical variance greater than the mean, especially with and small values of . Few of these cases showed nonconvergence also for mean and median BR. In particular, in the samples with empirical variance greater than the mean, we found nonconvergence samples using ML. Among these samples, we found nonconvergence samples using mean BR and, finally, among these samples, we found nonconvergence samples using median BR. For each setting, with , Table 1 gives the values of .

TABLE 1

Computational diagnostics in 10 000 replications

		μ=2					μ=5
	κ	0.5	0.75	1	1.5	2	0.5	0.75	1	1.5	2
n=20	A1	535	214	108	43	17	18	3	0	0	0
	A2	163	85	42	17	12	6	3	0	3	1
	A3	2	1	0	0	1	0	0	0	2	1
	A4	0	0	0	0	0	0	0	0	2	1
	10 000 ‐ A1 ‐ A2	9302	9701	9850	9940	9971	9976	9994	10 000	9997	9999
n=50	A1	36	6	0	0	0	0	0	0	0	0
	A2	16	0	1	0	0	0	0	0	0	0
	A3	2	0	0	0	0	0	0	0	0	0
	A4	0	0	0	0	0	0	0	0	0	0
	10 000 ‐ A1 ‐ A2	9948	9994	9999	10 000	10 000	10 000	10 000	10 000	10 000	10 000

Note: indicates the number of samples with empirical variance less than the empirical mean. Of the remaining samples, is the number of nonconvergence samples using ML, which include nonconvergence samples using mean BR, which in turn include nonconvergence samples using median BR. The quantity 10 000 ‐ ‐ represents the number of samples with convergence for all methods out of 10 000 replications.

Computational diagnostics in 10 000 replications Note: indicates the number of samples with empirical variance less than the empirical mean. Of the remaining samples, is the number of nonconvergence samples using ML, which include nonconvergence samples using mean BR, which in turn include nonconvergence samples using median BR. The quantity 10 000 ‐ ‐ represents the number of samples with convergence for all methods out of 10 000 replications. In order to compare the methods on the same samples, the results reported in Figures 1 and 2 are based on the simulated samples in which all methods converged (10 000 ‐ ‐ samples in Table 1 for , while all methods converge for ). Moreover, as done in References 8, 11, for , the results based, for each method, on all the samples in which that method converged are displayed in Figures S1 and S2 in Section S.2 of the Supplementary Material. The qualitative conclusions of the simulation described below are unchanged.

FIGURE 1

FIGURE 2

Estimated probability of under estimation (PU), relative bias (RBIAS), root mean squared error (RMSE) and coverage probability of 95% Wald‐type confidence intervals (WALD) for the intercept , with and . Results for ML (black squares), mean BR (blue circles), and median BR (red triangles). Except for RMSE, the vertical axes represent percentages Estimated probability of under estimation (PU), relative bias (RBIAS), root mean squared error (RMSE) and coverage probability of 95% Wald‐type confidence intervals (WALD) for the shape parameter , with and . Results for ML (black squares), mean BR (blue circles), and median BR (red triangles). Except for RMSE, the vertical axes represent percentages Both mean and median BR achieve the desired goals, that is, are effective in mean and median centering, respectively, and are both preferable to ML. We recall, however, that mean centering is tied to a specific parameterization. Under this respect, the larger empirical relative bias of median BR of the shape parameter is not observed in other parameterizations, such as the inverse or the log parameterizations. Median BR provides empirical coverage of the 95% Wald‐type confidence intervals better than its competitors, especially for the shape parameter and small sample sizes. As expected, all three estimators improve as the sample size and increase. With the aim of checking the improvement in the order of bias for mean BR, and of the distance from 0.5 of the probability of underestimation for median BR, we simulated samples of size for with and . This simulation design guarantees that the simulation standard error is asymptotically bounded for any r. The results are given in Section S.3 of the Supplementary Material for and and are in line with the theory. Indeed, mean BR provides a reduction in the order of the bias from to . Similarly, the order of error in the probability of underestimation is seen to decrease from to for median BR, even though theoretically the result is only guaranteed for continuous models. We now consider a second simulation study involving covariates. The linear predictor is connected to the mean with log link and the identity transformation for the shape parameter is considered . In particular, we let where and are independent realizations of Bernoulli random variables with probabilities 0.8 and 0.5, respectively; are independent realizations of a uniform on ; are independent realizations of a Poisson with mean 2.5, . The true parameter values are , and . Four values are considered for the shape parameter, . The sample sizes considered are . For each combination of and n, we run 10 000 Monte Carlo replications, where the values of the explanatory variables , and were held constant throughout the simulations. The summaries of the simulation results for the regression coefficients are presented in Figure 3 with . Other values of gave similar results, which are summarized in Figures S4 to S6 in Section S.4 of the Supplementary Material. Figure 4 summarizes results for . With and , we found 39, 11 and 9 samples out of 10 000 where the IWLS algorithm did not reach convergence for ML, mean BR, and median BR, respectively, while no convergence problems arose for . A sufficient condition for existence of the ML estimate is satisfied in most of the nonconvergence cases. Therefore, nonconvergence is mostly due to numerical problems. Moreover, the percentage of samples showing nonconvergence decreases as increases. As for the previous simulation study, the reported results are based only on samples which led to convergence for all methods. Looking at the four performance measures, it appears that mean and median BR are preferable to ML for small n. On the other hand, the results improve for all three methods as n increases. As increases, for estimation of regression coefficients, median BR is comparable to mean BR in terms of estimated relative (mean) bias, while it proves to be more accurate in achieving median centering. Moreover, in all scenarios, median BR provides the empirical coverages of Wald‐type confidence intervals closest to the 95% nominal value. The results in Figure 4 show that the improvement given by both mean and median BR over ML is substantial in all scenarios and more pronounced than in the previous case with the intercept parameter only.

FIGURE 3

FIGURE 4

Estimated probability of under estimation (PU), relative bias (RBIAS), root mean squared error (RMSE), coverage probability of 95% Wald‐type confidence intervals (WALD) and increase in estimated mean squared error (IBMSE) for estimation of regression parameters with . Simulation results for ML (black squares), mean BR (blue circles), and median BR (red triangles). Except for RMSE, the vertical axes represent percentages Estimated probability of under estimation (PU), relative bias (RBIAS), root mean squared error (RMSE), coverage probability of 95% Wald‐type confidence intervals (WALD) and increase in estimated mean squared error (IBMSE) for estimation of shape parameter with . Simulation results for ML (black squares), mean BR (blue circles), and median BR (red triangles). Except for RMSE, the vertical axes represent percentages Finally, in Section S.5 of the Supplementary Material, we investigated by simulation the behavior of the methods when the generating model has no overdispersion. In particular, samples of size are generated from a Poisson regression model with mean satisfying (6). As expected, in many samples the negative binomial fitting procedures did not converge. In such cases, the corresponding procedure for Poisson regression was used. The results in Table S1 indicate that mean and median BR are essentially equivalent to, and sometimes better than, ML for inference about .

CASE STUDIES

We consider two case studies, namely the Ames salmonella assay data and the epileptic seizures data. The first data set has one explanatory variable with 6 levels and 3 observations each. The second data set has counts of epileptic seizures for 59 matched pairs.

Ames salmonella data

We consider data from an Ames salmonella reverse mutagenicity assay, previously analyzed using negative binomial regression by several authors , , , in order to account for the observed overdispersion. The response variable Y corresponds to the number of revertant colonies observed on a plate, while covariate x is the dose level of quinoline on the plate. Three observations were taken at each of six dose levels leading to a total of 18 observations. We focus on the analysis based on the log‐linear model In the above expression, the constant 10 represents the smallest non‐zero dose level. The main interest is focused on testing significance of mutagenic effect, that is, the null hypothesis . The presence of overdispersion was confirmed by the Pearson statistic based on the residuals of the Poisson model. We also compared Poisson and negative binomial models using parametric bootstrap. The code is available in Section S.6 of the Supplementary Material. The results of both tests support the choice of a negative binomial model. Table 2 shows the estimates obtained with ML, mean bias correction (BC), mean BR, and median BR using the identity transformation for the shape parameter . Estimates of the regression coefficients have the same interpretation as in Poisson log linear models and the values here turn out to be comparable across methods. Mean and median bias reduced estimates of the shape parameter are comparable, but slightly different from the maximum likelihood estimate. This in turn reflects on the standard errors of the regression parameter estimates.

TABLE 2

Ames salmonella assay: Parameter estimates and corresponding standard errors in parenthesis

	ML	Mean BC	Mean BR	Median BR
β0	2.198 (0.325)	2.210 (0.348)	2.216 (0.352)	2.211 (0.359)
β1	−0.001 (0.00039)	−0.001 (0.00042)	−0.001 (0.00042)	−0.001 (0.00043)
β2	0.313 (0.088)	0.311 (0.095)	0.309 (0.096)	0.309 (0.098)
κ	0.049 (0.028)	0.063 (0.033)	0.065 (0.033)	0.069 (0.035)

Ames salmonella assay: Parameter estimates and corresponding standard errors in parenthesis A simulation study, with covariates fixed at the observed values and true parameter value equal to the observed mean BR estimate is included in Section S.6 of the Supplementary Material and confirms the findings of the previous section, with mean and median BR showing an improved repeated sampling behavior with respect to ML.

Epileptic seizures data

We consider here the epileptic seizures data on 2‐week seizure counts for 59 epileptics. The data were analyzed by several Authors. , The number of seizures was recorded for a baseline period of 8 weeks, and then patients were randomly assigned to a treatment group or a control group. Counts were then recorded for four successive 2‐weeks periods. The response was the number of observed seizures. We analyzed the data by comparing the response before and after the treatment, hence obtaining a set of 59 matched pairs. The only covariates in the linear predictor are then given by the two treatment indicators. As in the previous example, Poisson overdispersion was confirmed both by Pearson statistic and parametric bootstrap. The code is available in Section S.7 of the Supplementary Material. Therefore, we assume a negative binomial model for the response , , , with mean and variance where intercepts determine the stratified structure corresponding to each subject, , while if subject i received the placebo and if subject i received the treatment. We focus on inference about and , while the intercepts are treated as incidental nuisance parameters. This is a rather extreme case of fixed effects model for clustered data where it is well known that ML inference for the parameters of interest is problematic (see p. 292 of Cox and Hinkley ). Therefore, it is of particular interest to assess the behavior of BR methods, which provide improved estimates also for the nuisance parameters. Negative binomial regression for clustered data is considered in section 7.5.5 of Demidenko. Figure 5 displays the parameter estimates and the corresponding confidence intervals obtained with different methods. In addition to ML, mean BC, mean BR and median BR, modified profile likelihood (MPL) for the three‐dimensional parameter of interest has also been included for comparison. MPL for this model was previously proposed by Bellio and Sartori due to its higher‐order accuracy in models with many nuisance parameters. In this setting, the same higher order accuracy is guaranteed by BR methods. , The difference between mean and median BR and MPL with respect to ML is particularly pronounced for . This reflects on the lengths of confidence intervals, with those from ML being inaccurately too short, as illustrated by the simulation results below.

FIGURE 5

Epileptic seizures data: Points represent the parameter estimates while the vertical lines represent 95% Wald‐type confidence intervals

Epileptic seizures data: Points represent the parameter estimates while the vertical lines represent 95% Wald‐type confidence intervals We run 10 000 replications with covariates fixed at the observed value and true parameters set to the observed mean BR estimates. We found only 13 samples out of 10 000 where the IWLS algorithm did not reach convergence for ML, of these, 4 showed nonconvergence also for mean BR and median BR. The results are reported in Table 3 for the 9987 samples in which the IWLS algorithm achieved convergence for all the approaches. For the regression coefficients, all the approaches are almost equivalent in terms of PU, RBIAS, and RMSE, while coverage of confidence intervals based on BR methods substantially improves upon ML. Mean BC, while providing a noticeable improvement upon ML, is not as good as BR methods. This is mainly related to estimation of the shape parameter. Indeed, in this extreme scenario, only BR methods are seen to provide reasonable inference about .

TABLE 3

Epileptic seizures data: Simulation results for ML (hat), mean BC (tilde), mean BR (star), and median BR (dagger) of the parameters of interest

	PU	RBIAS	RMSE	WALD	IBMSE
β^1	50.21	−0.05	0.11	80.90	0.00
β˜1	50.28	−0.30	0.11	89.44	0.00
β1∗	50.38	−0.12	0.11	94.28	0.00
β1†	50.37	−0.10	0.11	94.25	0.00
β^2	49.54	0.92	0.11	81.39	0.05
β˜2	49.43	0.99	0.11	89.18	0.06
β2∗	50.34	0.23	0.11	93.98	0.00
β2†	50.36	0.22	0.11	93.96	0.00
κ^	100.00	−67.31	0.08	0.25	3331.25
κ˜	96.21	−36.26	0.05	31.22	360.50
κ∗	46.09	3.80	0.03	82.50	2.02
κ†	48.14	2.44	0.03	82.88	0.88

Epileptic seizures data: Simulation results for ML (hat), mean BC (tilde), mean BR (star), and median BR (dagger) of the parameters of interest Although not of direct interest in the present example, both mean and median BR provide improved estimates also of the nuisance parameters. The simulation results for these are presented in Figure S7 of the Supplementary Material. Once again, we can appreciate the improved performance of mean and median BR by looking at the coverages of 95% Wald‐type confidence intervals which are closest to the nominal value.

DISCUSSION

For negative binomial regression, we developed inference based on adjusted score equations for mean and median bias reduction. , Simulation results confirm the theoretical properties of the methods and indicate that they are both effective in improving over standard likelihood inference and even over traditional mean bias correction. This is especially notable when the number of parameters is large compared to the sample size, as is illustrated by the simulation results for the case study in Section 5.2. These methods also solve in most cases numerical problems that may occur with ML, and consequently with mean bias correction. Even though mean and median bias reduction aim at different centering properties, in practice they lead to similar conclusions. On the other hand, median bias reduction seems slightly preferable in terms of coverage accuracy of Wald confidence intervals. Other types of confidence intervals such as those based on the asymptotic distribution of the likelihood ratio or score statistics could be preferable to Wald intervals. However, these are not available for mean and median bias reduced estimators. Construction of adjusted score intervals could be the object of future research. Data S1: Supplementary Material Click here for additional data file.

7 in total

1 in total

1. Improved estimation in negative binomial regression.

Authors: Euloge Clovis Kenne Pagui; Alessandra Salvan; Nicola Sartori
Journal: Stat Med Date: 2022-03-11 Impact factor: 2.497

1 in total

Improved estimation in negative binomial regression.

INTRODUCTION

NEGATIVE BINOMIAL REGRESSION

MEAN AND MEDIAN BIAS REDUCTION

SIMULATION STUDIES

CASE STUDIES

Ames salmonella data

Epileptic seizures data

DISCUSSION

1. Maximum likelihood estimation for the negative binomial dispersion parameter.

2. Bias-corrected maximum likelihood estimator of the negative binomial dispersion parameter.

3. Practical use of modified maximum likelihoods for stratified data.

4. Sampling theory of the negative binomial and logarithmic series distributions.

5. Sparse data bias: a problem hiding in plain sight.

6. Some covariance models for longitudinal count data with overdispersion.

7. Improved estimation in negative binomial regression.

1. Improved estimation in negative binomial regression.