Literature DB >> 35153346

Data misreporting during the COVID19 crisis: The role of political institutions.

Antonis Adam1,2, Sofia Tsarsitalidou3,4.   

Abstract

We use Benford's law of first digits to determine whether there is evidence of data misreporting in the total COVID19 reported cases across countries. We try to model the differences in the Mean Absolute Deviation of actual data from those predicted by Benford's law to indicate the factors that lead to data misreporting using regression analysis. Using the Instrumental Variable model of Lewbel (2012) and Settler Mortality as an external instrument for democracy, we show that autocratic countries are more likely to misreport the COVID19 cases.
© 2022 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID19; Data misreporting; Democracy

Year:  2022        PMID: 35153346      PMCID: PMC8820020          DOI: 10.1016/j.econlet.2022.110348

Source DB:  PubMed          Journal:  Econ Lett        ISSN: 0165-1765


Introduction

During the COVID19 pandemic, governments worldwide faced serious incentives to misreport the actual number of patients. For example, travel restrictions were determined by the number of recorded cases in the targeted country. Similarly, all governments had incentives to appear efficient in dealing with the crisis, both to their “electorate” and external observers. Likewise, government response and effectiveness in dealing with the pandemic will be critical to all future elections.1 What are then the country characteristics which are related to data misreporting? First, it is natural to expect that countries with poor quality of political institutions are more likely to resort to methods of data manipulation: the absence of effective checks and balances, and powerful opposition, and the need of autocrats to project power both within the country as well as abroad, make autocracies more likely to try report manipulated data. In the present paper, we exploit Benford’s law of first digits to answer the above question.2 To isolate the impact of strategic misreporting from other reasons, we use the Lewbel (2012) IV model to estimate the causal effect of the political regime on the reliability of the reported COVID19 cases. We find a negative association between the level of democracy and data misreporting, suggesting that data misreporting might be guided by the governments’ wishes.

Benford’s law and COVID19 data

Any number can be written in scientific notation as , where is the significand and is an integer. Following Hill (1998), Benford’s law states that if random samples of numbers are taken from distributions selected at random, the significand’s first digit frequencies of the combined sample will converge to Benford’s distribution, i.e., the associated probability that digit occurs as the leading digit of is given by The above states that in a collection of random numbers, 1 will appear as the first digit with a probability of 30.1%, 2 will appear with a probability of 17.61%, etc. So then, significant deviations of the actual from the theoretical distribution are attributed to data misreporting. According to Nigrini (2015), Benford’s law can be applied to numbers (not, e.g., labels) data that represent the sizes of events or facts, with no associated built-in minimum or maximum value distributed over several orders of magnitude. To evaluate the extent of data misreporting, we use the Total Number of COVID19 cases in each country, a variable that satisfies all the above conditions.3 Using the Total Number of Cases has an additional advantage. If we assume that data are not just misreported and governments manipulate them in a sophisticated manner, their focus would be on new cases, as this is, typically, the measure used to evaluate government efficiency. Thus, any data manipulation on new COVID19 Cases would make the total cases move erratically. The data we use covers 196 countries from the day of the first recorded case of COVID19 infection until July 6, 2021.4 To measure data misreporting, we compute the Mean Absolute Deviation (MAD), i.e., the mean absolute difference between the expected probability that a particular digit will appear with the realized one.5 Fig. 1 presents the estimated MADs, with darker areas indicating higher MAD.6
Fig. 1

MADs across the world.

According to our computations, countries in the Middle East and Eastern Africa appear to have unreliable data. In contrast, Northern America and EU countries report data more in line with the theoretical predictions. Note that Australia and New Zealand have high MADs. This can be attributed to the fact that these two countries had a low number of active COVID19 cases due to the highly restrictive government measures.7 MADs across the world. Failure to comply with Benford’s law provides evidence of data misreporting. Misreporting, however, can be the outcome of several reasons: poor data collection methods, COVID19 test unavailability, and even stringent lockdown policies may explain these deviations (Koch and Okamura, 2020). Our strategy, then, is to use an econometric framework to determine the factors that explain non-compliance with Benford’s law. If differences in MADs are determined by factors that correlate with the government’s data collection ability, or the policies to contain the spread of the disease, non-compliance is not an important issue. However, to the extent that MADs are associated with political variables, data manipulation by the government may explain cross-country differences.

Model specification

The dependent variable of our model is the MAD as derived above. The variable of interest is the type of political regime. We use the dichotomous measure of democracy of Bjørnskov and Rode (2020). The political regime is a crucial factor affecting the government’s incentives to falsify data. Specifically, we expect that less democratic countries are more prone to data manipulation. First, in democracies, there are checks and balances that limit the ability of the government to falsely report data, as the separation of powers and the development of independent economic and political institutions will constrain the incumbent’s ability to manipulate official statistics. At the same time, autocrats need to project power since their stay in power relies on this. A weak autocratic ruler will be eventually disposed of, especially if he cannot protect the well-being of his citizens and, thus, gain their loyalty. As we want to derive the causal effect of the incentives and constraints placed by the political regime on the government to misreport data, we employ an Instrumental Variable Model as in Lewbel (2012). This method uses external instruments complemented with internal ones to increase efficiency. The internal instruments are constructed from the auxiliary equation’s residuals multiplied by the mean-centered exogenous variables.8 The external instrument follows Acemoglu et al. (2001), which shows that different colonization strategies created different sets of political and economic institutions. The colonization strategy was influenced by the conditions where colonizers were settled. Hence, we use the log of mortality rates by the first European settlers in colonies (. Our model then takes the following form: Where is a set of exogenous regressors, which include an index of quality of government, the share of imports plus exports to GDP (Openness), the Stringency index of government policies during the pandemic,9 the total number of Hospital beds (per 1000 people), the Cardiovascular death rate, the log of the total population and the log of GDP per capita. More details about the control variables and the data employed are given in Appendix.

Results

Table 1 presents our results.
Table 1

The effect of democracy of Covid19 data misreporting.

(1)(2)(3)(4)(5)(6)(7)(8)(9)
Baseline modelFebruary to June 2020October 2020 to June 2021OLSVDEMPolityWithout ControlsTotal DeathsExcluding high MADs

Democracy−2.750*−0.928−3.298*−2.114*−6.994*−0.191*−1.531*−2.580*−1.518*
(−3.875)(−1.315)(−3.503)(−3.155)(−3.471)(−2.618)(−2.543)(−2.631)(−2.650)

Observations96969611496961309690
R20.4180.5010.2540.3780.3610.3750.1750.4720.335
Kleibergen–Paap45.0815.4329.66393.8945.0829.57
Hansen J17.0514.8716.7616.4820.1821.3012.720.335
Breusch Pagan23.63 (0.00)26.55 (0.00)25.93 (0.00)9.11 (0.025)23.63 (0.00)25.98 (0.00)

Note: Breusch–Pagantests the null hypothesis of homoscedasticity in the first stage equation (-value in the parenthesis). Hansen J is an overidentification test of all instruments. The joint null hypothesis is that the instruments are valid. Kleinberger–Paap tests weak identification. It rejects the hypothesis of weak identification at the 5% level. Robust -statistics in the parenthesis. (*).

The results indicate a causal effect between democracy and data misreporting.10 However, this effect is conditional both on the ability of the government to collect data, as determined by the quality of government indicator, the level of development, and the rest of the variables that control for factors that affect the deviation from Benford’s law (e.g., the intensity of the lockdown policies). Moreover, the estimated effect of democracy is quantitatively significant, as democratic countries have on average a MAD approximately 2.7 points lower than autocracies. The effect of democracy of Covid19 data misreporting. Note: Breusch–Pagantests the null hypothesis of homoscedasticity in the first stage equation (-value in the parenthesis). Hansen J is an overidentification test of all instruments. The joint null hypothesis is that the instruments are valid. Kleinberger–Paap tests weak identification. It rejects the hypothesis of weak identification at the 5% level. Robust -statistics in the parenthesis. (*). Estimating the same model for different periods, we establish that the effect of democracy is present only at the later stages of the pandemic (column 3), but fail to find any significant effect for the initial period of February 2020 to June 2020 (column 2). This suggests that autocratic governments did not try to manipulate the pandemic data at the early stages. It also might explain the failure to detect data misreporting when examining COVID19 cases early in the pandemic (Koch and Okamura, 2020). After all, there was great uncertainty regarding the correct response towards the pandemic during the first pandemic wave. Also, there was no proper yardstick to compare the government’s performance. Similarly, several countries were left vulnerable after the first wave, increasing the accountability regarding the spread of COVID19 increased, and hence the pressure on the government to appear efficient. So, in the second wave, any incentive to misreport the data was much higher. Next, we perform a series of robustness tests. First, in column (4), we estimate an OLS model, and in columns (5) and (6), we use alternative measures of democracy. Then, we drop all control variables, except democracy (column 7). Next, column (8) reports the results using the MADs computed from the total number of deaths instead of total cases. Finally, in column (9), we drop the top 5% of the countries with higher MADs (i.e., countries with MAD higher than 9) since we want to ensure that outliers do not drive our results. All results indicate a robust relationship for the core issue under study.

Conclusions

COVID19 data from autocratic countries can generally be considered unreliable, and if the policy is structured using data from these countries, particular caution should be placed.
  2 in total

1.  Democracy and COVID-19 outcomes.

Authors:  Gokhan Karabulut; Klaus F Zimmermann; Mehmet Huseyin Bilgin; Asli Cansin Doker
Journal:  Econ Lett       Date:  2021-03-27

2.  Benford's Law and COVID-19 reporting.

Authors:  Christoffer Koch; Ken Okamura
Journal:  Econ Lett       Date:  2020-09-14
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.