Literature DB >> 35035005

A flexible framework for intervention analysis applied to credit-card usage during the coronavirus pandemic.

Anson T Y Ho¹, Lealand Morin², Harry J Paarsch², Kim P Huynh³.

Abstract

We develop a variant of intervention analysis designed to measure a change in the law of motion for the distribution of individuals in a cross-section, rather than modeling the moments of the distribution. To calculate a counterfactual forecast, we discretize the distribution and employ a Markov model in which the transition probabilities are modeled as a multinomial logit distribution. Our approach is scalable and is designed to be applied to micro-level data. A wide panel often carries with it several imperfections that complicate the analysis when using traditional time-series methods; our framework accommodates these imperfections. The result is a framework rich enough to detect intervention effects that not only shift the mean, but also those that shift higher moments, while leaving lower moments unchanged. We apply this framework to document the changes in credit usage of consumers during the COVID-19 pandemic. We consider multinomial logit models of the dependence of credit-card balances, with categorical variables representing monthly seasonality, homeownership status, and credit scores. We find that, relative to our forecasts, consumers have greatly reduced their use of credit. This result holds for homeowners and renters as well as consumers with both high and low credit scores. Crown

Entities: Chemical

Keywords: COVID-19; Consumer finance; Intervention analysis; Liquidity; Markov model

Year: 2021 PMID： 35035005 PMCID： PMC8748006 DOI： 10.1016/j.ijforecast.2021.12.012

Source DB: PubMed Journal: Int J Forecast ISSN： 0169-2070

Introduction and motivation

In an interview published in this journal, George E. P. Box described how he and George C. Tiao set out to study air pollution in Los Angeles (Peña, 2001). This analysis led to their seminal paper on intervention analysis, Box and Tiao (1975), in which they studied pollution concentration levels surrounding the introduction of a law restricting the types of gasoline sold. Since then, intervention analysis has been used in such diverse fields as driver safety (Bhattacharyya & Layton, 1979), marketing (Leone, 1987), television viewership (Krishnamurthi et al., 1989), call-center operations (Bianchi et al., 1998), and, more recently, the study of disease outbreaks (Daughton et al., 2017). Intervention analysis has also featured prominently in economics literature—to investigate the effects of government policies on the Consumer Price Index (Box & Tiao, 1975), the inflation-targeting strategies of central banks (Angeriz & Arestis, 2008), the effect of local tax policy (Bonham & Gangnes, 1996), the linkages between equity and futures markets (Bhar, 2001), and the impact of natural disasters on capital markets (Worthington & Valadkhani, 2004). In this paper, we apply intervention analysis to measure changes in credit usage during the COVID-19 pandemic, using Canadian consumer credit data from January 2017 to September 2020. Researchers typically conduct intervention analysis by estimating a time-series model, such as an autoregressive integrated moving average (ARIMA) model, with a long series of aggregated data. In contrast, we take a short but extremely wide panel of individual-level data and use consecutive pairs of monthly observations to identify the law of motion. This approach is the first to take advantage of the variation among individuals within a cross-section, and permits estimation over a much shorter time span or with data that are more widely spaced over time than would otherwise be possible. Using our method, we discretize the cross-sectional data of individuals in a panel data set to approximate nonparametrically the distribution of measurements. That is, the law of motion in this model governs the time series of the distributions of the individual outcomes. Nonparametric methods have been applied to the mean prediction from a time-series model within an intervention analysis. Stock (1989) applied a semiparametric model for the variable of interest that included dummy variables to measure the effects of the intervention. Park (2012) followed a similar approach, to build a central mean subspace for the time series, which is a dimension-reduction method focused on the conditional mean. To the best of our knowledge, none has been applied to the distribution of the variable of interest. The unique feature of our approach is that it avoids modeling the moments of the population. The discretized distribution is modeled to follow a Markov process, with transition probabilities represented by a multinomial distribution, permitting these probabilities to depend on covariates. The result is a flexible framework for conducting intervention analysis. Consequently, the effects related to time-varying mean or variance are already admitted. The distribution is fit to the data with little risk of bias from misspecification, creating a reliable benchmark for measuring the effects of the intervention. Furthermore, this framework is resilient to all sorts of otherwise inconvenient features of the data—including point masses as well as natural boundaries or missing observations. Such an approach is critically important when modeling the behavior of a large population of individuals, many of whom have an experience that interacts with these inconvenient features of the variable in question, and would otherwise interfere with the measurement obtained with conventional methods. In addition to allowing a greater ability to fit the data, the flexibility of our framework allows a researcher to detect responses to interventions that lie outside the canonical set of responses. Typically, the response to an intervention is assumed to take on one of four forms: a gradual change, an abrupt change, an abrupt change that reverses, or an abrupt change that gradually subsides. All four of these forms describe a change in the mean or expected value of the variable and are otherwise only differentiated by the time dimension. Our approach also addresses the nature of the change in the distribution, which can manifest in changes other than the mean. For instance, the response to an intervention could take the form of an increase in the variance or skewness of the distribution, leaving the mean constant, following any of the canonical response patterns in the time dimension. With a framework that is nonparametric, the possibilities extend beyond an analysis of the moments of the distribution. In the language of the time-series literature, the goal of intervention analysis is to estimate a form of structural break in a dynamic process. Our approach is rich enough to detect structural breaks that materialize in the shape of the distribution, rather than in the values of parameters, such as those specifying the mean of the distribution. This framework is applied to analyze the credit-card balances of Canadians during the COVID-19 pandemic. Our empirical application extends the work in Ho et al. (2021) by entailing a closer inspection of consumer credit usage. In general, our results show large decreases in credit-card balances that are consistent with their findings. Interestingly, we also found that consumers with the tightest credit constraints did not become more indebted during the first wave of COVID-19. Indeed, there was a large increase in the fraction of these consumers with a balance of less than $500. The effect of COVID-19 on their credit usage persisted when the economy reopened after the first wave subsided. On the other hand, we found a substantial decrease in the proportion of creditworthy homeowners with balances between $3000 and $14,000 during the first-wave lockdown periods. This group of consumers also had the strongest recovery in credit usage after the economy reopened, suggesting that the reduction in balances was due to limited spending opportunities. Some of these creditworthy homeowners, however, also appeared in greater proportions in the highest balance categories, although this occurred for a small number of consumers. Our empirical findings complement literature concerned with the effects of COVID-19 on household finances. Much of this research regarding the crisis has focused on consumer spending. According to Baker et al. (2020), consumers significantly reduced overall spending, and credit-constrained households responded rapidly to the fiscal stimulus payments from the 2020 CARES Act in the United States. In Denmark and Spain, similar effects on consumer spending were found in Andersen et al. (2020) and Carvalho et al. (2020). In Canada, Ho et al. (2021) used a special case of our framework and found that consumers paid down balances of their credit cards and home-equity lines of credit (HELOCs). In other work, Chen, Engert, et al. (2020) showed that consumers also increased cash holdings. Considering the economic disruption and surging unemployment rate due to COVID-19, our findings are in sharp contrast to existing evidence in the household finance literature that individuals tap into various forms of credit to smooth their income shocks. For instance, Sullivan (2008) and Agarwal and Qian (2014) found that consumers with few assets smooth their unemployment shocks via unsecured debts. Wealthy homeowners also tend to borrow against their home equity through mortgage refinancing (Chen, Michaux, and Roussanov, 2020, Hurst and Stafford, 2004) or HELOCs (Agarwal et al., 2006). Our methodology also contributes to the literature on applying nonparametric methods to credit risk management. An early application was tested by Khandani et al. (2010), who constructed nonlinear nonparametric forecasting models of consumer credit risk. They obtained significant improvements in the accuracy of forecasted delinquencies. Kruppaa et al. (2013) also documented improved forecasts from using an implementation of random forests. Yao et al. (2017) used a support vector machine as a classifier in their two-stage loss given default models for credit cards. Jiang et al. (2021) employed large-scale alternative data to construct credit scores to predict consumer delinquency. Although our methodology employs a more rudimentary form of nonparametric analysis, our framework is rich enough to incorporate both the nonlinearity and the dependence present in the data. The ability to account for dependence affords other possibilities for credit modeling in the time dimension. For policymakers, this framework is particularly useful to analyze the effect of policies that target specific groups of individuals, such as mortgage stress testing and macroprudential policy (see, for example, Siddall, 2016). These policies often concern the tail in a distribution and have important implications on financial stability (Allen et al., 2020) that can be masked over by investigating lower moments. For practitioners, our modeling framework can be used to guide a dynamic model-fitting strategy for credit risk management. The quality of predictions from a credit risk model may degrade over time due to various factors, such as business competition, that shift the client base. Industry practitioners can use our framework to analyze the distribution of their clients’ characteristics. A change in the distribution may indicate that the business landscape has changed over time. For instance, a different population of credit applications may be flowing through the model, or an existing set of clients may be experiencing changes, such that the model specification is different from that under the original model-fitting exercise. Generally, individual-level data are often stored as wide panels of microdata with a limited number of observations. Such data are often highly variable and do not often conform to a parametric distribution. The flexibility of our framework provides a procedure to manage risks in complex and realistic environments. The remainder of the paper is in four additional sections: In the next section, we first summarize briefly the notions behind intervention analysis. We then develop our econometric model to augment the standard approach to intervention analysis to produce a more flexible framework. In Section 3, we provide simulation evidence for the size and power properties of our test statistics. In Section 4, we apply our technique to a data set that is well suited to this approach—Canadian consumer credit data—and report our empirical results. In the final section of the paper, Section 5, we summarize and conclude.

Intervention analysis

In this section, we describe our methodology for conducting intervention analysis. We begin with a description of a standard approach, to contrast with our framework.

Background

The canonical form of intervention analysis is conducted with a time-series model, most commonly, the autoregressive moving-average (ARMA) model. ARMA models were designed to do two basic things: (1) refine forecasts by using past information, and (2) admit dependence when constructing confidence intervals as well as testing hypotheses. Testing hypotheses concerning events—either anthropogenic or natural—that cause changes in the process can be tricky because distinguishing between the dynamic effects of the change and the dependence in the process is difficult. Researchers interested in investigating the magnitude of the effects of changes have to take a stand: dependence in the errors often either masks the effect itself or makes it difficult to decide whether the effect is statistically significant. To address such difficulties, Box and Tiao developed the tools of intervention analysis. In general, an ARMA model of the random variable in period can be written as where and are polynomials in the lag operator , which has the property for instance. The elements of the sequence are assumed to be jointly independent, Gaussian innovations that have mean zero and constant variance . An example of an ARMA(1,1), for instance, would involve In this case, for technical reasons, the following restrictions would be required: The above general ARMA model can be rewritten as where we have a subscripted , the mean, to highlight the fact that a change, denoted below, will be relative to that object. Suppose, now, that at some time a change occurs, so that after that date, where represents the magnitude of the change in period . Several possible patterns exist concerning how the intervention may affect the mean of the stochastic process over time: First, a permanent, constant change in the mean may occur; for example, in Panel (1) of Fig. 1, an increase in every period after is obtained. Second, a brief constant change in the mean could occur; for example, in Panel (2) of Fig. 1, a temporary decrease in the mean occurs, but has no effect thereafter. Third, a gradual increase (or decrease) might happen; for example, in Panel (3) of Fig. 1, the mean of the process rises to an asymptote. Fourth, a brief initial change, but then a return to the previous mean, may be the outcome; for example, in Panel (4) of Fig. 1, a temporary increase obtains, which then subsides.

Fig. 1

Plots of four different intervention patterns.

To model the first sort of intervention, simply introduce a dummy variable , which equals zero up to and one thereafter, so Plots of four different intervention patterns. To model the second sort of intervention, introduce a similar dummy variable , which equals zero up to and one during the period of effect, and then zero thereafter, so To model the third sort of intervention, introduce a dummy variable , which equals zero up to and one thereafter, but some additional structure is required. Specifically, one needs a model of the gradual effect. One such model involves period-to-period increases that are proportional to the current distance between the old mean and the new one , such as where , and the parameter determines the speed of the gradual change: if is small (say, 0.1), then the change is slow, whereas if is large (say, 0.9), then the change is relatively fast. This, then, reduces to the following: where . To model the fourth sort of intervention, introduce a dummy variable , which equals zero up to , one in period , and zero thereafter. Again, some additional structure is required in order to model the gradual return to the old mean. As in the third case, one such model involves a decrease that is proportional to the distance between the new mean and the old one . This then reduces to the following: In order to estimate the effect of an intervention, one must know its date— exactly— because one must specify one of the above four models of the intervention beginning at that date. Perhaps belaboring the obvious, if one gets that date wrong, then the entire analysis is at risk: one may totally mismeasure the effect. That noted, once one has determined , one can then identify and estimate the ARMA model in the usual way, using the data from before , and then forecast what would have occurred under the old regime. Based on that forecast, one can then calculate the differences between actual outcomes after the intervention and the forecasted ones. By examining these differences, one can then choose one of the four models of the intervention pattern, and then estimate a particular empirical specification using all of the data. In our work, we do not put structure on the shape of the intervention, choosing instead to estimate the effect nonparametrically. One of the main limitations of intervention analysis is that the ARMA process is assumed to remain the same after the intervention as it was before the intervention. Obviously, this is a strong assumption, but it is necessary in order to decompose the dynamic effects of the intervention from the dynamic effects of the dependence. This assumption is reasonable for our analysis, since the true extent of the pandemic was largely unknown before its arrival in Canada, and the policy response was quickly executed, allowing very little time for consumers to act in anticipation.

A flexible framework for intervention analysis

We now outline a framework for intervention analysis that focuses on the distribution of the cross-section over time. To begin, consider a sample of outcomes that have been drawn independently from a common cumulative distribution function (CDF) . The Glivenko–Cantelli Lemma allows one to estimate the population CDF consistently using the empirical distribution function (EDF): where equals one if the event obtains, and is zero otherwise. That is, the EDF is the proportion of sample outcomes at or below some value . Using transformations of Eq. (1), one can also construct consistent estimates of such population measures as the mean and variance as well as quantiles, such as the median. This is useful because an intervention need not affect the variable of interest in the same way across the support of the distribution. In other words, nonlinearities may exist. One way in which nonlinearities can manifest themselves is when those who are currently at a boundary (for example, zero credit-card balances) are induced to the interior (for example, positive credit-card balances), whereas those at the interior (for example, those who have already positive credit-card balances) respond differently. Currently, some existing models can deal reasonably effectively with the interior (for example, positive) outcomes, whereas other models can accommodate boundary (for example, zero) outcomes. Integrating the two kinds of models is, however, an open area of development. Inevitably, external factors beyond our control can affect outcomes, too. Such omitted factors (which may often be unmeasurable) can exhibit strong dependence over time—in short, correlated errors. Correlation in the errors need not imply inconsistency when estimating changes due to events, but correlation can result in estimates of sampling variability that are incorrect, which in turn implies potentially misleading test statistics. In basic statistics courses, it is common to write the mean squared error () of an estimator of an object in terms of the variance () of as well as the square of its bias (). In other words, This decomposition in Eq. (2) highlights the tension in some situations where an unbiased estimator has considerable sampling variability, which can be reduced by introducing a biased estimator of that has less sampling variability. The research of Stein (1956) is an important example, and helped to spawn the empirical Bayes literature. Elsewhere, building on the research of Andrey Tikhonov in the 1940s and reported initially in Russian, researchers have suggested regularization as a way to reduce variance, but again introducing bias; for example, the LASSO of Tibshirani (1996) is an important method. Recently, Mao and Zheng (2020) noted that economic theory can be used as a form of regularization to reduce sampling variability, but again by introducing bias. These considerations of the tradeoff between bias and sampling variability are the primary motivation for developing this specific framework for intervention analysis. In order to achieve an accurate benchmark with which to compare performance, we aim to minimize the imposition of theory. This approach allows much more freedom for the data to inform the analysis, producing a representative counterfactual prediction. As long as we have a large data set, we have the luxury of taking on such an agnostic approach to modeling the law of motion while still maintaining a low degree of sampling variability. An added benefit of our flexible framework is that it allows for a consistent estimate of the intervention response in the presence of bounds, across the entire distribution. To illustrate this point, consider an analysis of a simple autoregressive model with a lower bound of zero. In the context of credit-card balances, this is represented by a discrete time series of the consumer’s net worth. When the net worth is negative, the value is represented by the positive value of the credit-card balance. If the consumer’s net worth is positive, then the consumer holds a zero balance on credit cards and might also hold a positive balance in another account that we do not observe, such as a checking account. Using this censored series, the estimates of the parameters in the autoregressive model would be both biased and inconsistent. In particular, the estimates of the mean credit balance would be biased upwards, that is, higher credit balances, since the censored series ignores the positive net worth, that is, negative credit balances, at zero. Furthermore, the constraint to non-negative credit-card balances creates the illusion of stronger mean reversion, since the series never moves beyond zero in the negative direction. It is even true that, for example, a random walk with a negative drift would appear to be stationary. In less extreme cases, the speed of mean reversion would be overestimated, as there is, effectively, an infinite degree of mean reversion over the lower bound of zero that is never crossed. In the above example, the researcher naïvely estimates the model as if the data were not censored. A more sophisticated treatment of censored data can be achieved with a Tobit model, named in honor of the American economics Nobel laureate, James Tobin (1958). This sort of model is designed to analyze random variables in a cross-section that has both continuous and discrete components. In Appendix A.1, we outline an example in which a series conforms to such a specification. Following Amemiya (1984), consider the latent random variable , which follows the Gaussian law, having mean and variance . Suppose the observed random variable is defined by the following: In the notation of the example above, represents the negative of the discounted lifetime income of the consumer.1 For all the observations in which is positive, the consumer holds a credit-card balance. When is negative or zero, however, only a zero balance is recorded in the credit-card account. As a result, a point mass exists at , which is accounted for in the likelihood function of the Tobit model. In Appendix A.1, we study a specific example with a first-order autoregressive model—an AR(1) process: where denotes independently and identically distributed Gaussian random variables having mean zero and variance . We consider the results of analyzing this data with a Tobit model, while ignoring the dependence. Robinson (1982) showed that, in this situation, one can estimate consistently, but the asymptotic standard errors are different from those derived from the Hessian matrix for the Tobit model. Furthermore, one cannot consistently estimate the correlation parameter . We collected some Monte Carlo evidence that supports these claims. Perhaps most disturbing, however, we found that the error in the estimation of the standard errors grew with the sample size. Resolving these problems would require an estimation method that properly accounts for the dependence. To specify the likelihood function appropriately requires integrating out the missing observation each time a zero is recorded, which can be computationally burdensome, especially in the event of many consecutive zero observations, as is common in our data set. Aside from dependence in the data, the presence of heteroskedasticity can also complicate inference. Hurd (1979) as well as Arabmazar and Schmidt (1981) demonstrated that heteroskedasticity within the Gaussian family could result in biased and inconsistent Tobit MLEs. In addition, the standard errors based on the Hessian matrix are also incorrect. Arabmazar and Schmidt (1982) as well as Paarsch (1984) demonstrated that when the errors do not follow the Gaussian law, the Tobit MLE, assuming the Gaussian law, is biased and inconsistent; the standard errors based on the Hessian matrix are incorrect, too. In Appendix A.1, we provide some Monte Carlo evidence for these claims in the case when the errors follow the lognormal (Galton) law. These simple models underscore the importance of employing a model with the flexibility to account for the features of the data. The flexibility of our modeling framework also affords the ability to detect responses that take on a variety of forms. In each of the response patterns depicted in Fig. 1, the response is clearly detectable through a change in the mean of the process. This characterizes the sort of response that can be detected with the traditional forms of intervention analysis. In contrast, in Fig. 2, we expand the set of patterns of intervention responses shown in Fig. 1 to include responses in which the mean remains constant, but changes effect in the higher moments. In these mean-invariant intervention responses, the focus is on the cross-sectional distribution of the response, rather than focusing solely on the time dimension.

Fig. 2

Plots of four different mean-invariant intervention responses.

Plots of four different mean-invariant intervention responses. As a simple example, consider Panel (1) of Fig. 2, where the intervention manifests in the form of a change in variance with a constant mean. In Panel (2), the intervention response changes the kurtosis of the distribution, leaving all three lower moments unchanged. In our application to credit usage, these distributions were selected to model the following potential response. Suppose the pandemic affected two groups in opposite directions: For instance, suppose consumers with low balances became better off and paid down any debts they had. On the other side of the distribution, suppose consumers who borrowed heavily were made worse off, so that they accumulated more debt. This could leave the mean unchanged and emerge only as a change in variance. Now suppose the population is split into four groups: Some consumers with zero balances begin to borrow, while some consumers with extremely high balances cut back their lavish spending. This chain of events could result in a shift to a more platykurtic distribution, but one that does not change in terms of mean or variance. Panel (3) of Fig. 2 depicts another type of change: a zero-inflated exponential distribution, where the mean of the exponential is linked to the proportion of zeroes to maintain a constant mean of the mixed distribution. In this example, the variance and skewness change, but the mean remains constant. In our empirical application, this would be consistent with the notion that some consumers pay off their debts, while other consumers are driven deeper into debt. We also considered the possibility that the mean stays constant, but the mode of the distribution changes, as in Panel (4) of Fig. 2. To achieve this, we used a zero-inflated model with a two-parameter distribution—in this case, the beta distribution. In this example, the proportion of zeros increased, while the higher mode shifted upward. This would be a type of mean-invariant response that would be easier to detect for a given sample size because it occurs with large proportional changes in probability mass, particularly between the regions containing the modes. Below, we present a relatively simple, yet flexible, framework that can detect any one of these forms of responses, among many others. Moreover, it is a framework within which the transition from the boundary to the interior is admitted, and within which forms of Markovian dependence can be accommodated.

Baseline model

Even though the EDF is an extremely useful function in statistics, most users of statistics are more familiar with the histogram, which is obviously related to the EDF as well. Specifically, imagine dividing the support of into mutually exclusive and exhaustive intervals , where .2 In short, is contained with certainty in one of the intervals, so . The histogram is then the fraction of the sample whose values fall within each of those intervals. Unlike the EDF, which can deal with both continuous and discrete values of the outcome variable, when is continuous, the histogram is an approximation to the probability density function (PDF) ; for discrete random variables, at the appropriate granularity, the histogram is an unbiased and consistent estimator of the probability mass function (PMF). In the case of balances, which are measured to the nearest cent, this may be important. One straightforward way to implement the histogram involves counting the number of outcomes in each interval, and then scaling that frequency by the total number of observed outcomes . In short, letting denote the count of values in the interval , the count of values in the interval , and so forth, then for . Another, more complicated (but later useful) way to implement the histogram involves introducing the random vector , which is defined as where equals one if , and zero otherwise. Note that the values of sum to one. The PMF of corresponds to that of the well-known multinomial distribution, which can be written as where , , and . One way to parameterize in Eq. (3), which respects the restrictions imposed above, involves introducing the following logit transformation: By adding up, The logit transformation is introduced because of its computational parsimony and numerical tractability. Specifically, although this transformation constrains to the unit simplex, the parameters are contained in the real line, which is particularly useful when it comes to numerical optimization. Note, too, that a one-to-one mapping exists between the relative frequencies, and , and the logit parameter , specifically, for . For notational parsimony, collect the parameters in the vector . In terms of training this model, the appropriate estimation technique depends on the particular specification. Broadly speaking, we estimate by maximizing the likelihood function, using common tolerance criteria for convergence to an optimum. A quasi-Newton method will often work; as in the case of indicator variables, however, the parameters can sometimes be concentrated out by using sample histograms on subsets of the data. For more complex models with continuous variables, the parameters are estimated by numerically optimizing the likelihood function, but these parameter searches will also be executed on subsets of the data, since the observations previously in a particular state are the only ones relevant to estimate the transition probabilities to the next state. In numerical terms, the Hessian matrix is block diagonal. We describe the calculations involved in performing the numerical optimization in Appendix A.2.

Dependence

In many applications, persistent, potentially unobserved factors have almost surely been omitted, which can induce dependence among outcomes over time. Many different forms of dependence can exist. Perhaps the simplest generalization of the independent process assumed above is the first-order Markov process. To explain the first-order Markov process, consider for observation a sequence of random vectors that are generated over the time horizon , namely, . In general, the distribution of, say, depends on , that is, all past realizations of , not just those observed. In short, the conditional PMF of has the following structure: In order to make any headway in investigating the importance of dependence over time, a device must be introduced to limit the horizon of this dependence. The first-order Markov assumption does just that. Simply put, the first-order Markov assumption states that is a sufficient statistic for the entire history of up until period . Put another way, the information in the vector , even if it were observed, can be ignored in an empirical analysis. Therefore, under a first-order Markov process, To illustrate how this process works, consider a simple example: Suppose is one; that is, only two states exist—zero and one. Such a specification is often referred to as a mover-stayer model in statistics literature. In the mover-stayer model, the dynamics of movements from state to state are governed by a transition matrix, which is typically written as follows: where the elements of in Eq. (6) are parameters that satisfy the following restrictions: and for . Element of the transition matrix is the probability of transiting to state given that the current state is . Under independence, as well as (by adding up), so the mover-stayer model nests a model in which independence is assumed. This sort of discrete-state, first-order Markov process can be used to approximate the following linear, first-order autoregressive [AR(1)] model of a continuous, positive random variable: where denotes independent and identically distributed normal random variables having mean zero and variance one. The parameter in Eq. (7) is a location parameter of sorts, whereas the parameter is a scaling parameter, and the parameter controls the amount of linear dependence. In order for the process to be stationary (that is, for it not to explode as time proceeds), the absolute value of must be less than one. Under these assumptions, the unconditional distribution of then belongs to the lognormal family, having mean and variance , so its PDF is The presence of point masses (for instance, at zero), however, makes it difficult to implement Gaussian autoregressive processes, such as those defined by Eqs. (7), (8), which is why we chose to discretize the problem. In keeping with the notation of Eq. (6), for observation , we specify , in general a matrix having the following form:

Introducing feature variables

When a feature variable is introduced, the loss function becomes where denotes parameters collected in the vector . In this case, the extra subscript followed by a comma is added to the definition of to alert the reader to the fact that this function now depends on the value of , that is, The loss function for the sample can be formed by adding up the terms. Obviously, including additional feature variables is straightforward, but computationally tedious.

Dealing with seasonality

One way to deal with seasonality is to assume that the transition probabilities change periodically, such as month to month. This is as simple as introducing a set of binary feature variables in the above framework. In short, introduce the single-index function , where is defined to be a vector that has zeros everywhere except for a one in the relevant month for period and . Whence, under the logit link function, . If seasonal indicators are the only covariates, as in Ho et al. (2021), then the maximum likelihood estimator is equivalent to estimating a series of separate transition matrices, as in Eq. (9), except that the sample for each matrix is restricted to the observations corresponding to each season. Further, the rows of these transition matrices are estimated by calculating a series of histograms, conditioning on the category in the previous observation.

Measuring the effect

How do we decompose the observed distribution of the variable of interest? That is, how do we distinguish between dependence in the data and the effect of the intervention on this variable? Well, we estimate the model up to the date before the intervention. Then, based on those estimates, we calculate the one-, two-, three-, …, -month-ahead forecast distributions according to the Markov model: We then compare the distributions generated by Eq. (10) to what actually obtained: in our notation, , , , and . As a summary measure of the difference, we consider which is the percentage difference between what actually obtained relative to what is predicted to obtain in cell . This object can be plotted on the ordinate versus the various cells on the abscissa to provide the reader with a visual description of how the variables changed relative to what would have been predicted before the intervention. The following statistic, facilitates formal statistical testing of the difference in the two distributions. This statistic was first proposed by Kullback and Leibler (1951) and is commonly referred to as the Kullback–Leibler divergence criterion. It is based on a concept of information introduced by Shannon (1948). Belov and Armstrong (2011) demonstrated that, for a pair of continuous distributions, a version of this statistic has a limiting (asymptotic) distribution with one degree of freedom. Parkash and Mukesh (2013) investigated the case of discrete distributions, which corresponds to our application, determining that the statistic has a limiting distribution, but with degrees of freedom—that is, the number of categories minus one. Song (2002) demonstrated that the Kullback–Leibler divergence statistic is asymptotically equivalent to the likelihood-ratio statistic for detecting a difference between distributions. In our application of this statistic, however, many parameters must be estimates, so it remains an empirical question as to the sample size that achieves the asymptotic distribution. We document this in the following section.

Simulation evidence

In this section, we provide two types of simulation evidence concerning the effectiveness of our test statistic. First, we verify the limiting distribution of our statistic under the null hypothesis for a reasonably large sample size. Second, we conduct a power analysis to determine the magnitude of a deviation from the null hypothesis that can be detected with this statistic, as a function of the size of the sample. Bootstrap distribution under the null hypothesis.

Empirical distribution

We implemented a parametric bootstrap simulation of our forecast statistic. Under the null hypothesis of no change in the transition matrix, first-order asymptotic distribution theory predicts that the Kullback–Leibler divergence statistic follows a distribution, when comparing the forecasted probabilities to those observed under the data generating process. We verify this claim here. We begin our investigation by defining a transition matrix with zero-inflated lognormal distributions in the columns. This matrix was chosen to approximate the transition matrix that was actually obtained from the observations in the empirical example in the next section. In particular, it imposes a limiting distribution with a long tail and an atom at a boundary. Across the categories, the zero probabilities decrease and the lognormal means increase to place most weight on the diagonal elements. The simulation is primed with a vector of starting values with frequencies drawn from the multinomial distribution with probabilities from the ergodic distribution defined by the transition matrix. This vector has an integer count of individuals across the categories, including the zero category. For each period, from to , we generated the state vector of counts for the next period by looping down the categories and taking draws from the multinomial distribution with probabilities defined by the corresponding column of the transition matrix. These are draws of vectors of integers with sums that add up to the bin count from the last period.3 After the realization of period , the calculation of was completed, denoted in the bootstrap replications. For periods after period , the data generating process continued to produce the next vectors recursively, with the true transition matrix , as before. This generated realized counts of individuals in the categories at each period . We also computed forecasts by left-multiplying the estimated transition matrix with , the last vector of relative frequencies from the in-sample period. For each realization of the replications, we calculate the Kullback–Leibler distance between and . As shown in Fig. 3, this distance statistic follows the distribution with degrees of freedom.

Fig. 3

Bootstrap distribution under the null hypothesis.

Power analysis

We defined local alternatives in a particular fashion to approximate the estimates from the data in our empirical application to credit-card balances. We increased the probability mass in the zero category by a factor of by moving probability mass from the other categories. We observe 99% rejection probability under alternatives with , and 55% rejection probability with . This indicates that a sample size of 10,000 individuals can detect deviations of 10% of the probability mass in the zero category with power near one—that is, a change from 15% to 13.5% in the zero category, for instance. A sample size of can detect a reallocation of 1% probability mass. A pair of distributions from the null hypothesis and the alternative hypothesis are depicted in Fig. 4. The dashed vertical line represents the 5% critical value of the distribution with degrees of freedom, with 55% of the probability mass of the distribution of the alternative in the rejection region.

Fig. 4

Bootstrap distribution under the null and alternative hypotheses.

We then considered local alternatives of this form with a set of local alternatives to calculate power curves. That is, we increased the probability mass in the zero category to times that probability mass and shifted this proportionately from the other categories, with (the null hypothesis) and , and 10.0 (the alternative hypotheses). The power curve appeared invariant across sample sizes, in terms of the number of individuals each period. In particular, the power curve for a sample size of 20,000 was hardly distinguishable from that from a sample size of 50,000. This confirmed that the test has the power to detect deviations in probability mass inversely proportional to the square root of the sample size. Bootstrap distribution under the null and alternative hypotheses. Although this example illustrates the effectiveness of our method at detecting a change in the distribution, this form of response, with a changing mean, can also be detected with the traditional form of intervention analysis. We also analyzed the mean-invariant intervention responses shown in Fig. 2. The intervention pattern in Panel (3) is closely related to the example above, in that it employs a sequence of zero-inflated exponential distributions. In this example, in contrast, the exponential parameter is linked to the proportion of zeros to maintain a fixed mean , by setting the exponential parameter to . We investigated several alternatives in which and , so that the mean of the exponential component of the mixture is set to 0.40 under the null distribution. With a modest sample size of 10,000 in the cross-section, the test achieves 99.3% power under the alternative with , which corresponds to a weight of 0.125 on zero and a parameter on the exponential distribution of 0.38. This sort of change implies a minor change in the variance or skewness of the distribution: the standard deviation rises roughly 2%, from 2.53 to 2.58, and the skewness rises less than 1%, from 2.052 to 2.056. In this example, the changing weight on zero drives the power of the test, although it requires a larger change: 25% more probability mass on zero, rather than 10% in the zero-inflated lognormal case above, with a changing mean response. Overall, this indicates that our framework can detect such changes even when the mean is unchanged, in which case the traditional form of intervention analysis would be ineffective. Panel (1) of Fig. 2 illustrates a simple example of a case in which the variance changes in response to an intervention but the mean remains constant. To examine this case, we generated alternatives from normal distributions, each with mean and standard deviation . As in the case in Panel (3) described above, this response to an intervention would not be detected under the traditional framework for intervention analysis. Within our flexible framework, however, the effect can be detected: we found a power of 99.8% for the alternative with , which corresponds to a standard deviation of 2.7, an 8% increase in variance. For the third case of our power analysis, we considered the model in Panel (2) of Fig. 2, in which the first three moments remain constant and the kurtosis changes. We used a mixture of two normal distributions, with weight on the first normal distribution, with mean and standard deviation . The remaining probability mass was placed on the second normal distribution, with the same mean, except with standard deviation under the null distribution. This results in a normal distribution under the null, and an excess kurtosis of zero. For the alternative distribution, we set the variance of the second distribution in a way that imposed constant variance of the mixture distribution over the alternatives, using the formula with the arguments or indicating the parameters of the mixture distribution under the null and alternative, respectively. With a sample size of 10,000, the test had 99.5% power to detect alternatives with , which corresponds to , , and . The resulting mixture for this alternative had an excess kurtosis of 0.72. Thus, even with the first three moments matching, our framework can detect small changes in the fourth moment with a modest sample size of 10,000 in the cross-section. Finally, we conducted a simulation to determine the performance of our model to detect a response of the form of Panel (4), in which the mode of the distribution changed and the mean remained the same. This sequence of alternatives also featured a zero-inflated distribution with and the remaining probability mass on a beta distribution with parameters and . We fixed under the null and alternatives, and varied to impose a constant mean of , using the formula . We found a rejection probability of 99.6% with a sample size of 10,000 and parameter , which corresponds to 0.12 probability mass on zero. Although the mean was held the same throughout, the distributions can have a large degree of divergence with this sort of alternative, since the distributions differ greatly in terms of the regions of the support with both high and low probability mass. In each of these simulations, we took draws from the null and alternative distributions to analyze the performance of our statistic. In our empirical application, however, we also estimated the transition matrices that determine the path to the null distribution. The question remains whether the measurement of the response is still reliable when the forecast distribution is calculated from estimated transition matrices. To study this question, we conducted another set of simulations by drawing the transition matrix from the asymptotic distribution of the maximum-likelihood estimators. For the log-odds of probabilities in each of the columns of , we obtained the estimate and Hessian matrix. In each bootstrap replication, we drew from the multivariate normal distribution with mean at the estimates and covariance matrix the negative of the inverse of the Hessian matrix. These log-odds were then transformed into probabilities and inserted as columns of the bootstrap transition matrix . We simulated the model with the zero-inflated lognormal distribution with a changing mean to illustrate a case that more closely matches that found in the credit-card data. For sample sizes in the tens of thousands, the results were not perceptibly different from those above, as would be expected from the asymptotic distribution theory for the maximum-likelihood estimator of the parameters of the multinomial distribution.

Empirical application

To apply our framework for intervention analysis, we investigated the credit usage of Canadian consumers during the COVID-19 pandemic. Here, we first describe the data set and then demonstrate that our method is useful and appropriate for analyzing these data.

Consumer credit data

Through a contract with the Bank of Canada, we acquired access to monthly anonymized data from TransUnion®, one of the two credit bureaus in Canada. The data set contains account-level balances on credit cards from January 2017 to September 2020. We aggregated account-level balances to the individual level, in order to study individuals’ decisions and to avoid complications from individuals’ choice-of-card decisions (Felt et al., 2021). The data set also contains consumer credit scores as well as the encrypted postal code of their primary residential addresses.4 Following Bhutta and Keys (2016), we defined consumers to be homeowners if they ever had a mortgage or a home-equity line of credit while living at their current postal code. For the analysis in this paper, a random 1% sample of individuals was constructed from the entire data set, based on the power analysis presented above. There are 290,436 credit-card holders in this sample with a total count of 10,528,372 observed monthly balances, in which there are 124,229 homeowners with 4,803,515 monthly observations. Once a sample data set of individual-level balances was obtained, we assigned the continuous balance variables to discrete balance categories. Instead of using evenly spaced bins, we organized consumer balances into intervals of increasing width to account for the lengthy tail of the distribution. The balance categories were sorted into intervals of width $250 up to $1500, intervals of $500 up to $6000, $1000 up to $10,000, $2000 up to $20,000, $5000 up to $30,000, one $30,000–$40,000 category, and a category of $40,000 and above in the tail. This grid of categories is representative of the variety of credit lines typically offered to different types of consumers in different risk categories. We defined the histogram bins according to risk categories for this particular application, because this aligns with the most commonly occurring credit lines. We divided consumers into sets of customers that financial institutions treat similarly. In other circumstances, one would start with common techniques for defining histogram bins. These boundaries would be further partitioned with knowledge of the relevant boundaries in the data, such as the bound of zero in our application. Finally, any atoms in the data should be placed in a separate bin, separating the adjacent observations into separate bins. With these factors taken into account, we also recommend that some bins be combined if separating them would result in a sparse transition matrix. For this reason, the bins in our application increase in width for the categories of customers with higher balances. Our aim is to characterize changes in credit usage during the pandemic. Fig. 5 conveys this information through a series of histograms. The columns furthest back represent the category with the most consumers: those with zero balances. The next most common category comprises consumers with balances between $0 and $250. Clearly, an increase in membership in the low-balance categories in the first six months of the pandemic (March to September 2020) occurred. In the spring of 2020, more consumers had a balance of less than $250 than in any other period in the sample. Over the sample period from January 2017 to June 2020, the average monthly balance was $4064, which dropped to nearly $3205 by June 2020. This represents a decline of 20% following a sustained pre-pandemic year-over-year growth of 2.7%. In addition to the COVID-19 pandemic effect, a clear seasonal pattern also exists in the low-balance categories: there are higher average balances in the fourth quarter of every year (the holiday season), and the average balances decrease in the first quarter every year. Our statistical framework aims to quantify these changes during the pandemic, while accounting for the counterfactual path over time.

Fig. 5

Histograms of account-specific credit-card balances.

At the individual level, four main features of the data set are accommodated nicely by our empirical framework. First, the distribution in a given month has a nonstandard shape, which is not characterized adequately by a small number of moments. Most notably, the highest bars at zero in Fig. 5 show that a substantial fraction (perhaps as much as 15%) of individuals in the sample have zero balances, precisely on the boundary. Second, the series exhibit strong seasonality: the distribution is expected to shift leftward during the spring months, which is precisely when we aim to measure the effect of the pandemic. Third, considerable dependence exists over time; for example, if an account had a zero balance at the end of last month, then it is very likely (in general, over 50% of the time) to have a zero balance at the end of this month.5 The framework that we have proposed has the flexibility to accommodate all of these characteristics of credit-card balances. Histograms of account-specific credit-card balances.

Modeling procedure

With the data set of credit-card balances, we show by example how our framework is used to detect changes in the distribution in response to the intervention. A researcher should follow this modeling procedure: Specify the explanatory variables to predict transition probabilities. Estimate the transition probabilities by calculating the parameter values that maximize the value of the likelihood function. Use the estimated parameter values to calculate the predicted transitions of the mass of consumers in each category, beginning with the initial proportions. Calculate the actual proportions of consumers in each category throughout the sample. Calculate the KLD statistic for the comparison of the predicted and actual proportion vectors for each time period within the sample. The next step depends on whether the KLD statistic detects anomalies in the proportion of consumers in each category. If the KLD statistic detects differences in proportions, then modify the list of explanatory variables. Normally, this will involve adding new explanatory variables that have variation related to the anomalies found in the tests with the KLD statistic. If the KLD statistic does not detect anomalies, then the model is adequate to detect changes out of sample. Although the model that results from Step 6(b) is adequate to detect a response to the intervention, the researcher may wish to employ a richer model, with additional explanatory variables. This would provide the added benefit of more precise predictions of the transition probabilities, thus allowing the researcher to detect changes of a smaller magnitude or to detect similar changes with a smaller sample size in the cross-section. Furthermore, the addition of explanatory variables would be appropriate for testing hypotheses involving differences in responses by subsets of the population or in a way that is related to other explanatory variables. In fact, a measurement of the differences in response between subsamples without including related variables would confound the measurement of the law of motion with any differences in the distribution across those subsamples. This would introduce inconsistency into the estimation of the response from mismeasuring the counterfactual post-intervention law of motion. This suggests that the researcher should consider a model that goes beyond simply being adequate to detect changes without false positives within the pre-intervention sample. The inclusion of additional variables should be guided by the question at hand. In our empirical application, the pandemic effect must be measured against a benchmark that includes the differences in the distribution of credit-card balances across credit-score categories.

Models

We estimated four models that seek to characterize the law of motion of credit-card balances before the pandemic. These were then used to construct a counterfactual forecast with which to compare consumer behavior during the pandemic. The approach we utilized is similar to that for “excess mortality” in the demography literature; see Statistics Canada (2020). Our simplest model, labeled “Histograms”, serves as a benchmark for the others. We compared the histograms month-by-month from those during 2020 to those estimated over the sample period from January 2017 to January 2020. This model takes into account the seasonality over the months throughout the calendar year. It is also simple to estimate: it requires the estimation of only free parameters in 12 histograms, which are estimated very precisely given our sample size. The histograms, however, do not take into account the dependence noted in Fig. 5. The next model, labeled “Fixed”, accounts for dependence between balances in consecutive months. It assumes a first-order Markov process for the transitions between months. The parameters in this model are the transition probabilities that govern the movement of consumers between balance categories each month, a total of parameters. A single transition matrix is estimated from pairs of observations over the entire sample. Using the transition matrix, we constructed counterfactual forecasts for the distribution of credit-card balances. We initialized the forecasts with the proportion of consumers in each balance category that is observed in January 2020, reflecting activity recorded during the month of December 2019.6 We then calculated a series of forecasts by left-multiplying the vector with the transition matrices for each month. The forecast for the next month is calculated in the same way, using the forecast from the period before. We considered a third model, labeled “Monthly”, by augmenting the “Fixed” model with a simple set of monthly indicators.7 A separate transition matrix is estimated each month, using monthly indicators as covariates. It combines the characteristics of the “Histograms” and the model with the “Fixed” transition matrix. Since these monthly indicators are mutually exclusive and exhaustive, each transition matrix is estimated using only the observations that correspond to the particular pairs of months. This alleviated the computational burden of estimating many more parameters. The model with monthly transition matrices has 12 times as many as that in the “Fixed” model—some parameters. -step-ahead forecasts from alternative models. In the fourth model, labeled “Covariates”, we used two variables to serve as covariates to estimate transition probabilities in addition to the seasonal indicators. We constructed a categorical variable on creditworthiness by dividing credit scores into three categories. Consumers with a credit score below 700 are placed in the “Low” credit-score category; those with credit scores between 700 and 839 are considered “Medium”; and those with credit scores of 840 or above are allocated to the “High” category.8 The other variable included is a home-ownership indicator. Including an interaction term to separate homeowners within each credit-score category, there are 10 times as many parameters, adding up to a total of 97,440 parameters.

Empirical results

We evaluated these models by performing a series of tests to detect differences between the observed distributions and the forecasted ones. In this set of tests, we compared the benchmark with an -step-ahead forecast. In Table 1, we collect the results of this series of comparisons for the four models. The column labeled “Histograms” shows the comparison to sample histograms; “Fixed” denotes the comparison to the forecast with a fixed transition matrix; the “Monthly” column refers to the forecasts using separate transition matrices each month; and the “Covariates” column refers to the same statistic using separate transition matrices, each estimated with credit-score and home-ownership categories. The statistic is the Kullback–Leibler divergence statistic comparing the -step-ahead forecasted distribution with the observed sample distribution. The -value columns show the probability of observing a statistic more extreme from the distribution with degrees of freedom. For reference, these divergence statistics should be compared to the critical values of 41.34, 48.28, and 56.89, corresponding to the 5%, 1%, and 10 basis-point levels of significance.

Table 1

-step-ahead forecasts from alternative models.

	Histogram	Fixed	Monthly	p-value	Covariate	p-value
January 2020	357.1	508.4	60.3	0.0004	56.2	0.0012
February 2020	329.8	1,142.7	68.4	0.0000	55.9	0.0013
March 2020	261.0	2,078.4	234.6	0.0000	134.5	0.0000
April 2020	3,981.9	8,124.9	5,326.5	0.0000	4,168.5	0.0000
May 2020	6,718.1	10,174.7	8,693.4	0.0000	6,577.4	0.0000
June 2020	4,416.8	6,529.4	6,453.2	0.0000	4,405.6	0.0000
July 2020	2,679.1	5,138.6	4,494.3	0.0000	2,741.5	0.0000
August 2020	2,358.3	4,115.0	3,874.1	0.0000	2,566.6	0.0000
September 2020	2,219.5	3,809.5	3,919.0	0.0000	2,705.0	0.0000

All models detected a statistically significant shift in the distributions that occurred in April 2020. This was followed by a larger deviation in May 2020 that subsided over the subsequent months, although a large difference persisted thereafter. The models differed in their characterization of the period before the pandemic. For the model with monthly transitions, the statistics up to March did not wander far from those expected under the null hypothesis of no change, especially considering that they were calculated with tens of millions of observations. The fit was even closer for the forecasts from the model with credit-score and home-ownership covariates. As a measure of goodness of fit, we also compared the deviations from predictions from one-step-ahead forecasts. We document in Appendix A.3 that the model with a fixed transition matrix is not suited to the data, as it erroneously detects changes in the months before the pandemic, even when conditioning on the distribution in the previous month. This problem was largely absent from the one-step-ahead forecasts with the model with monthly transition matrices: the distance statistics were all within a reasonable distance of the critical values at conventional levels of significance. This suggests that the model with monthly transition matrices is as good as it needs to be to detect a difference in distributions, and that no further complexity is warranted to answer the question of whether a change has occurred. The decision over which model is appropriate, if any, can be settled by analyzing the performance of the model within the pre-intervention period. We conducted this comparison for two of the models above and present the results in Appendix A.4. The model with a fixed transition matrix appears to detect changes that follow a pattern through the seasons. The tendency for this model to detect false positives will degrade the accuracy of the measurement of the intervention effect. The richer models with monthly seasonality do not appear to raise false positives through the pre-pandemic period. The fact that these models produce forecasts that do not detect deviations in the pre-pandemic period shows that the models provide a reliable benchmark for analyzing the effects of the pandemic. The model with covariates, however, is the appropriate choice, since we aim to detect effects specific to subsets of consumers with different home-ownership status and risk profiles. This is evident in that the histograms and the fixed, non-seasonal model erroneously detect differences in credit-card balances in the early months. On closer inspection of the fixed model, we found that this behavior is also observed each year in the sample, since the fixed transition model does not account for the annual pattern of declining balances after the holiday season up to March and April, when many consumers receive tax returns. The model with histograms estimated over the sample period does not account for the growth of credit usage year-over-year. We then analyzed the path of credit-card balances throughout the pandemic, using home-ownership status and credit scores as explanatory variables, with the predictions from the “Covariates” model. This provided deeper insight to the findings of Ho et al. (2021) that consumers are reducing their overall level of borrowing from credit cards and home-equity lines of credit (HELOCs). It remains an empirical question whether the pattern differs for consumers without access to home equity and/or other forms of credit. The changes in the distribution of credit usage for homeowners with medium credit scores are depicted in Fig. 6.9 The results in May and August 2020 form bookends on the transition throughout the first wave of the pandemic, since most Canadian provinces implemented some sort of economic lockdown in April and gradually reopened in July.10 In May, the proportion of these consumers with balances from $0 to $500 (the lowest three categories) increased by nearly 30%. There was a larger fraction of consumers with balances less than $2000, while the largest decreases were those with balances ranging from $8000 to $10,000. By August, the effect was weakened but still significant. Generally, smaller changes are observed in balance categories smaller than $14,000, while the greatest decreases occurred for balance categories greater than $16,000. The findings are similar for non-homeowners, except that the changes take place in lower-balance categories. Overall, the changes in credit usage largely mimic those of the full population, similar to the results in Ho et al. (2021). Our results provide evidence that the decrease in credit-card balances can be attributed to reduced spending during the lockdown.11 Although the gradual reopening of the economy may have attenuated the effect, the big spending that commonly occurs in the summer months was still missing during the pandemic.

Fig. 6

Deviations from forecasted credit-card balances for homeowners with medium credit scores.

Deviations from forecasted credit-card balances for homeowners with medium credit scores. We further investigated the changes for consumers with the tightest credit constraints—that is, those with low credit scores and no home equity to use as collateral for a loan. Fig. 7 depicts the proportional changes in balance categories for these consumers in May and August 2020. The scale is magnified because the proportional changes are much larger: nearly 40% more consumers had positive balances below $250. The change in the proportion of consumers with zero balances is even larger. The statistic for August 2020 compares a forecast of 1187.8 consumers with the 2367 consumers who are actually observed to have balances of zero, although the statistic shows a difference of . Overall, the May and August distributions for these consumers are more similar than that of the middle-credit-worthiness homeowners in Fig. 6. This suggests a stronger and sustained reduction in credit usage by consumers with the tightest credit constraints. This is an interesting finding because, first, consumers with tight credit constraints did not become more indebted, despite the unemployment rate surging from 7.9% in March to 13.7% in May 2020. Second, these consumers did not change their credit usage when unemployment subsided to 10.2% in August as the economy reopened.

Fig. 7

Deviations from forecasted credit-card balances for non-homeowners with low credit scores.

At the other extreme, homeowners with high credit scores show a different pattern of changes, as seen in Fig. 8. The proportion of these consumers in the very-low-balance categories does increase, just as that of the consumers in the other groups, but to a lesser degree. In May, there was a large decrease in these homeowners with balances from $3000 to $14,000. The shift in these intermediate balance categories is much more pronounced than in the rest of the population, with changes of up to 50%. A subset of those creditworthy homeowners is increasing their use of credit during the pandemic. We recorded 30% more consumers in this group with balances from $25,000 to $30,000 and even more consumers with balances above $30,000. Although we observed large percentage changes, very few consumers indeed experienced these changes: most homeowners with high credit scores do not hold high credit-card balances.12 Also note that in August 2020, the effect of COVID-19 on the credit usage of these consumers was more moderate when compared to other groups. These results suggest that affluent homeowners reduced their use of credit during the peak periods in the first wave of COVID-19, possibly due to limited spending opportunities in an economic lockdown. This group of creditworthy consumers had the strongest responses in credit usage during the economic recovery.

Fig. 8

Deviations from forecasted credit-card balances for homeowners with high credit scores.

Deviations from forecasted credit-card balances for non-homeowners with low credit scores. Deviations from forecasted credit-card balances for homeowners with high credit scores.

Summary and conclusions

We developed an approach to intervention analysis that avoids modeling the moments of the population. Our approach features a discretized distribution that is modeled to follow a Markov process, with transition probabilities that may depend on covariates. This tool proved very useful in modeling transitions with seasonality and conditioning on covariates. The result is a flexible framework for conducting intervention analysis, with little risk of bias from misspecification. The flexibility of this approach creates a reliable benchmark for measuring the effects of an intervention and is resilient to the effects of what would otherwise be inconvenient features of the data. This approach is useful when modeling the behavior of a large population of individuals when conventional methods of intervention analysis may fail. Our procedure can be extended to many other applications. For instance, this modeling approach could be used to detect responses to policy changes. Leverage ratios in mortgage lending, such as the loan-to-value ratio, are known to have unique clustered distributions due to lending regulations (Bilyk et al., 2017). Changes in mortgage stress tests, such as lowering the debt–service ratio, may induce nontrivial distributional effects at the tail of the distribution that may not be noticed in lower moments (Bilyk & teNyenhuis, 2018). Changes in the distribution of loan balances, such as an increasing portion of loans with a very high balance or a deterioration in loan quality, may have important tail-risk implications on a financial institute’s loss-given-default modeling (for example, Yao et al., 2015), consequently affecting its capital buffer requirement. In addition, this modeling approach can be used to detect changes in a population to guide a dynamic model-fitting strategy: perhaps a change will trigger a model rebuild or, at least, a further evaluation of the performance of an existing credit risk model. This framework could also be applied to a number of risk models designed as inputs for particular investment decisions. For example, it could be used in credit approval decisions to determine whether there is a change in the composition of customers. For credit-line assignment, an institution would want to know whether consumers need higher or lower lines of credit. In the marketing of financial products, firms can form their competitive strategy by understanding trends in consumer characteristics. In any of these modeling situations, there may be an unknown change in the competitive landscape, resulting in a different population of customers that are evaluated by the model. One should, however, be careful to apply this method only to situations in which the pre-intervention period is unaffected by the intervention before it appears. With incorrect timing of this intervention response, this approach would not identify the counterfactual path, because the estimate would be confounded with changes in anticipation of the intervention, thus leading to a biased estimate of the response. For our application to the pandemic, the initial changes took place over a timeline that was short relative to the time between observations. We used this framework to provide a plausible forecast of the distribution of credit-card balances, under the counterfactual state in which the COVID-19 pandemic did not take place. We found a significant downward shift in consumer credit usage in Canada, slashing billions of dollars off credit-card balances across the country, to an unprecedented level. Moreover, this result applies to both homeowners and non-homeowners alike, while the change in distribution represents a greater reduction of debt for homeowners. Further, we found an overall reduction in credit usage for consumers with credit scores at all levels—a finding in stark contrast to the experience over the years leading up to the pandemic. Allen et al. (2021) provided a structural econometric approach to quantify the amount left on the table for households who do not request a deferral on credit-card balances. Future work can attempt to link the intervention analysis to structural models.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Table A.1

Some Monte Carlo evidence with Gaussian errors.

Parameter	Mean	Variance	Minimum	Maximum
μ0=0.5
T=100	0.45192	0.92045	−3.76182	3.81443
T=200	0.46592	0.69396	−3.39578	2.91524
T=500	0.48435	0.45701	−1.61577	2.20638
T=1000	0.49304	0.32192	−0.92277	1.56088

σ0=2.29416
T=100	2.01571	0.50511	0.50365	4.77354
T=200	2.14657	0.39701	1.15388	5.07263
T=500	2.23906	0.26690	1.47906	3.66571
T=1000	2.26828	0.19464	1.67941	3.13396

V(μˆ)
T=100	0.06938	0.06399	0.01117	1.34104
T=200	0.03421	0.01999	0.00732	0.52426
T=500	0.01372	0.00435	0.00470	0.05856
T=1000	0.00685	0.00149	0.00339	0.01645

Table A.2

Some Monte Carlo evidence with lognormal errors.

Parameter	Mean	Variance	Minimum	Maximum
μ0=1.0
T=100	9.10341	2.02235	3.36693	29.30399
T=200	9.53405	1.47786	5.11464	19.70764
T=500	9.81798	0.94632	6.71972	14.55416
T=1000	9.91227	0.66267	7.68366	12.91488

σ0=4.98723
T=100	4.54009	1.74165	1.34228	42.71136
T=200	4.70636	1.38829	2.09365	31.85160
T=500	4.83995	0.97589	2.69117	20.95802
T=1000	4.89231	0.74422	3.17284	15.35153

V(μˆ)
T=100	0.23448	0.31823	0.01784	18.06199
T=200	0.11987	0.10189	0.02181	5.04738
T=500	0.04867	0.02411	0.01446	0.87672
T=1000	0.02447	0.00860	0.01006	0.23543

Table A.3

Divergence from out-of-sample, one-step-ahead forecasts.

Month	Fixed	p-value	Monthly	p-value
January 2020	422.05	0.0000	44.33	0.0258
February 2020	236.36	0.0000	44.39	0.0254
March 2020	205.79	0.0000	53.95	0.0023
April 2020	2,767.89	0.0000	3,792.64	0.0000
May 2020	818.40	0.0000	1,266.47	0.0000
June 2020	138.65	0.0000	66.50	0.0001
July 2020	47.27	0.0128	71.40	0.0000
August 2020	149.07	0.0000	200.57	0.0000

Table A.4

Goodness of fit of in-sample forecasts.

Month	Fixed	p-value	Monthly	p-value
February 2017	312.10	0.0000	27.85	0.4725
March 2017	122.94	0.0000	22.19	0.7727
April 2017	134.91	0.0000	21.78	0.7913
May 2017	22.63	0.7511	32.55	0.2527
June 2017	71.79	0.0000	13.68	0.9893
July 2017	46.86	0.0142	26.58	0.5415
August 2017	29.51	0.3872	29.59	0.3831
September 2017	30.25	0.3516	35.57	0.1540
October 2017	110.33	0.0000	36.92	0.1207
November 2017	80.77	0.0000	13.80	0.9886
December 2017	188.52	0.0000	21.28	0.8135

January 2018	282.43	0.0000	16.18	0.9630
February 2018	194.43	0.0000	20.84	0.8318
March 2018	124.65	0.0000	16.33	0.9606
April 2018	71.78	0.0000	27.17	0.5087
May 2018	127.73	0.0000	26.98	0.5193
June 2018	69.36	0.0000	17.12	0.9463
July 2018	48.29	0.0100	20.28	0.8539
August 2018	54.04	0.0022	32.84	0.2415
September 2018	30.18	0.3548	9.34	0.9996
October 2018	67.72	0.0000	18.73	0.9064
November 2018	72.70	0.0000	19.22	0.8912
December 2018	128.74	0.0000	20.09	0.8611

January 2019	412.25	0.0000	25.34	0.6092
February 2019	194.28	0.0000	17.92	0.9281
March 2019	90.30	0.0000	11.13	0.9981
April 2019	139.28	0.0000	20.12	0.8600
May 2019	56.68	0.0011	15.00	0.9785
June 2019	54.48	0.0020	24.18	0.6721
July 2019	45.71	0.0187	19.95	0.8664
August 2019	47.85	0.0111	27.26	0.5040
September 2019	31.54	0.2935	28.49	0.4385
October 2019	92.91	0.0000	22.74	0.7460
November 2019	42.47	0.0392	25.40	0.6057
December 2019	224.44	0.0000	14.26	0.9853

3 in total

1. Distributions of the Kullback-Leibler divergence with applications.

Authors: Dmitry I Belov; Ronald D Armstrong
Journal: Br J Math Stat Psychol Date: 2011-05 Impact factor: 3.380

2. An approach to and web-based tool for infectious disease outbreak intervention analysis.

Authors: Ashlynn R Daughton; Nicholas Generous; Reid Priedhorsky; Alina Deshpande
Journal: Sci Rep Date: 2017-04-18 Impact factor: 4.379

3 in total