The power of bootstrap and asymptotic tests
Introduction
In recent years, it has become common to use the bootstrap to perform hypothesis tests in econometrics. Its use for this purpose has been advocated by Horowitz, 1994, Horowitz, 1997, Davidson and MacKinnon (1999), and several others. If the bootstrap is to benefit from asymptotic refinements, the original test statistic must be asymptotically pivotal under the null hypothesis, that is, its asymptotic distribution under the null must not depend on any unknown features of the data generating process (DGP). With such a test statistic, the errors committed by using the bootstrap are generally of an order lower by a factor of either or , where n is the sample size, than the errors committed by relying on asymptotic theory; see Beran (1988), Hall (1992, Section 3.12), or Davidson and MacKinnon (1999).
A convenient way to perform bootstrap inference is to compute bootstrap P values. After computing a test statistic, say , in the usual way, one uses a random bootstrap DGP, denoted by and constructed so as to satisfy the null hypothesis under test, to generate B bootstrap samples, each of which is used to compute a bootstrap test statistic . The bootstrap P value may then be estimated by the proportion of bootstrap statistics that are more extreme than . As , this estimated bootstrap P value will tend to the “ideal” bootstrap P value , which is defined as where is the rejection region for a test for which the critical value is . For a one-tailed test that rejects in the upper tail, for instance, is just the set of real numbers greater than . In this paper, we ignore the fact that has to be estimated. The effect of the estimation error can be made as small as desired by appropriate choice of B; see Davidson and MacKinnon (2000).
If the original data are generated by a DGP , the “true” P value , which is just a deterministic function of , is by construction a drawing from the uniform distribution . But, since the bootstrap DGP is a function of the data, the bootstrap P value is in general drawn from a different distribution. Consequently, the rejection probability (RP) of a bootstrap test at nominal level is in general different from , even when satisfies the null hypothesis under test.
It is natural to ask whether bootstrapping a test has any effect on its power. Answering this question is complicated by the fact that asymptotic tests often suffer from substantial size distortion. In simulation studies, it is common to adjust for this distortion by using critical values for which the RP under some DGP that satisfies the null hypothesis is exactly equal to the desired nominal level. With statistics that are not exactly pivotal, the adjustment depends on the specific choice of .
Conventional asymptotic power analysis relies on the notion of a drifting DGP, which, as the sample size tends to infinity, drifts to . In order to study the difference between the power of a bootstrap test and the adjusted power of the asymptotic test on which it is based, which we call the bootstrap discrepancy, a suitable drifting DGP must be chosen. We demonstrate in Section 2 that, for any choice of drifting DGP, the bootstrap discrepancy may be of either sign and is, in general, of the same order in n as the size distortion of the bootstrap test.
In Section 3, we consider how best to choose the drifting DGP. We argue that the objective should be minimization of the bootstrap discrepancy, and we show that this is feasible only if and are asymptotically independent in a sense that we make precise. In Davidson and MacKinnon (1999), we showed that asymptotic independence of this sort leads to a reduction in the order of bootstrap size distortion. We characterize a class of drifting DGPs that serves to extend this result to the bootstrap discrepancy.
In Section 4, we propose an extension to power analysis of a procedure given in Davidson and MacKinnon (2001) for estimating the RP of a bootstrap test by simulation. This procedure, which is conceptually simple and computationally inexpensive, allows one to estimate the power of bootstrap and asymptotic tests inexpensively in Monte Carlo experiments. In Section 5, we present some Monte Carlo results for tests of omitted variables in a logit model. Section 6 concludes.
Section snippets
Power of bootstrap and asymptotic tests
Suppose that a test statistic t has a fixed known asymptotic distribution under the null hypothesis, represented by a probability measure defined on the real line and absolutely continuous with respect to Lebesgue measure. It is convenient to replace t by another test statistic, which we denote by , of which the nominal asymptotic distribution is uniform on . This is most conveniently done by replacing t by its asymptotic P value, so that is given by , the probability mass
Choice of a drifting DGP
We are usually interested in practice in a particular sample size, say , and we conduct asymptotic analysis as an approximation to what happens for a DGP defined just for that sample size. The drifting DGP used in asymptotic power analysis is a theoretical construct, but, as we will see in the simulations presented in Section 5, the bootstrap discrepancy can vary greatly with the specific choice of drifting DGP.
The parametrization of the extended model is not necessarily well
Approximate bootstrap rejection probabilities
The quantity , which is the power of an asymptotic test based on against the DGP at level when level adjustment is based on the null DGP , can be straightforwardly estimated by simulation. For each of M replications, compute two test statistics, one of them generated by and the other by . Find the critical value such that the rejection frequency in the M replications under is ; is our estimate of . is then estimated by the rejection
Testing for omitted variables in a logit model
In this section, we present the results of several Monte Carlo experiments, which deal with Lagrange multiplier tests for omitted variables in the logit model. We chose to examine the logit model for several reasons: it is not a regression model, the results of Horowitz (1994) and Davidson and MacKinnon (1998) suggest that, for information matrix tests in the closely related probit model, bootstrapping may greatly improve the finite-sample properties of one form of the LM test, and, in contrast
Conclusions
Level adjustment of the power of tests based on nonpivotal statistics yields results that depend on the DGP in the null hypothesis used to provide a critical value. For a given choice of this null DGP, we show that the power of a bootstrap test differs from the level-adjusted power of the asymptotic test on which it is based by an amount that we call the bootstrap discrepancy. This discrepancy is of the same order, in the sample size n, as the size distortion of the bootstrap test itself.
Since
Acknowledgements
This research was supported, in part, by grants from the Social Sciences and Humanities Research Council of Canada. We are grateful to Don Andrews, Joel Horowitz, two referees, the editors, and numerous seminar participants for comments on earlier versions.
References (18)
- et al.
Convenient specification tests for logit and probit models
Journal of Econometrics
(1984) Bootstrap-based critical values for the information matrix test
Journal of Econometrics
(1994)- et al.
Empirically relevant critical values for hypothesis tests
Journal of Econometrics
(2000) - et al.
Edgeworth corrected pivotal statistics and the bootstrap
Annals of Statistics
(1985) Prepivoting test statisticsa bootstrap view of asymptotic refinements
Journal of the American Statistical Association
(1988)Diagnosing bootstrap success
Annals of the Institute of Statistical Mathematics
(1997)- et al.
Implicit alternatives and the local power of test statistics
Econometrica
(1987) - et al.
Estimation and inference in econometrics
(1993) - et al.
Graphical methods for investigating the size and power of hypothesis tests
The Manchester School
(1998)
Cited by (84)
Bootstrap inference on the boundary of the parameter space, with application to conditional volatility models
2022, Journal of EconometricsCitation Excerpt :The bootstrap in Andrews (2000) does not impose the null hypothesis on the bootstrap sample – that is, it is an example of the so-called ‘unrestricted bootstrap’ – and this is crucial when interest is in testing that a parameter is on the boundary. In contrast, Cavaliere et al. (2017) show that randomness of the limiting distribution can be avoided by applying a bootstrap scheme which imposes the null hypothesis on the bootstrap sample, that is, the ‘restricted bootstrap’, see also Davidson and MacKinnon (2006). However, the approach of Cavaliere et al. (2017) requires that all parameters not restricted by the null hypothesis are in the interior of the parameter space and, when this is not the case, also this bootstrap fails to replicate the correct asymptotic distribution, see the discussion in Section 3.
The fast iterated bootstrap
2020, Journal of EconometricsInference for local distributions at high sampling frequencies: A bootstrap approach
2020, Journal of EconometricsImpact of financing constraints on firm's environmental performance: Evidence from China with survey data
2019, Journal of Cleaner ProductionAnalysis of domestic cultural tourism spend by segment in the city of Granada: An observational data approach
2019, Tourism Management Perspectives