In this paper we develop some econometric theory for factor models of large dimensions. The focus is the determination of the number of factors (r), which is an unresolved issue in the rapidly growing literature on multifactor models. We first establish the convergence rate for the factor estimates that will allow for consistent estimation of r. We then propose some panel criteria and show that the number of factors can be consistently estimated using the criteria. The theory is developed under the framework of large cross‐sections (N) and large time dimensions (T). No restriction is imposed on the relation between N and T. Simulations show that the proposed criteria have good finite sample properties in many configurations of the panel data encountered in practice.  相似文献   

We consider the situation when there is a large number of series, N, each with T observations, and each series has some predictive ability for some variable of interest. A methodology of growing interest is first to estimate common factors from the panel of data by the method of principal components and then to augment an otherwise standard regression with the estimated factors. In this paper, we show that the least squares estimates obtained from these factor‐augmented regressions are consistent and asymptotically normal if . The conditional mean predicted by the estimated factors is consistent and asymptotically normal. Except when T/N goes to zero, inference should take into account the effect of “estimated regressors” on the estimated conditional mean. We present analytical formulas for prediction intervals that are valid regardless of the magnitude of N/T and that can also be used when the factors are nonstationary.  相似文献   

This paper considers large N and large T panel data models with unobservable multiple interactive effects, which are correlated with the regressors. In earnings studies, for example, workers' motivation, persistence, and diligence combined to influence the earnings in addition to the usual argument of innate ability. In macroeconomics, interactive effects represent unobservable common shocks and their heterogeneous impacts on cross sections. We consider identification, consistency, and the limiting distribution of the interactive‐effects estimator. Under both large N and large T, the estimator is shown to be consistent, which is valid in the presence of correlations and heteroskedasticities of unknown form in both dimensions. We also derive the constrained estimator and its limiting distribution, imposing additivity coupled with interactive effects. The problem of testing additive versus interactive effects is also studied. In addition, we consider identification and estimation of models in the presence of a grand mean, time‐invariant regressors, and common regressors. Given identification, the rate of convergence and limiting results continue to hold.  相似文献   

This paper presents a new approach to estimation and inference in panel data models with a general multifactor error structure. The unobserved factors and the individual‐specific errors are allowed to follow arbitrary stationary processes, and the number of unobserved factors need not be estimated. The basic idea is to filter the individual‐specific regressors by means of cross‐section averages such that asymptotically as the cross‐section dimension (N) tends to infinity, the differential effects of unobserved common factors are eliminated. The estimation procedure has the advantage that it can be computed by least squares applied to auxiliary regressions where the observed regressors are augmented with cross‐sectional averages of the dependent variable and the individual‐specific regressors. A number of estimators (referred to as common correlated effects (CCE) estimators) are proposed and their asymptotic distributions are derived. The small sample properties of mean group and pooled CCE estimators are investigated by Monte Carlo experiments, showing that the CCE estimators have satisfactory small sample properties even under a substantial degree of heterogeneity and dynamics, and for relatively small values of N and T.  相似文献   

In this paper we derive the asymptotic properties of within groups (WG), GMM, and LIML estimators for an autoregressive model with random effects when both T and N tend to infinity. GMM and LIML are consistent and asymptotically equivalent to the WG estimator. When T/N→ 0 the fixed T results for GMM and LIML remain valid, but WG, although consistent, has an asymptotic bias in its asymptotic distribution. When T/N tends to a positive constant, the WG, GMM, and LIML estimators exhibit negative asymptotic biases of order 1/T, 1/N, and 1/(2NT), respectively. In addition, the crude GMM estimator that neglects the autocorrelation in first differenced errors is inconsistent as T/Nc>0, despite being consistent for fixed T. Finally, we discuss the properties of a random effects pseudo MLE with unrestricted initial conditions when both T and N tend to infinity.  相似文献   

Fixed effects estimators of panel models can be severely biased because of the well‐known incidental parameters problem. We show that this bias can be reduced by using a panel jackknife or an analytical bias correction motivated by large T. We give bias corrections for averages over the fixed effects, as well as model parameters. We find large bias reductions from using these approaches in examples. We consider asymptotics where T grows with n, as an approximation to the properties of the estimators in econometric applications. We show that if T grows at the same rate as n, the fixed effects estimator is asymptotically biased, so that asymptotic confidence intervals are incorrect, but that they are correct for the panel jackknife. We show T growing faster than n1/3 suffices for correctness of the analytic correction, a property we also conjecture for the jackknife.  相似文献   

This paper introduces a nonparametric Granger‐causality test for covariance stationary linear processes under, possibly, the presence of long‐range dependence. We show that the test is consistent and has power against contiguous alternatives converging to the parametric rate T−1/2. Since the test is based on estimates of the parameters of the representation of a VAR model as a, possibly, two‐sided infinite distributed lag model, we first show that a modification of Hannan's (1963, 1967) estimator is root‐ T consistent and asymptotically normal for the coefficients of such a representation. When the data are long‐range dependent, this method of estimation becomes more attractive than least squares, since the latter can be neither root‐ T consistent nor asymptotically normal as is the case with short‐range dependent data.  相似文献   

Many approaches to estimation of panel models are based on an average or integrated likelihood that assigns weights to different values of the individual effects. Fixed effects, random effects, and Bayesian approaches all fall into this category. We provide a characterization of the class of weights (or priors) that produce estimators that are first‐order unbiased. We show that such bias‐reducing weights will depend on the data in general unless an orthogonal reparameterization or an essentially equivalent condition is available. Two intuitively appealing weighting schemes are discussed. We argue that asymptotically valid confidence intervals can be read from the posterior distribution of the common parameters when N and T grow at the same rate. Next, we show that random effects estimators are not bias reducing in general and we discuss important exceptions. Moreover, the bias depends on the Kullback–Leibler distance between the population distribution of the effects and its best approximation in the random effects family. Finally, we show that, in general, standard random effects estimation of marginal effects is inconsistent for large T, whereas the posterior mean of the marginal effect is large‐T consistent, and we provide conditions for bias reduction. Some examples and Monte Carlo experiments illustrate the results.  相似文献   

We consider semiparametric estimation of the memory parameter in a model that includes as special cases both long‐memory stochastic volatility and fractionally integrated exponential GARCH (FIEGARCH) models. Under our general model the logarithms of the squared returns can be decomposed into the sum of a long‐memory signal and a white noise. We consider periodogram‐based estimators using a local Whittle criterion function. We allow the optional inclusion of an additional term to account for possible correlation between the signal and noise processes, as would occur in the FIEGARCH model. We also allow for potential nonstationarity in volatility by allowing the signal process to have a memory parameter d*1/2. We show that the local Whittle estimator is consistent for d*∈(0,1). We also show that the local Whittle estimator is asymptotically normal for d*∈(0,3/4) and essentially recovers the optimal semiparametric rate of convergence for this problem. In particular, if the spectral density of the short‐memory component of the signal is sufficiently smooth, a convergence rate of n2/5−δ for d*∈(0,3/4) can be attained, where n is the sample size and δ>0 is arbitrarily small. This represents a strong improvement over the performance of existing semiparametric estimators of persistence in volatility. We also prove that the standard Gaussian semiparametric estimator is asymptotically normal if d*=0. This yields a test for long memory in volatility.  相似文献   

In this paper we study identification and estimation of a correlated random coefficients (CRC) panel data model. The outcome of interest varies linearly with a vector of endogenous regressors. The coefficients on these regressors are heterogenous across units and may covary with them. We consider the average partial effect (APE) of a small change in the regressor vector on the outcome (cf. Chamberlain (1984), Wooldridge (2005a)). Chamberlain (1992) calculated the semiparametric efficiency bound for the APE in our model and proposed a √N‐consistent estimator. Nonsingularity of the APE's information bound, and hence the appropriateness of Chamberlain's (1992) estimator, requires (i) the time dimension of the panel (T) to strictly exceed the number of random coefficients (p) and (ii) strong conditions on the time series properties of the regressor vector. We demonstrate irregular identification of the APE when T = p and for more persistent regressor processes. Our approach exploits the different identifying content of the subpopulations of stayers—or units whose regressor values change little across periods—and movers—or units whose regressor values change substantially across periods. We propose a feasible estimator based on our identification result and characterize its large sample properties. While irregularity precludes our estimator from attaining parametric rates of convergence, its limiting distribution is normal and inference is straightforward to conduct. Standard software may be used to compute point estimates and standard errors. We use our methods to estimate the average elasticity of calorie consumption with respect to total outlay for a sample of poor Nicaraguan households.  相似文献   

This paper develops an asymptotic theory for time series binary choice models with nonstationary explanatory variables generated as integrated processes. Both logit and probit models are covered. The maximum likelihood (ML) estimator is consistent but a new phenomenon arises in its limit distribution theory. The estimator consists of a mixture of two components, one of which is parallel to and the other orthogonal to the direction of the true parameter vector, with the latter being the principal component. The ML estimator is shown to converge at a rate of n3/4 along its principal component but has the slower rate of n1/4 convergence in all other directions. This is the first instance known to the authors of multiple convergence rates in models where the regressors have the same (full rank) stochastic order and where the parameters appear in linear forms of these regressors. It is a consequence of the fact that the estimating equations involve nonlinear integrable transformations of linear forms of integrated processes as well as polynomials in these processes, and the asymptotic behavior of these elements is quite different. The limit distribution of the ML estimator is derived and is shown to be a mixture of two mixed normal distributions with mixing variates that are dependent upon Brownian local time as well as Brownian motion. It is further shown that the sample proportion of binary choices follows an arc sine law and therefore spends most of its time in the neighborhood of zero or unity. The result has implications for policy decision making that involves binary choices and where the decisions depend on economic fundamentals that involve stochastic trends. Our limit theory shows that, in such conditions, policy is likely to manifest streams of little intervention or intensive intervention.  相似文献   

In weighted moment condition models, we show a subtle link between identification and estimability that limits the practical usefulness of estimators based on these models. In particular, if it is necessary for (point) identification that the weights take arbitrarily large values, then the parameter of interest, though point identified, cannot be estimated at the regular (parametric) rate and is said to be irregularly identified. This rate depends on relative tail conditions and can be as slow in some examples as n−1/4. This nonstandard rate of convergence can lead to numerical instability and/or large standard errors. We examine two weighted model examples: (i) the binary response model under mean restriction introduced by Lewbel (1997) and further generalized to cover endogeneity and selection, where the estimator in this class of models is weighted by the density of a special regressor, and (ii) the treatment effect model under exogenous selection (Rosenbaum and Rubin (1983)), where the resulting estimator of the average treatment effect is one that is weighted by a variant of the propensity score. Without strong relative support conditions, these models, similar to well known “identified at infinity” models, lead to estimators that converge at slower than parametric rate, since essentially, to ensure point identification, one requires some variables to take values on sets with arbitrarily small probabilities, or thin sets. For the two models above, we derive some rates of convergence and propose that one conducts inference using rate adaptive procedures that are analogous to Andrews and Schafgans (1998) for the sample selection model.  相似文献   

We consider the estimation of dynamic panel data models in the presence of incidental parameters in both dimensions: individual fixed‐effects and time fixed‐effects, as well as incidental parameters in the variances. We adopt the factor analytical approach by estimating the sample variance of individual effects rather than the effects themselves. In the presence of cross‐sectional heteroskedasticity, the factor method estimates the average of the cross‐sectional variances instead of the individual variances. The method thereby eliminates the incidental‐parameter problem in the means and in the variances over the cross‐sectional dimension. We further show that estimating the time effects and heteroskedasticities in the time dimension does not lead to the incidental‐parameter bias even when T and N are comparable. Moreover, efficient and robust estimation is obtained by jointly estimating heteroskedasticities.  相似文献   

I introduce a model of undirected dyadic link formation which allows for assortative matching on observed agent characteristics (homophily) as well as unrestricted agent‐level heterogeneity in link surplus (degree heterogeneity). Like in fixed effects panel data analyses, the joint distribution of observed and unobserved agent‐level characteristics is left unrestricted. Two estimators for the (common) homophily parameter, β0, are developed and their properties studied under an asymptotic sequence involving a single network growing large. The first, tetrad logit (TL), estimator conditions on a sufficient statistic for the degree heterogeneity. The second, joint maximum likelihood (JML), estimator treats the degree heterogeneity {Ai0}i = 1N as additional (incidental) parameters to be estimated. The TL estimate is consistent under both sparse and dense graph sequences, whereas consistency of the JML estimate is shown only under dense graph sequences.  相似文献   

This paper develops a regression limit theory for nonstationary panel data with large numbers of cross section (n) and time series (T) observations. The limit theory allows for both sequential limits, wherein T followed by n, and joint limits where T, n simultaneously; and the relationship between these multidimensional limits is explored. The panel structures considered allow for no time series cointegration, heterogeneous cointegration, homogeneous cointegration, and near-homogeneous cointegration. The paper explores the existence of long-run average relations between integrated panel vectors when there is no individual time series cointegration and when there is heterogeneous cointegration. These relations are parameterized in terms of the matrix regression coefficient of the long-run average covariance matrix. In the case of homogeneous and near homogeneous cointegrating panels, a panel fully modified regression estimator is developed and studied. The limit theory enables us to test hypotheses about the long run average parameters both within and between subgroups of the full population.  相似文献   

In nonlinear panel data models, the incidental parameter problem remains a challenge to econometricians. Available solutions are often based on ingenious, model‐specific methods. In this paper, we propose a systematic approach to construct moment restrictions on common parameters that are free from the individual fixed effects. This is done by an orthogonal projection that differences out the unknown distribution function of individual effects. Our method applies generally in likelihood models with continuous dependent variables where a condition of non‐surjectivity holds. The resulting method‐of‐moments estimators are root‐N consistent (for fixed T) and asymptotically normal, under regularity conditions that we spell out. Several examples and a small‐scale simulation exercise complete the paper.  相似文献   

We propose an estimation method for models of conditional moment restrictions, which contain finite dimensional unknown parameters (θ) and infinite dimensional unknown functions (h). Our proposal is to approximate h with a sieve and to estimate θ and the sieve parameters jointly by applying the method of minimum distance. We show that: (i) the sieve estimator of h is consistent with a rate faster than n‐1/4 under certain metric; (ii) the estimator of θ is √n consistent and asymptotically normally distributed; (iii) the estimator for the asymptotic covariance of the θ estimator is consistent and easy to compute; and (iv) the optimally weighted minimum distance estimator of θ attains the semiparametric efficiency bound. We illustrate our results with two examples: a partially linear regression with an endogenous nonparametric part, and a partially additive IV regression with a link function.  相似文献   

The conventional heteroskedasticity‐robust (HR) variance matrix estimator for cross‐sectional regression (with or without a degrees‐of‐freedom adjustment), applied to the fixed‐effects estimator for panel data with serially uncorrelated errors, is inconsistent if the number of time periods T is fixed (and greater than 2) as the number of entities n increases. We provide a bias‐adjusted HR estimator that is ‐consistent under any sequences (n, T) in which n and/or T increase to ∞. This estimator can be extended to handle serial correlation of fixed order.  相似文献   

This paper establishes that instruments enable the identification of nonparametric regression models in the presence of measurement error by providing a closed form solution for the regression function in terms of Fourier transforms of conditional expectations of observable variables. For parametrically specified regression functions, we propose a root n consistent and asymptotically normal estimator that takes the familiar form of a generalized method of moments estimator with a plugged‐in nonparametric kernel density estimate. Both the identification and the estimation methodologies rely on Fourier analysis and on the theory of generalized functions. The finite‐sample properties of the estimator are investigated through Monte Carlo simulations.  相似文献   

Matching estimators are widely used in empirical economics for the evaluation of programs or treatments. Researchers using matching methods often apply the bootstrap to calculate the standard errors. However, no formal justification has been provided for the use of the bootstrap in this setting. In this article, we show that the standard bootstrap is, in general, not valid for matching estimators, even in the simple case with a single continuous covariate where the estimator is root‐N consistent and asymptotically normally distributed with zero asymptotic bias. Valid inferential methods in this setting are the analytic asymptotic variance estimator of Abadie and Imbens (2006a) as well as certain modifications of the standard bootstrap, like the subsampling methods in Politis and Romano (1994).  相似文献   

