首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When some explanatory variables in a regression are correlated with the disturbance term, instrumental variable methods are typically employed to make reliable inferences. Furthermore, to avoid difficulties associated with weak instruments, identification-robust methods are often proposed. However, it is hard to assess whether an instrumental variable is valid in practice because instrument validity is based on the questionable assumption that some of them are exogenous. In this paper, we focus on structural models and analyze the effects of instrument endogeneity on two identification-robust procedures, the Anderson–Rubin (1949, AR) and the Kleibergen (2002, K) tests, with or without weak instruments. Two main setups are considered: (1) the level of “instrument” endogeneity is fixed (does not depend on the sample size) and (2) the instruments are locally exogenous, i.e. the parameter which controls instrument endogeneity approaches zero as the sample size increases. In the first setup, we show that both test procedures are in general consistent against the presence of invalid instruments (hence asymptotically invalid for the hypothesis of interest), whether the instruments are “strong” or “weak”. We also describe cases where test consistency may not hold, but the asymptotic distribution is modified in a way that would lead to size distortions in large samples. These include, in particular, cases where the 2SLS estimator remains consistent, but the AR and K tests are asymptotically invalid. In the second setup, we find (non-degenerate) asymptotic non-central chi-square distributions in all cases, and describe cases where the non-centrality parameter is zero and the asymptotic distribution remains the same as in the case of valid instruments (despite the presence of invalid instruments). Overall, our results underscore the importance of checking for the presence of possibly invalid instruments when applying “identification-robust” tests.  相似文献   

2.
This paper examines the design and performance of sequential experiments where extensive switching is undesirable. Given an objective function to optimize by sampling between Bernoulli populations, two different models are considered. The constraint model restricts the maximum number of switches possible, while the cost model introduces a charge for each switch. Optimal allocation procedures and a new “hyperopic” procedure are discussed and their behavior examined. For the cost model, if one views the costs as control variables then the optimal allocation procedures yield the optimal tradeoff of expected switches vs. expected value of the objective function.  相似文献   

3.
In many areas of application, especially life testing and reliability, it is often of interest to estimate an unknown cumulative distribution (cdf). A simultaneous confidence band (SCB) of the cdf can be used to assess the statistical uncertainty of the estimated cdf over the entire range of the distribution. Cheng and Iles [1983. Confidence bands for cumulative distribution functions of continuous random variables. Technometrics 25 (1), 77–86] presented an approach to construct an SCB for the cdf of a continuous random variable. For the log-location-scale family of distributions, they gave explicit forms for the upper and lower boundaries of the SCB based on expected information. In this article, we extend the work of Cheng and Iles [1983. Confidence bands for cumulative distribution functions of continuous random variables. Technometrics 25 (1), 77–86] in several directions. We study the SCBs based on local information, expected information, and estimated expected information for both the “cdf method” and the “quantile method.” We also study the effects of exceptional cases where a simple SCB does not exist. We describe calibration of the bands to provide exact coverage for complete data and type II censoring and better approximate coverage for other kinds of censoring. We also discuss how to extend these procedures to regression analysis.  相似文献   

4.
This paper addresses the problem of confidence band construction for a standard multiple linear regression model. A “ray” method of construction is developed which generalizes the method of Graybill and Bowden [1967. Linear segment confidence bands for simple linear regression models. J. Amer. Statist. Assoc. 62, 403–408] for a simple linear regression model to a multiple linear regression model. By choosing suitable directions for the rays this method requires only critical points from t-distributions so that the confidence bands are easy to construct. Both one-sided and two-sided confidence bands can be constructed using this method. An illustration of the new method is provided.  相似文献   

5.
Confidence intervals for parameters that can be arbitrarily close to being unidentified are unbounded with positive probability [e.g. Dufour, J.-M., 1997. Some impossibility theorems in econometrics with applications to instrumental variables and dynamic models. Econometrica 65, 1365–1388; Pfanzagl, J. 1998. The nonexistence of confidence sets for discontinuous functionals. Journal of Statistical Planning and Inference 75, 9–20], and the asymptotic risks of their estimators are unbounded [Pötscher, B.M., 2002. Lower risk bounds and properties of confidence sets for ill-posed estimation problems with applications to spectral density and persistence estimation, unit roots, and estimation of long memory parameters. Econometrica 70, 1035–1065]. We extend these “impossibility results” and show that all tests of size α concerning parameters that can be arbitrarily close to being unidentified have power that can be as small as α for any sample size even if the null and the alternative hypotheses are not adjacent. The results are proved for a very general framework that contains commonly used models.  相似文献   

6.
Let X1,…,Xn be an exchangeable sequence of binary trials arranged on a circle with possible values “1” (success) or “0” (failure). In an exchangeable sequence, the joint distribution of X1,X2,…,Xn is invariant under the permutation of its arguments. For the circular sequence, general expressions for the joint distributions of run statistics based on the joint distribution of success and failure run lengths are obtained. As a special case, we present our results for Bernoulli trials. The results presented consist of combinatorial terms and therefore provide easier calculations. For illustration purposes, some numerical examples are given and the reliability of the circular combined k-out-of-n:G and consecutive kc-out-of-n:G system under stress–strength setup is evaluated.  相似文献   

7.
Outlining some recently obtained results of Hu and Rosenberger [2003. Optimality, variability, power: evaluating response-adaptive randomization procedures for treatment comparisons. J. Amer. Statist. Assoc. 98, 671–678] and Chen [2006. The power of Efron's biased coin design. J. Statist. Plann. Inference 136, 1824–1835] on the relationship between sequential randomized designs and the power of the usual statistical procedures for testing the equivalence of two competing treatments, the aim of this paper is to provide theoretical proofs of the numerical results of Chen [2006. The power of Efron's biased coin design. J. Statist. Plann. Inference 136, 1824–1835]. Furthermore, we prove that the Adjustable Biased Coin Design [Baldi Antognini A., Giovagnoli, A., 2004. A new “biased coin design” for the sequential allocation of two treatments. J. Roy. Statist. Soc. Ser. C 53, 651–664] is uniformly more powerful than the other “coin” designs proposed in the literature for any sample size.  相似文献   

8.
We wish to test the null hypothesis if the means of N panels remain the same during the observation period of length T. A quasi-likelihood argument leads to self-normalized statistics whose limit distribution under the null hypothesis is double exponential. The main results are derived assuming that the each panel is based on independent observations and then extended to linear processes. The proofs are based on an approximation of the sum of squared CUSUM processes using the Skorokhod embedding scheme. A simulation study illustrates that our results can be used in case of small and moderate N and T. We apply our results to detect change in the “corruption index”.  相似文献   

9.
In this article we study a linear discriminant function of multiple m-variate observations at u-sites and over v-time points under the assumption of multivariate normality. We assume that the m-variate observations have a separable mean vector structure and a “jointly equicorrelated covariance” structure. The new discriminant function is very effective in discriminating individuals in a small sample scenario. No closed-form expression exists for the maximum likelihood estimates of the unknown population parameters, and their direct computation is nontrivial. An iterative algorithm is proposed to calculate the maximum likelihood estimates of these unknown parameters. A discriminant function is also developed for unstructured mean vectors. The new discriminant functions are applied to simulated data sets as well as to a real data set. Results illustrating the benefits of the new classification methods over the traditional one are presented.  相似文献   

10.
In what follows, we introduce two Bayesian models for feature selection in high-dimensional data, specifically designed for the purpose of classification. We use two approaches to the problem: one which discards the components which have “almost constant” values (Model 1) and another which retains the components for which variations in-between the groups are larger than those within the groups (Model 2). We assume that p?n, i.e. the number of components p is much larger than the number of samples n, and that only few of those p components are useful for subsequent classification. We show that particular cases of the above two models recover familiar variance or ANOVA-based component selection. When one has only two classes and features are a priori independent, Model 2 reduces to the Feature Annealed Independence Rule (FAIR) introduced by Fan and Fan (2008) and can be viewed as a natural generalization of FAIR to the case of L>2 classes. The performance of the methodology is studies via simulations and using a biological dataset of animal communication signals comprising 43 groups of electric signals recorded from tropical South American electric knife fishes.  相似文献   

11.
Sample quantile, rank, and outlyingness functions play long-established roles in univariate exploratory data analysis. In recent years, various multivariate generalizations have been formulated, among which the “spatial” approach has become especially well developed, including fully affine equivariant/invariant versions with but modest computational burden (24, 6, 34, 32 and 25). The only shortcoming of the spatial approach is that its robustness decreases to zero as the quantile or outlyingness level is chosen farther out from the center (Dang and Serfling, 2010). This is especially detrimental to exploratory data analysis procedures such as detection of outliers and delineation of the “middle” 50%, 75%, or 90% of the data set, for example. Here we develop suitably robust versions using a trimming approach. The improvements in robustness are illustrated and characterized using simulated and actual data. Also, as a byproduct of the investigation, a new robust, affine equivariant, and computationally easy scatter estimator is introduced.  相似文献   

12.
Isotones   are a deterministic graphical device introduced by Mudholkar et al. [1991. A graphical procedure for comparing goodness-of-fit tests. J. Roy. Statist. Soc. B 53, 221–232], in the context of comparing some tests of normality. An isotone of a test is a contour of pp values of the test applied to “ideal samples”, called profiles, from a two-shape-parameter family representing the null and the alternative distributions of the parameter space. The isotone is an adaptation of Tukey's sensitivity curves, a generalization of Prescott's stylized sensitivity contours, and an alternative to the isodynes   of Stephens. The purpose of this paper is two fold. One is to show that the isotones can provide useful qualitative information regarding the behavior of the tests of distributional assumptions other than normality. The other is to show that the qualitative conclusions remain the same from one two-parameter family of alternatives to another. Towards this end we construct and interpret the isotones of some tests of the composite hypothesis of exponentiality, using the profiles of two Weibull extensions, the generalized Weibull and the exponentiated Weibull families, which allow IFR, DFR, as well as unimodal and bathtub failure rate alternatives. Thus, as a by-product of the study, it is seen that a test due to Csörg? et al. [1975. Application of characterizations in the area of goodness-of-fit. In: Patil, G.P., Kotz, S., Ord, J.K. (Eds.), Statistical Distributions in Scientific Work, vol. 2. Reidel, Boston, pp. 79–90], and Gnedenko's Q(r)Q(r) test [1969. Mathematical Methods of Reliability Theory. Academic Press, New York], are appropriate for detecting monotone failure rate alternatives, whereas a bivariate FF test due to Lin and Mudholkar [1980. A test of exponentiality based on the bivariate FF distribution. Technometrics 22, 79–82] and their entropy test [1984. On two applications of characterization theorems to goodness-of-fit. Colloq. Math. Soc. Janos Bolyai 45, 395–414] can detect all alternatives, but are especially suitable for nonmonotone failure rate alternatives.  相似文献   

13.
The lognormal distribution is currently used extensively to describe the distribution of positive random variables. This is especially the case with data pertaining to occupational health and other biological data. One particular application of the data is statistical inference with regards to the mean of the data. Other authors, namely Zou et al. (2009), have proposed procedures involving the so-called “method of variance estimates recovery” (MOVER), while an alternative approach based on simulation is the so-called generalized confidence interval, discussed by Krishnamoorthy and Mathew (2003). In this paper we compare the performance of the MOVER-based confidence interval estimates and the generalized confidence interval procedure to coverage of credibility intervals obtained using Bayesian methodology using a variety of different prior distributions to estimate the appropriateness of each. An extensive simulation study is conducted to evaluate the coverage accuracy and interval width of the proposed methods. For the Bayesian approach both the equal-tail and highest posterior density (HPD) credibility intervals are presented. Various prior distributions (Independence Jeffreys' prior, Jeffreys'-Rule prior, namely, the square root of the determinant of the Fisher Information matrix, reference and probability-matching priors) are evaluated and compared to determine which give the best coverage with the most efficient interval width. The simulation studies show that the constructed Bayesian confidence intervals have satisfying coverage probabilities and in some cases outperform the MOVER and generalized confidence interval results. The Bayesian inference procedures (hypothesis tests and confidence intervals) are also extended to the difference between two lognormal means as well as to the case of zero-valued observations and confidence intervals for the lognormal variance. In the last section of this paper the bivariate lognormal distribution is discussed and Bayesian confidence intervals are obtained for the difference between two correlated lognormal means as well as for the ratio of lognormal variances, using nine different priors.  相似文献   

14.
Let τ be an arbitrary lattice path, called in this context string, consisting of two kinds of steps (rises and falls) and let j be a non-negative integer.In this paper, the explicit formula for the generating function Fj associated with the Dyck path statistic “number of occurrences of τ at height j” is evaluated.For the expression of Fj some basic characteristics of the string are used, namely its number of rises, height, depth and periodicity, as well as the generating function of the Catalan numbers.  相似文献   

15.
Two methods to select columns for assigning factors to work on supersaturated designs are proposed. The focus of interest is the degree of non-orthogonality between the selected columns. One method is the exhaustive enumeration of selections of p columns from all k columns to find the exact optimality, while the other is intended to find an approximate solution by applying techniques used in the corresponding analysis, aiming for ease of use as well as a reduction in the large computing time required for large k with the first method. Numerical illustrations for several typical design matrices reveal that the resulting “approximately” optimal assignments of factors to their columns are exactly optimal for any p. Ordering the columns in E(s2)-optimal designs results in promising new findings including a large number of E(s2)-optimal designs.  相似文献   

16.
In this note we propose a new and novel kernel density estimator for directly estimating the probability and cumulative distribution function of an L-estimate from a single population based on utilizing the theory in Knight (1985) in conjunction with classic inversion theory. This idea is further developed for a kernel density estimator for the difference of L-estimates from two independent populations. The methodology is developed via a “plug-in” approach, but it is distinct from the classic bootstrap methodology in that it is analytically and computationally feasible to provide an exact estimate of the distribution function and thus eliminates the resampling related error. The asymptotic and finite sample properties of our estimators are examined. The procedure is illustrated via generating the kernel density estimate for the Tukey's trimean from a small data set.  相似文献   

17.
Consider a linear regression model with regression parameter β=(β1,…,βp) and independent normal errors. Suppose the parameter of interest is θ=aTβ, where a is specified. Define the s-dimensional parameter vector τ=CTβt, where C and t are specified. Suppose that we carry out a preliminary F test of the null hypothesis H0:τ=0 against the alternative hypothesis H1:τ≠0. It is common statistical practice to then construct a confidence interval for θ with nominal coverage 1−α, using the same data, based on the assumption that the selected model had been given to us a priori (as the true model). We call this the naive 1−α confidence interval for θ. This assumption is false and it may lead to this confidence interval having minimum coverage probability far below 1−α, making it completely inadequate. We provide a new elegant method for computing the minimum coverage probability of this naive confidence interval, that works well irrespective of how large s is. A very important practical application of this method is to the analysis of covariance. In this context, τ can be defined so that H0 expresses the hypothesis of “parallelism”. Applied statisticians commonly recommend carrying out a preliminary F test of this hypothesis. We illustrate the application of our method with a real-life analysis of covariance data set and a preliminary F test for “parallelism”. We show that the naive 0.95 confidence interval has minimum coverage probability 0.0846, showing that it is completely inadequate.  相似文献   

18.
We consider the bandit problem with an infinite number of Bernoulli arms, of which the unknown parameters are assumed to be i.i.d. random variables with a common distribution F. Our goal is to construct optimal strategies of choosing “arms” so that the expected long-run failure rate is minimized. We first review a class of strategies and establish their asymptotic properties when F is known. Based on the results, we propose a new strategy and prove that it is asymptotically optimal when F is unknown. Finally, we show that the proposed strategy performs well for a number of simulation scenarios.  相似文献   

19.
This paper considers linear and nonlinear regression with a response variable that is allowed to be “missing at random”. The only structural assumptions on the distribution of the variables are that the errors have mean zero and are independent of the covariates. The independence assumption is important. It enables us to construct an estimator for the response density that uses all the observed data, in contrast to the usual local smoothing techniques, and which therefore permits a faster rate of convergence. The idea is to write the response density as a convolution integral which can be estimated by an empirical version, with a weighted residual-based kernel estimator plugged in for the error density. For an appropriate class of regression functions, and a suitably chosen bandwidth, this estimator is consistent and converges with the optimal parametric rate n1/2. Moreover, the estimator is proved to be efficient (in the sense of Hájek and Le Cam) if an efficient estimator is used for the regression parameter.  相似文献   

20.
We consider a hypothesis problem with directional alternatives. We approach the problem from a Bayesian decision theoretic point of view and consider a situation when one side of the alternatives is more important or more probable than the other. We develop a general Bayesian framework by specifying a mixture prior structure and a loss function related to the Kullback–Leibler divergence. This Bayesian decision method is applied to Normal and Poisson populations. Simulations are performed to compare the performance of the proposed method with that of a method based on a classical z-test and a Bayesian method based on the “0–1” loss.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号