首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Using a forward selection procedure for selecting the best subset of regression variables involves the calculation of critical values (cutoffs) for an F-ratio at each step of a multistep search process. On dropping the restrictive (unrealistic) assumptions used in previous works, the null distribution of the F-ratio depends on unknown regression parameters for the variables already included in the subset. For the case of known σ, by conditioning the F-ratio on the set of regressors included so far and also on the observed (estimated) values of their regression coefficients, we obtain a forward selection procedure whose stepwise type I error does not depend on the unknown (nuisance) parameters. A numerical example with an orthogonal design matrix illustrates the difference between conditional cutoffs, cutoffs for the centralF-distribution, and cutoffs suggested by Pope and Webster.  相似文献   

2.
In this paper, we translate variable selection for linear regression into multiple testing, and select significant variables according to testing result. New variable selection procedures are proposed based on the optimal discovery procedure (ODP) in multiple testing. Due to ODP’s optimality, if we guarantee the number of significant variables included, it will include less non significant variables than marginal p-value based methods. Consistency of our procedures is obtained in theory and simulation. Simulation results suggest that procedures based on multiple testing have improvement over procedures based on selection criteria, and our new procedures have better performance than marginal p-value based procedures.  相似文献   

3.
This paper proposes a variable selection method for detecting abnormal items based on the T2 test when the observations on abnormal items are available. Based on the unbiased estimates of the powers for all subsets of variables, the variable selection method selects the subset of variables that maximizes the power estimate. Since more than one subsets of variables maximize the power estimate frequently, the averaged p-value of the rejected items is used as a second criterion. Although the performance of the method depends on the sample size for the abnormal items and the true power values for all subsets of variables, numerical experiments show the effectiveness of the proposed method. Since normal and abnormal items are simulated using one-factor and two-factor models, basic properties of the power functions for the models are investigated.  相似文献   

4.
This paper extends the ordinary quasi‐symmetry (QS) model for square contingency tables with commensurable classification variables. The proposed generalised QS model is defined in terms of odds ratios that apply to ordinal variables. In particular, we present QS models based on global, cumulative and continuation odds ratios and discuss their properties. Finally, the conditional generalised QS model is introduced for local and global odds ratios. These models are illustrated through the analysis of two data sets.  相似文献   

5.
We study bias arising from rounding categorical variables following multivariate normal (MVN) imputation. This task has been well studied for binary variables, but not for more general categorical variables. Three methods that assign imputed values to categories based on fixed reference points are compared using 25 specific scenarios covering variables with k=3, …, 7 categories, and five distributional shapes, and for each k=3, …, 7, we examine the distribution of bias arising over 100,000 distributions drawn from a symmetric Dirichlet distribution. We observed, on both empirical and theoretical grounds, that one method (projected-distance-based rounding) is superior to the other two methods, and that the risk of invalid inference with the best method may be too high at sample sizes n≥150 at 50% missingness, n≥250 at 30% missingness and n≥1500 at 10% missingness. Therefore, these methods are generally unsatisfactory for rounding categorical variables (with up to seven categories) following MVN imputation.  相似文献   

6.
Let X(1)X(2)≤···≤X(n) be the order statistics from independent and identically distributed random variables {Xi, 1≤in} with a common absolutely continuous distribution function. In this work, first a new characterization of distributions based on order statistics is presented. Next, we review some conditional expectation properties of order statistics, which can be used to establish some equivalent forms for conditional expectations for sum of random variables based on order statistics. Using these equivalent forms, some known results can be extended immediately.  相似文献   

7.
In this paper, a variables tightened-normal-tightened (TNT) two-plan sampling system based on the widely used capability index Cpk is developed for product acceptance determination when the quality characteristic of products has two-sided specification limits and follows a normal distribution. The operating procedure and operating characteristic (OC) function of the variables TNT two-plan sampling system, and the conditions for solving plan parameters are provided. The behavior of OC curves for the variables TNT sampling system under various parameters is also studied, and compared with the variables single tightened inspection plan and single normal inspection plan.  相似文献   

8.
The exact probability density function of a bivariate chi-square distribution with two correlated components is derived. Some moments of the product and ratio of two correlated chi-square random variables have been derived. The ratio of the two correlated chi-square variables is used to compare variability. One such application is referred to. Another application is pinpointed in connection with the distribution of correlation coefficient based on a bivariate t distribution.   相似文献   

9.
Acceptance sampling is a quality assurance tool, which provides a rule for the producer and the consumer to make acceptance or rejection decision about a lot. This paper attempts to develop a more efficient sampling plan, variables repetitive group sampling plan, based on the total loss to the producer and consumer. To design this model, two constraints are considered to satisfy the opposing priorities and requirements of the producer and the consumer by using Acceptable quality level (AQL) and Limiting quality level (LQL) points on operating characteristic (OC) curve. The objective function of this model is constructed based on the total expected loss. In order to illustrate the application of the proposed model, an example is presented. In addition, the effects of process parameters on the optimal solution and the total expected loss are studied by performing a sensitivity analysis. Finally, the efficiency of the proposed model is compared with the variables single sampling plan, the variables double sampling plan and the repetitive group sampling plan of Balamurali and Jun (2006) in terms of average sample number, total expected loss and its difference with ideal OC curve.  相似文献   

10.
We focus on the problem of selection of a subset of the variables so as to preserve the multivariate data structure that a principal-components analysis of the initial variables would reveal. We propose a new method based on some adapted Gaussian graphical models. This method is then compared with those developed by Bonifas et al. (1984) and Krzanowski (1987a, b). It appears that the criteria for all methods consider the same correlation submatrices and often lead to similar results. The proposed approach offers some guidance as to the number of variables to be selected. In particular, Akaike's information criterion is used.  相似文献   

11.
Global sensitivity analysis with variance-based measures suffers from several theoretical and practical limitations, since they focus only on the variance of the output and handle multivariate variables in a limited way. In this paper, we introduce a new class of sensitivity indices based on dependence measures which overcomes these insufficiencies. Our approach originates from the idea to compare the output distribution with its conditional counterpart when one of the input variables is fixed. We establish that this comparison yields previously proposed indices when it is performed with Csiszár f-divergences, as well as sensitivity indices which are well-known dependence measures between random variables. This leads us to investigate completely new sensitivity indices based on recent state-of-the-art dependence measures, such as distance correlation and the Hilbert–Schmidt independence criterion. We also emphasize the potential of feature selection techniques relying on such dependence measures as alternatives to screening in high dimension.  相似文献   

12.
In this paper, we are interested in the estimation of the reliability parameter R = P(X > Y) where X, a component strength, and Y, a component stress, are independent power Lindley random variables. The point and interval estimation of R, based on maximum likelihood, nonparametric and parametric bootstrap methods, are developed. The performance of the point estimate and confidence interval of R under the considered estimation methods is studied through extensive simulation. A numerical example, based on a real data, is presented to illustrate the proposed procedure.  相似文献   

13.
In this paper, a new mixed sampling plan based on the process capability index (PCI) Cpk is proposed and the resultant plan is called mixed variable lot-size chain sampling plan (ChSP). The proposed mixed plan comprises of both attribute and variables inspections. The variable lot-size sampling plan can be used for inspection of attribute quality characteristics and for the inspection of measurable quality characteristics, the variables ChSP based on PCI will be used. We have considered both symmetric and asymmetric fraction non conforming cases for the variables ChSP. Tables are developed for determining the optimal parameters of the proposed mixed plan based on two points on the operating characteristic (OC) approach. In order to construct the tables, the problem is formulated as a non linear programming where the average sample number function is considered as an objective function to be minimized and the lot acceptance probabilities at acceptable quality level and limiting quality level under the OC curve are considered as constraints. The practical implementation of the proposed mixed sampling plan is explained with an illustrative real time example. Advantages of the proposed sampling plan are also discussed in terms of comparison with other existing sampling plans.  相似文献   

14.
In this paper, we have considered an estimation of the population total Y of the study variable y, making use of information on an auxiliary variable x. A class of estimators for the population total Y using transformation on both the variables study as well as auxiliary has been suggested based on the probability proportional to size with replacement (PPSWR). In addition to many the usual PPS estimator, Reddy and Rao's (1977) estimator and Srivenkataramana and Tracy's (1979, 1984, 1986) estimators are shown to be members of the proposed class of estimators. The variance of the proposed class of estimators has been obtained. In particular, the properties of 75 estimators based on different known population parameters of the study as well as auxiliary variables have been derived from the proposed class of estimators. In support of the present study, numerical illustrations are given.  相似文献   

15.
This paper discusses the traditional specification problem from a geometric (or co-ordinate-free) viewpoint. While the traditional emphasis is on the properties of estimators, the geometric approach also allows an easy development of corresponding results for inference. Errors arising from artificial inclusion or exclusion of variables are considered in terms of augmentations or restrictions on a given maintained hypothesis, and this allows a corresponding interpretation of tests based upon the Wald and Lagrange Multiplier Principles. It is demonstrated that biases arising from incorrect exclusion of variables do not invalidate the traditional F-test.  相似文献   

16.
In this article, we consider the problem of selecting functional variables using the L1 regularization in a functional linear regression model with a scalar response and functional predictors, in the presence of outliers. Since the LASSO is a special case of the penalized least-square regression with L1 penalty function, it suffers from the heavy-tailed errors and/or outliers in data. Recently, Least Absolute Deviation (LAD) and the LASSO methods have been combined (the LAD-LASSO regression method) to carry out robust parameter estimation and variable selection simultaneously for a multiple linear regression model. However, variable selection of the functional predictors based on LASSO fails since multiple parameters exist for a functional predictor. Therefore, group LASSO is used for selecting functional predictors since group LASSO selects grouped variables rather than individual variables. In this study, we propose a robust functional predictor selection method, the LAD-group LASSO, for a functional linear regression model with a scalar response and functional predictors. We illustrate the performance of the LAD-group LASSO on both simulated and real data.  相似文献   

17.
In this article, a robust variable selection procedure based on the weighted composite quantile regression (WCQR) is proposed. Compared with the composite quantile regression (CQR), WCQR is robust to heavy-tailed errors and outliers in the explanatory variables. For the choice of the weights in the WCQR, we employ a weighting scheme based on the principal component method. To select variables with grouping effect, we consider WCQR with SCAD-L2 penalization. Furthermore, under some suitable assumptions, the theoretical properties, including the consistency and oracle property of the estimator, are established with a diverging number of parameters. In addition, we study the numerical performance of the proposed method in the case of ultrahigh-dimensional data. Simulation studies and real examples are provided to demonstrate the superiority of our method over the CQR method when there are outliers in the explanatory variables and/or the random error is from a heavy-tailed distribution.  相似文献   

18.
We study the distributions of the random variables Sn and Vr related to a sequence of dependent Bernoulli variables, where Sn denotes the number of successes in n trials and Vr the number of trials necessary to obtain r successes. The purpose of this article is twofold: (1) Generalizing some results on the “nature” of the binomial and negative binomial distributions we show that Sn and Vr can follow any prescribed discrete distribution. The corresponding joint distributions of the Bernoulli variables are characterized as the solutions of systems of linear equations. (2) We consider a specific type of dependence of the Bernoulli variables, where the probability of a success depends only on the number of previous successes. We develop some theory based on new closed-form representations for the probability mass functions of Sn and Vr which enable direct computations of the probabilities.  相似文献   

19.
In this paper, the Rosenthal-type maximal inequalities and Kolmogorov-type exponential inequality for negatively superadditive-dependent (NSD) random variables are presented. By using these inequalities, we study the complete convergence for arrays of rowwise NSD random variables. As applications, the Baum–Katz-type result for arrays of rowwise NSD random variables and the complete consistency for the estimator of nonparametric regression model based on NSD errors are obtained. Our results extend and improve the corresponding ones of Chen et al. [On complete convergence for arrays of rowwise negatively associated random variables. Theory Probab Appl. 2007;52(2):393–397] for arrays of rowwise negatively associated random variables to the case of arrays of rowwise NSD random variables.  相似文献   

20.
We address the problem of robust inference about the stress–strength reliability parameter R = P(X < Y), where X and Y are taken to be independent random variables. Indeed, although classical likelihood based procedures for inference on R are available, it is well-known that they can be badly affected by mild departures from model assumptions, regarding both stress and strength data. The proposed robust method relies on the theory of bounded influence M-estimators. We obtain large-sample test statistics with the standard asymptotic distribution by means of delta-method asymptotics. The finite sample behavior of these tests is investigated by some numerical studies, when both X and Y are independent exponential or normal random variables. An illustrative application in a regression setting is also discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号