首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
We study the variable selection problem for a class of generalized linear models with endogenous covariates. Based on the instrumental variable adjustment technology and the smooth-threshold estimating equation (SEE) method, we propose an instrumental variable based variable selection procedure. The proposed variable selection method can attenuate the effect of endogeneity in covariates, and is easy for application in practice. Some theoretical results are also derived such as the consistency of the proposed variable selection procedure and the convergence rate of the resulting estimator. Further, some simulation studies and a real data analysis are conducted to evaluate the performance of the proposed method, and simulation results show that the proposed method is workable.  相似文献   

3.
ABSTRACT

Markov's theorem for an upper bound of the probability related to a nonnegative random variable has been improved using additional information in almost the nontrivial entire range of the variable. In the improvement, Cantelli's inequality is applied to the square root of the original variable, whose expectation is finite when that of the original variable is finite. The improvement has been extended to lower bounds and monotonic transformations of the original variable. The improvements are used in Chebyshev's inequality and its multivariate version.  相似文献   

4.
A Cornish-Fisher expansion is used to approximate the per-centiles of a variable of the bivariate normal distribution when the other variable is truncated. The expression is in terms of the bivariate cumulants of a singly truncated bivariate normal distribution. The percentiles are useful in the problem of personnel selection where we use a screening variable to screen applicants for employment and a correlated performance variable to screen employees for rehiring. This paper provides a bivariate cumulants table for determining the cutoff score of the performance variable. The following two problems are also con¬sidered: (1) determine the proportion of applicants who would have been successful had no screening been applied, and (2) determine the proportion of individuals being rejected byscreening who would have been successful had they been hired, The variable that is used to measure job performance and the variable that measures the outcome of an aptitude test are assumed to be jointly normally distributed with correlation ρ  相似文献   

5.
In the framework of cluster analysis based on Gaussian mixture models, it is usually assumed that all the variables provide information about the clustering of the sample units. Several variable selection procedures are available in order to detect the structure of interest for the clustering when this structure is contained in a variable sub-vector. Currently, in these procedures a variable is assumed to play one of (up to) three roles: (1) informative, (2) uninformative and correlated with some informative variables, (3) uninformative and uncorrelated with any informative variable. A more general approach for modelling the role of a variable is proposed by taking into account the possibility that the variable vector provides information about more than one structure of interest for the clustering. This approach is developed by assuming that such information is given by non-overlapped and possibly correlated sub-vectors of variables; it is also assumed that the model for the variable vector is equal to a product of conditionally independent Gaussian mixture models (one for each variable sub-vector). Details about model identifiability, parameter estimation and model selection are provided. The usefulness and effectiveness of the described methodology are illustrated using simulated and real datasets.  相似文献   

6.
This paper concerns a robust variable selection method in multiple linear regression: the robust S-nonnegative garrote variable selection method. In this paper the consistency of the method, both in terms of estimation and in terms of variable selection, is established. Moreover, the robustness properties of the method are further investigated by providing a lower bound for the breakdown point, and by deriving the influence function. The provided expressions nicely reveal the impact that the choice of an initial estimator has on the robustness properties of the variable selection method. Illustrative examples of influence functions for the S-nonnegative garrote as well as for the original (non-robust) nonnegative garrote variable selection method are provided.  相似文献   

7.
Stepwise variable selection procedures are computationally inexpensive methods for constructing useful regression models for a single dependent variable. At each step a variable is entered into or deleted from the current model, based on the criterion of minimizing the error sum of squares (SSE). When there is more than one dependent variable, the situation is more complex. In this article we propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion. Specifically, we suggest minimizing some function of the estimated error covariance matrix: the trace, the determinant, or the largest eigenvalue. The computations associated with these criteria may be burdensome. We develop a computational framework based on the use of the SWEEP operator which greatly reduces these calculations for stepwise variable selection in multivariate regression.  相似文献   

8.
This article provides a strategy to identify the existence and direction of a causal effect in a generalized nonparametric and nonseparable model identified by instrumental variables. The causal effect concerns how the outcome depends on the endogenous treatment variable. The outcome variable, treatment variable, other explanatory variables, and the instrumental variable can be essentially any combination of continuous, discrete, or “other” variables. In particular, it is not necessary to have any continuous variables, none of the variables need to have large support, and the instrument can be binary even if the corresponding endogenous treatment variable and/or outcome is continuous. The outcome can be mismeasured or interval-measured, and the endogenous treatment variable need not even be observed. The identification results are constructive, and can be empirically implemented using standard estimation results.  相似文献   

9.
The quality and loss of products are crucial factors separating competitive companies in global market. Firms widely employ a loss function to measure the loss caused by a deviation of the quality variable from the target value. Monitoring this deviation from the process target value is important from the view of Taguchi’s philosophy. In reality, there are many situations where the distribution of the quality variable may not be normal but skewed. This paper aims at developing a median loss (ML) control chart for monitoring quality loss under skewed distributions. Both the cases with fixed and variable sampling intervals are considered. Numerical results show that the ML chart with (optimal) variable sampling intervals performs better than the ML chart in detecting small to moderate shifts in the process loss centre or in the difference of mean and target and/or variance of a process variable. The ML chart and the ML chart with variable sampling intervals also illustrate the best performance in detection out-of-control process for a process quality variable with a left-skewed distribution. A numerical example illustrates the application of the proposed control chart.  相似文献   

10.
陈骥等 《统计研究》2019,36(4):106-118
针对群组评价在分配评价个体权重时,由于忽视群组意见分歧以及不考虑个体评价尺度的“非稳定性”而导致的个体权重固化现象,提出了基于自适应变权的群组评价方法。首先,阐述了群组变权的理论依据与基本思路,从个体与群组意见分歧的角度,设计了基于意见偏差的变权机制;其次,以“满意的群组相对一致性水平”为控制条件,设计了群组变权的自适应机制,在不调整群组评价量化数据的基础上,进行个体权重的自适应变权分配与变权综合。最后,应用实际案例演示了该方法的过程;通过调整不同的学习率取值,对比其动态变动特征以分析其可用性。  相似文献   

11.
A harmonic new better than used in expectation (HNBUE) variable is a random variable which is dominated by an exponential distribution in the convex stochastic order. We use a recently obtained condition on stochastic equality under convex domination to derive characterizations of the exponential distribution and bounds for HNBUE variables based on the mean values of the order statistics of the variable. We apply the results to generate discrepancy measures to test if a random variable is exponential against the alternative that is HNBUE, but not exponential.  相似文献   

12.
针对教育收益率测算中可能存在的弱工具变量问题,本文利用2006年中国健康与营养调查数据,结合工具变量估计框架下的各种模型设定检验,对我国正规就业者的教育收益率进行测算。检验和测算结果表明:受教育程度的变量存在内生性,个体配偶的受教育年限是内生变量受教育程度的强工具变量,而个体的出生季度是弱工具变量。广义矩估计结果显示我国正规就业者的教育收益率为10.1%。  相似文献   

13.
In this article, we propose a new mixture model induced by the model of proportional mean residual life. Under some appropriate assumptions, it is shown that the mixing and overall variables in the model admit the positive likelihood ratio dependence structure. To see how the overall variable is affected by the stochastic variation of the mixing variable, we study some stochastic comparisons using these variables. Finally, some useful bounds for tail probability of the overall variable for large values of the mixing variable are derived.  相似文献   

14.
In this paper, we propose a novel Max-Relevance and Min-Common-Redundancy criterion for variable selection in linear models. Considering that the ensemble approach for variable selection has been proven to be quite effective in linear regression models, we construct a variable selection ensemble (VSE) by combining the presented stochastic correlation coefficient algorithm with a stochastic stepwise algorithm. We conduct extensive experimental comparison of our algorithm and other methods using two simulation studies and four real-life data sets. The results confirm that the proposed VSE leads to promising improvement on variable selection and regression accuracy.  相似文献   

15.
Semiparametric regression models with multiple covariates are commonly encountered. When there are covariates not associated with response variable, variable selection may lead to sparser models, more lucid interpretations and more accurate estimation. In this study, we adopt a sieve approach for the estimation of nonparametric covariate effects in semiparametric regression models. We adopt a two-step iterated penalization approach for variable selection. In the first step, a mixture of the Lasso and group Lasso penalties are employed to conduct the first-round variable selection and obtain the initial estimate. In the second step, a mixture of the weighted Lasso and weighted group Lasso penalties, with weights constructed using the initial estimate, are employed for variable selection. We show that the proposed iterated approach has the variable selection consistency property, even when number of unknown parameters diverges with sample size. Numerical studies, including simulation and analysis of a diabetes dataset, show satisfactory performance of the proposed approach.  相似文献   

16.
Selection of the important variables is one of the most important model selection problems in statistical applications. In this article, we address variable selection in finite mixture of generalized semiparametric models. To overcome computational burden, we introduce a class of variable selection procedures for finite mixture of generalized semiparametric models using penalized approach for variable selection. Estimation of nonparametric component will be done via multivariate kernel regression. It is shown that the new method is consistent for variable selection and the performance of proposed method will be assessed via simulation.  相似文献   

17.
We propose a latent variable model for informative missingness in longitudinal studies which is an extension of latent dropout class model. In our model, the value of the latent variable is affected by the missingness pattern and it is also used as a covariate in modeling the longitudinal response. So the latent variable links the longitudinal response and the missingness process. In our model, the latent variable is continuous instead of categorical and we assume that it is from a normal distribution. The EM algorithm is used to obtain the estimates of the parameter we are interested in and Gauss–Hermite quadrature is used to approximate the integration of the latent variable. The standard errors of the parameter estimates can be obtained from the bootstrap method or from the inverse of the Fisher information matrix of the final marginal likelihood. Comparisons are made to the mixed model and complete-case analysis in terms of a clinical trial dataset, which is Weight Gain Prevention among Women (WGPW) study. We use the generalized Pearson residuals to assess the fit of the proposed latent variable model.  相似文献   

18.
A polynomial functional relationship with errors in both variables can be consistently estimated by constructing an ordinary least squares estimator for the regression coefficients, assuming hypothetically the latent true regressor variable to be known, and then adjusting for the errors. If normality of the error variables can be assumed, the estimator can be simplified considerably. Only the variance of the errors in the regressor variable and its covariance with the errors of the response variable need to be known. If the variance of the errors in the dependent variable is also known, another estimator can be constructed.  相似文献   

19.
Abstract

In this paper we are concerned with variable selection in finite mixture of semiparametric regression models. This task consists of model selection for non parametric component and variable selection for parametric part. Thus, we encountered separate model selections for every non parametric component of each sub model. To overcome this computational burden, we introduced a class of variable selection procedures for finite mixture of semiparametric regression models using penalized approach for variable selection. It is shown that the new method is consistent for variable selection. Simulations show that the performance of proposed method is good, and it consequently improves pervious works in this area and also requires much less computing power than existing methods.  相似文献   

20.
One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号