首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We propose a new adaptive L1 penalized quantile regression estimator for high-dimensional sparse regression models with heterogeneous error sequences. We show that under weaker conditions compared with alternative procedures, the adaptive L1 quantile regression selects the true underlying model with probability converging to one, and the unique estimates of nonzero coefficients it provides have the same asymptotic normal distribution as the quantile estimator which uses only the covariates with non-zero impact on the response. Thus, the adaptive L1 quantile regression enjoys oracle properties. We propose a completely data driven choice of the penalty level λnλn, which ensures good performance of the adaptive L1 quantile regression. Extensive Monte Carlo simulation studies have been conducted to demonstrate the finite sample performance of the proposed method.  相似文献   

2.
In many financial applications, Poisson mixture regression models are commonly used to analyze heterogeneous count data. When fitting these models, the observed counts are supposed to come from two or more subpopulations and parameter estimation is typically performed by means of maximum likelihood via the Expectation–Maximization algorithm. In this study, we discuss briefly the procedure for fitting Poisson mixture regression models by means of maximum likelihood, the model selection and goodness-of-fit tests. These models are applied to a real data set for credit-scoring purposes. We aim to reveal the impact of demographic and financial variables in creating different groups of clients and to predict the group to which each client belongs, as well as his expected number of defaulted payments. The model's conclusions are very interesting, revealing that the population consists of three groups, contrasting with the traditional good versus bad categorization approach of the credit-scoring systems.  相似文献   

3.
4.
Existing literature on quantile regression for panel data models with individual effects advocates the application of penalization to reduce the dynamic panel bias and increase the efficiency of the estimators. In this paper, we consider penalized quantile regression for dynamic panel data with random effects from a Bayesian perspective, where the penalty involves an adaptive Lasso shrinkage of the random effects. We also address the role of initial conditions in dynamic panel data models, emphasizing joint modeling of start-up and subsequent responses. For posterior inference, an efficient Gibbs sampler is developed to simulate the parameters from the posterior distributions. Through simulation studies and analysis of a real data set, we assess the performance of the proposed Bayesian method.  相似文献   

5.
In many regression problems, predictors are naturally grouped. For example, when a set of dummy variables is used to represent categorical variables, or a set of basis functions of continuous variables is included in the predictor set, it is important to carry out a feature selection both at the group level and at individual variable levels within the group simultaneously. To incorporate the group and variables within-group information into a regularized model fitting, several regularization methods have been developed, including the Cox regression and the conditional mean regression. Complementary to earlier works, the simultaneous group and within-group variables selection method is examined in quantile regression. We propose a hierarchically penalized quantile regression, and show that the hierarchical penalty possesses the oracle property in quantile regression, as well as in the Cox regression. The proposed method is evaluated through simulation studies and a real data application.  相似文献   

6.
7.
Errors in measurement frequently occur in observing responses. If case–control data are based on certain reported responses, which may not be the true responses, then we have contaminated case–control data. In this paper, we first show that the ordinary logistic regression analysis based on contaminated case–control data can lead to very serious biased conclusions. This can be concluded from the results of a theoretical argument, one example, and two simulation studies. We next derive the semiparametric maximum likelihood estimate (MLE) of the risk parameter of a logistic regression model when there is a validation subsample. The asymptotic normality of the semiparametric MLE will be shown along with consistent estimate of asymptotic variance. Our example and two simulation studies show these estimates to have reasonable performance under finite sample situations.  相似文献   

8.
We study variable selection in quantile regression with multiple responses. Instead of applying conventional penalized quantile regression to each response separately, it is desired to solve them simultaneously when the sparsity patterns of the regression coefficients for different responses are similar, which is often the case in practice. In this paper, we propose employing a hierarchical penalty that enables us to detect a common sparsity pattern shared between different responses as well as additional sparsity patterns within the selected variables. We establish the oracle property of the proposed method and demonstrate it offers better performance than existing approaches.  相似文献   

9.
To perform variable selection in expectile regression, we introduce the elastic-net penalty into expectile regression and propose an elastic-net penalized expectile regression (ER-EN) model. We then adopt the semismooth Newton coordinate descent (SNCD) algorithm to solve the proposed ER-EN model in high-dimensional settings. The advantages of ER-EN model are illustrated via extensive Monte Carlo simulations. The numerical results show that the ER-EN model outperforms the elastic-net penalized least squares regression (LSR-EN), the elastic-net penalized Huber regression (HR-EN), the elastic-net penalized quantile regression (QR-EN) and conventional expectile regression (ER) in terms of variable selection and predictive ability, especially for asymmetric distributions. We also apply the ER-EN model to two real-world applications: relative location of CT slices on the axial axis and metabolism of tacrolimus (Tac) drug. Empirical results also demonstrate the superiority of the ER-EN model.  相似文献   

10.
Count data have emerged in many applied research areas. In recent years, there has been a considerable interest in models for count data. In modelling such data, it is common to face a large frequency of zeroes. The data are regarded as zero-inflated when the frequency of observed zeroes is larger than what is expected from a theoretical distribution such as Poisson distribution, as a standard model for analysing count data. Data analysis, using the simple Poisson model, may lead to over-dispersion. Several classes of different mixture models were proposed for handling zero-inflated data. But they do not apply to cases when inflated counts happen at some other points, in addition to zero. In these cases, a doubly-inflated Poisson model has been suggested which only be used for cross-sectional data and cannot consider correlations between observations. However, correlated count data have a large application, especially in the health and medical fields. The present study aims to introduce a Doubly-Inflated Poisson models with random effect for correlated doubly-inflated data. Then, the best performance of the proposed method is shown via different simulation scenarios. Finally, the proposed model is applied to a dental study.KEYWORDS: Count data, doubly-inflated, Poisson regression, zero-inflated, correlated data  相似文献   

11.
The penalized logistic regression (PLR) is a powerful statistical tool for classification. It has been commonly used in many practical problems. Despite its success, since the loss function of the PLR is unbounded, resulting classifiers can be sensitive to outliers. To build more robust classifiers, we propose the robust PLR (RPLR) which uses truncated logistic loss functions, and suggest three schemes to estimate conditional class probabilities. Connections of the RPLR with some other existing work on robust logistic regression have been discussed. Our theoretical results indicate that the RPLR is Fisher consistent and more robust to outliers. Moreover, we develop estimated generalized approximate cross validation (EGACV) for the tuning parameter selection. Through numerical examples, we demonstrate that truncating the loss function indeed yields better performance in terms of classification accuracy and class probability estimation.  相似文献   

12.
13.
This paper studies a fast computational algorithm for variable selection on high-dimensional recurrent event data. Based on the lasso penalized partial likelihood function for the response process of recurrent event data, a coordinate descent algorithm is used to accelerate the estimation of regression coefficients. This algorithm is capable of selecting important predictors for underdetermined problems where the number of predictors far exceeds the number of cases. The selection strength is controlled by a tuning constant that is determined by a generalized cross-validation method. Our numerical experiments on simulated and real data demonstrate the good performance of penalized regression in model building for recurrent event data in high-dimensional settings.  相似文献   

14.
We present a bivariate regression model for count data that allows for positive as well as negative correlation of the response variables. The covariance structure is based on the Sarmanov distribution and consists of a product of generalised Poisson marginals and a factor that depends on particular functions of the response variables. The closed form of the probability function is derived by means of the moment-generating function. The model is applied to a large real dataset on health care demand. Its performance is compared with alternative models presented in the literature. We find that our model is significantly better than or at least equivalent to the benchmark models. It gives insights into influences on the variance of the response variables.  相似文献   

15.
We apply the univariate sliced inverse regression to survival data. Our approach is different from the other papers on this subject. The right-censored observations are taken into account during the slicing of the survival times by assigning each of them with equal weight to all of the slices with longer survival. We test this method with different distributions for the two main survival data models, the accelerated lifetime model and Cox’s proportional hazards model. In both cases and under different conditions of sparsity, sample size and dimension of parameters, this non-parametric approach finds the data structure and can be viewed as a variable selector.  相似文献   

16.
Multivariate Poisson regression with covariance structure   总被引:1,自引:0,他引:1  
In recent years the applications of multivariate Poisson models have increased, mainly because of the gradual increase in computer performance. The multivariate Poisson model used in practice is based on a common covariance term for all the pairs of variables. This is rather restrictive and does not allow for modelling the covariance structure of the data in a flexible way. In this paper we propose inference for a multivariate Poisson model with larger structure, i.e. different covariance for each pair of variables. Maximum likelihood estimation, as well as Bayesian estimation methods are proposed. Both are based on a data augmentation scheme that reflects the multivariate reduction derivation of the joint probability function. In order to enlarge the applicability of the model we allow for covariates in the specification of both the mean and the covariance parameters. Extension to models with complete structure with many multi-way covariance terms is discussed. The method is demonstrated by analyzing a real life data set.  相似文献   

17.
Count data are very often analyzed under the assumption of a Poisson model [(Agresti, A., 1996. An Introduction to Categorical Data Analysis. Wiley, New York; Generalized Linear Models, second ed. Chapman & Hall, New York)]. However, the derived inference is generally erroneous if the underlying distribution is not Poisson (Biometrika 70, 269–274).A parametric robust regression approach is proposed for the analysis of count data. More specifically it will be demonstrated that the Poisson regression model could be properly adjusted to become asymptotically valid for inference about regression parameters, even if the Poisson assumption fails. With large samples the novel robust methodology provides legitimate likelihood functions for regression parameters, so long as the true underlying distributions have finite second moments. Adjustments that robustify the Poisson regression will be given, respectively, under log link and identity link functions. Simulation studies will be used to demonstrate the efficacy of the robust Poisson regression model.  相似文献   

18.
Logistic regression is estimated by maximizing the log-likelihood objective function formulated under the assumption of maximizing the overall accuracy. That does not apply to the imbalanced data. The resulting models tend to be biased towards the majority class (i.e. non-event), which can bring great loss in practice. One strategy for mitigating such bias is to penalize the misclassification costs of observations differently in the log-likelihood function. Existing solutions require either hard hyperparameter estimating or high computational complexity. We propose a novel penalized log-likelihood function by including penalty weights as decision variables for observations in the minority class (i.e. event) and learning them from data along with model coefficients. In the experiments, the proposed logistic regression model is compared with the existing ones on the statistics of area under receiver operating characteristics (ROC) curve from 10 public datasets and 16 simulated datasets, as well as the training time. A detailed analysis is conducted on an imbalanced credit dataset to examine the estimated probability distributions, additional performance measurements (i.e. type I error and type II error) and model coefficients. The results demonstrate that both the discrimination ability and computation efficiency of logistic regression models are improved using the proposed log-likelihood function as the learning objective.  相似文献   

19.
In this paper, we consider the distribution of life length of a series system with random number of components, say Z. Considering the distribution of Z as generalized Poisson, an exponential-generalized Poisson (EGP) distribution is developed. The generalized Poisson distribution is a generalization of the Poisson distribution having one extra parameter. The structural properties of the resulting distribution are presented and the maximum likelihood estimation of the parameters is investigated. Extensive simulation studies are carried out to study the performance of the estimates. The score test is developed to test the importance of the extra parameter. For illustration, two real data sets are examined and it is shown that the EGP model, presented here, fits better than the exponential–Poisson distribution.  相似文献   

20.
In life-testing and survival analysis, sometimes the components are arranged in series or parallel system and the number of components is initially unknown. Thus, the number of components, say Z, is considered as random with an appropriate probability mass function. In this paper, we model the survival data with baseline distribution as Weibull and the distribution of Z as generalized Poisson, giving rise to four parameters in the model: increasing, decreasing, bathtub and upside bathtub failure rates. Two examples are provided and the maximum-likelihood estimation of the parameters is studied. Rao's score test is developed to compare the results with the exponential Poisson model studied by Kus [17] and the exponential-generalized Poisson distribution with baseline distribution as exponential and the distribution of Z as generalized Poisson. Simulation studies are carried out to examine the performance of the estimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号