首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The multinomial logit model (MNL) is one of the most frequently used statistical models in marketing applications. It allows one to relate an unordered categorical response variable, for example representing the choice of a brand, to a vector of covariates such as the price of the brand or variables characterising the consumer. In its classical form, all covariates enter in strictly parametric, linear form into the utility function of the MNL model. In this paper, we introduce semiparametric extensions, where smooth effects of continuous covariates are modelled by penalised splines. A mixed model representation of these penalised splines is employed to obtain estimates of the corresponding smoothing parameters, leading to a fully automated estimation procedure. To validate semiparametric models against parametric models, we utilise different scoring rules as well as predicted market share and compare parametric and semiparametric approaches for a number of brand choice data sets.  相似文献   

2.
In this article, we propose semiparametric methods to estimate the cumulative incidence function of two dependent competing risks for left-truncated and right-censored data. The proposed method is based on work by Huang and Wang (1995). We extend previous model by allowing for a general parametric truncation distribution and a third competing risk before recruitment. Based on work by Vardi (1989), several iterative algorithms are proposed to obtain the semiparametric estimates of cumulative incidence functions. The asymptotic properties of the semiparametric estimators are derived. Simulation results show that a semiparametric approach assuming the parametric truncation distribution is correctly specified produces estimates with smaller mean squared error than those obtained in a fully nonparametric model.  相似文献   

3.
There are a variety of economic areas, such as studies of employment duration and of the durability of capital goods, in which data on important variables typically are censored. The standard techinques for estimating a model from censored data require the distributions of unobservable random components of the model to be specified a priori up to a finite set of parameters, and misspecification of these distributions usually leads to inconsistent parameter estimates. However, economic theory rarely gives guidance about distributions and the standard estimation techniques do not provide convenient methods for identifying distributions from censored data. Recently, several distribution-free or semiparametric methods for estimating censored regression models have been developed. This paper presents the results of using two such methods to estimate a model of employment duration. The paper reports the operating characteristics of the semiparametric estimators and compares the semiparametric estimates with those obtained from a standard parametric model.  相似文献   

4.
In this paper, we propose a new semiparametric heteroscedastic regression model allowing for positive and negative skewness and bimodal shapes using the B-spline basis for nonlinear effects. The proposed distribution is based on the generalized additive models for location, scale and shape framework in order to model any or all parameters of the distribution using parametric linear and/or nonparametric smooth functions of explanatory variables. We motivate the new model by means of Monte Carlo simulations, thus ignoring the skewness and bimodality of the random errors in semiparametric regression models, which may introduce biases on the parameter estimates and/or on the estimation of the associated variability measures. An iterative estimation process and some diagnostic methods are investigated. Applications to two real data sets are presented and the method is compared to the usual regression methods.  相似文献   

5.
In the parametric regression model, the covariate missing problem under missing at random is considered. It is often desirable to use flexible parametric or semiparametric models for the covariate distribution, which can reduce a potential misspecification problem. Recently, a completely nonparametric approach was developed by [H.Y. Chen, Nonparametric and semiparametric models for missing covariates in parameter regression, J. Amer. Statist. Assoc. 99 (2004), pp. 1176–1189; Z. Zhang and H.E. Rockette, On maximum likelihood estimation in parametric regression with missing covariates, J. Statist. Plann. Inference 47 (2005), pp. 206–223]. Although it does not require a model for the covariate distribution or the missing data mechanism, the proposed method assumes that the covariate distribution is supported only by observed values. Consequently, their estimator is a restricted maximum likelihood estimator (MLE) rather than the global MLE. In this article, we show the restricted semiparametric MLE could be very misleading in some cases. We discuss why this problem occurs and suggest an algorithm to obtain the global MLE. Then, we assess the performance of the proposed method via some simulation experiments.  相似文献   

6.
Abstract.  We consider marginal semiparametric partially linear models for longitudinal/clustered data and propose an estimation procedure based on a spline approximation of the non-parametric part of the model and an extension of the parametric marginal generalized estimating equations (GEE). Our estimates of both parametric part and non-parametric part of the model have properties parallel to those of parametric GEE, that is, the estimates are efficient if the covariance structure is correctly specified and they are still consistent and asymptotically normal even if the covariance structure is misspecified. By showing that our estimate achieves the semiparametric information bound, we actually establish the efficiency of estimating the parametric part of the model in a stronger sense than what is typically considered for GEE. The semiparametric efficiency of our estimate is obtained by assuming only conditional moment restrictions instead of the strict multivariate Gaussian error assumption.  相似文献   

7.
ABSTRACT

We investigate the semiparametric smooth coefficient stochastic frontier model for panel data in which the distribution of the composite error term is assumed to be of known form but depends on some environmental variables. We propose multi-step estimators for the smooth coefficient functions as well as the parameters of the distribution of the composite error term and obtain their asymptotic properties. The Monte Carlo study demonstrates that the proposed estimators perform well in finite samples. We also consider an application and perform model specification test, construct confidence intervals, and estimate efficiency scores that depend on some environmental variables. The application uses a panel data on 451 large U.S. firms to explore the effects of computerization on productivity. Results show that two popular parametric models used in the stochastic frontier literature are likely to be misspecified. Compared with the parametric estimates, our semiparametric model shows a positive and larger overall effect of computer capital on the productivity. The efficiency levels, however, were not much different among the models. Supplementary materials for this article are available online.  相似文献   

8.
This paper is concerned with semiparametric efficient estimation of a generalized partially linear varying coefficient model. The model studied in this paper is very flexible, accommodating various nonlinear relations between the response variable and a set of predictor variables. It is a structured regression model and is particularly useful in dealing with a discrete response variable. We apply the smooth backfitting technique to estimate the nonparametric part of the model and employ the profiling approach to obtain a semiparametric efficient estimator of the parametric part.  相似文献   

9.
A new method of modeling coronary artery calcium (CAC) is needed in order to properly understand the probability of onset and growth of CAC. CAC remains a controversial indicator of cardiovascular disease (CVD) risk, but this may be due to ill-equipped methods of specifying CAC during the analysis phase of studies reporting an analysis where CAC is the primary outcome. The modern method of two-part latent growth modeling may represent a strong alternative to the myriad of existing methods for modeling CAC. We provide a brief overview of existing methods of analysis used for CAC before introducing the general latent growth curve model, how it extends into a two-part (semicontinuous) growth model, and how the ubiquitous problem of missing data can be effectively handled. We then present an example of how to model CAC using this framework. We demonstrate that utilizing this type of modeling strategy can result in traditional predictors of CAC (e.g. age, gender, and high-density lipoprotein cholesterol), exerting a different impact on the two different, yet simultaneous, operationalizations of CAC. This method of analyzing CAC could inform future analyses of CAC and inform subsequent discussions about the nature of its potential to inform long-term CVD risk and heart events.  相似文献   

10.
We propose goodness-of-fit tests for testing generalized linear models and semiparametric regression models against smooth alternatives. The focus is on models having both continous and factorial covariates. As a smooth extension of a parametric or semiparametric model we use generalized varying-coefficient models as proposed by Hastie and Tibshirani. A likelihood ratio statistic is used for testing. Asymptotic expansions allow us to write the estimates as linear smoothers which in turn guarantees simple and fast bootstrapping of the test statistic. The test is shown to have √ n -power, but in contrast with parametric tests it is powerful against smooth alternatives in general.  相似文献   

11.
Female labor participation models have been usually studied through probit and logit specifications. Little attention has been paid to verify the assumptions that are used in these sort of models, basically distributional assumptions and homoskedasticity. In this paper we apply semiparametirc methods in order to test the previous hypothesis. We also estimate a Spanish female labor participation model using both parametric and semiparametirc approaches. The parametirc model includes fixed and random coefficients probit specification. The estimation procedures are parametric maximum likelihood for both probit and logit models, and semiparametric quasi maximum likelihood following Klein and Spady (1993). The results depend cricially in the assumed model.  相似文献   

12.
Abstract.  We consider large sample inference in a semiparametric logistic/proportional-hazards mixture model. This model has been proposed to model survival data where there exists a positive portion of subjects in the population who are not susceptible to the event under consideration. Previous studies of the logistic/proportional-hazards mixture model have focused on developing point estimation procedures for the unknown parameters. This paper studies large sample inferences based on the semiparametric maximum likelihood estimator. Specifically, we establish existence, consistency and asymptotic normality results for the semiparametric maximum likelihood estimator. We also derive consistent variance estimates for both the parametric and non-parametric components. The results provide a theoretical foundation for making large sample inference under the logistic/proportional-hazards mixture model.  相似文献   

13.
Nonparametric regression models are often used to check or suggest a parametric model. Several methods have been proposed to test the hypothesis of a parametric regression function against an alternative smoothing spline model. Some tests such as the locally most powerful (LMP) test by Cox et al. (Cox, D., Koh, E., Wahba, G. and Yandell, B. (1988). Testing the (parametric) null model hypothesis in (semiparametric) partial and generalized spline models. Ann. Stat., 16, 113–119.), the generalized maximum likelihood (GML) ratio test and the generalized cross validation (GCV) test by Wahba (Wahba, G. (1990). Spline models for observational data. CBMS-NSF Regional Conference Series in Applied Mathematics, SIAM.) were developed from the corresponding Bayesian models. Their frequentist properties have not been studied. We conduct simulations to evaluate and compare finite sample performances. Simulation results show that the performances of these tests depend on the shape of the true function. The LMP and GML tests are more powerful for low frequency functions while the GCV test is more powerful for high frequency functions. For all test statistics, distributions under the null hypothesis are complicated. Computationally intensive Monte Carlo methods can be used to calculate null distributions. We also propose approximations to these null distributions and evaluate their performances by simulations.  相似文献   

14.
Qunfang Xu 《Statistics》2017,51(6):1280-1303
In this paper, semiparametric modelling for longitudinal data with an unstructured error process is considered. We propose a partially linear additive regression model for longitudinal data in which within-subject variances and covariances of the error process are described by unknown univariate and bivariate functions, respectively. We provide an estimating approach in which polynomial splines are used to approximate the additive nonparametric components and the within-subject variance and covariance functions are estimated nonparametrically. Both the asymptotic normality of the resulting parametric component estimators and optimal convergence rate of the resulting nonparametric component estimators are established. In addition, we develop a variable selection procedure to identify significant parametric and nonparametric components simultaneously. We show that the proposed SCAD penalty-based estimators of non-zero components have an oracle property. Some simulation studies are conducted to examine the finite-sample performance of the proposed estimation and variable selection procedures. A real data set is also analysed to demonstrate the usefulness of the proposed method.  相似文献   

15.
Jing Yang  Fang Lu  Hu Yang 《Statistics》2017,51(6):1179-1199
In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods.  相似文献   

16.
This article proposes a semiparametric nonlinear reproductive dispersion model (SNRDM) which is an extension of nonlinear reproductive dispersion model and semiparametric regression model. Maximum penalized likelihood estimators (MPLEs) of unknown parameters and nonparametric functions in SNRDMs are presented. Some novel diagnostic statistics such as Cook distance and difference deviance for parametric and nonparametric parts are developed to identify influence observations in SNRDMs on the basis of case-deletion method, and some formulae readily computed with the MPLEs algorithm for diagnostic measures are given. The equivalency of case-deletion models and mean-shift outlier models in SNRDM is investigated. A simulation study and a real example are used to illustrate the proposed diagnostic measures.  相似文献   

17.
Zero-inflated models are commonly used for modeling count and continuous data with extra zeros. Inflations at one point or two points apart from zero for modeling continuous data have been discussed less than that of zero inflation. In this article, inflation at an arbitrary point α as a semicontinuous distribution is presented and the mean imputation for a continuous response is discussed as a cause of having semicontinuous data. Also, inflation at two points and generally at k arbitrary points and their relation to cell-mean imputation in the mixture of continuous distributions are studied. To analyze the imputed data, a mixture of semicontinuous distributions is used. The effects of covariates on the dependent variable in a mixture of k semicontinuous distributions with inflation at k points are also investigated. In order to find the parameter estimates, the method of expectation–maximization (EM) algorithm is used. In a real data of Iranian Households Income and Expenditure Survey (IHIES), it is shown how to obtain a proper estimate of the population variance when continuous missing at random responses are mean imputed.  相似文献   

18.
A semiparametric approach to model skewed/heteroscedastic regression data is discussed. We work with a semiparametric transform-both-sides regression model, which contains a parametric regression function and a nonparametric transformation. This model is adequate when the relationship between the median response and the explanatory variable has been specified by a theoretical result or a previous empirical study. The transform-both-sides model with a parametric transformation has been studied extensively and applied successfully to a number data sets. Allowing a nonparametric transformation function increases the flexibility of the model. In this article, we estimate the nonparametric transformation function by the conditional kernel density approach developed by Wang and Ruppert (1995), and then use a pseudo-maximum likelihood estimator to estimate the regression parameters. This estimate of the regression parameters has not been studied previously. In this article, the asymptotic distribution of this pseudo-MLE is derived. We also show that when σ, the standard deviation of the error, goes to zero (small σ asymptotics), this estimator is adaptive. Adaptive means that the regression parameters are estimated as precisely as when the transformation is known exactly. A similar result holds in the parametric approaches of Carroll and Ruppert (1984) and Ruppert and Aldershof (1989). Simulated and real examples are provided to illustrate the performance of the proposed estimator for finite sample size.  相似文献   

19.
Doubly censored failure time data occur in many areas including demographical studies, epidemiology studies, medical studies and tumorigenicity experiments, and correspondingly some inference procedures have been developed in the literature (Biometrika, 91, 2004, 277; Comput. Statist. Data Anal., 57, 2013, 41; J. Comput. Graph. Statist., 13, 2004, 123). In this paper, we discuss regression analysis of such data under a class of flexible semiparametric transformation models, which includes some commonly used models for doubly censored data as special cases. For inference, the non‐parametric maximum likelihood estimation will be developed and in particular, we will present a novel expectation–maximization algorithm with the use of subject‐specific independent Poisson variables. In addition, the asymptotic properties of the proposed estimators are established and an extensive simulation study suggests that the proposed methodology works well for practical situations. The method is applied to an AIDS study.  相似文献   

20.
In this paper, we consider improved estimating equations for semiparametric partial linear models (PLM) for longitudinal data, or clustered data in general. We approximate the non‐parametric function in the PLM by a regression spline, and utilize quadratic inference functions (QIF) in the estimating equations to achieve a more efficient estimation of the parametric part in the model, even when the correlation structure is misspecified. Moreover, we construct a test which is an analogue to the likelihood ratio inference function for inferring the parametric component in the model. The proposed methods perform well in simulation studies and real data analysis conducted in this paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号