首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Coefficient estimation in linear regression models with missing data is routinely carried out in the mean regression framework. However, the mean regression theory breaks down if the error variance is infinite. In addition, correct specification of the likelihood function for existing imputation approach is often challenging in practice, especially for skewed data. In this paper, we develop a novel composite quantile regression and a weighted quantile average estimation procedure for parameter estimation in linear regression models when some responses are missing at random. Instead of imputing the missing response by randomly drawing from its conditional distribution, we propose to impute both missing and observed responses by their estimated conditional quantiles given the observed data and to use the parametrically estimated propensity scores to weigh check functions that define a regression parameter. Both estimation procedures are resistant to heavy‐tailed errors or outliers in the response and can achieve nice robustness and efficiency. Moreover, we propose adaptive penalization methods to simultaneously select significant variables and estimate unknown parameters. Asymptotic properties of the proposed estimators are carefully investigated. An efficient algorithm is developed for fast implementation of the proposed methodologies. We also discuss a model selection criterion, which is based on an ICQ ‐type statistic, to select the penalty parameters. The performance of the proposed methods is illustrated via simulated and real data sets.  相似文献   

2.
This paper presents results from a simulation study motivated by a recent study of the relationships between ambient levels of air pollution and human health in the community of Prince George, British Columbia. The simulation study was designed to evaluate the performance of methods based on overdispersed Poisson regression models for the analysis of series of count data. Aspects addressed include estimation of the dispersion parameter, estimation of regression coefficients and their standard errors, and the performance of model selection tests. The effects of varying amounts of overdispersion and differing underlying variance structure on this performance were of particular interest. This study is related to work reported by Breslow (1990) although the context is quite different. Preliminary work led to the conclusion that estimation of the dispersion parameter should be based on Pearson's chi-square statistic rather than the Poisson deviance. Regression coefficients are well estimated, even in the présence of substantial overdispersion and when the model for the variance function is incorrectly specified. Despite potential greater variability, the empirical estimator of the covariance matrix is preferred because the model-based estimator is unreliable in general. When the model for the variance function is incorrect, model-based test statistics may perform poorly, in sharp contrast to empirical test statistics, which performed very well in this study.  相似文献   

3.
Count data with excess zeros often occurs in areas such as public health, epidemiology, psychology, sociology, engineering, and agriculture. Zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression are useful for modeling such data, but because of hierarchical study design or the data collection procedure, zero-inflation and correlation may occur simultaneously. To overcome these challenges ZIP or ZINB may still be used. In this paper, multilevel ZINB regression is used to overcome these problems. The method of parameter estimation is an expectation-maximization algorithm in conjunction with the penalized likelihood and restricted maximum likelihood estimates for variance components. Alternative modeling strategies, namely the ZIP distribution are also considered. An application of the proposed model is shown on decayed, missing, and filled teeth of children aged 12 years old.  相似文献   

4.
We consider two estimation schemes based on penalized quasilikelihood and quasi-pseudo-likelihood in Poisson mixed models. The asymptotic bias in regression coefficients and variance components estimated by penalized quasilikelihood (PQL) is studied for small values of the variance components. We show the PQL estimators of both regression coefficients and variance components in Poisson mixed models have a smaller order of bias compared to those for binomial data. Unbiased estimating equations based on quasi-pseudo-likelihood are proposed and are shown to yield consistent estimators under some regularity conditions. The finite sample performance of these two methods is compared through a simulation study.  相似文献   

5.
Probabilistic matching of records is widely used to create linked data sets for use in health science, epidemiological, economic, demographic and sociological research. Clearly, this type of matching can lead to linkage errors, which in turn can lead to bias and increased variability when standard statistical estimation techniques are used with the linked data. In this paper we develop unbiased regression parameter estimates to be used when fitting a linear model with nested errors to probabilistically linked data. Since estimation of variance components is typically an important objective when fitting such a model, we also develop appropriate modifications to standard methods of variance components estimation in order to account for linkage error. In particular, we focus on three widely used methods of variance components estimation: analysis of variance, maximum likelihood and restricted maximum likelihood. Simulation results show that our estimators perform reasonably well when compared to standard estimation methods that ignore linkage errors.  相似文献   

6.
Qunfang Xu 《Statistics》2017,51(6):1280-1303
In this paper, semiparametric modelling for longitudinal data with an unstructured error process is considered. We propose a partially linear additive regression model for longitudinal data in which within-subject variances and covariances of the error process are described by unknown univariate and bivariate functions, respectively. We provide an estimating approach in which polynomial splines are used to approximate the additive nonparametric components and the within-subject variance and covariance functions are estimated nonparametrically. Both the asymptotic normality of the resulting parametric component estimators and optimal convergence rate of the resulting nonparametric component estimators are established. In addition, we develop a variable selection procedure to identify significant parametric and nonparametric components simultaneously. We show that the proposed SCAD penalty-based estimators of non-zero components have an oracle property. Some simulation studies are conducted to examine the finite-sample performance of the proposed estimation and variable selection procedures. A real data set is also analysed to demonstrate the usefulness of the proposed method.  相似文献   

7.
8.
This paper studies estimation of a partially specified spatial panel data linear regression with random-effects and spatially correlated error components. Under the assumption of exogenous spatial weighting matrix and exogenous regressors, the unknown parameter is estimated by applying the instrumental variable estimation. Under some sufficient conditions, the proposed estimator for the finite dimensional parameters is shown to be root-N consistent and asymptotically normally distributed; the proposed estimator for the unknown function is shown to be consistent and asymptotically distributed as well, though at a rate slower than root-N. Consistent estimators for the asymptotic variance–covariance matrices of both the parametric and unknown components are provided. The Monte Carlo simulation results suggest that the approach has some practical value.  相似文献   

9.
Parameter design or robust parameter design (RPD) is an engineering methodology intended as a cost-effective approach for improving the quality of products and processes. The goal of parameter design is to choose the levels of the control variables that optimize a defined quality characteristic. An essential component of RPD involves the assumption of well estimated models for the process mean and variance. Traditionally, the modeling of the mean and variance has been done parametrically. It is often the case, particularly when modeling the variance, that nonparametric techniques are more appropriate due to the nature of the curvature in the underlying function. Most response surface experiments involve sparse data. In sparse data situations with unusual curvature in the underlying function, nonparametric techniques often result in estimates with problematic variation whereas their parametric counterparts may result in estimates with problematic bias. We propose the use of semi-parametric modeling within the robust design setting, combining parametric and nonparametric functions to improve the quality of both mean and variance model estimation. The proposed method will be illustrated with an example and simulations.  相似文献   

10.
Mixture cure models are widely used when a proportion of patients are cured. The proportional hazards mixture cure model and the accelerated failure time mixture cure model are the most popular models in practice. Usually the expectation–maximisation (EM) algorithm is applied to both models for parameter estimation. Bootstrap methods are used for variance estimation. In this paper we propose a smooth semi‐nonparametric (SNP) approach in which maximum likelihood is applied directly to mixture cure models for parameter estimation. The variance can be estimated by the inverse of the second derivative of the SNP likelihood. A comprehensive simulation study indicates good performance of the proposed method. We investigate stage effects in breast cancer by applying the proposed method to breast cancer data from the South Carolina Cancer Registry.  相似文献   

11.
The maximum likelihood (ML) method is used to estimate the unknown Gamma regression (GR) coefficients. In the presence of multicollinearity, the variance of the ML method becomes overstated and the inference based on the ML method may not be trustworthy. To combat multicollinearity, the Liu estimator has been used. In this estimator, estimation of the Liu parameter d is an important problem. A few estimation methods are available in the literature for estimating such a parameter. This study has considered some of these methods and also proposed some new methods for estimation of the d. The Monte Carlo simulation study has been conducted to assess the performance of the proposed methods where the mean squared error (MSE) is considered as a performance criterion. Based on the Monte Carlo simulation and application results, it is shown that the Liu estimator is always superior to the ML and recommendation about which best Liu parameter should be used in the Liu estimator for the GR model is given.  相似文献   

12.
In this article, we propose a beta regression model with multiplicative log-normal measurement errors. Three estimation methods are presented, namely, naive, calibration regression, and pseudo likelihood. The nuisance parameters are estimated from a system of estimation equations using replicated data and these estimates are used to propose a pseudo likelihood function. A simulation study was performed to assess some properties of the proposed methods. Results from an example with a real dataset, including diagnostic tools, are also reported.  相似文献   

13.
空间回归模型由于引入了空间地理信息而使得其参数估计变得复杂,因为主要采用最大似然法,致使一般人认为在空间回归模型参数估计中不存在最小二乘法。通过分析空间回归模型的参数估计技术,研究发现,最小二乘法和最大似然法分别用于估计空间回归模型的不同的参数,只有将两者结合起来才能快速有效地完成全部的参数估计。数理论证结果表明,空间回归模型参数最小二乘估计量是最佳线性无偏估计量。空间回归模型的回归参数可以在估计量为正态性的条件下而实施显著性检验,而空间效应参数则不可以用此方法进行检验。  相似文献   

14.
Because sliced inverse regression (SIR) using the conditional mean of the inverse regression fails to recover the central subspace when the inverse regression mean degenerates, sliced average variance estimation (SAVE) using the conditional variance was proposed in the sufficient dimension reduction literature. However, the efficacy of SAVE depends heavily upon the number of slices. In the present article, we introduce a class of weighted variance estimation (WVE), which, similar to SAVE and simple contour regression (SCR), uses the conditional variance of the inverse regression to recover the central subspace. The strong consistency and the asymptotic normality of the kernel estimation of WVE are established under mild regularity conditions. Finite sample studies are carried out for comparison with existing methods and an application to a real data is presented for illustration.  相似文献   

15.
研究缺失偏态数据下线性回归模型的参数估计问题,针对缺失偏态数据,为克服样本分布扭曲缺点和提高模型的回归系数、尺度参数和偏度参数的估计效果,提出了一种适合偏态数据下线性回归模型中缺失数据的修正回归插补方法.通过随机模拟和实例研究,并与均值插补、回归插补、随机回归插补方法比较,结果表明所提出的修正回归插补方法是有效可行的.  相似文献   

16.
Summary. We propose a class of semiparametric functional regression models to describe the influence of vector-valued covariates on a sample of response curves. Each observed curve is viewed as the realization of a random process, composed of an overall mean function and random components. The finite dimensional covariates influence the random components of the eigenfunction expansion through single-index models that include unknown smooth link and variance functions. The parametric components of the single-index models are estimated via quasi-score estimating equations with link and variance functions being estimated nonparametrically. We obtain several basic asymptotic results. The functional regression models proposed are illustrated with the analysis of a data set consisting of egg laying curves for 1000 female Mediterranean fruit-flies (medflies).  相似文献   

17.
面板数据的自适应Lasso分位回归方法研究   总被引:1,自引:0,他引:1  
如何在对参数进行估计的同时自动选择重要解释变量,一直是面板数据分位回归模型中讨论的热点问题之一。通过构造一种含多重随机效应的贝叶斯分层分位回归模型,在假定固定效应系数先验服从一种新的条件Laplace分布的基础上,给出了模型参数估计的Gibbs抽样算法。考虑到不同重要程度的解释变量权重系数压缩程度应该不同,所构造的先验信息具有自适应性的特点,能够准确地对模型中重要解释变量进行自动选取,且设计的切片Gibbs抽样算法能够快速有效地解决模型中各个参数的后验均值估计问题。模拟结果显示,新方法在参数估计精确度和变量选择准确度上均优于现有文献的常用方法。通过对中国各地区多个宏观经济指标的面板数据进行建模分析,演示了新方法估计参数与挑选变量的能力。  相似文献   

18.
This paper discusses a model in which the regression lines will be passing through a common point. This point exists as a focal point in the wind-blown sand phenomena. The model of regression lines will be called ‘the focal point regression model’. The focal point will move according to the conditions of the experiments or the measurement site, so it must be estimated together with regression coefficients. The existence of the focal point is mathematically proved in the research field of coastal engineering, but its physical meaning and exact estimation method have not been established. Considering the experimental and/or measurement conditions, five models, that is, common or different error variance(s), passing through or not the centroid and Bayes-like approach are proposed. Moreover, the formulae of direct computation for a focal point under some conditions are given for engineering purpose. The models are applied to the wind-blown sand data, and behaviors of the models are verified by numerical experiments.  相似文献   

19.
Quantile regression methods have been widely used in many research areas in recent years. However conventional estimation methods for quantile regression models do not guarantee that the estimated quantile curves will be non‐crossing. While there are various methods in the literature to deal with this problem, many of these methods force the model parameters to lie within a subset of the parameter space in order for the required monotonicity to be satisfied. Note that different methods may use different subspaces of the space of model parameters. This paper establishes a relationship between the monotonicity of the estimated conditional quantiles and the comonotonicity of the model parameters. We develope a novel quasi‐Bayesian method for parameter estimation which can be used to deal with both time series and independent statistical data. Simulation studies and an application to real financial returns show that the proposed method has the potential to be very useful in practice.  相似文献   

20.
The estimation of the mixtures of regression models is usually based on the normal assumption of components and maximum likelihood estimation of the normal components is sensitive to noise, outliers, or high-leverage points. Missing values are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this article, we propose the mixtures of regression models for contaminated incomplete heterogeneous data. The proposed models provide robust estimates of regression coefficients varying across latent subgroups even under the presence of missing values. The methodology is illustrated through simulation studies and a real data analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号