首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary. We examine three pattern–mixture models for making inference about parameters of the distribution of an outcome of interest Y that is to be measured at the end of a longitudinal study when this outcome is missing in some subjects. We show that these pattern–mixture models also have an interpretation as selection models. Because these models make unverifiable assumptions, we recommend that inference about the distribution of Y be repeated under a range of plausible assumptions. We argue that, of the three models considered, only one admits a parameterization that facilitates the examination of departures from the assumption of sequential ignorability. The three models are nonparametric in the sense that they do not impose restrictions on the class of observed data distributions. Owing to the curse of dimensionality, the assumptions that are encoded in these models are sufficient for identification but not for inference. We describe additional flexible and easily interpretable assumptions under which it is possible to construct estimators that are well behaved with moderate sample sizes. These assumptions define semiparametric models for the distribution of the observed data. We describe a class of estimators which, up to asymptotic equivalence, comprise all the consistent and asymptotically normal estimators of the parameters of interest under the postulated semiparametric models. We illustrate our methods with the analysis of data from a randomized clinical trial of contracepting women.  相似文献   

2.
The present article deals with the problem of estimation of parameters in a linear regression model when some data on response variable is missing and the responses are equi-correlated. The ordinary least squares and optimal homogeneous predictors are employed to find the imputed values of missing observations. Their efficiency properties are analyzed using the small disturbances asymptotic theory. The estimation of regression coefficients using these imputed values is also considered and a comparison of estimators is presented.  相似文献   

3.
Summary.  In longitudinal studies, missingness of data is often an unavoidable problem. Estimators from the linear mixed effects model assume that missing data are missing at random. However, estimators are biased when this assumption is not met. In the paper, theoretical results for the asymptotic bias are established under non-ignorable drop-out, drop-in and other missing data patterns. The asymptotic bias is large when the drop-out subjects have only one or no observation, especially for slope-related parameters of the linear mixed effects model. In the drop-in case, intercept-related parameter estimators show substantial asymptotic bias when subjects enter late in the study. Eight other missing data patterns are considered and these produce asymptotic biases of a variety of magnitudes.  相似文献   

4.
In this paper we propose Stein‐type shrinkage estimators for the parameter vector of a Poisson regression model when it is suspected that some of the parameters may be restricted to a subspace. We develop the properties of these estimators using the notion of asymptotic distributional risk. The shrinkage estimators are shown to have higher efficiency than the classical estimators for a wide class of models. Furthermore, we consider three different penalty estimators: the LASSO, adaptive LASSO, and SCAD estimators and compare their relative performance with that of the shrinkage estimators. Monte Carlo simulation studies reveal that the shrinkage strategy compares favorably to the use of penalty estimators, in terms of relative mean squared error, when the number of inactive predictors in the model is moderate to large. The shrinkage and penalty strategies are applied to two real data sets to illustrate the usefulness of the procedures in practice.  相似文献   

5.
The odds ratio (OR) has been recommended elsewhere to measure the relative treatment efficacy in a randomized clinical trial (RCT), because it possesses a few desirable statistical properties. In practice, it is not uncommon to come across an RCT in which there are patients who do not comply with their assigned treatments and patients whose outcomes are missing. Under the compound exclusion restriction, latent ignorable and monotonicity assumptions, we derive the maximum likelihood estimator (MLE) of the OR and apply Monte Carlo simulation to compare its performance with those of the other two commonly used estimators for missing completely at random (MCAR) and for the intention-to-treat (ITT) analysis based on patients with known outcomes, respectively. We note that both estimators for MCAR and the ITT analysis may produce a misleading inference of the OR even when the relative treatment effect is equal. We further derive three asymptotic interval estimators for the OR, including the interval estimator using Wald’s statistic, the interval estimator using the logarithmic transformation, and the interval estimator using an ad hoc procedure of combining the above two interval estimators. On the basis of a Monte Carlo simulation, we evaluate the finite-sample performance of these interval estimators in a variety of situations. Finally, we use the data taken from a randomized encouragement design studying the effect of flu shots on the flu-related hospitalization rate to illustrate the use of the MLE and the asymptotic interval estimators for the OR developed here.  相似文献   

6.
A discrete distribution in which the probabilities are expressible as Laguerre polynomials is formulated in terms of a probability generating function involving three parameters. The skewness and kurtosis is given for members of the family corresponding to various parameter values. Several estimators of the parameters are proposed, including some based on minimum chi-square. All the estimators are compared on the basis of asymptotic relative efficiency.  相似文献   

7.
Jingjing Wu 《Statistics》2015,49(4):711-740
The successful application of the Hellinger distance approach to fully parametric models is well known. The corresponding optimal estimators, known as minimum Hellinger distance (MHD) estimators, are efficient and have excellent robustness properties [Beran R. Minimum Hellinger distance estimators for parametric models. Ann Statist. 1977;5:445–463]. This combination of efficiency and robustness makes MHD estimators appealing in practice. However, their application to semiparametric statistical models, which have a nuisance parameter (typically of infinite dimension), has not been fully studied. In this paper, we investigate a methodology to extend the MHD approach to general semiparametric models. We introduce the profile Hellinger distance and use it to construct a minimum profile Hellinger distance estimator of the finite-dimensional parameter of interest. This approach is analogous in some sense to the profile likelihood approach. We investigate the asymptotic properties such as the asymptotic normality, efficiency, and adaptivity of the proposed estimator. We also investigate its robustness properties. We present its small-sample properties using a Monte Carlo study.  相似文献   

8.
The Fisher information is intricately linked to the asymptotic (first-order) optimality of maximum likelihood estimators for parametric complete-data models. When data are missing completely at random in a multivariate setup, it is shown that information in a single observation is well-defined and it plays the same role as in the complete-data model in characterizing the first-order asymptotic optimality properties of associated maximum likelihood estimators; computational aspects are also thoroughly appraised. As an illustration, the logistic regression model with incomplete binary responses and an incomplete categorical covariate is worked out.  相似文献   

9.
Motivated by Shibata’s (1980) asymptotic efficiency results this paper dis-cusses the asymptotic efficiency of the order selected by a selection procedure for an infinite order autoregressive process with nonzero mean and unob servable errors that constitute a sequence of independent Gaussian random variables with mean zero and variance σ2 The asymptotic efficiency is established for AIC–type selection criteria such as AIC’, FPE, and Sn(k). In addition, some asymptotic results about the estimators of the parameters of the process and the error–sequence are presented.  相似文献   

10.
The extreme value distribution has been extensively used to model natural phenomena such as rainfall and floods, and also in modeling lifetimes and material strengths. Maximum likelihood estimation (MLE) for the parameters of the extreme value distribution leads to likelihood equations that have to be solved numerically, even when the complete sample is available. In this paper, we discuss point and interval estimation based on progressively Type-II censored samples. Through an approximation in the likelihood equations, we obtain explicit estimators which are approximations to the MLEs. Using these approximate estimators as starting values, we obtain the MLEs using an iterative method and examine numerically their bias and mean squared error. The approximate estimators compare quite favorably to the MLEs in terms of both bias and efficiency. Results of the simulation study, however, show that the probability coverages of the pivotal quantities (for location and scale parameters) based on asymptotic normality are unsatisfactory for both these estimators and particularly so when the effective sample size is small. We, therefore, suggest the use of unconditional simulated percentage points of these pivotal quantities for the construction of confidence intervals. The results are presented for a wide range of sample sizes and different progressive censoring schemes. We conclude with an illustrative example.  相似文献   

11.
Missing data are common in many experiments, including surveys, clinical trials, epidemiological studies, and environmental studies. Unconstrained likelihood inferences for generalized linear models (GLMs) with nonignorable missing covariates have been studied extensively in the literature. However, parameter orderings or constraints may occur naturally in practice, and thus the efficiency of a statistical method may be improved by incorporating parameter constraints into the likelihood function. In this paper, we consider constrained inference for analysing GLMs with nonignorable missing covariates under linear inequality constraints on the model parameters. Specifically, constrained maximum likelihood (ML) estimation is based on the gradient projection expectation maximization approach. Further, we investigate the asymptotic null distribution of the constrained likelihood ratio test (LRT). Simulations study the empirical properties of the constrained ML estimators and LRTs, which demonstrate improved precision of these constrained techniques. An application to contaminant levels in an environmental study is also presented.  相似文献   

12.
The modified zero order approach to estimating coefficients in the face of missing observations treats them as parameters to be estimated simultaneously with the missing observations. The paper then investigates (in the context of Han's generalized regression model)(i) when parameter estimators don't vary between using the partial data points and using only the complete ones (the informationless result), and (ii) large sample properties of the modified zero order estimator. It's found the sequential cut property is crucial to the informationless result for coefficient estimators; consistency of the modified zero order estimator depends on the percentage of observations with missing elements for large sample sizes or the sequential cut property.  相似文献   

13.
ABSTRACT

We investigate the semiparametric smooth coefficient stochastic frontier model for panel data in which the distribution of the composite error term is assumed to be of known form but depends on some environmental variables. We propose multi-step estimators for the smooth coefficient functions as well as the parameters of the distribution of the composite error term and obtain their asymptotic properties. The Monte Carlo study demonstrates that the proposed estimators perform well in finite samples. We also consider an application and perform model specification test, construct confidence intervals, and estimate efficiency scores that depend on some environmental variables. The application uses a panel data on 451 large U.S. firms to explore the effects of computerization on productivity. Results show that two popular parametric models used in the stochastic frontier literature are likely to be misspecified. Compared with the parametric estimates, our semiparametric model shows a positive and larger overall effect of computer capital on the productivity. The efficiency levels, however, were not much different among the models. Supplementary materials for this article are available online.  相似文献   

14.
This article develops a general multivariate additive noise model for synchronized asset prices and provides a multivariate extension of the generalized flat-top realized kernel estimators, analyzed earlier by Varneskov (2014), to estimate its quadratic covariation. The additive noise model allows for α-mixing dependent exogenous noise, random sampling, and an endogenous noise component that encompasses synchronization errors, lead-lag relations, and diurnal heteroscedasticity. The various components may exhibit polynomially decaying autocovariances. In this setting, the class of estimators considered is consistent, asymptotically unbiased, and mixed Gaussian at the optimal rate of convergence, n1/4. A simple finite sample correction based on projections of symmetric matrices ensures positive definiteness without altering the asymptotic properties of the estimators. It, thereby, guarantees the existence of nonlinear transformations of the estimated covariance matrix such as correlations and realized betas, which inherit the asymptotic properties from the flat-top realized kernel estimators. An empirically motivated simulation study assesses the choice of sampling scheme and projection rule, and it shows that flat-top realized kernels have a desirable combination of robustness and efficiency relative to competing estimators. Last, an empirical analysis of signal detection and out-of-sample predictions for a portfolio of six stocks of varying size and liquidity illustrates the use and properties of the new estimators.  相似文献   

15.
Linear models are considered in which measurement error is present in the dependent variable. Observed values are related to true values via nonlinear regression models with the parameters in the measurement error models being estimated with the use of independent, external data, collected using standards. Pseudo-maximum likelihood estimators and their asymptotic properties are developed under normality assumptions and the common approach of simply analyzing imputed values obtained from the nestimated calibration curves is assessed. A small simulation evaluates the procedures. An example is presented in which urinary neopterin (measured via radioimmunoassay) is nbeing compared between two groups of individuals.  相似文献   

16.
Biao Zhang 《Statistics》2016,50(5):1173-1194
Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study.  相似文献   

17.
We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.  相似文献   

18.
Ibrahim (1990) used the EM-algorithm to obtain maximum likelihood estimates of the regression parameters in generalized linear models with partially missing covariates. The technique was termed EM by the method of weights. In this paper, we generalize this technique to Cox regression analysis with missing values in the covariates. We specify a full model letting the unobserved covariate values be random and then maximize the observed likelihood. The asymptotic covariance matrix is estimated by the inverse information matrix. The missing data are allowed to be missing at random but also the non-ignorable non-response situation may in principle be considered. Simulation studies indicate that the proposed method is more efficient than the method suggested by Paik & Tsai (1997). We apply the procedure to a clinical trials example with six covariates with three of them having missing values.  相似文献   

19.
Coefficient estimation in linear regression models with missing data is routinely carried out in the mean regression framework. However, the mean regression theory breaks down if the error variance is infinite. In addition, correct specification of the likelihood function for existing imputation approach is often challenging in practice, especially for skewed data. In this paper, we develop a novel composite quantile regression and a weighted quantile average estimation procedure for parameter estimation in linear regression models when some responses are missing at random. Instead of imputing the missing response by randomly drawing from its conditional distribution, we propose to impute both missing and observed responses by their estimated conditional quantiles given the observed data and to use the parametrically estimated propensity scores to weigh check functions that define a regression parameter. Both estimation procedures are resistant to heavy‐tailed errors or outliers in the response and can achieve nice robustness and efficiency. Moreover, we propose adaptive penalization methods to simultaneously select significant variables and estimate unknown parameters. Asymptotic properties of the proposed estimators are carefully investigated. An efficient algorithm is developed for fast implementation of the proposed methodologies. We also discuss a model selection criterion, which is based on an ICQ ‐type statistic, to select the penalty parameters. The performance of the proposed methods is illustrated via simulated and real data sets.  相似文献   

20.
A family of robust estimators for coefficients of Gaussian AR(p) time series under simultaneously influencing distortions of two types: outliers and missing values, is proposed. The estimators are based on special properties of the Cauchy probability distribution; consistency and the asymptotic normality of these estimators are proven. An approximate solution of the problem of minimization of the asymptotic variance within the proposed family of estimators is found. Performance of the proposed estimators is illustrated for simulated time series and for real data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号