首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary.  In longitudinal studies missing data are the rule not the exception. We consider the analysis of longitudinal binary data with non-monotone missingness that is thought to be non-ignorable. In this setting a full likelihood approach is complicated algebraically and can be computationally prohibitive when there are many measurement occasions. We propose a 'protective' estimator that assumes that the probability that a response is missing at any occasion depends, in a completely unspecified way, on the value of that variable alone. Relying on this 'protectiveness' assumption, we describe a pseudolikelihood estimator of the regression parameters under non-ignorable missingness, without having to model the missing data mechanism directly. The method proposed is applied to CD4 cell count data from two longitudinal clinical trials of patients infected with the human immunodeficiency virus.  相似文献   

2.
In this paper we consider the calibration procedure for a rare sensitive attribute with Poisson distribution which suggested by Land et al. (2012) using auxiliary information associated with the variable of interest. In the calibration procedure, we can use auxiliary information such as socio-demographical variables for the respondents of rare sensitive attribute questions from an external source, and then this estimator can be improved with respect to the problems of non coverage or non response. From the efficiency comparison study, we show that the calibrated Poisson RR estimators are more efficient than that of Land et al. (2012), when the known population cell and marginal counts of auxiliary information are used for the calibration procedure.  相似文献   

3.
Simple nonparametric estimates of the conditional distribution of a response variable given a covariate are often useful for data exploration purposes or to help with the specification or validation of a parametric or semi-parametric regression model. In this paper we propose such an estimator in the case where the response variable is interval-censored and the covariate is continuous. Our approach consists in adding weights that depend on the covariate value in the self-consistency equation proposed by Turnbull (J R Stat Soc Ser B 38:290–295, 1976), which results in an estimator that is no more difficult to implement than Turnbull’s estimator itself. We show the convergence of our algorithm and that our estimator reduces to the generalized Kaplan–Meier estimator (Beran, Nonparametric regression with randomly censored survival data, 1981) when the data are either complete or right-censored. We demonstrate by simulation that the estimator, bootstrap variance estimation and bandwidth selection (by rule of thumb or cross-validation) all perform well in finite samples. We illustrate the method by applying it to a dataset from a study on the incidence of HIV in a group of female sex workers from Kinshasa.  相似文献   

4.
Stochastic processes often exhibit sudden systematic changes in pattern a short time before certain failure events. Examples include increase in medical costs before death and decrease in CD4 counts before AIDS diagnosis. To study such terminal behavior of stochastic processes, a natural and direct way is to align the processes using failure events as time origins. This paper studies backward stochastic processes counting time backward from failure events, and proposes one-sample nonparametric estimation of the mean of backward processes when follow-up is subject to left truncation and right censoring. We will discuss benefits of including prevalent cohort data to enlarge the identifiable region and large sample properties of the proposed estimator with related extensions. A SEER-Medicare linked data set is used to illustrate the proposed methodologies.  相似文献   

5.
Interval-grouped data are defined, in general, when the event of interest cannot be directly observed and it is only known to have been occurred within an interval. In this framework, a nonparametric kernel density estimator is proposed and studied. The approach is based on the classical Parzen–Rosenblatt estimator and on the generalisation of the binned kernel density estimator. The asymptotic bias and variance of the proposed estimator are derived under usual assumptions, and the effect of using non-equally spaced grouped data is analysed. Additionally, a plug-in bandwidth selector is proposed. Through a comprehensive simulation study, the behaviour of both the estimator and the plug-in bandwidth selector considering different scenarios of data grouping is shown. An application to real data confirms the simulation results, revealing the good performance of the estimator whenever data are not heavily grouped.  相似文献   

6.
Nonparametric estimates of the conditional distribution of a response variable given a covariate are important for data exploration purposes. In this article, we propose a nonparametric estimator of the conditional distribution function in the case where the response variable is subject to interval censoring and double truncation. Using the approach of Dehghan and Duchesne (2011), the proposed method consists in adding weights that depend on the covariate value in the self-consistency equation of Turnbull (1976), which results in a nonparametric estimator. We demonstrate by simulation that the estimator, bootstrap variance estimation and bandwidth selection all perform well in finite samples.  相似文献   

7.
A semiparametric approach to model skewed/heteroscedastic regression data is discussed. We work with a semiparametric transform-both-sides regression model, which contains a parametric regression function and a nonparametric transformation. This model is adequate when the relationship between the median response and the explanatory variable has been specified by a theoretical result or a previous empirical study. The transform-both-sides model with a parametric transformation has been studied extensively and applied successfully to a number data sets. Allowing a nonparametric transformation function increases the flexibility of the model. In this article, we estimate the nonparametric transformation function by the conditional kernel density approach developed by Wang and Ruppert (1995), and then use a pseudo-maximum likelihood estimator to estimate the regression parameters. This estimate of the regression parameters has not been studied previously. In this article, the asymptotic distribution of this pseudo-MLE is derived. We also show that when σ, the standard deviation of the error, goes to zero (small σ asymptotics), this estimator is adaptive. Adaptive means that the regression parameters are estimated as precisely as when the transformation is known exactly. A similar result holds in the parametric approaches of Carroll and Ruppert (1984) and Ruppert and Aldershof (1989). Simulated and real examples are provided to illustrate the performance of the proposed estimator for finite sample size.  相似文献   

8.
Asymptotic Normality in Mixtures of Power Series Distributions   总被引:1,自引:0,他引:1  
Abstract.  The problem of estimating the individual probabilities of a discrete distribution is considered. The true distribution of the independent observations is a mixture of a family of power series distributions. First, we ensure identifiability of the mixing distribution assuming mild conditions. Next, the mixing distribution is estimated by non-parametric maximum likelihood and an estimator for individual probabilities is obtained from the corresponding marginal mixture density. We establish asymptotic normality for the estimator of individual probabilities by showing that, under certain conditions, the difference between this estimator and the empirical proportions is asymptotically negligible. Our framework includes Poisson, negative binomial and logarithmic series as well as binomial mixture models. Simulations highlight the benefit in achieving normality when using the proposed marginal mixture density approach instead of the empirical one, especially for small sample sizes and/or when interest is in the tail areas. A real data example is given to illustrate the use of the methodology.  相似文献   

9.
ABSTRACT

In this paper, we study a novelly robust variable selection and parametric component identification simultaneously in varying coefficient models. The proposed estimator is based on spline approximation and two smoothly clipped absolute deviation (SCAD) penalties through rank regression, which is robust with respect to heavy-tailed errors or outliers in the response. Furthermore, when the tuning parameter is chosen by modified BIC criterion, we show that the proposed procedure is consistent both in variable selection and the separation of varying and constant coefficients. In addition, the estimators of varying coefficients possess the optimal convergence rate under some assumptions, and the estimators of constant coefficients have the same asymptotic distribution as their counterparts obtained when the true model is known. Simulation studies and a real data example are undertaken to assess the finite sample performance of the proposed variable selection procedure.  相似文献   

10.
This paper considers the problem of selecting optimal bandwidths for variable (sample‐point adaptive) kernel density estimation. A data‐driven variable bandwidth selector is proposed, based on the idea of approximating the log‐bandwidth function by a cubic spline. This cubic spline is optimized with respect to a cross‐validation criterion. The proposed method can be interpreted as a selector for either integrated squared error (ISE) or mean integrated squared error (MISE) optimal bandwidths. This leads to reflection upon some of the differences between ISE and MISE as error criteria for variable kernel estimation. Results from simulation studies indicate that the proposed method outperforms a fixed kernel estimator (in terms of ISE) when the target density has a combination of sharp modes and regions of smooth undulation. Moreover, some detailed data analyses suggest that the gains in ISE may understate the improvements in visual appeal obtained using the proposed variable kernel estimator. These numerical studies also show that the proposed estimator outperforms existing variable kernel density estimators implemented using piecewise constant bandwidth functions.  相似文献   

11.
In a cross-sectional observational study, time-to-event distribution can be estimated from data on current status or from recalled data on the time of occurrence. In either case, one can treat the data as having been interval censored, and use the nonparametric maximum likelihood estimator proposed by Turnbull (J R Stat Soc Ser B 38:290–295, 1976). However, the chance of recall may depend on the time span between the occurrence of the event and the time of interview. In such a case, the underlying censoring would be informative, rendering the Turnbull estimator inappropriate. In this article, we provide a nonparametric maximum likelihood estimator of the distribution of interest, by using a model adapted to the special nature of the data at hand. We also provide a computationally simple approximation of this estimator, and establish the consistency of both the original and the approximate versions, under mild conditions. Monte Carlo simulations indicate that the proposed estimators have smaller bias than the Turnbull estimator based on incomplete recall data, smaller variance than the Turnbull estimator based on current status data, and smaller mean squared error than both of them. The method is applied to menarcheal data from a recent Anthropometric study of adolescent and young adult females in Kolkata, India.  相似文献   

12.
Rao (J. Indian Statist. Assoc. 17 (1979) 125) has given a ‘necessary form’ for an unbiased mean square error (MSE) estimator to be ‘uniformly non-negative’. The MSE is of a homogeneous linear estimator ‘subject to a specified constraint’, for a survey population total of a real variable of interest. We present a corresponding theorem when the ‘constraint’ is relaxed. Certain results are added presenting formulae for estimators of MSEs when the variate-values for the sampled individuals are not ascertainable. Though not ascertainable, they are supposed to be suitably estimated either by (1) randomized response techniques covering sensitive issues or by (2) further sampling in ‘subsequent’ stages in specific ways when the initial sampling units are composed of a number of sub-units. Using live numerical data, practical uses of the proposed alternative MSE estimators are demonstrated.  相似文献   

13.
We propose quantile regression (QR) in the Bayesian framework for a class of nonlinear mixed effects models with a known, parametric model form for longitudinal data. Estimation of the regression quantiles is based on a likelihood-based approach using the asymmetric Laplace density. Posterior computations are carried out via Gibbs sampling and the adaptive rejection Metropolis algorithm. To assess the performance of the Bayesian QR estimator, we compare it with the mean regression estimator using real and simulated data. Results show that the Bayesian QR estimator provides a fuller examination of the shape of the conditional distribution of the response variable. Our approach is proposed for parametric nonlinear mixed effects models, and therefore may not be generalized to models without a given model form.  相似文献   

14.
A density estimation method in a Bayesian nonparametric framework is presented when recorded data are not coming directly from the distribution of interest, but from a length biased version. From a Bayesian perspective, efforts to computationally evaluate posterior quantities conditionally on length biased data were hindered by the inability to circumvent the problem of a normalizing constant. In this article, we present a novel Bayesian nonparametric approach to the length bias sampling problem that circumvents the issue of the normalizing constant. Numerical illustrations as well as a real data example are presented and the estimator is compared against its frequentist counterpart, the kernel density estimator for indirect data of Jones.  相似文献   

15.
A general nonparametric imputation procedure, based on kernel regression, is proposed to estimate points as well as set- and function-indexed parameters when the data are missing at random (MAR). The proposed method works by imputing a specific function of a missing value (and not the missing value itself), where the form of this specific function is dictated by the parameter of interest. Both single and multiple imputations are considered. The associated empirical processes provide the right tool to study the uniform convergence properties of the resulting estimators. Our estimators include, as special cases, the imputation estimator of the mean, the estimator of the distribution function proposed by Cheng and Chu [1996. Kernel estimation of distribution functions and quantiles with missing data. Statist. Sinica 6, 63–78], imputation estimators of a marginal density, and imputation estimators of regression functions.  相似文献   

16.
An unknown moment-determinate cumulative distribution function or its density function can be recovered from corresponding moments and estimated from the empirical moments. This method of estimating an unknown density is natural in certain inverse estimation models like multiplicative censoring or biased sampling when the moments of unobserved distribution can be estimated via the transformed moments of the observed distribution. In this paper, we introduce a new nonparametric estimator of a probability density function defined on the positive real line, motivated by the above. Some fundamental properties of proposed estimator are studied. The comparison with traditional kernel density estimator is discussed.  相似文献   

17.
This study focuses on the estimation of population mean of a sensitive variable in stratified random sampling based on randomized response technique (RRT) when the observations are contaminated by measurement errors (ME). A generalized estimator of population mean is proposed by using additively scrambled responses for the sensitive variable. The expressions for the bias and mean square error (MSE) of the proposed estimator are derived. The performance of the proposed estimator is evaluated both theoretically and empirically. Results are also applied to a real data set.  相似文献   

18.
Patients infected with the human immunodeficiency virus (HIV) generally experience a decline in their CD4 cell count (a count of certain white blood cells). We describe the use of quantile regression methods to analyse longitudinal data on CD4 cell counts from 1300 patients who participated in clinical trials that compared two therapeutic treatments: zidovudine and didanosine. It is of scientific interest to determine any treatment differences in the CD4 cell counts over a short treatment period. However, the analysis of the CD4 data is complicated by drop-outs: patients with lower CD4 cell counts at the base-line appear more likely to drop out at later measurement occasions. Motivated by this example, we describe the use of `weighted' estimating equations in quantile regression models for longitudinal data with drop-outs. In particular, the conventional estimating equations for the quantile regression parameters are weighted inversely proportionally to the probability of drop-out. This approach requires the process generating the missing data to be estimable but makes no assumptions about the distribution of the responses other than those imposed by the quantile regression model. This method yields consistent estimates of the quantile regression parameters provided that the model for drop-out has been correctly specified. The methodology proposed is applied to the CD4 cell count data and the results are compared with those obtained from an `unweighted' analysis. These results demonstrate how an analysis that fails to account for drop-outs can mislead.  相似文献   

19.
In this paper, we consider the non-penalty shrinkage estimation method of random effect models with autoregressive errors for longitudinal data when there are many covariates and some of them may not be active for the response variable. In observational studies, subjects are followed over equally or unequally spaced visits to determine the continuous response and whether the response is associated with the risk factors/covariates. Measurements from the same subject are usually more similar to each other and thus are correlated with each other but not with observations of other subjects. To analyse this data, we consider a linear model that contains both random effects across subjects and within-subject errors that follows autoregressive structure of order 1 (AR(1)). Considering the subject-specific random effect as a nuisance parameter, we use two competing models, one includes all the covariates and the other restricts the coefficients based on the auxiliary information. We consider the non-penalty shrinkage estimation strategy that shrinks the unrestricted estimator in the direction of the restricted estimator. We discuss the asymptotic properties of the shrinkage estimators using the notion of asymptotic biases and risks. A Monte Carlo simulation study is conducted to examine the relative performance of the shrinkage estimators with the unrestricted estimator when the shrinkage dimension exceeds two. We also numerically compare the performance of the shrinkage estimators to that of the LASSO estimator. A longitudinal CD4 cell count data set will be used to illustrate the usefulness of shrinkage and LASSO estimators.  相似文献   

20.
When the individual measurements are statistically independent, the maximum likelihood estimator calculated at the end of a sequential procedure overestimates the underlying effect. There are many clinical trials in which we are interested in comparing changes in responses between two treatment groups sequentially. Lee and DeMets (1991, JASA 86, 757–762) proposed a group sequential method for comparing rates of change when a response variable is measured for eaeh patient at successive follow-up visits. They assumed that the response follows the linear mixed effects model and derived the asymptotic joint distribution of the sequentially computed statistics. In this article, we consider the maximum likelihood estimator (MLE), the median unbiased estimator (MUE) and the midpoint of a 100(1-α)% confidence interval as point estimators for the rate of change in the linear mixed effects model, and investigate their properties by Monte Carlo simulation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号