首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 984 毫秒
1.
The Pareto distribution is a simple model for non negative data with a power law probability tail. Income and wealth data are typically modeled using some variant of the classical Pareto distribution. In practice, it is frequently likely that the observed data have been truncated with respect to some unobserved covariable. In this paper, a hidden truncation formulation of this scenario is proposed and analyzed. A bivariate Pareto (II) distribution is assumed for the variable of interest and the unobserved covariable. Distributional properties of the resulting model are investigated. A variety of parameter estimation strategies (under the classical set up) are investigated.  相似文献   

2.
Regression calibration is a simple method for estimating regression models when covariate data are missing for some study subjects. It consists in replacing an unobserved covariate by an estimator of its conditional expectation given available covariates. Regression calibration has recently been investigated in various regression models such as the linear, generalized linear, and proportional hazards models. The aim of this paper is to investigate the appropriateness of this method for estimating the stratified Cox regression model with missing values of the covariate defining the strata. Despite its practical relevance, this problem has not yet been discussed in the literature. Asymptotic distribution theory is developed for the regression calibration estimator in this setting. A simulation study is also conducted to investigate the properties of this estimator.  相似文献   

3.
Measurement-error modelling occurs when one cannot observe a covariate, but instead has possibly replicated surrogate versions of this covariate measured with error. The vast majority of the literature in measurement-error modelling assumes (typically with good reason) that given the value of the true but unobserved (latent) covariate, the replicated surrogates are unbiased for latent covariate and conditionally independent. In the area of nutritional epidemiology, there is some evidence from biomarker studies that this simple conditional independence model may break down due to two causes: (a) systematic biases depending on a person's body mass index, and (b) an additional random component of bias, so that the error structure is the same as a one-way random-effects model. We investigate this problem in the context of (1) estimating distribution of usual nutrient intake, (2) estimating the correlation between a nutrient instrument and usual nutrient intake, and (3) estimating the true relative risk from an estimated relative risk using the error-prone covariate. While systematic bias due to body mass index appears to have little effect, the additional random effect in the variance structure is shown to have a potentially important effect on overall results, both on corrections for relative risk estimates and in estimating the distribution of usual nutrient intake. However, the effect of dietary measurement error on both factors is shown via examples to depend strongly on the data set being used. Indeed, one of our data sets suggests that dietary measurement error may be masking a strong risk of fat on breast cancer, while for a second data set this masking is not so clear. Until further understanding of dietary measurement is available, measurement-error corrections must be done on a study-specific basis, sensitivity analyses should be conducted, and even then results of nutritional epidemiology studies relating diet to disease risk should be interpreted cautiously.  相似文献   

4.
Abstract

In this article, a new model is presented that is based on the Pareto distribution of the second kind, when the location parameter depends on covariates as well as unobserved heterogeneity. Bayesian analysis of the model can be performed using Markov Chain Monte Carlo techniques. The new procedures are illustrated in the context of artificial data as well as international output data.  相似文献   

5.
Internet traffic data is characterized by some unusual statistical properties, in particular, the presence of heavy-tailed variables. A typical model for heavy-tailed distributions is the Pareto distribution although this is not adequate in many cases. In this article, we consider a mixture of two-parameter Pareto distributions as a model for heavy-tailed data and use a Bayesian approach based on the birth-death Markov chain Monte Carlo algorithm to fit this model. We estimate some measures of interest related to the queueing system k-Par/M/1 where k-Par denotes a mixture of k Pareto distributions. Heavy-tailed variables are difficult to model in such queueing systems because of the lack of a simple expression for the Laplace Transform (LT). We use a procedure based on recent LT approximating results for the Pareto/M/1 system. We illustrate our approach with both simulated and real data.  相似文献   

6.
Survival studies usually collect on each participant, both duration until some terminal event and repeated measures of a time-dependent covariate. Such a covariate is referred to as an internal time-dependent covariate. Usually, some subjects drop out of the study before occurence of the terminal event of interest. One may then wish to evaluate the relationship between time to dropout and the internal covariate. The Cox model is a standard framework for that purpose. Here, we address this problem in situations where the value of the covariate at dropout is unobserved. We suggest a joint model which combines a first-order Markov model for the longitudinaly measured covariate with a time-dependent Cox model for the dropout process. We consider maximum likelihood estimation in this model and show how estimation can be carried out via the EM-algorithm. We state that the suggested joint model may have applications in the context of longitudinal data with nonignorable dropout. Indeed, it can be viewed as generalizing Diggle and Kenward's model (1994) to situations where dropout may occur at any point in time and may be censored. Hence we apply both models and compare their results on a data set concerning longitudinal measurements among patients in a cancer clinical trial.  相似文献   

7.
In this study, a new extension of generalized half-normal (GHN) distribution is introduced. Since this new distribution can be viewed as weighted version of GHN distribution, it is called as weighted generalized half-normal (WGHN) distribution. It is shown that WGHN distribution can be observed as a single constrained and hidden truncation model. Therefore, the new distribution is more flexible than the GHN distribution. Some statistical properties of the WGHN distribution are studied, i.e. moments, cumulative distribution function, hazard rate function are derived. Furthermore, maximum likelihood estimation of the parameters is considered. Some real-life data sets taken from the literature are modelled using the WGHN distribution. It is seen that for these data sets the WGHN distribution provides better fitting than the GHN and slashed generalized half-normal (SGHN) distributions.  相似文献   

8.
In two observational studies, one investigating the effects of minimum wage laws on employment and the other of the effects of exposures to lead, an estimated treatment effect's sensitivity to hidden bias is examined. The estimate uses the combined quantile averages that were introduced in 1981 by B. M. Brown as simple, efficient, robust estimates of location admitting both exact and approximate confidence intervals and significance tests. Closely related to Gastwirth's estimate and Tukey's trimean, the combined quantile average has asymptotic efficiency for normal data that is comparable with that of a 15% trimmed mean, and higher efficiency than the trimean, but it has resistance to extreme observations or breakdown comparable with that of the trimean and better than the 15% trimmed mean. Combined quantile averages provide consistent estimates of an additive treatment effect in a matched randomized experiment. Sensitivity analyses are discussed for combined quantile averages when used in a matched observational study in which treatments are not randomly assigned. In a sensitivity analysis in an observational study, subjects are assumed to differ with respect to an unobserved covariate that was not adequately controlled by the matching, so that treatments are assigned within pairs with probabilities that are unequal and unknown. The sensitivity analysis proposed here uses significance levels, point estimates and confidence intervals based on combined quantile averages and examines how these inferences change under a range of assumptions about biases due to an unobserved covariate. The procedures are applied in the studies of minimum wage laws and exposures to lead. The first example is also used to illustrate sensitivity analysis with an instrumental variable.  相似文献   

9.
In this paper we extend the structural probit measurement error model by considering the unobserved covariate to follow a skew-normal distribution. The new model is termed the structural skew-normal probit model. As in the normal case, the likelihood function is obtained analytically, and can be maximized by using existing statistical software. A Bayesian approach using Markov chain Monte Carlo techniques for generating from the posterior distributions is also developed. A simulation study demonstrates the usefulness of the approach in avoiding attenuation which arises with the naive procedure. Moreover, a comparison of predicted and true success probabilities indicates that it seems to be more efficient to use the skew probit model when the distribution of the covariate (predictor) is skew. An application to a real data set is also provided.  相似文献   

10.
The cumulative incidence function plays an important role in assessing its treatment and covariate effects with competing risks data. In this article, we consider an additive hazard model allowing the time-varying covariate effects for the subdistribution and propose the weighted estimating equation under the covariate-dependent censoring by fitting the Cox-type hazard model for the censoring distribution. When there exists some association between the censoring time and the covariates, the proposed coefficients’ estimations are unbiased and the large-sample properties are established. The finite-sample properties of the proposed estimators are examined in the simulation study. The proposed Cox-weighted method is applied to a competing risks dataset from a Hodgkin's disease study.  相似文献   

11.
Measurement error models constitute a wide class of models that include linear and nonlinear regression models. They are very useful to model many real-life phenomena, particularly in the medical and biological areas. The great advantage of these models is that, in some sense, they can be represented as mixed effects models, allowing us to implement well-known techniques, like the EM-algorithm for the parameter estimation. In this paper, we consider a class of multivariate measurement error models where the observed response and/or covariate are not fully observed, i.e., the observations are subject to certain threshold values below or above which the measurements are not quantifiable. Consequently, these observations are considered censored. We assume a Student-t distribution for the unobserved true values of the mismeasured covariate and the error term of the model, providing a robust alternative for parameter estimation. Our approach relies on a likelihood-based inference using an EM-type algorithm. The proposed method is illustrated through some simulation studies and the analysis of an AIDS clinical trial dataset.  相似文献   

12.
Shared-frailty survival models specify that systematic unobserved determinants of duration outcomes are identical within groups of individuals. We consider random-effects likelihood-based statistical inference if the duration data are subject to left-truncation. Such inference with left-truncated data can be performed in previous versions of the Stata software package for parametric and semi-parametric shared frailty models. We show that with left-truncated data, the commands ignore the weeding-out process before the left-truncation points, affecting the distribution of unobserved determinants among group members in the data, namely among the group members who survive until their truncation points. We critically examine studies in the statistical literature on this issue as well as published empirical studies that use the commands. Simulations illustrate the size of the (asymptotic) bias and its dependence on the degree of truncation.  相似文献   

13.
Adjustment for covariates is a time-honored tool in statistical analysis and is often implemented by including the covariates that one intends to adjust as additional predictors in a model. This adjustment often does not work well when the underlying model is misspecified. We consider here the situation where we compare a response between two groups. This response may depend on a covariate for which the distribution differs between the two groups one intends to compare. This creates the potential that observed differences are due to differences in covariate levels rather than “genuine” population differences that cannot be explained by covariate differences. We propose a bootstrap-based adjustment method. Bootstrap weights are constructed with the aim of aligning bootstrap–weighted empirical distributions of the covariate between the two groups. Generally, the proposed weighted-bootstrap algorithm can be used to align or match the values of an explanatory variable as closely as desired to those of a given target distribution. We illustrate the proposed bootstrap adjustment method in simulations and in the analysis of data on the fecundity of historical cohorts of French-Canadian women.  相似文献   

14.
Using some logarithmic and integral transformation we transform a continuous covariate frailty model into a polynomial regression model with a random effect. The responses of this mixed model can be ‘estimated’ via conditional hazard function estimation. The random error in this model does not have zero mean and its variance is not constant along the covariate and, consequently, these two quantities have to be estimated. Since the asymptotic expression for the bias is complicated, the two-large-bandwidth trick is proposed to estimate the bias. The proposed transformation is very useful for clustered incomplete data subject to left truncation and right censoring (and for complex clustered data in general). Indeed, in this case no standard software is available to fit the frailty model, whereas for the transformed model standard software for mixed models can be used for estimating the unknown parameters in the original frailty model. A small simulation study illustrates the good behavior of the proposed method. This method is applied to a bladder cancer data set.  相似文献   

15.
Nonparametric estimates of the conditional distribution of a response variable given a covariate are important for data exploration purposes. In this article, we propose a nonparametric estimator of the conditional distribution function in the case where the response variable is subject to interval censoring and double truncation. Using the approach of Dehghan and Duchesne (2011), the proposed method consists in adding weights that depend on the covariate value in the self-consistency equation of Turnbull (1976), which results in a nonparametric estimator. We demonstrate by simulation that the estimator, bootstrap variance estimation and bandwidth selection all perform well in finite samples.  相似文献   

16.
In this paper, we extend the structural probit measurement error model by considering that the unobserved covariate follows a skew-normal distribution. The new model is termed the structural skew-normal probit model. As in the normal case, the likelihood function is obtained analytically which can be maximized by using existing statistical software. A Bayesian approach using Markov chain Monte Carlo techniques to generate from the posterior distributions is also developed. A simulation study demonstrates the usefulness of the approach in avoiding attenuation which is the case with the naive procedure and it seems to be more efficient than using the structural probit model when the distribution of the covariate (predictor) is skew.  相似文献   

17.
In statistical modelling, it is often of interest to evaluate non‐negative quantities that capture heterogeneity in the population such as variances, mixing proportions and dispersion parameters. In instances of covariate‐dependent heterogeneity, the implied homogeneity hypotheses are nonstandard and existing inferential techniques are not applicable. In this paper, we develop a quasi‐score test statistic to evaluate homogeneity against heterogeneity that varies with a covariate profile through a regression model. We establish the limiting null distribution of the proposed test as a functional of mixtures of chi‐square processes. The methodology does not require the full distribution of the data to be entirely specified. Instead, a general estimating function for a finite dimensional component of the model, that is, of interest is assumed but other characteristics of the population are left completely unspecified. We apply the methodology to evaluate the excess zero proportion in zero‐inflated models for count data. Our numerical simulations show that the proposed test can greatly improve efficiency over tests of homogeneity that neglect covariate information under the alternative hypothesis. An empirical application to dental caries indices demonstrates the importance and practical utility of the methodology in detecting excess zeros in the data.  相似文献   

18.
In regression analyses of spatially structured data, it is common practice to introduce spatially correlated random effects into the regression model to reduce or even avoid unobserved variable bias in the estimation of other covariate effects. If besides the response the covariates are also spatially correlated, the spatial effects may confound the effect of the covariates or vice versa. In this case, the model fails to identify the true covariate effect due to multicollinearity. For highly collinear continuous covariates, path analysis and structural equation modeling techniques prove to be helpful to disentangle direct covariate effects from indirect covariate effects arising from correlation with other variables. This work discusses the applicability of these techniques in regression setups, where spatial and covariate effects coincide at least partly and classical geoadditive models fail to separate these effects. Supplementary materials for this article are available online.  相似文献   

19.
We propose a method for estimating parameters in generalized linear models with missing covariates and a non-ignorable missing data mechanism. We use a multinomial model for the missing data indicators and propose a joint distribution for them which can be written as a sequence of one-dimensional conditional distributions, with each one-dimensional conditional distribution consisting of a logistic regression. We allow the covariates to be either categorical or continuous. The joint covariate distribution is also modelled via a sequence of one-dimensional conditional distributions, and the response variable is assumed to be completely observed. We derive the E- and M-steps of the EM algorithm with non-ignorable missing covariate data. For categorical covariates, we derive a closed form expression for the E- and M-steps of the EM algorithm for obtaining the maximum likelihood estimates (MLEs). For continuous covariates, we use a Monte Carlo version of the EM algorithm to obtain the MLEs via the Gibbs sampler. Computational techniques for Gibbs sampling are proposed and implemented. The parametric form of the assumed missing data mechanism itself is not `testable' from the data, and thus the non-ignorable modelling considered here can be viewed as a sensitivity analysis concerning a more complicated model. Therefore, although a model may have `passed' the tests for a certain missing data mechanism, this does not mean that we have captured, even approximately, the correct missing data mechanism. Hence, model checking for the missing data mechanism and sensitivity analyses play an important role in this problem and are discussed in detail. Several simulations are given to demonstrate the methodology. In addition, a real data set from a melanoma cancer clinical trial is presented to illustrate the methods proposed.  相似文献   

20.
In this article, we propose a flexible parametric (FP) approach for adjusting for covariate measurement errors in regression that can accommodate replicated measurements on the surrogate (mismeasured) version of the unobserved true covariate on all the study subjects or on a sub-sample of the study subjects as error assessment data. We utilize the general framework of the FP approach proposed by Hossain and Gustafson in 2009 for adjusting for covariate measurement errors in regression. The FP approach is then compared with the existing non-parametric approaches when error assessment data are available on the entire sample of the study subjects (complete error assessment data) considering covariate measurement error in a multiple logistic regression model. We also developed the FP approach when error assessment data are available on a sub-sample of the study subjects (partial error assessment data) and investigated its performance using both simulated and real life data. Simulation results reveal that, in comparable situations, the FP approach performs as good as or better than the competing non-parametric approaches in eliminating the bias that arises in the estimated regression parameters due to covariate measurement errors. Also, it results in better efficiency of the estimated parameters. Finally, the FP approach is found to perform adequately well in terms of bias correction, confidence coverage, and in achieving appropriate statistical power under partial error assessment data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号