首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Variable selection is an important issue in all regression analysis and in this paper, we discuss this in the context of regression analysis of recurrent event data. Recurrent event data often occur in long-term studies in which individuals may experience the events of interest more than once and their analysis has recently attracted a great deal of attention (Andersen et al., Statistical models based on counting processes, 1993; Cook and Lawless, Biometrics 52:1311–1323, 1996, The analysis of recurrent event data, 2007; Cook et al., Biometrics 52:557–571, 1996; Lawless and Nadeau, Technometrics 37:158-168, 1995; Lin et al., J R Stat Soc B 69:711–730, 2000). However, it seems that there are no established approaches to the variable selection with respect to recurrent event data. For the problem, we adopt the idea behind the nonconcave penalized likelihood approach proposed in Fan and Li (J Am Stat Assoc 96:1348–1360, 2001) and develop a nonconcave penalized estimating function approach. The proposed approach selects variables and estimates regression coefficients simultaneously and an algorithm is presented for this process. We show that the proposed approach performs as well as the oracle procedure in that it yields the estimates as if the correct submodel was known. Simulation studies are conducted for assessing the performance of the proposed approach and suggest that it works well for practical situations. The proposed methodology is illustrated by using the data from a chronic granulomatous disease study.  相似文献   

2.
Asymptotic theory for the Cox semi-Markov illness-death model   总被引:1,自引:1,他引:0  
Irreversible illness-death models are used to model disease processes and in cancer studies to model disease recovery. In most applications, a Markov model is assumed for the multistate model. When there are covariates, a Cox (1972, J Roy Stat Soc Ser B 34:187–220) model is used to model the effect of covariates on each transition intensity. Andersen et al. (2000, Stat Med 19:587–599) proposed a Cox semi-Markov model for this problem. In this paper, we study the large sample theory for that model and provide the asymptotic variances of various probabilities of interest. A Monte Carlo study is conducted to investigate the robustness and efficiency of Markov/Semi-Markov estimators. A real data example from the PROVA (1991, Hepatology 14:1016–1024) trial is used to illustrate the theory.  相似文献   

3.
Donor lymphocyte infusion (DLI) for patients who relapse following an allogeneic stem cell transplant has proved remarkably durable. Because of the potential for second remissions with DLI, the current leukemia free survival (CLFS), which is the probability that a patient has not failed the entire course of the treatment, is becoming of interest to clinical investigators. Based on either a multistate Markov model or a linear combination of Kaplan–Meier estimators, we explore regression models for the CLFS. We focus on the two sample problem and we develop confidence bands for the CLFS or for differences in CLFS as well as a Kolmogorov type hypothesis test using a re-sampling technique. We also examine the use of pseudo-values to make inference on the direct effects of covariates on the CLFS function and we develop a score test for the equality of two CLFS. We illustrate these inference methods on a bone marrow transplant dataset.  相似文献   

4.
The second-order least-squares estimator (SLSE) was proposed by Wang (Statistica Sinica 13:1201–1210, 2003) for measurement error models. It was extended and applied to linear and nonlinear regression models by Abarin and Wang (Far East J Theor Stat 20:179–196, 2006) and Wang and Leblanc (Ann Inst Stat Math 60:883–900, 2008). The SLSE is asymptotically more efficient than the ordinary least-squares estimator if the error distribution has a nonzero third moment. However, it lacks robustness against outliers in the data. In this paper, we propose a robust second-order least squares estimator (RSLSE) against X-outliers. The RSLSE is highly efficient with high breakdown point and is asymptotically normally distributed. We compare the RSLSE with other estimators through a simulation study. Our results show that the RSLSE performs very well.  相似文献   

5.
In this paper we have discussed inference aspects of the skew-normal nonlinear regression models following both, a classical and Bayesian approach, extending the usual normal nonlinear regression models. The univariate skew-normal distribution that will be used in this work was introduced by Sahu et al. (Can J Stat 29:129–150, 2003), which is attractive because estimation of the skewness parameter does not present the same degree of difficulty as in the case with Azzalini (Scand J Stat 12:171–178, 1985) one and, moreover, it allows easy implementation of the EM-algorithm. As illustration of the proposed methodology, we consider a data set previously analyzed in the literature under normality.  相似文献   

6.
Breslow and Holubkov (J Roy Stat Soc B 59:447–461 1997a) developed semiparametric maximum likelihood estimation for two-phase studies with a case–control first phase under a logistic regression model and noted that, apart for the overall intercept term, it was the same as the semiparametric estimator for two-phase studies with a prospective first phase developed in Scott and Wild (Biometrica 84:57–71 1997). In this paper we extend the Breslow–Holubkov result to general binary regression models and show that it has a very simple relationship with its prospective first-phase counterpart. We also explore why the design of the first phase only affects the intercept of a logistic model, simplify the calculation of standard errors, establish the semiparametric efficiency of the Breslow–Holubkov estimator and derive its asymptotic distribution in the general case.  相似文献   

7.
The goal of this paper is to introduce a partially adaptive estimator for the censored regression model based on an error structure described by a mixture of two normal distributions. The model we introduce is easily estimated by maximum likelihood using an EM algorithm adapted from the work of Bartolucci and Scaccia (Comput Stat Data Anal 48:821–834, 2005). A Monte Carlo study is conducted to compare the small sample properties of this estimator to the performance of some common alternative estimators of censored regression models including the usual tobit model, the CLAD estimator of Powell (J Econom 25:303–325, 1984), and the STLS estimator of Powell (Econometrica 54:1435–1460, 1986). In terms of RMSE, our partially adaptive estimator performed well. The partially adaptive estimator is applied to data on wife’s hours worked from Mroz (1987). In this application we find support for the partially adaptive estimator over the usual tobit model.  相似文献   

8.
Regression analysis for competing risks data can be based on generalized estimating equations. For the case with right censored data, pseudo-values were proposed to solve the estimating equations. In this article we investigate robustness of the pseudo-values against violation of the assumption that the probability of not being lost to follow-up (un-censored) is independent of the covariates. Modified pseudo-values are proposed which rely on a correctly specified regression model for the censoring times. Bias and efficiency of these methods are compared in a simulation study. Further illustration of the differences is obtained in an application to bone marrow transplantation data and a corresponding sensitivity analysis.  相似文献   

9.
Abstract.  Typically, regression analysis for multistate models has been based on regression models for the transition intensities. These models lead to highly nonlinear and very complex models for the effects of covariates on state occupation probabilities. We present a technique that models the state occupation or transition probabilities in a multistate model directly. The method is based on the pseudo-values from a jackknife statistic constructed from non-parametric estimators for the probability in question. These pseudo-values are used as outcome variables in a generalized estimating equation to obtain estimates of model parameters. We examine this approach and its properties in detail for two special multistate model probabilities, the cumulative incidence function in competing risks and the current leukaemia-free survival used in bone marrow transplants. The latter is the probability a patient is alive and in either a first or second post-transplant remission. The techniques are illustrated on a dataset of leukaemia patients given a marrow transplant. We also discuss extensions of the model that are of current research interest.  相似文献   

10.
An alternative stochastic restricted Liu estimator in linear regression   总被引:2,自引:1,他引:1  
In this paper, we introduce an alternative stochastic restricted Liu estimator for the vector of parameters in a linear regression model when additional stochastic linear restrictions on the parameter vector are assumed to hold. The new estimator is a generalization of the ordinary mixed estimator (OME) (Durbin in J Am Stat Assoc 48:799–808, 1953; Theil and Goldberger in Int Econ Rev 2:65–78, 1961; Theil in J Am Stat Assoc 58:401–414, 1963) and Liu estimator proposed by Liu (Commun Stat Theory Methods 22:393–402, 1993). Necessary and sufficient conditions for the superiority of the new stochastic restricted Liu estimator over the OME, the Liu estimator and the estimator proposed by Hubert and Wijekoon (Stat Pap 47:471–479, 2006) in the mean squared error matrix (MSEM) sense are derived. Furthermore, a numerical example based on the widely analysed dataset on Portland cement (Woods et al. in Ind Eng Chem 24:1207–1241, 1932) and a Monte Carlo evaluation of the estimators are also given to illustrate some of the theoretical results.  相似文献   

11.
Sasabuchi et al. (Biometrika 70(2):465–472, 1983) introduces a multivariate version of the well-known univariate isotonic regression which plays a key role in the field of statistical inference under order restrictions. His proposed algorithm for computing the multivariate isotonic regression, however, is guaranteed to converge only under special conditions (Sasabuchi et al., J Stat Comput Simul 73(9):619–641, 2003). In this paper, a more general framework for multivariate isotonic regression is given and an algorithm based on Dykstra’s method is used to compute the multivariate isotonic regression. Two numerical examples are given to illustrate the algorithm and to compare the result with the one published by Fernando and Kulatunga (Comput Stat Data Anal 52:702–712, 2007).  相似文献   

12.
Shi, Wang, Murray-Smith and Titterington (Biometrics 63:714–723, 2007) proposed a Gaussian process functional regression (GPFR) model to model functional response curves with a set of functional covariates. Two main problems are addressed by their method: modelling nonlinear and nonparametric regression relationship and modelling covariance structure and mean structure simultaneously. The method gives very good results for curve fitting and prediction but side-steps the problem of heterogeneity. In this paper we present a new method for modelling functional data with ‘spatially’ indexed data, i.e., the heterogeneity is dependent on factors such as region and individual patient’s information. For data collected from different sources, we assume that the data corresponding to each curve (or batch) follows a Gaussian process functional regression model as a lower-level model, and introduce an allocation model for the latent indicator variables as a higher-level model. This higher-level model is dependent on the information related to each batch. This method takes advantage of both GPFR and mixture models and therefore improves the accuracy of predictions. The mixture model has also been used for curve clustering, but focusing on the problem of clustering functional relationships between response curve and covariates, i.e. the clustering is based on the surface shape of the functional response against the set of functional covariates. The model is examined on simulated data and real data.  相似文献   

13.
We develop a Bayesian analysis for the class of Birnbaum–Saunders nonlinear regression models introduced by Lemonte and Cordeiro (Comput Stat Data Anal 53:4441–4452, 2009). This regression model, which is based on the Birnbaum–Saunders distribution (Birnbaum and Saunders in J Appl Probab 6:319–327, 1969a), has been used successfully to model fatigue failure times. We have considered a Bayesian analysis under a normal-gamma prior. Due to the complexity of the model, Markov chain Monte Carlo methods are used to develop a Bayesian procedure for the considered model. We describe tools for model determination, which include the conditional predictive ordinate, the logarithm of the pseudo-marginal likelihood and the pseudo-Bayes factor. Additionally, case deletion influence diagnostics is developed for the joint posterior distribution based on the Kullback–Leibler divergence. Two empirical applications are considered in order to illustrate the developed procedures.  相似文献   

14.
In this paper we consider different approaches for estimation and assessment of covariate effects for the cumulative incidence curve in the competing risks model. The classic approach is to model all cause-specific hazards and then estimate the cumulative incidence curve based on these cause-specific hazards. Another recent approach is to directly model the cumulative incidence by a proportional model (Fine and Gray, J Am Stat Assoc 94:496–509, 1999), and then obtain direct estimates of how covariates influences the cumulative incidence curve. We consider a simple and flexible class of regression models that is easy to fit and contains the Fine–Gray model as a special case. One advantage of this approach is that our regression modeling allows for non-proportional hazards. This leads to a new simple goodness-of-fit procedure for the proportional subdistribution hazards assumption that is very easy to use. The test is constructive in the sense that it shows exactly where non-proportionality is present. We illustrate our methods to a bone marrow transplant data from the Center for International Blood and Marrow Transplant Research (CIBMTR). Through this data example we demonstrate the use of the flexible regression models to analyze competing risks data when non-proportionality is present in the data.  相似文献   

15.
Scale mixtures of normal distributions form a class of symmetric thick-tailed distributions that includes the normal one as a special case. In this paper we consider local influence analysis for measurement error models (MEM) when the random error and the unobserved value of the covariates jointly follow scale mixtures of normal distributions, providing an appealing robust alternative to the usual Gaussian process in measurement error models. In order to avoid difficulties in estimating the parameter of the mixing variable, we fixed it previously, as recommended by Lange et al. (J Am Stat Assoc 84:881–896, 1989) and Berkane et al. (Comput Stat Data Anal 18:255–267, 1994). The local influence method is used to assess the robustness aspects of the parameter estimates under some usual perturbation schemes. However, as the observed log-likelihood associated with this model involves some integrals, Cook’s well–known approach may be hard to apply to obtain measures of local influence. Instead, we develop local influence measures following the approach of Zhu and Lee (J R Stat Soc Ser B 63:121–126, 2001), which is based on the EM algorithm. Results obtained from a real data set are reported, illustrating the usefulness of the proposed methodology, its relative simplicity, adaptability and practical usage.  相似文献   

16.
Modelling count data with overdispersion and spatial effects   总被引:1,自引:1,他引:0  
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.  相似文献   

17.
This article describes a convenient method of selecting Metropolis– Hastings proposal distributions for multinomial logit models. There are two key ideas involved. The first is that multinomial logit models have a latent variable representation similar to that exploited by Albert and Chib (J Am Stat Assoc 88:669–679, 1993) for probit regression. Augmenting the latent variables replaces the multinomial logit likelihood function with the complete data likelihood for a linear model with extreme value errors. While no conjugate prior is available for this model, a least squares estimate of the parameters is easily obtained. The asymptotic sampling distribution of the least squares estimate is Gaussian with known variance. The second key idea in this paper is to generate a Metropolis–Hastings proposal distribution by conditioning on the estimator instead of the full data set. The resulting sampler has many of the benefits of so-called tailored or approximation Metropolis–Hastings samplers. However, because the proposal distributions are available in closed form they can be implemented without numerical methods for exploring the posterior distribution. The algorithm converges geometrically ergodically, its computational burden is minor, and it requires minimal user input. Improvements to the sampler’s mixing rate are investigated. The algorithm is also applied to partial credit models describing ordinal item response data from the 1998 National Assessment of Educational Progress. Its application to hierarchical models and Poisson regression are briefly discussed.  相似文献   

18.
Quantile regression, including median regression, as a more completed statistical model than mean regression, is now well known with its wide spread applications. Bayesian inference on quantile regression or Bayesian quantile regression has attracted much interest recently. Most of the existing researches in Bayesian quantile regression focus on parametric quantile regression, though there are discussions on different ways of modeling the model error by a parametric distribution named asymmetric Laplace distribution or by a nonparametric alternative named scale mixture asymmetric Laplace distribution. This paper discusses Bayesian inference for nonparametric quantile regression. This general approach fits quantile regression curves using piecewise polynomial functions with an unknown number of knots at unknown locations, all treated as parameters to be inferred through reversible jump Markov chain Monte Carlo (RJMCMC) of Green (Biometrika 82:711–732, 1995). Instead of drawing samples from the posterior, we use regression quantiles to create Markov chains for the estimation of the quantile curves. We also use approximate Bayesian factor in the inference. This method extends the work in automatic Bayesian mean curve fitting to quantile regression. Numerical results show that this Bayesian quantile smoothing technique is competitive with quantile regression/smoothing splines of He and Ng (Comput. Stat. 14:315–337, 1999) and P-splines (penalized splines) of Eilers and de Menezes (Bioinformatics 21(7):1146–1153, 2005).  相似文献   

19.
In the paper the problem of testing of two-sided hypotheses for variance components in mixed linear models is considered. When the uniformly most powerful invariant test does not exist (see e.g. Das and Sinha, in Proceedings of the second international Tampere conference in statistics, 1987; Gnot and Michalski, in Statistics 25:213–223, 1994; Michalski and Zmyślony, in Statistics 27:297–310, 1996) then to conduct the optimal statistical inference on model parameters a construction of a test with locally best properties is desirable, cf. Michalski (in Tatra Mountains Mathematical Publications 26:1–21, 2003). The main goal of this article is the construction of the locally best invariant unbiased test for a single variance component (or for a ratio of variance components). The result has been obtained utilizing Andersson’s and Wijsman’s approach connected with a representation of density function of maximal invariant (Andersson, in Ann Stat 10:955–961, 1982; Wijsman, in Proceedings of fifth Berk Symp Math Statist Prob 1:389–400, 1967; Wijsman, in Sankhyā A 48:1–42, 1986; Khuri et al., in Statistical tests for mixed linear models, 1998) and from generalized Neyman–Pearson Lemma (Dantzig and Wald, in Ann Math Stat 22:87–93, 1951; Rao, in Linear statistical inference and its applications, 1973). One selected real example of an unbalanced mixed linear model is given, for which the power functions of the LBIU test and Wald’s test (the F-test in ANOVA model) are computed, and compared with the attainable upper bound of power obtained by using Neyman–Pearson Lemma.  相似文献   

20.
Recurrent event data occur in many clinical and observational studies (Cook and Lawless, Analysis of recurrent event data, 2007) and in these situations, there may exist a terminal event such as death that is related to the recurrent event of interest (Ghosh and Lin, Biometrics 56:554–562, 2000; Wang et al., J Am Stat Assoc 96:1057–1065, 2001; Huang and Wang, J Am Stat Assoc 99:1153–1165, 2004; Ye et al., Biometrics 63:78–87, 2007). In addition, sometimes there may exist more than one type of recurrent events, that is, one faces multivariate recurrent event data with some dependent terminal event (Chen and Cook, Biostatistics 5:129–143, 2004). It is apparent that for the analysis of such data, one has to take into account the dependence both among different types of recurrent events and between the recurrent and terminal events. In this paper, we propose a joint modeling approach for regression analysis of the data and both finite and asymptotic properties of the resulting estimates of unknown parameters are established. The methodology is applied to a set of bivariate recurrent event data arising from a study of leukemia patients.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号