共查询到20条相似文献,搜索用时 31 毫秒
1.
Variable selection is an important issue in all regression analysis and in this paper, we discuss this in the context of regression
analysis of recurrent event data. Recurrent event data often occur in long-term studies in which individuals may experience
the events of interest more than once and their analysis has recently attracted a great deal of attention (Andersen et al.,
Statistical models based on counting processes, 1993; Cook and Lawless, Biometrics 52:1311–1323, 1996, The analysis of recurrent
event data, 2007; Cook et al., Biometrics 52:557–571, 1996; Lawless and Nadeau, Technometrics 37:158-168, 1995; Lin et al.,
J R Stat Soc B 69:711–730, 2000). However, it seems that there are no established approaches to the variable selection with
respect to recurrent event data. For the problem, we adopt the idea behind the nonconcave penalized likelihood approach proposed
in Fan and Li (J Am Stat Assoc 96:1348–1360, 2001) and develop a nonconcave penalized estimating function approach. The proposed
approach selects variables and estimates regression coefficients simultaneously and an algorithm is presented for this process.
We show that the proposed approach performs as well as the oracle procedure in that it yields the estimates as if the correct
submodel was known. Simulation studies are conducted for assessing the performance of the proposed approach and suggest that
it works well for practical situations. The proposed methodology is illustrated by using the data from a chronic granulomatous
disease study. 相似文献
2.
Asymptotic theory for the Cox semi-Markov illness-death model 总被引:1,自引:1,他引:0
Irreversible illness-death models are used to model disease processes and in cancer studies to model disease recovery. In
most applications, a Markov model is assumed for the multistate model. When there are covariates, a Cox (1972, J Roy Stat
Soc Ser B 34:187–220) model is used to model the effect of covariates on each transition intensity. Andersen et al. (2000,
Stat Med 19:587–599) proposed a Cox semi-Markov model for this problem. In this paper, we study the large sample theory for
that model and provide the asymptotic variances of various probabilities of interest. A Monte Carlo study is conducted to
investigate the robustness and efficiency of Markov/Semi-Markov estimators. A real data example from the PROVA (1991, Hepatology
14:1016–1024) trial is used to illustrate the theory. 相似文献
3.
Donor lymphocyte infusion (DLI) for patients who relapse following an allogeneic stem cell transplant has proved remarkably
durable. Because of the potential for second remissions with DLI, the current leukemia free survival (CLFS), which is the
probability that a patient has not failed the entire course of the treatment, is becoming of interest to clinical investigators.
Based on either a multistate Markov model or a linear combination of Kaplan–Meier estimators, we explore regression models
for the CLFS. We focus on the two sample problem and we develop confidence bands for the CLFS or for differences in CLFS as
well as a Kolmogorov type hypothesis test using a re-sampling technique. We also examine the use of pseudo-values to make
inference on the direct effects of covariates on the CLFS function and we develop a score test for the equality of two CLFS.
We illustrate these inference methods on a bone marrow transplant dataset. 相似文献
4.
The second-order least-squares estimator (SLSE) was proposed by Wang (Statistica Sinica 13:1201–1210, 2003) for measurement
error models. It was extended and applied to linear and nonlinear regression models by Abarin and Wang (Far East J Theor Stat
20:179–196, 2006) and Wang and Leblanc (Ann Inst Stat Math 60:883–900, 2008). The SLSE is asymptotically more efficient than
the ordinary least-squares estimator if the error distribution has a nonzero third moment. However, it lacks robustness against
outliers in the data. In this paper, we propose a robust second-order least squares estimator (RSLSE) against X-outliers. The RSLSE is highly efficient with high breakdown point and is asymptotically normally distributed. We compare
the RSLSE with other estimators through a simulation study. Our results show that the RSLSE performs very well. 相似文献
5.
In this paper we have discussed inference aspects of the skew-normal nonlinear regression models following both, a classical
and Bayesian approach, extending the usual normal nonlinear regression models. The univariate skew-normal distribution that
will be used in this work was introduced by Sahu et al. (Can J Stat 29:129–150, 2003), which is attractive because estimation
of the skewness parameter does not present the same degree of difficulty as in the case with Azzalini (Scand J Stat 12:171–178,
1985) one and, moreover, it allows easy implementation of the EM-algorithm. As illustration of the proposed methodology, we
consider a data set previously analyzed in the literature under normality. 相似文献
6.
Breslow and Holubkov (J Roy Stat Soc B 59:447–461 1997a) developed semiparametric maximum likelihood estimation for two-phase
studies with a case–control first phase under a logistic regression model and noted that, apart for the overall intercept
term, it was the same as the semiparametric estimator for two-phase studies with a prospective first phase developed in Scott
and Wild (Biometrica 84:57–71 1997). In this paper we extend the Breslow–Holubkov result to general binary regression models
and show that it has a very simple relationship with its prospective first-phase counterpart. We also explore why the design
of the first phase only affects the intercept of a logistic model, simplify the calculation of standard errors, establish
the semiparametric efficiency of the Breslow–Holubkov estimator and derive its asymptotic distribution in the general case. 相似文献
7.
A partially adaptive estimator for the censored regression model based on a mixture of normal distributions 总被引:1,自引:0,他引:1
Steven B. Caudill 《Statistical Methods and Applications》2012,21(2):121-137
The goal of this paper is to introduce a partially adaptive estimator for the censored regression model based on an error
structure described by a mixture of two normal distributions. The model we introduce is easily estimated by maximum likelihood
using an EM algorithm adapted from the work of Bartolucci and Scaccia (Comput Stat Data Anal 48:821–834, 2005). A Monte Carlo study is conducted to compare the small sample properties of this estimator to the performance of some common
alternative estimators of censored regression models including the usual tobit model, the CLAD estimator of Powell (J Econom
25:303–325, 1984), and the STLS estimator of Powell (Econometrica 54:1435–1460, 1986). In terms of RMSE, our partially adaptive estimator performed well. The partially adaptive estimator is applied to data
on wife’s hours worked from Mroz (1987). In this application we find support for the partially adaptive estimator over the usual tobit model. 相似文献
8.
Regression analysis for competing risks data can be based on generalized estimating equations. For the case with right censored data, pseudo-values were proposed to solve the estimating equations. In this article we investigate robustness of the pseudo-values against violation of the assumption that the probability of not being lost to follow-up (un-censored) is independent of the covariates. Modified pseudo-values are proposed which rely on a correctly specified regression model for the censoring times. Bias and efficiency of these methods are compared in a simulation study. Further illustration of the differences is obtained in an application to bone marrow transplantation data and a corresponding sensitivity analysis. 相似文献
9.
Regression Analysis for Multistate Models Based on a Pseudo-value Approach, with Applications to Bone Marrow Transplantation Studies 总被引:4,自引:0,他引:4
Abstract. Typically, regression analysis for multistate models has been based on regression models for the transition intensities. These models lead to highly nonlinear and very complex models for the effects of covariates on state occupation probabilities. We present a technique that models the state occupation or transition probabilities in a multistate model directly. The method is based on the pseudo-values from a jackknife statistic constructed from non-parametric estimators for the probability in question. These pseudo-values are used as outcome variables in a generalized estimating equation to obtain estimates of model parameters. We examine this approach and its properties in detail for two special multistate model probabilities, the cumulative incidence function in competing risks and the current leukaemia-free survival used in bone marrow transplants. The latter is the probability a patient is alive and in either a first or second post-transplant remission. The techniques are illustrated on a dataset of leukaemia patients given a marrow transplant. We also discuss extensions of the model that are of current research interest. 相似文献
10.
In this paper, we introduce an alternative stochastic restricted Liu estimator for the vector of parameters in a linear regression
model when additional stochastic linear restrictions on the parameter vector are assumed to hold. The new estimator is a generalization
of the ordinary mixed estimator (OME) (Durbin in J Am Stat Assoc 48:799–808, 1953; Theil and Goldberger in Int Econ Rev 2:65–78,
1961; Theil in J Am Stat Assoc 58:401–414, 1963) and Liu estimator proposed by Liu (Commun Stat Theory Methods 22:393–402,
1993). Necessary and sufficient conditions for the superiority of the new stochastic restricted Liu estimator over the OME,
the Liu estimator and the estimator proposed by Hubert and Wijekoon (Stat Pap 47:471–479, 2006) in the mean squared error
matrix (MSEM) sense are derived. Furthermore, a numerical example based on the widely analysed dataset on Portland cement
(Woods et al. in Ind Eng Chem 24:1207–1241, 1932) and a Monte Carlo evaluation of the estimators are also given to illustrate
some of the theoretical results. 相似文献
11.
Sasabuchi et al. (Biometrika 70(2):465–472, 1983) introduces a multivariate version of the well-known univariate isotonic
regression which plays a key role in the field of statistical inference under order restrictions. His proposed algorithm for
computing the multivariate isotonic regression, however, is guaranteed to converge only under special conditions (Sasabuchi
et al., J Stat Comput Simul 73(9):619–641, 2003). In this paper, a more general framework for multivariate isotonic regression
is given and an algorithm based on Dykstra’s method is used to compute the multivariate isotonic regression. Two numerical
examples are given to illustrate the algorithm and to compare the result with the one published by Fernando and Kulatunga
(Comput Stat Data Anal 52:702–712, 2007). 相似文献
12.
Shi, Wang, Murray-Smith and Titterington (Biometrics 63:714–723, 2007) proposed a Gaussian process functional regression (GPFR)
model to model functional response curves with a set of functional covariates. Two main problems are addressed by their method:
modelling nonlinear and nonparametric regression relationship and modelling covariance structure and mean structure simultaneously.
The method gives very good results for curve fitting and prediction but side-steps the problem of heterogeneity. In this paper
we present a new method for modelling functional data with ‘spatially’ indexed data, i.e., the heterogeneity is dependent
on factors such as region and individual patient’s information. For data collected from different sources, we assume that
the data corresponding to each curve (or batch) follows a Gaussian process functional regression model as a lower-level model,
and introduce an allocation model for the latent indicator variables as a higher-level model. This higher-level model is dependent
on the information related to each batch. This method takes advantage of both GPFR and mixture models and therefore improves
the accuracy of predictions. The mixture model has also been used for curve clustering, but focusing on the problem of clustering
functional relationships between response curve and covariates, i.e. the clustering is based on the surface shape of the functional
response against the set of functional covariates. The model is examined on simulated data and real data. 相似文献
13.
We develop a Bayesian analysis for the class of Birnbaum–Saunders nonlinear regression models introduced by Lemonte and Cordeiro
(Comput Stat Data Anal 53:4441–4452, 2009). This regression model, which is based on the Birnbaum–Saunders distribution (Birnbaum and Saunders in J Appl Probab 6:319–327,
1969a), has been used successfully to model fatigue failure times. We have considered a Bayesian analysis under a normal-gamma
prior. Due to the complexity of the model, Markov chain Monte Carlo methods are used to develop a Bayesian procedure for the
considered model. We describe tools for model determination, which include the conditional predictive ordinate, the logarithm
of the pseudo-marginal likelihood and the pseudo-Bayes factor. Additionally, case deletion influence diagnostics is developed
for the joint posterior distribution based on the Kullback–Leibler divergence. Two empirical applications are considered in
order to illustrate the developed procedures. 相似文献
14.
In this paper we consider different approaches for estimation and assessment of covariate effects for the cumulative incidence
curve in the competing risks model. The classic approach is to model all cause-specific hazards and then estimate the cumulative
incidence curve based on these cause-specific hazards. Another recent approach is to directly model the cumulative incidence
by a proportional model (Fine and Gray, J Am Stat Assoc 94:496–509, 1999), and then obtain direct estimates of how covariates
influences the cumulative incidence curve. We consider a simple and flexible class of regression models that is easy to fit
and contains the Fine–Gray model as a special case. One advantage of this approach is that our regression modeling allows
for non-proportional hazards. This leads to a new simple goodness-of-fit procedure for the proportional subdistribution hazards
assumption that is very easy to use. The test is constructive in the sense that it shows exactly where non-proportionality
is present. We illustrate our methods to a bone marrow transplant data from the Center for International Blood and Marrow
Transplant Research (CIBMTR). Through this data example we demonstrate the use of the flexible regression models to analyze
competing risks data when non-proportionality is present in the data. 相似文献
15.
Scale mixtures of normal distributions form a class of symmetric thick-tailed distributions that includes the normal one as
a special case. In this paper we consider local influence analysis for measurement error models (MEM) when the random error
and the unobserved value of the covariates jointly follow scale mixtures of normal distributions, providing an appealing robust
alternative to the usual Gaussian process in measurement error models. In order to avoid difficulties in estimating the parameter
of the mixing variable, we fixed it previously, as recommended by Lange et al. (J Am Stat Assoc 84:881–896, 1989) and Berkane
et al. (Comput Stat Data Anal 18:255–267, 1994). The local influence method is used to assess the robustness aspects of the
parameter estimates under some usual perturbation schemes. However, as the observed log-likelihood associated with this model
involves some integrals, Cook’s well–known approach may be hard to apply to obtain measures of local influence. Instead, we
develop local influence measures following the approach of Zhu and Lee (J R Stat Soc Ser B 63:121–126, 2001), which is based
on the EM algorithm. Results obtained from a real data set are reported, illustrating the usefulness of the proposed methodology,
its relative simplicity, adaptability and practical usage. 相似文献
16.
Modelling count data with overdispersion and spatial effects 总被引:1,自引:1,他引:0
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account
for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson
model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP)
distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models
in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial
variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows
for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt
et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive
meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion
(DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example
Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion
in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects
to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated
or uncorrelated random effects are to be preferred over all other models according to the considered criteria. 相似文献
17.
Steven L. Scott 《Statistical Papers》2011,52(1):87-109
This article describes a convenient method of selecting Metropolis– Hastings proposal distributions for multinomial logit
models. There are two key ideas involved. The first is that multinomial logit models have a latent variable representation
similar to that exploited by Albert and Chib (J Am Stat Assoc 88:669–679, 1993) for probit regression. Augmenting the latent
variables replaces the multinomial logit likelihood function with the complete data likelihood for a linear model with extreme
value errors. While no conjugate prior is available for this model, a least squares estimate of the parameters is easily obtained.
The asymptotic sampling distribution of the least squares estimate is Gaussian with known variance. The second key idea in
this paper is to generate a Metropolis–Hastings proposal distribution by conditioning on the estimator instead of the full
data set. The resulting sampler has many of the benefits of so-called tailored or approximation Metropolis–Hastings samplers.
However, because the proposal distributions are available in closed form they can be implemented without numerical methods
for exploring the posterior distribution. The algorithm converges geometrically ergodically, its computational burden is minor,
and it requires minimal user input. Improvements to the sampler’s mixing rate are investigated. The algorithm is also applied
to partial credit models describing ordinal item response data from the 1998 National Assessment of Educational Progress.
Its application to hierarchical models and Poisson regression are briefly discussed. 相似文献
18.
Quantile regression, including median regression, as a more completed statistical model than mean regression, is now well
known with its wide spread applications. Bayesian inference on quantile regression or Bayesian quantile regression has attracted
much interest recently. Most of the existing researches in Bayesian quantile regression focus on parametric quantile regression,
though there are discussions on different ways of modeling the model error by a parametric distribution named asymmetric Laplace
distribution or by a nonparametric alternative named scale mixture asymmetric Laplace distribution. This paper discusses Bayesian
inference for nonparametric quantile regression. This general approach fits quantile regression curves using piecewise polynomial
functions with an unknown number of knots at unknown locations, all treated as parameters to be inferred through reversible
jump Markov chain Monte Carlo (RJMCMC) of Green (Biometrika 82:711–732, 1995). Instead of drawing samples from the posterior, we use regression quantiles to create Markov chains for the estimation of
the quantile curves. We also use approximate Bayesian factor in the inference. This method extends the work in automatic Bayesian
mean curve fitting to quantile regression. Numerical results show that this Bayesian quantile smoothing technique is competitive
with quantile regression/smoothing splines of He and Ng (Comput. Stat. 14:315–337, 1999) and P-splines (penalized splines) of Eilers and de Menezes (Bioinformatics 21(7):1146–1153, 2005). 相似文献
19.
On locally optimal invariant unbiased tests for the variance components ratio in mixed linear models
Andrzej Michalski 《Statistical Papers》2009,50(4):855-868
In the paper the problem of testing of two-sided hypotheses for variance components in mixed linear models is considered.
When the uniformly most powerful invariant test does not exist (see e.g. Das and Sinha, in Proceedings of the second international
Tampere conference in statistics, 1987; Gnot and Michalski, in Statistics 25:213–223, 1994; Michalski and Zmyślony, in Statistics
27:297–310, 1996) then to conduct the optimal statistical inference on model parameters a construction of a test with locally
best properties is desirable, cf. Michalski (in Tatra Mountains Mathematical Publications 26:1–21, 2003). The main goal of
this article is the construction of the locally best invariant unbiased test for a single variance component (or for a ratio
of variance components). The result has been obtained utilizing Andersson’s and Wijsman’s approach connected with a representation
of density function of maximal invariant (Andersson, in Ann Stat 10:955–961, 1982; Wijsman, in Proceedings of fifth Berk Symp
Math Statist Prob 1:389–400, 1967; Wijsman, in Sankhyā A 48:1–42, 1986; Khuri et al., in Statistical tests for mixed linear models, 1998) and from generalized Neyman–Pearson Lemma
(Dantzig and Wald, in Ann Math Stat 22:87–93, 1951; Rao, in Linear statistical inference and its applications, 1973). One
selected real example of an unbalanced mixed linear model is given, for which the power functions of the LBIU test and Wald’s
test (the F-test in ANOVA model) are computed, and compared with the attainable upper bound of power obtained by using Neyman–Pearson
Lemma. 相似文献
20.
Recurrent event data occur in many clinical and observational studies (Cook and Lawless, Analysis of recurrent event data,
2007) and in these situations, there may exist a terminal event such as death that is related to the recurrent event of interest
(Ghosh and Lin, Biometrics 56:554–562, 2000; Wang et al., J Am Stat Assoc 96:1057–1065, 2001; Huang and Wang, J Am Stat Assoc
99:1153–1165, 2004; Ye et al., Biometrics 63:78–87, 2007). In addition, sometimes there may exist more than one type of recurrent
events, that is, one faces multivariate recurrent event data with some dependent terminal event (Chen and Cook, Biostatistics
5:129–143, 2004). It is apparent that for the analysis of such data, one has to take into account the dependence both among
different types of recurrent events and between the recurrent and terminal events. In this paper, we propose a joint modeling
approach for regression analysis of the data and both finite and asymptotic properties of the resulting estimates of unknown
parameters are established. The methodology is applied to a set of bivariate recurrent event data arising from a study of
leukemia patients. 相似文献