首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper discusses regression analysis of clustered interval-censored failure time data, which often occur in medical follow-up studies among other areas. For such data, sometimes the failure time may be related to the cluster size, the number of subjects within each cluster or we have informative cluster sizes. For the problem, we present a within-cluster resampling method for the situation where the failure time of interest can be described by a class of linear transformation models. In addition to the establishment of the asymptotic properties of the proposed estimators of regression parameters, an extensive simulation study is conducted for the assessment of the finite sample properties of the proposed method and suggests that it works well in practical situations. An application to the example that motivated this study is also provided.  相似文献   

2.
Clustered survival data arise often in clinical trial design, where the correlated subunits from the same cluster are randomized to different treatment groups. Under such design, we consider the problem of constructing confidence interval for the difference of two median survival time given the covariates. We use Cox gamma frailty model to account for the within-cluster correlation. Based on the conditional confidence intervals, we can identify the possible range of covariates over which the two groups would provide different median survival times. The associated coverage probability and the expected length of the proposed interval are investigated via a simulation study. The implementation of the confidence intervals is illustrated using a real data set.  相似文献   

3.
When modeling correlated binary data in the presence of informative cluster sizes, generalized estimating equations with either resampling or inverse-weighting, are often used to correct for estimation bias. However, existing methods for the clustered longitudinal setting assume constant cluster sizes over time. We present a subject-weighted generalized estimating equations scheme that provides valid parameter estimation for the clustered longitudinal setting while allowing cluster sizes to change over time. We compare, via simulation, the performance of existing methods to our subject-weighted approach. The subject-weighted approach was the only method that showed negligible bias, with excellent coverage, for all model parameters.  相似文献   

4.
ABSTRACT.  Most proposed subsampling and resampling methods in the literature assume stationary data. In many empirical applications, however, the hypothesis of stationarity can easily be rejected. In this paper, we demonstrate that moment and variance estimators based on the subsampling methodology can also be employed for different types of non-stationarity data. Consistency of estimators are demonstrated under mild moment and mixing conditions. Rates of convergence are provided, giving guidance for the appropriate choice of subshape size. Results from a small simulation study on finite-sample properties are also reported.  相似文献   

5.
Importance resampling is an approach that uses exponential tilting to reduce the resampling necessary for the construction of nonparametric bootstrap confidence intervals. The properties of bootstrap importance confidence intervals are well established when the data is a smooth function of means and when there is no censoring. However, in the framework of survival or time-to-event data, the asymptotic properties of importance resampling have not been rigorously studied, mainly because of the unduly complicated theory incurred when data is censored. This paper uses extensive simulation to show that, for parameter estimates arising from fitting Cox proportional hazards models, importance bootstrap confidence intervals can be constructed if the importance resampling probabilities of the records for the n individuals in the study are determined by the empirical influence function for the parameter of interest. Our results show that, compared to uniform resampling, importance resampling improves the relative mean-squared-error (MSE) efficiency by a factor of nine (for n = 200). The efficiency increases significantly with sample size, is mildly associated with the amount of censoring, but decreases slightly as the number of bootstrap resamples increases. The extra CPU time requirement for calculating importance resamples is negligible when compared to the large improvement in MSE efficiency. The method is illustrated through an application to data on chronic lymphocytic leukemia, which highlights that the bootstrap confidence interval is the preferred alternative to large sample inferences when the distribution of a specific covariate deviates from normality. Our results imply that, because of its computational efficiency, importance resampling is recommended whenever bootstrap methodology is implemented in a survival framework. Its use is particularly important when complex covariates are involved or the survival problem to be solved is part of a larger problem; for instance, when determining confidence bounds for models linking survival time with clusters identified in gene expression microarray data.  相似文献   

6.
In a multilevel model for complex survey data, the weight‐inflated estimators of variance components can be biased. We propose a resampling method to correct this bias. The performance of the bias corrected estimators is studied through simulations using populations generated from a simple random effects model. The simulations show that, without lowering the precision, the proposed procedure can reduce the bias of the estimators, especially for designs that are both informative and have small cluster sizes. Application of these resampling procedures to data from an artificial workplace survey provides further evidence for the empirical value of this method. The Canadian Journal of Statistics 40: 150–171; 2012 © 2012 Statistical Society of Canada  相似文献   

7.
Clustered multinomial data with random cluster sizes commonly appear in health, environmental and ecological studies. Traditional approaches for analyzing clustered multinomial data contemplate two assumptions. One of these assumptions is that cluster sizes are fixed, whereas the other demands cluster sizes to be positive. Randomness of the cluster sizes may be the determinant of the within-cluster correlation and between-cluster variation. We propose a baseline-category mixed model for clustered multinomial data with random cluster sizes based on Poisson mixed models. Our orthodox best linear unbiased predictor approach to this model depends only on the moment structure of unobserved distribution-free random effects. Our approach also consolidates the marginal and conditional modeling interpretations. Unlike the traditional methods, our approach can accommodate both random and zero cluster sizes. Two real-life multinomial data examples, crime data and food contamination data, are used to manifest our proposed methodology.  相似文献   

8.
The article describes a generalized estimating equations approach that was used to investigate the impact of technology on vessel performance in a trawl fishery during 1988–96, while accounting for spatial and temporal correlations in the catch-effort data. Robust estimation of parameters in the presence of several levels of clustering depended more on the choice of cluster definition than on the choice of correlation structure within the cluster. Models with smaller cluster sizes produced stable results, while models with larger cluster sizes, that may have had complex within-cluster correlation structures and that had within-cluster covariates, produced estimates sensitive to the correlation structure. The preferred model arising from this dataset assumed that catches from a vessel were correlated in the same years and the same areas, but independent in different years and areas. The model that assumed catches from a vessel were correlated in all years and areas, equivalent to a random effects term for vessel, produced spurious results. This was an unexpected finding that highlighted the need to adopt a systematic strategy for modelling. The article proposes a modelling strategy of selecting the best cluster definition first, and the working correlation structure (within clusters) second. The article discusses the selection and interpretation of the model in the light of background knowledge of the data and utility of the model, and the potential for this modelling approach to apply in similar statistical situations.  相似文献   

9.
Between–within models are generalized linear mixed models (GLMMs) for clustered data that incorporate a random intercept together with fixed effects for within-cluster and between-cluster covariates; the between-cluster covariates represent the cluster means of the within-cluster covariates. One popular use of these models is to adjust for confounding of the effect of within-cluster covariates due to unmeasured between-cluster covariates. Previous research has shown via simulations that using this approach can yield inconsistent estimators. We present theory and simulations as evidence that a primary cause of the inconsistency is heteroscedasticity of the linearized version of the GLMM used for estimation.  相似文献   

10.
Abstract

The gap time between recurrent events is often of primary interest in many fields such as medical studies, and in this article, we discuss regression analysis of the gap times arising from a general class of additive transformation models. For the problem, we propose two estimation procedures, the modified within-cluster resampling (MWCR) method and the weighted risk-set (WRS) method, and the proposed estimators are shown to be consistent and asymptotically follow the normal distribution. In particular, the estimators have closed forms and can be easily determined, and the methods have the advantage of leaving the correlation among gap times arbitrary. A simulation study is conducted for assessing the finite sample performance of the presented methods and suggests that they work well in practical situations. Also the methods are applied to a set of real data from a chronic granulomatous disease (CGD) clinical trial.  相似文献   

11.
The semiparametric accelerated failure time (AFT) model is not as widely used as the Cox relative risk model due to computational difficulties. Recent developments in least squares estimation and induced smoothing estimating equations for censored data provide promising tools to make the AFT models more attractive in practice. For multivariate AFT models, we propose a generalized estimating equations (GEE) approach, extending the GEE to censored data. The consistency of the regression coefficient estimator is robust to misspecification of working covariance, and the efficiency is higher when the working covariance structure is closer to the truth. The marginal error distributions and regression coefficients are allowed to be unique for each margin or partially shared across margins as needed. The initial estimator is a rank-based estimator with Gehan’s weight, but obtained from an induced smoothing approach with computational ease. The resulting estimator is consistent and asymptotically normal, with variance estimated through a multiplier resampling method. In a large scale simulation study, our estimator was up to three times as efficient as the estimateor that ignores the within-cluster dependence, especially when the within-cluster dependence was strong. The methods were applied to the bivariate failure times data from a diabetic retinopathy study.  相似文献   

12.
Pan  Wei  Connett  John E. 《Lifetime data analysis》2001,7(2):111-123
Weextend Wei and Tanner's (1991) multiple imputation approach insemi-parametric linear regression for univariate censored datato clustered censored data. The main idea is to iterate the followingtwo steps: 1) using the data augmentation to impute for censoredfailure times; 2) fitting a linear model with imputed completedata, which takes into consideration of clustering among failuretimes. In particular, we propose using the generalized estimatingequations (GEE) or a linear mixed-effects model to implementthe second step. Through simulation studies our proposal comparesfavorably to the independence approach (Lee et al., 1993), whichignores the within-cluster correlation in estimating the regressioncoefficient. Our proposal is easy to implement by using existingsoftwares.  相似文献   

13.
14.
Marginal Regression of Gaps Between Recurrent Events   总被引:1,自引:0,他引:1  
Recurrent event data typically exhibit the phenomenon of intra-individual correlation, owing to not only observed covariates but also random effects. In many applications, the population may be reasonably postulated as a heterogeneous mixture of individual renewal processes, and the inference of interest is the effect of individual-level covariates. In this article, we suggest and investigate a marginal proportional hazards model for gaps between recurrent events. A connection is established between observed gap times and clustered survival data with informative cluster size. We subsequently construct a novel and general inference procedure for the latter, based on a functional formulation of standard Cox regression. Large-sample theory is established for the proposed estimators. Numerical studies demonstrate that the procedure performs well with practical sample sizes. Application to the well-known bladder tumor data is given as an illustration.  相似文献   

15.
In this article, we consider the order estimation of autoregressive models with incomplete data using the expectation–maximization (EM) algorithm-based information criteria. The criteria take the form of a penalization of the conditional expectation of the log-likelihood. The evaluation of the penalization term generally involves numerical differentiation and matrix inversion. We introduce a simplification of the penalization term for autoregressive model selection and we propose a penalty factor based on a resampling procedure in the criteria formula. The simulation results show the improvements yielded by the proposed method when compared with the classical information criteria for model selection with incomplete data.  相似文献   

16.
Analysis of tidal data via the blockwise bootstrap   总被引:1,自引:0,他引:1  
We analyze tidal data from Port Mansfield, TX, using Kunsch's blockwise bootstrap in the regression setting. In particular, we estimate the variability of parameter estimates in a harmonic analysis via block subsampling of residuals from a least-squares fit. We see that naive least-squares variance estimates can be either too large or too small, depending on the strength of correlation and the design matrix. We argue that the block bootstrap is a simple, omnibus method of accounting for correlation in a regression model with correlated errors.  相似文献   

17.
Classification of high-dimensional data set is a big challenge for statistical learning and data mining algorithms. To effectively apply classification methods to high-dimensional data sets, feature selection is an indispensable pre-processing step of learning process. In this study, we consider the problem of constructing an effective feature selection and classification scheme for data set which has a small number of sample size with a large number of features. A novel feature selection approach, named four-Staged Feature Selection, has been proposed to overcome high-dimensional data classification problem by selecting informative features. The proposed method first selects candidate features with number of filtering methods which are based on different metrics, and then it applies semi-wrapper, union and voting stages, respectively, to obtain final feature subsets. Several statistical learning and data mining methods have been carried out to verify the efficiency of the selected features. In order to test the adequacy of the proposed method, 10 different microarray data sets are employed due to their high number of features and small sample size.  相似文献   

18.
Binary outcome data with small clusters often arise in medical studies and the size of clusters might be informative of the outcome. The authors conducted a simulation study to examine the performance of a range of statistical methods. The simulation results showed that all methods performed mostly comparable in the estimation of covariate effects. However, the standard logistic regression approach that ignores the clustering encountered an undercoverage problem when the degree of clustering was nontrivial. The performance of random-effects logistic regression approach tended to be affected by low disease prevalence, relatively small cluster size, or informative cluster size.  相似文献   

19.
In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, standard survival analysis techniques cannot be directly applied. In addition, among the thousands of genes surveyed, only a subset are disease-associated. Gene selection is needed along with estimation. In this article, we model the relationship between gene expressions and survival using the accelerated failure time (AFT) models. We use the bridge penalization for regularized estimation and gene selection. An efficient iterative computational algorithm is proposed. Tuning parameters are selected using V-fold cross validation. We use a resampling method to evaluate the prediction performance of bridge estimator and the relative stability of identified genes. We show that the proposed bridge estimator is selection consistent under appropriate conditions. Analysis of two lymphoma prognostic studies suggests that the bridge estimator can identify a small number of genes and can have better prediction performance than the Lasso.  相似文献   

20.
Quantile regression is a flexible approach to assessing covariate effects on failure time, which has attracted considerable interest in survival analysis. When the dimension of covariates is much larger than the sample size, feature screening and variable selection become extremely important and indispensable. In this article, we introduce a new feature screening method for ultrahigh dimensional censored quantile regression. The proposed method can work for a general class of survival models, allow for heterogeneity of data and enjoy desirable properties including the sure screening property and the ranking consistency property. Moreover, an iterative version of screening algorithm has also been proposed to accommodate more complex situations. Monte Carlo simulation studies are designed to evaluate the finite sample performance under different model settings. We also illustrate the proposed methods through an empirical analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号