首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
ABSTRACT

This paper analyses the behaviour of the goodness-of-fit tests for regression models. To this end, it uses statistics based on an estimation of the integrated regression function with missing observations either in the response variable or in some of the covariates. It proposes several versions of one empirical process, constructed from a previous estimation, that uses only the complete observations or replaces the missing observations with imputed values. In the case of missing covariates, a link model is used to fill the missing observations with other complete covariates. In all the situations, Bootstrap methodology is used to calibrate the distribution of the test statistics. A broad simulation study compares the different procedures based on empirical regression methodology, with smoothed tests previously studied in the literature. The comparison reflects the effect of the correlation between the covariates in the tests based on the imputed sample for missing covariates. In addition, the paper proposes a computational binning strategy to evaluate the tests based on an empirical process for large data sets. Finally, two applications to real data illustrate the performance of the tests.  相似文献   

2.
Abstract

We propose a novel approach to estimate the Cox model with temporal covariates. Our new approach treats the temporal covariates as arising from a longitudinal process which is modeled jointly with the event time. Different from the literature, the longitudinal process in our model is specified as a bounded variational process and determined by a family of Initial Value Problems associated with an Ordinary Differential Equation. Our specification has the advantage that only the observation of the temporal covariates at the event-time and the event-time itself are needed to fit the model, while it is fine but not necessary to have more longitudinal observations. This fact makes our approach very useful for many medical outcome datasets, such as the SPARCS and NIS, where it is important to find the hazard rate of being discharged given the accumulative cost but only the total cost at the discharge time is available due to the protection of private information. Our estimation procedure is based on maximizing the full information likelihood function. The resulting estimators are shown to be consistent and asymptotically normally distributed. Simulations and a real example illustrate the utility of the proposed model. Finally, a couple of extensions are discussed.  相似文献   

3.

Time-to-event data often violate the proportional hazards assumption inherent in the popular Cox regression model. Such violations are especially common in the sphere of biological and medical data where latent heterogeneity due to unmeasured covariates or time varying effects are common. A variety of parametric survival models have been proposed in the literature which make more appropriate assumptions on the hazard function, at least for certain applications. One such model is derived from the First Hitting Time (FHT) paradigm which assumes that a subject’s event time is determined by a latent stochastic process reaching a threshold value. Several random effects specifications of the FHT model have also been proposed which allow for better modeling of data with unmeasured covariates. While often appropriate, these methods often display limited flexibility due to their inability to model a wide range of heterogeneities. To address this issue, we propose a Bayesian model which loosens assumptions on the mixing distribution inherent in the random effects FHT models currently in use. We demonstrate via simulation study that the proposed model greatly improves both survival and parameter estimation in the presence of latent heterogeneity. We also apply the proposed methodology to data from a toxicology/carcinogenicity study which exhibits nonproportional hazards and contrast the results with both the Cox model and two popular FHT models.

  相似文献   

4.
Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators.  相似文献   

5.
ABSTRACT

Often in data arising out of epidemiologic studies, covariates are subject to measurement error. In addition ordinal responses may be misclassified into a category that does not reflect the true state of the respondents. The goal of the present work is to develop an ordered probit model that corrects for the classification errors in ordinal responses and/or measurement error in covariates. Maximum likelihood method of estimation is used. Simulation study reveals the effect of ignoring measurement error and/or classification errors on the estimates of the regression coefficients. The methodology developed is illustrated through a numerical example.  相似文献   

6.
《Econometric Reviews》2012,31(1):92-109
Abstract

This paper provides several new results on identification of the linear factor model. The model allows for correlated latent factors and dependence among the idiosyncratic errors. I also illustrate identification under a dedicated measurement structure and other reduced rank restrictions. I use these results to study identification in a model with both observed covariates and latent factors. The analysis emphasizes the different roles played by restrictions on the error covariance matrix, restrictions on the factor loadings and the factor covariance matrix, and restrictions on the coefficients on covariates. The identification results are simple, intuitive, and directly applicable to many settings.  相似文献   

7.
Shi  Yushu  Laud  Purushottam  Neuner  Joan 《Lifetime data analysis》2021,27(1):156-176

In this paper, we first propose a dependent Dirichlet process (DDP) model using a mixture of Weibull models with each mixture component resembling a Cox model for survival data. We then build a Dirichlet process mixture model for competing risks data without regression covariates. Next we extend this model to a DDP model for competing risks regression data by using a multiplicative covariate effect on subdistribution hazards in the mixture components. Though built on proportional hazards (or subdistribution hazards) models, the proposed nonparametric Bayesian regression models do not require the assumption of constant hazard (or subdistribution hazard) ratio. An external time-dependent covariate is also considered in the survival model. After describing the model, we discuss how both cause-specific and subdistribution hazard ratios can be estimated from the same nonparametric Bayesian model for competing risks regression. For use with the regression models proposed, we introduce an omnibus prior that is suitable when little external information is available about covariate effects. Finally we compare the models’ performance with existing methods through simulations. We also illustrate the proposed competing risks regression model with data from a breast cancer study. An R package “DPWeibull” implementing all of the proposed methods is available at CRAN.

  相似文献   

8.
Abstract

In this article we propose a new mixed-effects regression model for fractional bounded response variables. Our model allows us to incorporate covariates directly to the expected value, so we can quantify exactly the influence of these covariates in the mean of the variable of interest rather than to the conditional mean. Estimation is carried out from a Bayesian perspective. Due to the complexity of the augmented posterior distribution, we use a Hamiltonian Monte Carlo algorithm, the No-U-Turn sampler, implemented using the Stan software. A simulation study was performed showing that our model has a better performance than other traditional longitudinal models for bounded variables. Finally, we applied our beta-inflated mean mixed-effects regression model to real data which consists of utilization of credit lines in the peruvian financial system.  相似文献   

9.
The present paper is concerned with statistical models for the dependence of survival time or time to occurrence of an event, such as time to tumor, on a vector X of covariates or prognostic variables such as age, sex, blood pressure, length of exposure to a toxic material, etc., measured on a group of individuals in biomedical investigations. It is assumed that the covariates influence the distribution of time to tumor only through a linear predictor μ =βX.

The object of our paper is to investigate the effect due to the covariates on the Life Expectancy and the Percentile Residual Life (PRL) function of a family of organisms under the proportional hazards and the accelerated life models. The key result is that the families of survival distributions under these models have the 'setting the clock back to zero' property if the family of baseline survival distributions does. This property is a generalization of the lack of memory property of the exponential distribution. Simple examples of the members of this family are the linear hazard exponential, Pareto and Gompertz life distributions.

As a simple application of the main results obtained in the present paper, we have considered a stochastic survival model recently proposed by Chiang and Conforti (1989) for the time-to-tumor distribution in the context of a large-scale serial sacrifice experiment by the National Center of Toxicological Research (NCTR). This involves some mice that were fed 2-AAF from infancy and those that developed bladder and/or liver neoplasms, see Farmer et al (1980). It is shown that their stochastic model for tumor incidence intensity at time t leads to a family of survival models that has the setting the clock back to zero property. The survival functions and the effect of the vector X of covariates on the PRL and the tumor-free life expectancies are evaluated for the proportional hazards and accelerated life models.  相似文献   

10.
ABSTRACT

In order to investigate the convergence rate of the asymptotic normality for the estimator of the conditional mode function for the left-truncation model, we derive a Berry–Esseen type bound of the estimator when the lifetime observations with multivariate covariates form a stationary α-mixing sequence. The finite sample performance of the estimator of the conditional mode function is explored through simulations.  相似文献   

11.
Abstract

Type III methods were introduced by SAS to address difficulties in dummy-variable models for effects of multiple factors and covariates. They are widely used in practice; they are the default method in several statistical computing packages. Type III sums of squares (SSs) are defined by a set of instructions; an explicit mathematical formulation does not seem to exist.

An explicit formulation is derived in this paper. It is used to illustrate Type III SSs and to establish their properties in the two-factor ANOVA model.  相似文献   

12.
Abstract

In this article, a new model is presented that is based on the Pareto distribution of the second kind, when the location parameter depends on covariates as well as unobserved heterogeneity. Bayesian analysis of the model can be performed using Markov Chain Monte Carlo techniques. The new procedures are illustrated in the context of artificial data as well as international output data.  相似文献   

13.
Abstract

Goodness-of-fit testing is addressed in the stratified proportional hazards model for survival data. A test statistic based on within-strata cumulative sums of martingale residuals over covariates is proposed and its asymptotic distribution is derived under the null hypothesis of model adequacy. A Monte Carlo procedure is proposed to approximate the critical value of the test. Simulation studies are conducted to examine finite-sample performance of the proposed statistic.  相似文献   

14.
It is often of interest to use regression analysis to study the relationship between occurrence of events in space and spatially-indexed covariates. One model for such regression analysis is the Poisson point process. Here, we develop a method to perform the selection of covariates and the estimation of model parameters simultaneously for this model via a regularization method. We assess the finite-sample properties of our method with a simulation study. In addition, we propose a variant of our method that allows the selection of covariates at multiple pixel resolutions. For illustration, we consider the locations of a tree species, Beilschmiedia pendula, in a study plot at Barro Colorado Island in central Panama. We find that Beilschmiedia pendula occurs in greater abundance at locations with higher elevation and steeper slope. Also, we identify three species to which Beilschmiedia pendula tends to be attracted, two species by which it appears to be repelled, and a species with no apparent relationship.  相似文献   

15.
ABSTRACT

A general class of models for discrete and/or continuous responses is proposed in which joint distributions are constructed via the conditional approach. It is assumed that the distributions of one response and of the other response given the first one belong to exponential family of distributions. Furthermore, the marginal means are related to the covariates by link functions and a dependency structure between the responses is inserted into the model. Estimation methods, diagnostic analysis and a simulation study considering a Bernoulli-exponential model, a particular case of the class, are presented. Finally, this model is used in a real data set.  相似文献   

16.
Lee  Chi Hyun  Ning  Jing  Shen  Yu 《Lifetime data analysis》2019,25(1):79-96

Length-biased data are frequently encountered in prevalent cohort studies. Many statistical methods have been developed to estimate the covariate effects on the survival outcomes arising from such data while properly adjusting for length-biased sampling. Among them, regression methods based on the proportional hazards model have been widely adopted. However, little work has focused on checking the proportional hazards model assumptions with length-biased data, which is essential to ensure the validity of inference. In this article, we propose a statistical tool for testing the assumed functional form of covariates and the proportional hazards assumption graphically and analytically under the setting of length-biased sampling, through a general class of multiparameter stochastic processes. The finite sample performance is examined through simulation studies, and the proposed methods are illustrated with the data from a cohort study of dementia in Canada.

  相似文献   

17.
Abstract

Sliced average variance estimation (SAVE) is one of the best methods for estimating central dimension-reduction subspace in semi parametric regression models when covariates are normal. In recent days SAVE is being used to analyze DNA microarray data especially in tumor classification but most important drawback is normality of covariates. In this article, the asymptotic behavior of estimates of CDR space under varying slice size is studied through simulation studies when covariates are non normal but follows linearity condition as well as when covariates slightly perturbed from normal distribution and we observed that serious error may occur under violation normality assumption.  相似文献   

18.
ABSTRACT

We consider the variance estimation in a general nonparametric regression model with multiple covariates. We extend difference methods to the multivariate setting by introducing an algorithm that orders the design points in higher dimensions. We also consider an adaptive difference estimator which requires much less strict assumptions on the covariate design and can significantly reduce mean squared error for small sample sizes.  相似文献   

19.
ABSTRACT

In this article, we propose a new distribution by mixing normal and Pareto distributions, and the new distribution provides an unusual hazard function. We model the mean and the variance with covariates for heterogeneity. Estimation of the parameters is obtained by the Bayesian method using Markov Chain Monte Carlo (MCMC) algorithms. Proposal distribution in MCMC is proposed with a defined working variable related to the observations. Through the simulation, the method shows a dependable performance of the model. We demonstrate through establishing model under a real dataset that the proposed model and method can be more suitable than the previous report.  相似文献   

20.
ABSTRACT

This paper attempts to model the development of children's reading skills using the negative exponential curve with mixed effects model. The model describes the nature of growth in children's reading skills and accounts for intra-individual and inter-individual variations. In addition, we propose methods including cross-validation, regression, and graphing to determine an appropriate curve for the data, to find good initial values for parameters, and to select potential covariates. We illustrate with an example that motivated this research: a longitudinal study of academic reading skills from grade 1 to grade 12 in Connecticut public schools.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号