期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On maximum likelihood estimation of the semi-parametric Cox model with time-varying covariates

Mark Thackham Jun Ma 《Journal of applied statistics》2020,47(9):1511

Including time-varying covariates is a popular extension to the Cox model and a suitable approach for dealing with non-proportional hazards. However, partial likelihood (PL) estimation of this model has three shortcomings: (i) estimated regression coefficients can be less accurate in small samples with heavy censoring; (ii) the baseline hazard is not directly estimated and (iii) a covariance matrix for both the regression coefficients and the baseline hazard is not easily produced.We address these by developing a maximum likelihood (ML) approach to jointly estimate regression coefficients and baseline hazard using a constrained optimisation ensuring the latter''s non-negativity. We demonstrate asymptotic properties of these estimates and show via simulation their increased accuracy compared to PL estimates in small samples and show our method produces smoother baseline hazard estimates than the Breslow estimator.Finally, we apply our method to two examples, including an important real-world financial example to estimate time to default for retail home loans. We demonstrate using our ML estimate for the baseline hazard can give much clearer corroboratory evidence of the ‘humped hazard’, whereby the risk of loan default rises to a peak and then later falls. 相似文献

2.

Maximum Penalized Likelihood Estimation in a Gamma-Frailty Model

Rondeau V Commenges D Joly P 《Lifetime data analysis》2003,9(2):139-153

The shared frailty models allow for unobserved heterogeneity or for statistical dependence between observed survival data. The most commonly used estimation procedure in frailty models is the EM algorithm, but this approach yields a discrete estimator of the distribution and consequently does not allow direct estimation of the hazard function. We show how maximum penalized likelihood estimation can be applied to nonparametric estimation of a continuous hazard function in a shared gamma-frailty model with right-censored and left-truncated data. We examine the problem of obtaining variance estimators for regression coefficients, the frailty parameter and baseline hazard functions. Some simulations for the proposed estimation procedure are presented. A prospective cohort (Paquid) with grouped survival data serves to illustrate the method which was used to analyze the relationship between environmental factors and the risk of dementia. 相似文献

3.

Inference in multivariate linear regression models with elliptically distributed errors

M. Qamarul Islam Fetih Yildirim Mehmet Yazici 《Journal of applied statistics》2014,41(8):1746-1766

In this study we investigate the problem of estimation and testing of hypotheses in multivariate linear regression models when the errors involved are assumed to be non-normally distributed. We consider the class of heavy-tailed distributions for this purpose. Although our method is applicable for any distribution in this class, we take the multivariate t-distribution for illustration. This distribution has applications in many fields of applied research such as Economics, Business, and Finance. For estimation purpose, we use the modified maximum likelihood method in order to get the so-called modified maximum likelihood estimates that are obtained in a closed form. We show that these estimates are substantially more efficient than least-square estimates. They are also found to be robust to reasonable deviations from the assumed distribution and also many data anomalies such as the presence of outliers in the sample, etc. We further provide test statistics for testing the relevant hypothesis regarding the regression coefficients. 相似文献

4.

Bootstrapping Regression Parameters in Multivariate Survival Analysis

Loughin Thomas M. Koehler Kenneth J. 《Lifetime data analysis》1997,3(2):157-177

Bootstrap methods are proposed for estimating sampling distributions and associated statistics for regression parameters in multivariate survival data. We use an Independence Working Model (IWM) approach, fitting margins independently, to obtain consistent estimates of the parameters in the marginal models. Resampling procedures, however, are applied to an appropriate joint distribution to estimate covariance matrices, make bias corrections, and construct confidence intervals. The proposed methods allow for fixed or random explanatory variables, the latter case using extensions of existing resampling schemes (Loughin,1995), and they permit the possibility of random censoring. An application is shown for the viral positivity time data previously analyzed by Wei, Lin, and Weissfeld (1989). A simulation study of small-sample properties shows that the proposed bootstrap procedures provide substantial improvements in variance estimation over the robust variance estimator commonly used with the IWM. This revised version was published online in July 2006 with corrections to the Cover Date. 相似文献

5.

On multivariate quantile regression

Biman Chakraborty 《Journal of statistical planning and inference》2003,110(1-2):109-132

To detect the dependence on the covariates in the lower and upper tails of the response distribution, regression quantiles are very useful tools in linear model problems with univariate response. We consider here a notion of regression quantiles for problems with multivariate responses. The approach is based on minimizing a loss function equivalent to that in the case of univariate response. To construct an affine equivariant notion of multivariate regression quantiles, we have considered a transformation retransformation procedure based on ‘data-driven coordinate systems’. We indicate some algorithm to compute the proposed estimates and establish asymptotic normality for them. We also, suggest an adaptive procedure to select the optimal data-driven coordinate system. We discuss the performance of our estimates with the help of a finite sample simulation study and to illustrate our methodology, we analyzed an interesting data-set on blood pressures of a group of women and another one on the dependence of sales performances on creative test scores. 相似文献

6.

Efficient estimation for the proportional hazards model with bivariate current status data

Wang L Sun J Tong X 《Lifetime data analysis》2008,14(2):134-153

We consider efficient estimation of regression and association parameters jointly for bivariate current status data with the marginal proportional hazards model. Current status data occur in many fields including demographical studies and tumorigenicity experiments and several approaches have been proposed for regression analysis of univariate current status data. We discuss bivariate current status data and propose an efficient score estimation approach for the problem. In the approach, the copula model is used for joint survival function with the survival times assumed to follow the proportional hazards model marginally. Simulation studies are performed to evaluate the proposed estimates and suggest that the approach works well in practical situations. A real life data application is provided for illustration. 相似文献

7.

Cocaine Dependence Treatment Data: Methods for Measurement Error Problems With Predictors Derived From Stationary Stochastic Processes

Guan Y Li Y Sinha R 《Journal of the American Statistical Association》2011,106(493):480-493

In a cocaine dependence treatment study, we use linear and nonlinear regression models to model posttreatment cocaine craving scores and first cocaine relapse time. A subset of the covariates are summary statistics derived from baseline daily cocaine use trajectories, such as baseline cocaine use frequency and average daily use amount. These summary statistics are subject to estimation error and can therefore cause biased estimators for the regression coefficients. Unlike classical measurement error problems, the error we encounter here is heteroscedastic with an unknown distribution, and there are no replicates for the error-prone variables or instrumental variables. We propose two robust methods to correct for the bias: a computationally efficient method-of-moments-based method for linear regression models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear regression models. Simulations and an application to the cocaine dependence treatment data are used to illustrate the efficacy of the proposed methods. Asymptotic theory and variance estimation for the proposed subsampling extrapolation method and some additional simulation results are described in the online supplementary material. 相似文献

8.

General partially linear varying-coefficient transformation model with right censored data

Jianbo Li Riquan Zhang 《Journal of statistical planning and inference》2012

In this paper, a unified maximum marginal likelihood estimation procedure is proposed for the analysis of right censored data using general partially linear varying-coefficient transformation models (GPLVCTM), which are flexible enough to include many survival models as its special cases. Unknown functional coefficients in the models are approximated by cubic B-spline polynomial. We estimate B-spline coefficients and regression parameters by maximizing marginal likelihood function. One advantage of this procedure is that it is free of both baseline and censoring distribution. Through simulation studies and a real data application (VA data from the Veteran's Administration Lung Cancer Study Clinical Trial), we illustrate that the proposed estimation procedure is accurate, stable and practical. 相似文献

9.

Partial least squares Cox regression for genome-wide data

Nygård S Borgan O Lingjaerde OC Størvold HL 《Lifetime data analysis》2008,14(2):179-195

Most methods for survival prediction from high-dimensional genomic data combine the Cox proportional hazards model with some technique of dimension reduction, such as partial least squares regression (PLS). Applying PLS to the Cox model is not entirely straightforward, and multiple approaches have been proposed. The method of Park et al. (Bioinformatics 18(Suppl. 1):S120–S127, 2002) uses a reformulation of the Cox likelihood to a Poisson type likelihood, thereby enabling estimation by iteratively reweighted partial least squares for generalized linear models. We propose a modification of the method of park et al. (2002) such that estimates of the baseline hazard and the gene effects are obtained in separate steps. The resulting method has several advantages over the method of park et al. (2002) and other existing Cox PLS approaches, as it allows for estimation of survival probabilities for new patients, enables a less memory-demanding estimation procedure, and allows for incorporation of lower-dimensional non-genomic variables like disease grade and tumor thickness. We also propose to combine our Cox PLS method with an initial gene selection step in which genes are ordered by their Cox score and only the highest-ranking k% of the genes are retained, obtaining a so-called supervised partial least squares regression method. In simulations, both the unsupervised and the supervised version outperform other Cox PLS methods. 相似文献

10.

Efficient estimation of variance components in nonparametric mixed-effects models with large samples

Nathaniel E. Helwig 《Statistics and Computing》2016,26(6):1319-1336

Linear mixed-effects (LME) regression models are a popular approach for analyzing correlated data. Nonparametric extensions of the LME regression model have been proposed, but the heavy computational cost makes these extensions impractical for analyzing large samples. In particular, simultaneous estimation of the variance components and smoothing parameters poses a computational challenge when working with large samples. To overcome this computational burden, we propose a two-stage estimation procedure for fitting nonparametric mixed-effects regression models. Our results reveal that, compared to currently popular approaches, our two-stage approach produces more accurate estimates that can be computed in a fraction of the time. 相似文献

11.

An estimating equation for parametric shared frailty models with marginal additive hazards

Christian Bressen Pipper Torben Martinussen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2004,66(1):207-220

Summary. Multivariate failure time data arise when data consist of clusters in which the failure times may be dependent. A popular approach to such data is the marginal proportional hazards model with estimation under the working independence assumption. In some contexts, however, it may be more reasonable to use the marginal additive hazards model. We derive asymptotic properties of the Lin and Ying estimators for the marginal additive hazards model for multivariate failure time data. Furthermore we suggest estimating equations for the regression parameters and association parameters in parametric shared frailty models with marginal additive hazards by using the Lin and Ying estimators. We give the large sample properties of the estimators arising from these estimating equations and investigate their small sample properties by Monte Carlo simulation. A real example is provided for illustration. 相似文献

12.

Copula-based predictions in small area estimation

Kanika Grover Elif F. Acar Mahmoud Torabi 《Revue canadienne de statistique》2020,48(4):685-711

Unit-level regression models are commonly used in small area estimation (SAE) to obtain an empirical best linear unbiased prediction of small area characteristics. The underlying assumptions of these models, however, may be unrealistic in some applications. Previous work developed a copula-based SAE model where the empirical Kendall's tau was used to estimate the dependence between two units from the same area. In this article, we propose a likelihood framework to estimate the intra-class dependence of the multivariate exchangeable copula for the empirical best unbiased prediction (EBUP) of small area means. One appeal of the proposed approach lies in its accommodation of both parametric and semi-parametric estimation approaches. Under each estimation method, we further propose a bootstrap approach to obtain a nearly unbiased estimator of the mean squared prediction error of the EBUP of small area means. The performance of the proposed methods is evaluated through simulation studies and also by a real data application. 相似文献

13.

Composite likelihood estimation in multivariate data analysis

Yinshan Zhao Harry Joe 《Revue canadienne de statistique》2005,33(3):335-356

The authors propose two composite likelihood estimation procedures for multivariate models with regression/univariate and dependence parameters. One is a two‐stage method based on both univariate and bivariate margins. The other estimates all the parameters simultaneously based on bivariate margins. For some special cases, the authors compare their asymptotic efficiencies with the maximum likelihood method. The performance of the two methods is reasonable, except that the first procedure is inefficient for the regression parameters under strong dependence. The second approach is generally better for the regression parameters, but less efficient for the dependence parameters under weak dependence. 相似文献

14.

George W. Snedecor,Pioneer Statistician, 1881–1974

Robert V. Hogg 《The American statistician》2013,67(3):108-109

Users of statistical packages need to be aware of the influence that outlying data points can have on their statistical analyses. Robust procedures provide formal methods to spot these outliers and reduce their influence. Although a few robust procedures are mentioned in this article, one is emphasized; it is motivated by maximum likelihood estimation to make it seem more natural. Use of this procedure in regression problems is considered in some detail, and an approximate error structure is stated for the robust estimates of the regression coefficients. A few examples are given. A suggestion of how these techniques should be implemented in practice is included. 相似文献

15.

Parameter estimation for multivariate diffusion processes with the time inhomogeneously positive semidefinite diffusion matrix

Xiu-Li Du Jin-Guan Lin Xiu-Qing Zhou 《统计学通讯:理论与方法》2017,46(22):11010-11025

Statistical inference for the diffusion coefficients of multivariate diffusion processes has been well established in recent years; however, it is not the case for the drift coefficients. Furthermore, most existing estimation methods for the drift coefficients are proposed under the assumption that the diffusion matrix is positive definite and time homogeneous. In this article, we put forward two estimation approaches for estimating the drift coefficients of the multivariate diffusion models with the time inhomogeneously positive semidefinite diffusion matrix. They are maximum likelihood estimation methods based on both the martingale representation theorem and conditional characteristic functions and the generalized method of moments based on conditional characteristic functions, respectively. Consistency and asymptotic normality of the generalized method of moments estimation are also proved in this article. Simulation results demonstrate that these methods work well. 相似文献

16.

A graphical evaluation of logistic ridge estimator in mixture experiments

Kadri Ulas Akay 《Journal of applied statistics》2014,41(6):1217-1232

In comparison to other experimental studies, multicollinearity appears frequently in mixture experiments, a special study area of response surface methodology, due to the constraints on the components composing the mixture. In the analysis of mixture experiments by using a special generalized linear model, logistic regression model, multicollinearity causes precision problems in the maximum-likelihood logistic regression estimate. Therefore, effects due to multicollinearity can be reduced to a certain extent by using alternative approaches. One of these approaches is to use biased estimators for the estimation of the coefficients. In this paper, we suggest the use of logistic ridge regression (RR) estimator in the cases where there is multicollinearity during the analysis of mixture experiments using logistic regression. Also, for the selection of the biasing parameter, we use fraction of design space plots for evaluating the effect of the logistic RR estimator with respect to the scaled mean squared error of prediction. The suggested graphical approaches are illustrated on the tumor incidence data set. 相似文献

17.

Rabindra Nath Das Anis Chandra Mukhopadhyay 《Journal of applied statistics》2017,44(5):897-915

In regression analysis, it is assumed that the response (or dependent variable) distribution is Normal, and errors are homoscedastic and uncorrelated. However, in practice, these assumptions are rarely satisfied by a real data set. To stabilize the heteroscedastic response variance, generally, log-transformation is suggested. Consequently, the response variable distribution approaches nearer to the Normal distribution. As a result, the model fit of the data is improved. Practically, a proper (seems to be suitable) transformation may not always stabilize the variance, and the response distribution may not reduce to Normal distribution. The present article assumes that the response distribution is log-normal with compound autocorrelated errors. Under these situations, estimation and testing of hypotheses regarding regression parameters have been derived. From a set of reduced data, we have derived the best linear unbiased estimators of all the regression coefficients, except the intercept which is often unimportant in practice. Unknown correlation parameters have been estimated. In this connection, we have derived a test rule for testing any set of linear hypotheses of the unknown regression coefficients. In addition, we have developed the confidence ellipsoids of a set of estimable functions of regression coefficients. For the fitted regression equation, an index of fit has been proposed. A simulated study illustrates the results derived in this report. 相似文献

18.

Proportional hazards regression with interval-censored and left-truncated data

《Journal of Statistical Computation and Simulation》2012,82(2):264-272

This paper considers the estimation of the regression coefficients in the Cox proportional hazards model with left-truncated and interval-censored data. Using the approaches of Pan [A multiple imputation approach to Cox regression with interval-censored data, Biometrics 56 (2000), pp. 199–203] and Heller [Proportional hazards regression with interval censored data using an inverse probability weight, Lifetime Data Anal. 17 (2011), pp. 373–385], we propose two estimates of the regression coefficients. The first estimate is based on a multiple imputation methodology. The second estimate uses an inverse probability weight to select event time pairs where the ordering is unambiguous. A simulation study is conducted to investigate the performance of the proposed estimators. The proposed methods are illustrated using the Centers for Disease Control and Prevention (CDC) acquired immunodeficiency syndrome (AIDS) Blood Transfusion Data. 相似文献

19.

Modelling a non-stationary BINAR(1) Poisson process

《Journal of Statistical Computation and Simulation》2012,82(15):3106-3126

ABSTRACT

Non-stationarity in bivariate time series of counts may be induced by a number of time-varying covariates affecting the bivariate responses due to which the innovation terms of the individual series as well as the bivariate dependence structure becomes non-stationary. So far, in the existing models, the innovation terms of individual INAR(1) series and the dependence structure are assumed to be constant even though the individual time series are non-stationary. Under this assumption, the reliability of the regression and correlation estimates is questionable. Besides, the existing estimation methodologies such as the conditional maximum likelihood (CMLE) and the composite likelihood estimation are computationally intensive. To address these issues, this paper proposes a BINAR(1) model where the innovation series follow a bivariate Poisson distribution under some non-stationary distributional assumptions. The method of generalized quasi-likelihood (GQL) is used to estimate the regression effects while the serial and bivariate correlations are estimated using a robust moment estimation technique. The application of model and estimation method is made in the simulated data. The GQL method is also compared with the CMLE, generalized method of moments (GMM) and generalized estimating equation (GEE) approaches where through simulation studies, it is shown that GQL yields more efficient estimates than GMM and equally or slightly more efficient estimates than CMLE and GEE. 相似文献

20.

Bias corrected estimates in multivariate student t regression models

Klaus L. P. Vasconcellos Gauss M. Cordeiro 《统计学通讯:理论与方法》2013,42(4):797-822

The t distribution has proved to be a useful alternative to the normal distribution especially When robust estimation is desired. We consider the multivariate nonlinear Student-t regression model and show that the biased of the estimates of the regression coefficients can be computed from an auxiliary generalized linear regression. We give a formula for the biases of the estimates of the parameters in the scale matrix, which also can be computed by means of a generalized linear regression. We briefly discuss some important special cases and present simulation results which indicate that our bias-corrected estimates outperform the uncorrected ones in small samples. 相似文献