期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Nonparametric estimation in the illness-death model using prevalent data

Bella Vakulenko-Lagun Micha Mandel Yair Goldberg 《Lifetime data analysis》2017,23(1):25-56

We study nonparametric estimation of the illness-death model using left-truncated and right-censored data. The general aim is to estimate the multivariate distribution of a progressive multi-state process. Maximum likelihood estimation under censoring suffers from problems of uniqueness and consistency, so instead we review and extend methods that are based on inverse probability weighting. For univariate left-truncated and right-censored data, nonparametric maximum likelihood estimation can be considerably improved when exploiting knowledge on the truncation distribution. We aim to examine the gain in using such knowledge for inverse probability weighting estimators in the illness-death framework. Additionally, we compare the weights that use truncation variables with the weights that integrate them out, showing, by simulation, that the latter performs more stably and efficiently. We apply the methods to intensive care units data collected in a cross-sectional design, and discuss how the estimators can be easily modified to more general multi-state models. 相似文献

2.

On computing standard errors for marginal structural Cox models

R. Ayesha Ali M. Adnan Ali Zhe Wei 《Lifetime data analysis》2014,20(1):106-131

In recent decades, marginal structural models have gained popularity for proper adjustment of time-dependent confounders in longitudinal studies through time-dependent weighting. When the marginal model is a Cox model, using current standard statistical software packages was thought to be problematic because they were not developed to compute standard errors in the presence of time-dependent weights. We address this practical modelling issue by extending the standard calculations for Cox models with case weights to time-dependent weights and show that the coxph procedure in R can readily compute asymptotic robust standard errors. Through a simulation study, we show that the robust standard errors are rather conservative, though corresponding confidence intervals have good coverage. A second contribution of this paper is to introduce a Cox score bootstrap procedure to compute the standard errors. We show that this method is efficient and tends to outperform the non-parametric bootstrap in small samples. 相似文献

3.

ESTIMATION PROCEDURES FOR CATEGORICAL SURVEY DATA WITH NONIGNORABLE NONRESPONSE

《统计学通讯:理论与方法》2013,42(4):643-663

We consider surveys with one or more callbacks and use a series of logistic regressions to model the probabilities of nonresponse at first contact and subsequent callbacks. These probabilities are allowed to depend on covariates as well as the categorical variable of interest and so the nonresponse mechanism is nonignorable. Explicit formulae for the score functions and information matrices are given for some important special cases to facilitate implementation of the method of scoring for obtaining maximum likelihood estimates of the model parameters. For estimating finite population quantities, we suggest the imputation and prediction approaches as alternatives to weighting adjustment. Simulation results suggest that the proposed methods work well in reducing the bias due to nonresponse. In our study, the imputation and prediction approaches perform better than weighting adjustment and they continue to perform quite well in simulations involving misspecified response models. 相似文献

4.

A Contrasting Study of Likelihood Methods for the Analysis of Longitudinal Binary Data

Weiming Yang 《统计学通讯:理论与方法》2014,43(14):3027-3046

相似文献

5.

Focused information criterion and model averaging based on weighted composite quantile regression

Ganggang Xu Suojin Wang Jianhua Z. Huang 《Scandinavian Journal of Statistics》2014,41(2):365-381

We study the focused information criterion and frequentist model averaging and their application to post‐model‐selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non‐parametric functions approximated by polynomial splines, we show that, under certain conditions, the asymptotic distribution of the frequentist model averaging WCQR‐estimator of a focused parameter is a non‐linear mixture of normal distributions. This asymptotic distribution is used to construct confidence intervals that achieve the nominal coverage probability. With properly chosen weights, the focused information criterion based WCQR estimators are not only robust to outliers and non‐normal residuals but also can achieve efficiency close to the maximum likelihood estimator, without assuming the true error distribution. Simulation studies and a real data analysis are used to illustrate the effectiveness of the proposed procedure. 相似文献

6.

Case-cohort analysis with semiparametric transformation models

Yi-Hau Chen David M. Zucker 《Journal of statistical planning and inference》2009

Semiparametric transformation models provide flexible regression models for survival analysis, including the Cox proportional hazards and the proportional odds models as special cases. We consider the application of semiparametric transformation models in case-cohort studies, where the covariate data are observed only on cases and on a subcohort randomly sampled from the full cohort. We first propose an approximate profile likelihood approach with full-cohort data, which amounts to the pseudo-partial likelihood approach of Zucker [2005. A pseudo-partial likelihood method for semiparametric survival regression with covariate errors. J. Amer. Statist. Assoc. 100, 1264–1277]. Simulation results show that our proposal is almost as efficient as the nonparametric maximum likelihood estimator. We then extend this approach to the case-cohort design, applying the Horvitz–Thompson weighting method to the estimating equations from the approximated profile likelihood. Two levels of weights can be utilized to achieve unbiasedness and to gain efficiency. The resulting estimator has a closed-form asymptotic covariance matrix, and is found in simulations to be substantially more efficient than the estimator based on martingale estimating equations. The extension to left-truncated data will be discussed. We illustrate the proposed method on data from a cardiovascular risk factor study conducted in Taiwan. 相似文献

7.

Delayed treatment effects,treatment switching and heterogeneous patient populations: How to design and analyze RCTs in oncology

Robin Ristl Nicolás M Ballarini Heiko Götte Armin Schüler Martin Posch Franz König 《Pharmaceutical statistics》2021,20(1):129-145

In the analysis of survival times, the logrank test and the Cox model have been established as key tools, which do not require specific distributional assumptions. Under the assumption of proportional hazards, they are efficient and their results can be interpreted unambiguously. However, delayed treatment effects, disease progression, treatment switchers or the presence of subgroups with differential treatment effects may challenge the assumption of proportional hazards. In practice, weighted logrank tests emphasizing either early, intermediate or late event times via an appropriate weighting function may be used to accommodate for an expected pattern of non-proportionality. We model these sources of non-proportional hazards via a mixture of survival functions with piecewise constant hazard. The model is then applied to study the power of unweighted and weighted log-rank tests, as well as maximum tests allowing different time dependent weights. Simulation results suggest a robust performance of maximum tests across different scenarios, with little loss in power compared to the most powerful among the considered weighting schemes and huge power gain compared to unfavorable weights. The actual sources of non-proportional hazards are not obvious from resulting populationwise survival functions, highlighting the importance of detailed simulations in the planning phase of a trial when assuming non-proportional hazards.We provide the required tools in a software package, allowing to model data generating processes under complex non-proportional hazard scenarios, to simulate data from these models and to perform the weighted logrank tests. 相似文献

8.

On minimax designs when there are two candidate models

《Journal of Statistical Computation and Simulation》2012,82(11):841-862

This work is motivated by the need to find experimental designs which are robust under different model assumptions. We measure robustness by calculating a measure of design efficiency with respect to a design optimality criterion and say that a design is robust if it is reasonably efficient under different model scenarios. We discuss two design criteria and an algorithm which can be used to obtain robust designs. The first criterion employs a Bayesian-type approach by putting a prior or weight on each candidate model and possibly priors on the corresponding model parameters. We define the first criterion as the expected value of the design efficiency over the priors. The second design criterion we study is the minimax design which minimizes the worst value of a design criterion over all candidate models. We establish conditions when these two criteria are equivalent when there are two candidate models. We apply our findings to the area of accelerated life testing and perform sensitivity analysis of designs with respect to priors and misspecification of planning values. 相似文献

9.

Robust minimum distance inference based on combined distances

Chanseok Park Ayanendranath Basu Srabashi Basu 《统计学通讯:模拟与计算》2013,42(3):653-673

The minimum disparity estimators proposed by Lindsay (1994) for discrete models form an attractive subclass of minimum distance estimators which achieve their robustness without sacrificing first order efficiency at the model. Similarly, disparity test statistics are useful robust alternatives to the likelihood ratio test for testing of hypotheses in parametric models; they are asymptotically equivalent to the likelihood ratio test statistics under the null hypothesis and contiguous alternatives. Despite their asymptotic optimality properties, the small sample performance of many of the minimum disparity estimators and disparity tests can be considerably worse compared to the maximum likelihood estimator and the likelihood ratio test respectively. In this paper we focus on the class of blended weight Hellinger distances, a general subfamily of disparities, and study the effects of combining two different distances within this class to generate the family of “combined” blended weight Hellinger distances, and identify the members of this family which generally perform well. More generally, we investigate the class of "combined and penal-ized" blended weight Hellinger distances; the penalty is based on reweighting the empty cells, following Harris and Basu (1994). It is shown that some members of the combined and penalized family have rather attractive properties 相似文献

10.

The Use of Sample Weights in Hot Deck Imputation

Andridge RR Little RJ 《Journal of official statistics》2009,25(1):21-36

A common strategy for handling item nonresponse in survey sampling is hot deck imputation, where each missing value is replaced with an observed response from a "similar" unit. We discuss here the use of sampling weights in the hot deck. The naive approach is to ignore sample weights in creation of adjustment cells, which effectively imputes the unweighted sample distribution of respondents in an adjustment cell, potentially causing bias. Alternative approaches have been proposed that use weights in the imputation by incorporating them into the probabilities of selection for each donor. We show by simulation that these weighted hot decks do not correct for bias when the outcome is related to the sampling weight and the response propensity. The correct approach is to use the sampling weight as a stratifying variable alongside additional adjustment variables when forming adjustment cells. 相似文献

11.

Improving predictive inference under covariate shift by weighting the log-likelihood function 总被引：1，自引：0，他引：1

Hidetoshi Shimodaira 《Journal of statistical planning and inference》2000,90(2):2091-244

A class of predictive densities is derived by weighting the observed samples in maximizing the log-likelihood function. This approach is effective in cases such as sample surveys or design of experiments, where the observed covariate follows a different distribution than that in the whole population. Under misspecification of the parametric model, the optimal choice of the weight function is asymptotically shown to be the ratio of the density function of the covariate in the population to that in the observations. This is the pseudo-maximum likelihood estimation of sample surveys. The optimality is defined by the expected Kullback–Leibler loss, and the optimal weight is obtained by considering the importance sampling identity. Under correct specification of the model, however, the ordinary maximum likelihood estimate (i.e. the uniform weight) is shown to be optimal asymptotically. For moderate sample size, the situation is in between the two extreme cases, and the weight function is selected by minimizing a variant of the information criterion derived as an estimate of the expected loss. The method is also applied to a weighted version of the Bayesian predictive density. Numerical examples as well as Monte-Carlo simulations are shown for polynomial regression. A connection with the robust parametric estimation is discussed. 相似文献

12.

Fitting survival data with penalized Poisson regression

Aris Perperoglou 《Statistical Methods and Applications》2011,20(4):451-462

Cox’s proportional hazards model is the most common way to analyze survival data. The model can be extended in the presence of collinearity to include a ridge penalty, or in cases where a very large number of coefficients (e.g. with microarray data) has to be estimated. To maximize the penalized likelihood, optimal weights of the ridge penalty have to be obtained. However, there is no definite rule for choosing the penalty weight. One approach suggests maximization of the weights by maximizing the leave-one-out cross validated partial likelihood, however this is time consuming and computationally expensive, especially in large datasets. We suggest modelling survival data through a Poisson model. Using this approach, the log-likelihood of a Poisson model is maximized by standard iterative weighted least squares. We will illustrate this simple approach, which includes smoothing of the hazard function and move on to include a ridge term in the likelihood. We will then maximize the likelihood by considering tools from generalized mixed linear models. We will show that the optimal value of the penalty is found simply by computing the hat matrix of the system of linear equations and dividing its trace by a product of the estimated coefficients. 相似文献

13.

Estimators based on ranks for arma models

Nelida E. Ferretti Diana M. Kelmansky Victor J. Yohai 《统计学通讯:理论与方法》2013,42(12):3879-3907

In this paper we introduce a new family of robust estimators for ARMA models. These estimators are defined by replacing the residual sample autocovariances in the least squares equations by autocovariances based on ranks. The asymptotic normality of the proposed estimators is provided. The efficiency and robustness properties of these estimators are studied. An adequate choice of the score functions gives estimators which have high efficiency under normality and robustness in the presence of outliers. The score functions can also be chosen so that the resulting estimators are asymptotically as efficient as the maximum likelihood estimators for a given distribution. 相似文献

14.

Robust estimation in generalized linear mixed models

Kelvin K. W. Yau & Anthony Y. C. Kuk 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(1):101-117

Generalized linear mixed models (GLMMs) are widely used to analyse non-normal response data with extra-variation, but non-robust estimators are still routinely used. We propose robust methods for maximum quasi-likelihood and residual maximum quasi-likelihood estimation to limit the influence of outlying observations in GLMMs. The estimation procedure parallels the development of robust estimation methods in linear mixed models, but with adjustments in the dependent variable and the variance component. The methods proposed are applied to three data sets and a comparison is made with the nonparametric maximum likelihood approach. When applied to a set of epileptic seizure data, the methods proposed have the desired effect of limiting the influence of outlying observations on the parameter estimates. Simulation shows that one of the residual maximum quasi-likelihood proposals has a smaller bias than those of the other estimation methods. We further discuss the equivalence of two GLMM formulations when the response variable follows an exponential family. Their extensions to robust GLMMs and their comparative advantages in modelling are described. Some possible modifications of the robust GLMM estimation methods are given to provide further flexibility for applying the method. 相似文献

15.

Statistical evidence in contingency tables analysis

M. Kateri N. Balakrishnan 《Journal of statistical planning and inference》2008

The likelihood ratio is used for measuring the strength of statistical evidence. The probability of observing strong misleading evidence along with that of observing weak evidence evaluate the performance of this measure. When the corresponding likelihood function is expressed in terms of a parametric statistical model that fails, the likelihood ratio retains its evidential value if the likelihood function is robust [Royall, R., Tsou, T.S., 2003. Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions. J. Roy. Statist. Soc. Ser. B 65, 391–404]. In this paper, we extend the theory of Royall and Tsou [2003. Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions. J. Roy. Statist. Soc., Ser. B 65, 391–404] to the case when the assumed working model is a characteristic model for two-way contingency tables (the model of independence, association and correlation models). We observe that association and correlation models are not equivalent in terms of statistical evidence. The association models are bounded by the maximum of the bump function while the correlation models are not. 相似文献

16.

Second-order least squares estimation of censored regression models

Taraneh Abarin Liqun Wang 《Journal of statistical planning and inference》2009

This paper proposes the second-order least squares estimation, which is an extension of the ordinary least squares method, for censored regression models where the error term has a general parametric distribution (not necessarily normal). The strong consistency and asymptotic normality of the estimator are derived under fairly general regularity conditions. We also propose a computationally simpler estimator which is consistent and asymptotically normal under the same regularity conditions. Finite sample behavior of the proposed estimators under both correctly and misspecified models are investigated through Monte Carlo simulations. The simulation results show that the proposed estimator using optimal weighting matrix performs very similar to the maximum likelihood estimator, and the estimator with the identity weight is more robust against the misspecification. 相似文献

17.

SELECTION OF WEIGHTS FOR WEIGHTED MODEL AVERAGING

Paul H. Garthwaite Emmanuel Mubwandarikwa 《Australian & New Zealand Journal of Statistics》2010,52(4):363-382

We address the task of choosing prior weights for models that are to be used for weighted model averaging. Models that are very similar should usually be given smaller weights than models that are quite distinct. Otherwise, the importance of a model in the weighted average could be increased by augmenting the set of models with duplicates of the model or virtual duplicates of it. Similarly, the importance of a particular model feature (a certain covariate, say) could be exaggerated by including many models with that feature. Ways of forming a correlation matrix that reflects the similarity between models are suggested. Then, weighting schemes are proposed that assign prior weights to models on the basis of this matrix. The weighting schemes give smaller weights to models that are more highly correlated. Other desirable properties of a weighting scheme are identified, and we examine the extent to which these properties are held by the proposed methods. The weighting schemes are applied to real data, and prior weights, posterior weights and Bayesian model averages are determined. For these data, empirical Bayes methods were used to form the correlation matrices that yield the prior weights. Predictive variances are examined, as empirical Bayes methods can result in unrealistically small variances. 相似文献

18.

Using number of failed contact attempts to adjust for non-ignorable non-response

Angela M. Wood Ian R. White Matthew Hotopf 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(3):525-542

Summary. We present a general method of adjustment for non-ignorable non-response in studies where one or more further attempts are made to contact initial non-responders. A logistic regression model relates the probability of response at each contact attempt to covariates and outcomes of interest. We assume that the effect of these covariates and outcomes on the probability of response is the same at all contact attempts. Knowledge of the number of contact attempts enables estimation of the model by using only information from the respondents and the number of non-responders. Three approaches for fitting the response models and estimating parameters of substantive interest and their standard errors are compared: a modified conditional likelihood method in which the fitted inverse probabilities of response are used in weighted analyses for the outcomes of interest, an EM procedure with the Louis formula and a Bayesian approach using Markov chain Monte Carlo methods. We further propose the creation of several sets of weights to incorporate uncertainty in the probability weights in subsequent analyses. Our methods are applied as a sensitivity analysis to a postal survey of symptoms in Persian Gulf War veterans and other servicemen. 相似文献

19.

A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons

David R. Bickel 《Revue canadienne de statistique》2011,39(4):610-631

The normalized maximum likelihood (NML) is a recent penalized likelihood that has properties that justify defining the amount of discrimination information (DI) in the data supporting an alternative hypothesis over a null hypothesis as the logarithm of an NML ratio, namely, the alternative hypothesis NML divided by the null hypothesis NML. The resulting DI, like the Bayes factor but unlike the P‐value, measures the strength of evidence for an alternative hypothesis over a null hypothesis such that the probability of misleading evidence vanishes asymptotically under weak regularity conditions and such that evidence can support a simple null hypothesis. Instead of requiring a prior distribution, the DI satisfies a worst‐case minimax prediction criterion. Replacing a (possibly pseudo‐) likelihood function with its weighted counterpart extends the scope of the DI to models for which the unweighted NML is undefined. The likelihood weights leverage side information, either in data associated with comparisons other than the comparison at hand or in the parameter value of a simple null hypothesis. Two case studies, one involving multiple populations and the other involving multiple biological features, indicate that the DI is robust to the type of side information used when that information is assigned the weight of a single observation. Such robustness suggests that very little adjustment for multiple comparisons is warranted if the sample size is at least moderate. The Canadian Journal of Statistics 39: 610–631; 2011. © 2011 Statistical Society of Canada 相似文献

20.

Mixed Graphical Models with Missing Data and the Partial Imputation EM Algorithm 总被引：2，自引：0，他引：2

Zhi Geng Kang Wan & Feng Tao 《Scandinavian Journal of Statistics》2000,27(3):433-444

In this paper we discuss graphical models for mixed types of continuous and discrete variables with incomplete data. We use a set of hyperedges to represent an observed data pattern. A hyperedge is a set of variables observed for a group of individuals. In a mixed graph with two types of vertices and two types of edges, dots and circles represent discrete and continuous variables respectively. A normal graph represents a graphical model and a hypergraph represents an observed data pattern. In terms of the mixed graph, we discuss decomposition of mixed graphical models with incomplete data, and we present a partial imputation method which can be used in the EM algorithm and the Gibbs sampler to speed their convergence. For a given mixed graphical model and an observed data pattern, we try to decompose a large graph into several small ones so that the original likelihood can be factored into a product of likelihoods with distinct parameters for small graphs. For the case that a graph cannot be decomposed due to its observed data pattern, we can impute missing data partially so that the graph can be decomposed. 相似文献