首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
The properties of robust M-estimators with type II censored failure time data are considered. The optimal members within two classes of ψ-functions are characterized. The first optimality result is the censored data analogue of the optimality result described in Hampel et al. (1986); the estimators corresponding to the optimal members within this class are referred to as the optimal robust estimators. The second result pertains to a restricted class of ψ-functions which is the analogue of the class of ψ-functions considered in James (1986) for randomly censored data; the estimators corresponding to the optimal members within this restricted class are referred to as the optimal James-type estimators. We examine the usefulness of the two classes of ψ-functions and find that the breakdown point and efficiency of the optimal James-type estimators compare favourably with those of the corresponding optimal robust estimators. From the computational point of view, the optimal James-type ψ-functions are readily obtainable from the optimal ψ-functions in the uncensored case. The ψ-functions for the optimal robust estimators require a separate algorithm which is provided. A data set illustrates the optimal robust estimators for the parameters of the extreme value distribution.  相似文献   

2.
Learning classification trees   总被引:11,自引:0,他引:11  
Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. This paper outlines how a tree learning algorithm can be derived using Bayesian statistics. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule is similar to Quinlan's information gain, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach,c4 (Quinlanet al., 1987) andcart (Breimanet al., 1984), show that the full Bayesian algorithm can produce more accurate predictions than versions of these other approaches, though pays a computational price.  相似文献   

3.
When θ is a multidimensional parameter, the issue of prior dependence or independence of coordinates is a serious concern. This is especially true in robust Bayesian analysis; Lavine et al. (J. Amer. Statist. Assoc.86, 964–971 (1991)) show that allowing a wide range of prior dependencies among coordinates can result in near vacuous conclusions. It is sometimes possible, however, to make confidently the judgement that the coordinates of θ are independent a priori and, when this can be done, robust Bayesian conclusions improve dramatically. In this paper, it is shown how to incorporate the independence assumption into robust Bayesian analysis involving -contamination and density band classes of priors. Attention is restricted to the case θ = (θ1, θ2) for clarity, although the ideas generalize.  相似文献   

4.
The generalized doubly robust estimator is proposed for estimating the average treatment effect (ATE) of multiple treatments based on the generalized propensity score (GPS). In medical researches where observational studies are conducted, estimations of ATEs are usually biased since the covariate distributions could be unbalanced among treatments. To overcome this problem, Imbens [The role of the propensity score in estimating dose-response functions, Biometrika 87 (2000), pp. 706–710] and Feng et al. [Generalized propensity score for estimating the average treatment effect of multiple treatments, Stat. Med. (2011), in press. Available at: http://onlinelibrary.wiley.com/doi/10.1002/sim.4168/abstract] proposed weighted estimators that are extensions of a ratio estimator based on GPS to estimate ATEs with multiple treatments. However, the ratio estimator always produces a larger empirical sample variance than the doubly robust estimator, which estimates an ATE between two treatments based on the estimated propensity score (PS). We conduct a simulation study to compare the performance of our proposed estimator with Imbens’ and Feng et al.’s estimators, and simulation results show that our proposed estimator outperforms their estimators in terms of bias, empirical sample variance and mean-squared error of the estimated ATEs.  相似文献   

5.
Most biomedical research is carried out using longitudinal studies. The method of generalized estimating equations (GEEs) introduced by Liang and Zeger [Longitudinal data analysis using generalized linear models, Biometrika 73 (1986), pp. 13–22] and Zeger and Liang [Longitudinal data analysis for discrete and continuous outcomes, Biometrics 42 (1986), pp. 121–130] has become a standard method for analyzing non-normal longitudinal data. Since then, a large variety of GEEs have been proposed. However, the model diagnostic problem has not been explored intensively. Oh et al. [Modeldiagnostic plots for repeated measures data using the generalized estimating equations approach, Comput. Statist. Data Anal. 53 (2008), pp. 222–232] proposed residual plots based on the quantile–quantile (Q–Q) plots of the χ2-distribution for repeated-measures data using the GEE methodology. They considered the Pearson, Anscombe and deviance residuals. In this work, we propose to extend this graphical diagnostic using a generalized residual. A simulation study is presented as well as two examples illustrating the proposed generalized Q–Q plots.  相似文献   

6.
The Breusch–Godfrey LM test is one of the most popular tests for autocorrelation. However, it has been shown that the LM test may be erroneous when there exist heteroskedastic errors in a regression model. Recently, remedies have been proposed by Godfrey and Tremayne [9] and Shim et al. [21]. This paper suggests three wild-bootstrapped variance-ratio (WB-VR) tests for autocorrelation in the presence of heteroskedasticity. We show through a Monte Carlo simulation that our WB-VR tests have better small sample properties and are robust to the structure of heteroskedasticity.  相似文献   

7.

Suppose that an order restriction is imposed among several p-variate normal mean vectors. We are interested in the problems of estimating these mean vectors and testing their homogeneity under this restriction. These problems are multivariate extensions of Bartholomew's (1959) ones. For the bivariate case, these problems have been studied by Sasabuchi et al. (1983) and (1998) and some others. In the present paper we examine the convergence of an iterative algorithm for computing the maximum likelihood estimator when p is larger than two. We also study some test procedures for testing homogeneity when p is larger than two.  相似文献   

8.
The demand for reliable statistics in subpopulations, when only reduced sample sizes are available, has promoted the development of small area estimation methods. In particular, an approach that is now widely used is based on the seminal work by Battese et al. [An error-components model for prediction of county crop areas using survey and satellite data, J. Am. Statist. Assoc. 83 (1988), pp. 28–36] that uses linear mixed models (MM). We investigate alternatives when a linear MM does not hold because, on one side, linearity may not be assumed and/or, on the other, normality of the random effects may not be assumed. In particular, Opsomer et al. [Nonparametric small area estimation using penalized spline regression, J. R. Statist. Soc. Ser. B 70 (2008), pp. 265–283] propose an estimator that extends the linear MM approach to the case in which a linear relationship may not be assumed using penalized splines regression. From a very different perspective, Chambers and Tzavidis [M-quantile models for small area estimation, Biometrika 93 (2006), pp. 255–268] have recently proposed an approach for small-area estimation that is based on M-quantile (MQ) regression. This allows for models robust to outliers and to distributional assumptions on the errors and the area effects. However, when the functional form of the relationship between the qth MQ and the covariates is not linear, it can lead to biased estimates of the small area parameters. Pratesi et al. [Semiparametric M-quantile regression for estimating the proportion of acidic lakes in 8-digit HUCs of the Northeastern US, Environmetrics 19(7) (2008), pp. 687–701] apply an extended version of this approach for the estimation of the small area distribution function using a non-parametric specification of the conditional MQ of the response variable given the covariates [M. Pratesi, M.G. Ranalli, and N. Salvati, Nonparametric m-quantile regression using penalized splines, J. Nonparametric Stat. 21 (2009), pp. 287–304]. We will derive the small area estimator of the mean under this model, together with its mean-squared error estimator and compare its performance to the other estimators via simulations on both real and simulated data.  相似文献   

9.
Approximate Bayesian Computational (ABC) methods, or likelihood-free methods, have appeared in the past fifteen years as useful methods to perform Bayesian analysis when the likelihood is analytically or computationally intractable. Several ABC methods have been proposed: MCMC methods have been developed by Marjoram et al. (2003) and by Bortot et al. (2007) for instance, and sequential methods have been proposed among others by Sisson et al. (2007), Beaumont et al. (2009) and Del Moral et al. (2012). Recently, sequential ABC methods have appeared as an alternative to ABC-PMC methods (see for instance McKinley et al., 2009; Sisson et al., 2007). In this paper a new algorithm combining population-based MCMC methods with ABC requirements is proposed, using an analogy with the parallel tempering algorithm (Geyer 1991). Performance is compared with existing ABC algorithms on simulations and on a real example.  相似文献   

10.
The glmnet package by Friedman et al. [Regularization paths for generalized linear models via coordinate descent, J. Statist. Softw. 33 (2010), pp. 1–22] is an extremely fast implementation of the standard coordinate descent algorithm for solving ?1 penalized learning problems. In this paper, we consider a family of coordinate majorization descent algorithms for solving the ?1 penalized learning problems by replacing each coordinate descent step with a coordinate-wise majorization descent operation. Numerical experiments show that this simple modification can lead to substantial improvement in speed when the predictors have moderate or high correlations.  相似文献   

11.
This study proposes the estimators for the mean and its variance of the number of respondents who possessed a rare sensitive attribute based on stratified sampling schemes (stratified sampling and stratified double sampling). This study deals with the extension of the estimation reported in Land et al. [Estimation of a rare sensitive attribute using Poisson distribution, Statistics (2011), in press. DOI: 10.1080/02331888.2010.524300] using a Poisson distribution and an unrelated question randomized response model reported in Greenberg et al. [The unrelated question randomized response model: Theoretical framework, J. Amer. Statist. Assoc. 64 (1969), 520–539]. In the stratified sampling, the estimators are proposed when the parameter of the rare unrelated attribute is known and unknown. The variances of estimators using a proportional and optimum allocation are also suggested. The proposed estimators are evaluated using a relative efficiency comparing variances of the estimators reported in Land et al. depending on the parameters and the probability of selecting a question. We showed that our proposed methods have better efficiencies than Land et al.’s randomized response model in some conditions. When the sizes of stratified populations are not given, other estimators are suggested using a stratified double sampling. For the proportional allocation, the difference between two variances in the stratified sampling and the stratified double sampling is given with the known rare unrelated attribute.  相似文献   

12.
Abstract

It is well known that prior application of GLS detrending, as advocated by Elliot et al. [Elliot, G., Rothenberg, T., Stock, J. (1996). Efficient tests for an autoregressive unit root. Econometrica 64:813–836], can produce a significant increase in power to reject the unit root null over that obtained from a conventional OLS-based Dickey and Fuller [Dickey, D., Fuller, W. (1979). Distribution of the estimators for autoregressive time series with a unit root. J. Am. Statist. Assoc. 74:427–431] testing equation. However, this paper employs Monte Carlo simulation to demonstrate that this increase in power is not necessarily obtained when breaks occur in either level or trend. It is found that neither OLS nor GLS-based tests are robust to level or trend breaks, their size and power properties both deteriorating as the break size increases.  相似文献   

13.

The linear mixed-effects model (Verbeke and Molenberghs, 2000) has become a standard tool for the analysis of continuous hierarchical data such as, for example, repeated measures or data from meta-analyses. However, in certain situations the model does pose insurmountable computational problems. Precisely this has been the experience of Buyse et al. (2000a) who proposed an estimation- and prediction-based approach for evaluating surrogate endpoints. Their approach requires fitting linear mixed models to data from several clinical trials. In doing so, these authors built on the earlier, single-trial based, work by Prentice (1989), Freedman et al. (1992), and Buyse and Molenberghs (1998). While Buyse et al. (2000a) claim their approach has a number of advantages over the classical single-trial methods, a solution needs to be found for the computational complexity of the corresponding linear mixed model. In this paper, we propose and study a number of possible simplifications. This is done by means of a simulation study and by applying the various strategies to data from three clinical studies: Pharmacological Therapy for Macular Degeneration Study Group (1977), Ovarian Cancer Meta-analysis Project (1991) and Corfu-A Study Group (1995).  相似文献   

14.
Singh et al. (Stat Trans 6(4):515–522, 2003) proposed a modified unrelated question procedure and they also demonstrated that the modified procedure is capable of producing a more efficient estimator of the population parameter π A , namely, the proportion of persons in a community bearing a sensitive character A when π A  < 0.50. The development of Singh et al. (Stat Trans 6(4):515–522, 2003) is based on simple random samples with replacement and on the assumption that π B , namely, the proportion of individuals bearing an unrelated innocuous character B is known. Due to these limitations, Singh et al.’s (Stat Trans 6(4):515–522, 2003) procedure cannot be used in practical surveys where usually the sample units are chosen with varying selection probabilities. In this article, following Singh et al. (Stat Trans 6(4):515–522, 2003) we propose an alternative RR procedure assuming that the population units are sampled with unequal selection probabilities and that the value of π B is unknown. A numerical example comparing the performance of the proposed RR procedure under alternative sampling designs is also reported.  相似文献   

15.
In this paper, we develop a new forecasting algorithm for value-at-risk (VaR) based on ARMA–GARCH (autoregressive moving average–generalized autoregressive conditional heteroskedastic) models whose innovations follow a Gaussian mixture distribution. For the parameter estimation, we employ the conditional least squares and quasi-maximum-likelihood estimator (QMLE) for ARMA and GARCH parameters, respectively. In particular, Gaussian mixture parameters are estimated based on the residuals obtained from the QMLE of GARCH parameters. Our algorithm provides a handy methodology, spending much less time in calculation than the existing resampling and bias-correction method developed in Hartz et al. [Accurate value-at-risk forecasting based on the normal-GARCH model, Comput. Stat. Data Anal. 50 (2006), pp. 3032–3052]. Through a simulation study and a real-data analysis, it is shown that our method provides an accurate VaR prediction.  相似文献   

16.
The main objective of this paper is to implement non-binary splits into Gelfand et al.'s modification of the CART algorithm. Multi-way splits can sometimes better reflect the structure of data than binary splits. Methods for the construction and comparison of such splits are described.  相似文献   

17.
Approximate Bayesian computation (ABC) is a popular approach to address inference problems where the likelihood function is intractable, or expensive to calculate. To improve over Markov chain Monte Carlo (MCMC) implementations of ABC, the use of sequential Monte Carlo (SMC) methods has recently been suggested. Most effective SMC algorithms that are currently available for ABC have a computational complexity that is quadratic in the number of Monte Carlo samples (Beaumont et al., Biometrika 86:983?C990, 2009; Peters et al., Technical report, 2008; Toni et al., J.?Roy. Soc. Interface 6:187?C202, 2009) and require the careful choice of simulation parameters. In this article an adaptive SMC algorithm is proposed which admits a computational complexity that is linear in the number of samples and adaptively determines the simulation parameters. We demonstrate our algorithm on a toy example and on a birth-death-mutation model arising in epidemiology.  相似文献   

18.
The Buckley–James estimator (BJE) [J. Buckley and I. James, Linear regression with censored data, Biometrika 66 (1979), pp. 429–436] has been extended from right-censored (RC) data to interval-censored (IC) data by Rabinowitz et al. [D. Rabinowitz, A. Tsiatis, and J. Aragon, Regression with interval-censored data, Biometrika 82 (1995), pp. 501–513]. The BJE is defined to be a zero-crossing of a modified score function H(b), a point at which H(·) changes its sign. We discuss several approaches (for finding a BJE with IC data) which are extensions of the existing algorithms for RC data. However, these extensions may not be appropriate for some data, in particular, they are not appropriate for a cancer data set that we are analysing. In this note, we present a feasible iterative algorithm for obtaining a BJE. We apply the method to our data.  相似文献   

19.
The L1-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L1-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L1-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers.  相似文献   

20.

In this paper, we consider testing for linearity against a well-known class of regime switching models known as the smooth transition autoregressive (STAR) models. Apart from the model selection issues, one reason for interest in testing for linearity in time-series models is that non-linear models such as the STAR are considerably more difficult to use. This testing problem is non-standard because a nuisance parameter becomes unidentified under the null hypothesis. In this paper, we further explore the class of tests proposed by Luukkonen, Saikonnen and Terasvirta (1988). Luukkonen et al . (1988) proposed LM tests for linearity against STAR models. A potential difficulty here is that the linear approximation introduces high leverage points, and hence outliers are likely to be quite influential. To overcome this difficulty, we use the same approximating linear model of Luukkonen et al . (1988), but we apply Wald and F -tests based on l 1 - and bounded influence estimates. The efficiency gains of this procedure cannot be easily deduced from the existing theoretical results because the test is based on a misspecified model under H 1 . Therefore, we carried out a simulation study, in which we observed that the robust tests have desirable properties compared to the test of Luukkonen et al . (1988) for a range of error distributions in the STAR model, in particular the robust tests have power advantages over the LM test.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号