首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 385 毫秒
1.
When estimating treatment effect on count outcome of given population, one uses different models in different studies, resulting in non-comparable measures of treatment effect. Here we show that the marginal rate differences in these studies are comparable measures of treatment effect. We estimate the marginal rate differences by log-linear models and show that their finite-sample maximum-likelihood estimates are unbiased and highly robust with respect to effects of dispersing covariates on outcome. We get approximate finite-sample distributions of these estimates by using the asymptotic normal distribution of estimates of the log-linear model parameters. This method can be easily applied to practice.  相似文献   

2.
The standard Tobit model is constructed under the assumption of a normal distribution and has been widely applied in econometrics. Atypical/extreme data have a harmful effect on the maximum likelihood estimates of the standard Tobit model parameters. Then, we need to count with diagnostic tools to evaluate the effect of extreme data. If they are detected, we must have available a Tobit model that is robust to this type of data. The family of elliptically contoured distributions has the Laplace, logistic, normal and Student-t cases as some of its members. This family has been largely used for providing generalizations of models based on the normal distribution, with excellent practical results. In particular, because the Student-t distribution has an additional parameter, we can adjust the kurtosis of the data, providing robust estimates against extreme data. We propose a methodology based on a generalization of the standard Tobit model with errors following elliptical distributions. Diagnostics in the Tobit model with elliptical errors are developed. We derive residuals and global/local influence methods considering several perturbation schemes. This is important because different diagnostic methods can detect different atypical data. We implement the proposed methodology in an R package. We illustrate the methodology with real-world econometrical data by using the R package, which shows its potential applications. The Tobit model based on the Student-t distribution with a small quantity of degrees of freedom displays an excellent performance reducing the influence of extreme cases in the maximum likelihood estimates in the application presented. It provides new empirical evidence on the capabilities of the Student-t distribution for accommodation of atypical data.  相似文献   

3.
We use logistic model to get point and interval estimates of the marginal risk difference in observational studies and randomized trials with dichotomous outcome. We prove that the maximum likelihood estimate of the marginal risk difference is unbiased for finite sample and highly robust to the effects of dispersing covariates. We use approximate normal distribution of the maximum likelihood estimates of the logistic model parameters to get approximate distribution of the maximum likelihood estimate of the marginal risk difference and then the interval estimate of the marginal risk difference. We illustrate application of the method by a real medical example.  相似文献   

4.
In this paper, we consider the estimation of parameters of a general near regression model. An estimator that minimises the weighted Wilcoxon dispersion function is considered and its asymptotic properties established under mild regularity conditions similar to those used in least squares and least absolute deviations estimation. As in linear models, the procedure provides estimators that are robust and highly efficient. The estimates depend on the choice of a weight function and diagnostics which differentiate between nonlinear fits are provided along with appropriate benchmarks. The behavior of these estimates is discussed on a real data set. A simulation study verifies the robustness, efficiency and validity of these estimates over several error distributions including the normal and a family of contaminated normal distributions.  相似文献   

5.
We compare minimum Hellinger distance and minimum Heiiinger disparity estimates for U-shaped beta distributions. Given suitable density estimates, both methods are known to be asymptotically efficient when the data come from the assumed model family, and robust to small perturbations from the model family. Most implementations use kernel density estimates, which may not be appropriate for U-shaped distributions. We compare fixed binwidth histograms, percentile mesh histograms, and averaged shifted histograms. Minimum disparity estimates are less sensitive to the choice of density estimate than are minimum distance estimates, and the percentile mesh histogram gives the best results for both minimum distance and minimum disparity estimates. Minimum distance estimates are biased and a bias-corrected method is proposed. Minimum disparity estimates and bias-corrected minimum distance estimates are comparable to maximum likelihood estimates when the model holds, and give better results than either method of moments or maximum likelihood when the data are discretized or contaminated, Although our re¬sults are for the beta density, the implementations are easily modified for other U-shaped distributions such as the Dirkhlet or normal generated distribution.  相似文献   

6.
In this study we investigate the problem of estimation and testing of hypotheses in multivariate linear regression models when the errors involved are assumed to be non-normally distributed. We consider the class of heavy-tailed distributions for this purpose. Although our method is applicable for any distribution in this class, we take the multivariate t-distribution for illustration. This distribution has applications in many fields of applied research such as Economics, Business, and Finance. For estimation purpose, we use the modified maximum likelihood method in order to get the so-called modified maximum likelihood estimates that are obtained in a closed form. We show that these estimates are substantially more efficient than least-square estimates. They are also found to be robust to reasonable deviations from the assumed distribution and also many data anomalies such as the presence of outliers in the sample, etc. We further provide test statistics for testing the relevant hypothesis regarding the regression coefficients.  相似文献   

7.
We consider the problem of making inferences about extreme values from a sample. The underlying model distribution is the generalized extreme-value (GEV) distribution, and our interest is in estimating the parameters and quantiles of the distribution robustly. In doing this we find estimates for the GEV parameters based on that part of the data which is well fitted by a GEV distribution. The robust procedure will assign weights between 0 and 1 to each data point. A weight near 0 indicates that the data point is not well modelled by the GEV distribution which fits the points with weights at or near 1. On the basis of these weights we are able to assess the validity of a GEV model for our data. It is important that the observations with low weights be carefully assessed to determine whether diey are valid observations or not. If they are, we must examine whether our data could be generated by a mixture of GEV distributions or whether some other process is involved in generating the data. This process will require careful consideration of die subject matter area which led to the data. The robust estimation techniques are based on optimal B-robust estimates. Their performance is compared to the probability-weighted moment estimates of Hosking et al. (1985) in both simulated and real data.  相似文献   

8.
In this article, we consider a linear regression model with AR(p) error terms with the assumption that the error terms have a t distribution as a heavy-tailed alternative to the normal distribution. We obtain the estimators for the model parameters by using the conditional maximum likelihood (CML) method. We conduct an iteratively reweighting algorithm (IRA) to find the estimates for the parameters of interest. We provide a simulation study and three real data examples to illustrate the performance of the proposed robust estimators based on t distribution.  相似文献   

9.
In this paper, we consider the family of skew generalized t (SGT) distributions originally introduced by Theodossiou [P. Theodossiou, Financial data and the skewed generalized t distribution, Manage. Sci. Part 1 44 (12) ( 1998), pp. 1650–1661] as a skew extension of the generalized t (GT) distribution. The SGT distribution family warrants special attention, because it encompasses distributions having both heavy tails and skewness, and many of the widely used distributions such as Student's t, normal, Hansen's skew t, exponential power, and skew exponential power (SEP) distributions are included as limiting or special cases in the SGT family. We show that the SGT distribution can be obtained as the scale mixture of the SEP and generalized gamma distributions. We investigate several properties of the SGT distribution and consider the maximum likelihood estimation of the location, scale, and skewness parameters under the assumption that the shape parameters are known. We show that if the shape parameters are estimated along with the location, scale, and skewness parameters, the influence function for the maximum likelihood estimators becomes unbounded. We obtain the necessary conditions to ensure the uniqueness of the maximum likelihood estimators for the location, scale, and skewness parameters, with known shape parameters. We provide a simple iterative re-weighting algorithm to compute the maximum likelihood estimates for the location, scale, and skewness parameters and show that this simple algorithm can be identified as an EM-type algorithm. We finally present two applications of the SGT distributions in robust estimation.  相似文献   

10.
In this paper, we argue that replacing the expectation of the loss in statistical decision theory with the median of the loss leads to a viable and useful alternative to conventional risk minimization particularly because it can be used with heavy tailed distributions. We investigate three possible definitions for such medloss estimators and derive examples of them in several standard settings. We argue that the medloss definition based on the posterior distribution is better than the other two definitions that do not permit optimization over large classes of estimators. We argue that median loss minimizing estimates often yield improved performance, have resistance to outliers as high as the usual robust estimates, and are resistant to the specific loss used to form them. In simulations with the posterior medloss formulation, we show how the estimates can be obtained numerically and that they can have better robustness properties than estimates derived from risk minimization.  相似文献   

11.
The t distribution has proved to be a useful alternative to the normal distribution especially When robust estimation is desired. We consider the multivariate nonlinear Student-t regression model and show that the biased of the estimates of the regression coefficients can be computed from an auxiliary generalized linear regression. We give a formula for the biases of the estimates of the parameters in the scale matrix, which also can be computed by means of a generalized linear regression. We briefly discuss some important special cases and present simulation results which indicate that our bias-corrected estimates outperform the uncorrected ones in small samples.  相似文献   

12.
In this paper we present two robust estimates for GARCH models. The first is defined by the minimization of a conveniently modified likelihood and the second is similarly defined, but includes an additional mechanism for restricting the propagation of the effect of one outlier on the next estimated conditional variances. We study the asymptotic properties of our estimates proving consistency and asymptotic normality. A Monte Carlo study shows that the proposed estimates compare favorably with respect to other robust estimates. Moreover, we consider some real examples with financial data that illustrate the behavior of these estimates.  相似文献   

13.
In this paper, we study the estimation of p-values for robust tests for the linear regression model. The asymptotic distribution of these tests has only been studied under the restrictive assumption of errors with known scale or symmetric distribution. Since these robust tests are based on robust regression estimates, Efron's bootstrap (1979) presents a number of problems. In particular, it is computationally very expensive, and it is not resistant to outliers in the data. In other words, the tails of the bootstrap distribution estimates obtained by re-sampling the data may be severely affected by outliers.We show how to adapt the Robust Bootstrap (Ann. Statist 30 (2002) 556; Bootstrapping MM-estimators for linear regression with fixed designs, http://mathstat.carleton.ca/~matias/pubs.html) to this problem. This method is very fast to compute, resistant to outliers in the data, and asymptotically correct under weak regularity assumptions. In this paper, we show that the Robust Bootstrap can be used to obtain asymptotically correct, computationally simple p-value estimates. A simulation study indicates that the tests whose p-values are estimated with the Robust Bootstrap have better finite sample significance levels than those obtained from the asymptotic theory based on the symmetry assumption.Although this paper is focussed on robust scores-type tests (in: Directions in Robust Statistics and Diagnostics, Part I, Springer, New York), our approach can be applied to other robust tests (for example, Wald- and dispersion-type also discussed in Markatou et al., 1991).  相似文献   

14.
We propose here a robust multivariate extension of the bivariate Birnbaum–Saunders (BS) distribution derived by Kundu et al. [Bivariate Birnbaum–Saunders distribution and associated inference. J Multivariate Anal. 2010;101:113–125], based on scale mixtures of normal (SMN) distributions that are used for modelling symmetric data. This resulting multivariate BS-type distribution is an absolutely continuous distribution whose marginal and conditional distributions are of BS-type distribution of Balakrishnan et al. [Estimation in the Birnbaum–Saunders distribution based on scalemixture of normals and the EM algorithm. Stat Oper Res Trans. 2009;33:171–192]. Due to the complexity of the likelihood function, parameter estimation by direct maximization is very difficult to achieve. For this reason, we exploit the nice hierarchical representation of the proposed distribution to propose a fast and accurate EM algorithm for computing the maximum likelihood (ML) estimates of the model parameters. We then evaluate the finite-sample performance of the developed EM algorithm and the asymptotic properties of the ML estimates through empirical experiments. Finally, we illustrate the obtained results with a real data and display the robustness feature of the estimation procedure developed here.  相似文献   

15.
In this article, the least squares (LS) estimates of the parameters of periodic autoregressive (PAR) models are investigated for various distributions of error terms via Monte-Carlo simulation. Beside the Gaussian distribution, this study covers the exponential, gamma, student-t, and Cauchy distributions. The estimates are compared for various distributions via bias and MSE criterion. The effect of other factors are also examined as the non-constancy of model orders, the non-constancy of the variances of seasonal white noise, the period length, and the length of the time series. The simulation results indicate that this method is in general robust for the estimation of AR parameters with respect to the distribution of error terms and other factors. However, the estimates of those parameters were, in some cases, noticeably poor for Cauchy distribution. It is also noticed that the variances of estimates of white noise variances are highly affected by the degree of skewness of the distribution of error terms.  相似文献   

16.
In this article, we propose a family of bounded influence robust estimates for the parametric and non-parametric components of a generalized partially linear mixed model that are subject to censored responses and missing covariates. The asymptotic properties of the proposed estimates have been looked into. The estimates are obtained by using Monte Carlo expectation–maximization algorithm. An approximate method which reduces the computational time to a great extent is also proposed. A simulation study shows that performances of the two approaches are similar in terms of bias and mean square error. The analysis is illustrated through a study on the effect of environmental factors on the phytoplankton cell count.  相似文献   

17.
Nonlinear mixed‐effects models are being widely used for the analysis of longitudinal data, especially from pharmaceutical research. They use random effects which are latent and unobservable variables so the random‐effects distribution is subject to misspecification in practice. In this paper, we first study the consequences of misspecifying the random‐effects distribution in nonlinear mixed‐effects models. Our study is focused on Gauss‐Hermite quadrature, which is now the routine method for calculation of the marginal likelihood in mixed models. We then present a formal diagnostic test to check the appropriateness of the assumed random‐effects distribution in nonlinear mixed‐effects models, which is very useful for real data analysis. Our findings show that the estimates of fixed‐effects parameters in nonlinear mixed‐effects models are generally robust to deviations from normality of the random‐effects distribution, but the estimates of variance components are very sensitive to the distributional assumption of random effects. Furthermore, a misspecified random‐effects distribution will either overestimate or underestimate the predictions of random effects. We illustrate the results using a real data application from an intensive pharmacokinetic study.  相似文献   

18.
In this paper, we propose a robust bandwidth selection method for local M-estimates used in nonparametric regression. We study the asymptotic behavior of the resulting estimates. We use the results of a Monte Carlo study to compare the performance of various competitors for moderate samples sizes. It appears that the robust plug-in bandwidth selector we propose compares favorably to its competitors, despite the need to select a pilot bandwidth. The Monte Carlo study shows that the robust plug-in bandwidth selector is very stable and relatively insensitive to the choice of the pilot.  相似文献   

19.
M-quantile models with application to poverty mapping   总被引:1,自引:0,他引:1  
Over the last decade there has been growing demand for estimates of population characteristics at small area level. Unfortunately, cost constraints in the design of sample surveys lead to small sample sizes within these areas and as a result direct estimation, using only the survey data, is inappropriate since it yields estimates with unacceptable levels of precision. Small area models are designed to tackle the small sample size problem. The most popular class of models for small area estimation is random effects models that include random area effects to account for between area variations. However, such models also depend on strong distributional assumptions, require a formal specification of the random part of the model and do not easily allow for outlier robust inference. An alternative approach to small area estimation that is based on the use of M-quantile models was recently proposed by Chambers and Tzavidis (Biometrika 93(2):255–268, 2006) and Tzavidis and Chambers (Robust prediction of small area means and distributions. Working paper, 2007). Unlike traditional random effects models, M-quantile models do not depend on strong distributional assumption and automatically provide outlier robust inference. In this paper we illustrate for the first time how M-quantile models can be practically employed for deriving small area estimates of poverty and inequality. The methodology we propose improves the traditional poverty mapping methods in the following ways: (a) it enables the estimation of the distribution function of the study variable within the small area of interest both under an M-quantile and a random effects model, (b) it provides analytical, instead of empirical, estimation of the mean squared error of the M-quantile small area mean estimates and (c) it employs a robust to outliers estimation method. The methodology is applied to data from the 2002 Living Standards Measurement Survey (LSMS) in Albania for estimating (a) district level estimates of the incidence of poverty in Albania, (b) district level inequality measures and (c) the distribution function of household per-capita consumption expenditure in each district. Small area estimates of poverty and inequality show that the poorest Albanian districts are in the mountainous regions (north and north east) with the wealthiest districts, which are also linked with high levels of inequality, in the coastal (south west) and southern part of country. We discuss the practical advantages of our methodology and note the consistency of our results with results from previous studies. We further demonstrate the usefulness of the M-quantile estimation framework through design-based simulations based on two realistic survey data sets containing small area information and show that the M-quantile approach may be preferable when the aim is to estimate the small area distribution function.  相似文献   

20.
When incomplete repeated failure times are collected from a large number of independent individuals, interest is focused primarily on the consistent and efficient estimation of the effects of the associated covariates on the failure times. Since repeated failure times are likely to be correlated, it is important to exploit the correlation structure of the failure data in order to obtain such consistent and efficient estimates. However, it may be difficult to specify an appropriate correlation structure for a real life data set. We propose a robust correlation structure that can be used irrespective of the true correlation structure. This structure is used in constructing an estimating equation for the hazard ratio parameter, under the assumption that the number of repeated failure times for an individual is random. The consistency and efficiency of the estimates is examined through a simulation study, where we consider failure times that marginally follow an exponential distribution and a Poisson distribution is assumed for the random number of repeated failure times. We conclude by using the proposed method to analyze a bladder cancer dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号