首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Observational data analysis is often based on tacit assumptions of ignorability or randomness. The paper develops a general approach to local sensitivity analysis for selectivity bias, which aims to study the sensitivity of inference to small departures from such assumptions. If M is a model assuming ignorability, we surround M by a small neighbourhood N defined in the sense of Kullback–Leibler divergence and then compare the inference for models in N with that for M . Interpretable bounds for such differences are developed. Applications to missing data and to observational comparisons are discussed. Local approximations to sensitivity analysis are model robust and can be applied to a wide range of statistical problems.  相似文献   

2.
The most popular and perhaps universal estimator of location and scale in robust estimation, where the population is normal with possible small departures, is Huber's Proposal‐2 M‐estimator. This paper gives the first‐order small sample bias correction for the scale estimator, verifying the calculation through theory and simulation. Other ways of reducing small sample bias, say by jackknifing or bootstrapping, can be computationally intensive, and would not be routinely used with this iteratively derived estimator. It is suggested that bias‐reduced estimates of scale are most useful when forming confidence intervals for location and/or scale based on the asymptotic distribution.  相似文献   

3.
For estimation of population totals, dual system estimation (d.s.e.) is often used. Such a procedure is known to suffer from bias under certain conditions. In the following, a simple model is proposed that combines three conditions under which bias of the DSE can result. The conditions relate to response correlation, classification and matching error. The resulting bias is termed model bias. The effects of model bias and synthetic bias in a small area estimation application are illustrated. The illustration uses simulated population data  相似文献   

4.
Often in observational studies of time to an event, the study population is a biased (i.e., unrepresentative) sample of the target population. In the presence of biased samples, it is common to weight subjects by the inverse of their respective selection probabilities. Pan and Schaubel (Can J Stat 36:111–127, 2008) recently proposed inference procedures for an inverse selection probability weighted (ISPW) Cox model, applicable when selection probabilities are not treated as fixed but estimated empirically. The proposed weighting procedure requires auxiliary data to estimate the weights and is computationally more intense than unweighted estimation. The ignorability of sample selection process in terms of parameter estimators and predictions is often of interest, from several perspectives: e.g., to determine if weighting makes a significant difference to the analysis at hand, which would in turn address whether the collection of auxiliary data is required in future studies; to evaluate previous studies which did not correct for selection bias. In this article, we propose methods to quantify the degree of bias corrected by the weighting procedure in the partial likelihood and Breslow-Aalen estimators. Asymptotic properties of the proposed test statistics are derived. The finite-sample significance level and power are evaluated through simulation. The proposed methods are then applied to data from a national organ failure registry to evaluate the bias in a post-kidney transplant survival model.  相似文献   

5.
Summary.  Given a large number of test statistics, a small proportion of which represent departures from the relevant null hypothesis, a simple rule is given for choosing those statistics that are indicative of departure. It is based on fitting by moments a mixture model to the set of test statistics and then deriving an estimated likelihood ratio. Simulation suggests that the procedure has good properties when the departure from an overall null hypothesis is not too small.  相似文献   

6.
Summary.  In the empirical literature on assortative matching using linked employer–employee data, unobserved worker quality appears to be negatively correlated with unobserved firm quality. We show that this can be caused by standard estimation error. We develop formulae that show that the estimated correlation is biased downwards if there is true positive assortative matching and when any conditioning covariates are uncorrelated with the firm and worker fixed effects. We show that this bias is bigger the fewer movers there are in the data, which is 'limited mobility bias'. This result applies to any two-way (or higher) error components model that is estimated by fixed effects methods. We apply these bias corrections to a large German linked employer–employee data set. We find that, although the biases can be considerable, they are not sufficiently large to remove the negative correlation entirely.  相似文献   

7.
A study on the robustness of the adaptation of the sample size for a phase III trial on the basis of existing phase II data is presented—when phase III is lower than phase II effect size. A criterion of clinical relevance for phase II results is applied in order to launch phase III, where data from phase II cannot be included in statistical analysis. The adaptation consists in adopting the conservative approach to sample size estimation, which takes into account the variability of phase II data. Some conservative sample size estimation strategies, Bayesian and frequentist, are compared with the calibrated optimal γ conservative strategy (viz. COS) which is the best performer when phase II and phase III effect sizes are equal. The Overall Power (OP) of these strategies and the mean square error (MSE) of their sample size estimators are computed under different scenarios, in the presence of the structural bias due to lower phase III effect size, for evaluating the robustness of the strategies. When the structural bias is quite small (i.e., the ratio of phase III to phase II effect size is greater than 0.8), and when some operating conditions for applying sample size estimation hold, COS can still provide acceptable results for planning phase III trials, even if in bias absence the OP was higher.

Main results concern the introduction of a correction, which affects just sample size estimates and not launch probabilities, for balancing the structural bias. In particular, the correction is based on a postulation of the structural bias; hence, it is more intuitive and easier to use than those based on the modification of Type I or/and Type II errors. A comparison of corrected conservative sample size estimation strategies is performed in the presence of a quite small bias. When the postulated correction is right, COS provides good OP and the lowest MSE. Moreover, the OPs of COS are even higher than those observed without bias, thanks to higher launch probability and a similar estimation performance. The structural bias can therefore be exploited for improving sample size estimation performances. When the postulated correction is smaller than necessary, COS is still the best performer, and it also works well. A higher than necessary correction should be avoided.  相似文献   

8.
Hea-Jung Kim  Taeyoung Roh 《Statistics》2013,47(5):1082-1111
In regression analysis, a sample selection scheme often applies to the response variable, which results in missing not at random observations on the variable. In this case, a regression analysis using only the selected cases would lead to biased results. This paper proposes a Bayesian methodology to correct this bias based on a semiparametric Bernstein polynomial regression model that incorporates the sample selection scheme into a stochastic monotone trend constraint, variable selection, and robustness against departures from the normality assumption. We present the basic theoretical properties of the proposed model that include its stochastic representation, sample selection bias quantification, and hierarchical model specification to deal with the stochastic monotone trend constraint in the nonparametric component, simple bias corrected estimation, and variable selection for the linear components. We then develop computationally feasible Markov chain Monte Carlo methods for semiparametric Bernstein polynomial functions with stochastically constrained parameter estimation and variable selection procedures. We demonstrate the finite-sample performance of the proposed model compared to existing methods using simulation studies and illustrate its use based on two real data applications.  相似文献   

9.
The gamma frailty model is a natural extension of the Cox proportional hazards model in survival analysis. Because the frailties are unobserved, an E-M approach is often used for estimation. Such an approach is shown to lead to finite sample underestimation of the frailty variance, with the corresponding regression parameters also being underestimated as a result. For the univariate case, we investigate the source of the bias with simulation studies and a complete enumeration. The rank-based E-M approach, we note, only identifies frailty through the order in which failures occur; additional frailty which is evident in the survival times is ignored, and as a result the frailty variance is underestimated. An adaption of the standard E-M approach is suggested, whereby the non-parametric Breslow estimate is replaced by a local likelihood formulation for the baseline hazard which allows the survival times themselves to enter the model. Simulations demonstrate that this approach substantially reduces the bias, even at small sample sizes. The method developed is applied to survival data from the North West Regional Leukaemia Register.  相似文献   

10.
An extensive simulation study is conducted to compare the performance between balanced and antithetic resampling for the bootstrap in estimation of bias, variance, and percentiles when the statistic of interest is the median, the square root of the absolute value of the mean, or the median absolute deviations from the median. Simulation results reveal that balanced resampling provide better efficiencies in most cases; however, antithetic resampling is superior in estimating bias of the median. We also investigate the possibility of combining an existing efficient bootstrap computation of Efron (1990) with balanced or antithetic resampling for percentile estimation. Results indicate that the combination method does indeed offer gains in performance though the gains are much more dramatic for the bootstrap t statistic than for any of the three statistics of interest as described above.  相似文献   

11.
Classification error can lead to substantial biases in the estimation of gross flows from longitudinal data. We propose a method to adjust flow estimates for bias, based on fitting separate multinomial logistic models to the classification error probabilities and the true state transition probabilities using values of auxiliary variables. Our approach has the advantages that it does not require external information on misclassification rates, it permits the identification of factors that are related to misclassification and true transitions and it does not assume independence between classification errors at successive points in time. Constraining the prediction of the stocks to agree with the observed stocks protects against model misspecification. We apply the approach to data on women from the Panel Study of Income Dynamics with three categories of labour force status. The model fitted is shown to have interpretable coefficient estimates and to provide a good fit. Simulation results indicate good performance of the model in predicting the true flows and robustness against departures from the model postulated.  相似文献   

12.
A methodology is developed for estimating consumer acceptance limits on a sensory attribute of a manufactured product. In concept these limits are analogous to engineering tolerances. The method is based on a generalization of Stevens' Power Law. This generalized law is expressed as a nonlinear statistical model. Instead of restricting the analysis to this particular case, a strategy is discussed for evaluating nonlinear models in general since scientific models are frequently of nonlinear form. The strategy focuses on understanding the geometrical contrasts between linear and nonlinear model estimation and assessing the bias in estimation and the departures from a Gaussian sampling distribution. Computer simulation is employed to examine the behavior of nonlinear least squares estimation. In addition to the usual Gaussian assumption, a bootstrap sample reuse procedure and a general triangular distribution are introduced for evaluating the effects of a non-Gaussian or asymmetrical error structure. Recommendations are given for further model analysis based on the simulation results. In the case of a model for which estimation bias is not a serious issue, estimating functions of the model are considered. Application of these functions to the generalization of Stevens’ Power Law leads to a means for defining and estimating consumer acceptance limits, The statistical form of the law and the model evaluation strategy are applied to consumer research data. Estimation of consumer acceptance limits is illustrated and discussed.  相似文献   

13.
Supremum score test statistics are often used to evaluate hypotheses with unidentifiable nuisance parameters under the null hypothesis. Although these statistics provide an attractive framework to address non‐identifiability under the null hypothesis, little attention has been paid to their distributional properties in small to moderate sample size settings. In situations where there are identifiable nuisance parameters under the null hypothesis, these statistics may behave erratically in realistic samples as a result of a non‐negligible bias induced by substituting these nuisance parameters by their estimates under the null hypothesis. In this paper, we propose an adjustment to the supremum score statistics by subtracting the expected bias from the score processes and show that this adjustment does not alter the limiting null distribution of the supremum score statistics. Using a simple example from the class of zero‐inflated regression models for count data, we show empirically and theoretically that the adjusted tests are superior in terms of size and power. The practical utility of this methodology is illustrated using count data in HIV research.  相似文献   

14.
This article considers identification and estimation of social network models in a system of simultaneous equations. We show that, with or without row-normalization of the social adjacency matrix, the network model has different equilibrium implications, needs different identification conditions, and requires different estimation strategies. When the adjacency matrix is not row-normalized, the variation in the Bonacich centrality across nodes in a network can be used as an IV to identify social interaction effects and improve estimation efficiency. The number of such IVs depends on the number of networks. When there are many networks in the data, the proposed estimators may have an asymptotic bias due to the presence of many IVs. We propose a bias-correction procedure for the many-instrument bias. Simulation experiments show that the bias-corrected estimators perform well in finite samples. We also provide an empirical example to illustrate the proposed estimation procedure.  相似文献   

15.
We study the bias that arises from using censored regressors in estimation of linear models. We present results on bias in ordinary least aquares (OLS) regression estimators with exogenous censoring and in instrumental variable (IV) estimators when the censored regressor is endogenous. Bound censoring such as top-coding results in expansion bias, or effects that are too large. Independent censoring results in bias that varies with the estimation method—attenuation bias in OLS estimators and expansion bias in IV estimators. Severe biases can result when there are several regressors and when a 0–1 variable is used in place of a continuous regressor.  相似文献   

16.
This article investigates the consequences of departures from independence when the component lifetimes in a series system are exponentially distributed. Such departures are studied when the joint distribution is assumed to follow either one of the three Gumbel bivariate exponential models, the Downton bivariate exponential model, or the Oakes bivariate exponential model. Two distinct situations are considered. First, in theoretical modeling of series systems, when the distribution of the component lifetimes is assumed, one wishes to compute system reliability and mean system life. Second, errors in parametric and nonparametric estimation of component reliability and component mean life are studied based on life-test data collected on series systems when the assumption of independence is made  相似文献   

17.
Modern systems of official statistics require the estimation and publication of business statistics for disaggregated domains, for example, industry domains and geographical regions. Outlier robust methods have proven to be useful for small‐area estimation. Recently proposed outlier robust model‐based small‐area methods assume, however, uncorrelated random effects. Spatial dependencies, resulting from similar industry domains or geographic regions, often occur. In this paper, we propose an outlier robust small‐area methodology that allows for the presence of spatial correlation in the data. In particular, we present a robust predictive methodology that incorporates the potential spatial impact from other areas (domains) on the small area (domain) of interest. We further propose two parametric bootstrap methods for estimating the mean‐squared error. Simulations indicate that the proposed methodology may lead to efficiency gains. The paper concludes with an illustrative application by using business data for estimating average labour costs in Italian provinces.  相似文献   

18.
Case–control studies allow efficient estimation of the associations of covariates with a binary response in settings where the probability of a positive response is small. It is well known that covariate–response associations can be consistently estimated using a logistic model by acting as if the case–control (retrospective) data were prospective, and that this result does not hold for other binary regression models. However, in practice an investigator may be interested in fitting a non–logistic link binary regression model and this paper examines the magnitude of the bias resulting from ignoring the case–control sample design with such models. The paper presents an approximation to the magnitude of this bias in terms of the sampling rates of cases and controls, as well as simulation results that show that the bias can be substantial.  相似文献   

19.
The logistic regression model has been widely used in the social and natural sciences and results from studies using this model can have significant policy impacts. Thus, confidence in the reliability of inferences drawn from these models is essential. The robustness of such inferences is dependent on sample size. The purpose of this article is to examine the impact of alternative data sets on the mean estimated bias and efficiency of parameter estimation and inference for the logistic regression model with observational data. A number of simulations are conducted examining the impact of sample size, nonlinear predictors, and multicollinearity on substantive inferences (e.g. odds ratios, marginal effects) when using logistic regression models. Findings suggest that small sample size can negatively affect the quality of parameter estimates and inferences in the presence of rare events, multicollinearity, and nonlinear predictor functions, but marginal effects estimates are relatively more robust to sample size.  相似文献   

20.
In longitudinal data where the timing and frequency of the measurement of outcomes may be associated with the value of the outcome, significant bias can occur. Previous results depended on correct specification of the outcome process and a somewhat unrealistic visit process model. In practice, this will never exactly be the case, so it is important to understand to what degree the results hold when those assumptions are violated in order to guide practical use of the methods. This paper presents theory and the results of simulation studies to extend our previous work to more realistic visit process models, as well as Poisson outcomes. We also assess the effects of several types of model misspecification. The estimated bias in these new settings generally mirrors the theoretical and simulation results of our previous work and provides confidence in using maximum likelihood methods in practice. Even when the assumptions about the outcome process did not hold, mixed effects models fit by maximum likelihood produced at most small bias in estimated regression coefficients, illustrating the robustness of these methods. This contrasts with generalised estimating equations approaches where bias increased in the settings of this paper. The analysis of data from a study of change in neurological outcomes following microsurgery for a brain arteriovenous malformation further illustrate the results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号