期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Assessing large sample bias in misspecified model scenarios with reference to exposure model misspecification in errors-in-variable regression: A new computational approach

Shahadut Hossain Paul Gustafson 《Journal of statistical planning and inference》2011,141(3):1161-1169

In this paper, we develop a numerical method for evaluating the large sample bias in estimated regression coefficients arising due to exposure model misspecification while adjusting for measurement errors in errors-in-variable regression. The application of the proposed method has been demonstrated in the case of a logistic errors-in-variable regression model. The method is based on the combination of Monte-Carlo, numerical and, in some special cases, analytic integration techniques. The proposed method facilitates the investigation of the limiting bias in the estimated regression parameters based on a single data set rather than on repeated data sets as required by the conventional repeated sample method. Simulation studies demonstrate that the proposed method provides very similar estimates of bias in the estimated regression parameters under exposure model misspecification in logistic errors-in-variable regression with a higher degree of precision as compared to the conventional repeated sample method. 相似文献

2.

Two-step jackknife bias reduction for logistic regression mles

S.B Bull W.W Hauck C.M.T Greenwood 《统计学通讯:模拟与计算》2013,42(1):59-88

Maximum likelihood estimates (MLEs) for logistic regression coefficients are known to be biased in finite samples and consequently may produce misleading inferences. Bias adjusted estimates can be calculated using the first-order asymptotic bias derived from a Taylor series expansion of the log likelihood. Jackknifing can also be used to obtain bias corrected estimates, but the approach is computationally intensive, requiring an additional series of iterations (steps) for each observation in the dataset.Although the one-step jackknife has been shown to be useful in logistic regression diagnostics and i the estimation of classification error rates, it does not effectively reduce bias. The two-step jackknife, however, can reduce computation in moderate-sized samples, provide estimates of dispersion and classification error, and appears to be effective in bias reduction. Another alternative, a two-step closed-form approximation, is found to be similar to the Taylo series method in certain circumstances. Monte Carlo simulations indicate that all the procedures, but particularly the multi-step jackknife, may tend to over-correct in very small samples. Comparison of the various bias correction proceduresin an example from the medical literature illustrates that bias correction can have a considerable impact on inference 相似文献

3.

Corrected likelihood-ratio tests in logistic regression using small-sample data

Ujjwal Das Subhra Sankar Dhar Vivek Pradhan 《统计学通讯:理论与方法》2018,47(17):4272-4285

Likelihood-ratio tests (LRTs) are often used for inferences on one or more logistic regression coefficients. Conventionally, for given parameters of interest, the nuisance parameters of the likelihood function are replaced by their maximum likelihood estimates. The new function created is called the profile likelihood function, and is used for inference from LRT. In small samples, LRT based on the profile likelihood does not follow χ² distribution. Several corrections have been proposed to improve LRT when used with small-sample data. Additionally, complete or quasi-complete separation is a common geometric feature for small-sample binary data. In this article, for small-sample binary data, we have derived explicitly the correction factors of LRT for models with and without separation, and proposed an algorithm to construct confidence intervals. We have investigated the performances of different LRT corrections, and the corresponding confidence intervals through simulations. Based on the simulation results, we propose an empirical rule of thumb on the use of these methods. Our simulation findings are also supported by real-world data. 相似文献

4.

Testing Inference from Logistic Regression Models in Data with Unobserved Heterogeneity at Cluster Levels

Salma Ayis 《统计学通讯:模拟与计算》2013,42(6):1202-1211

Clustering due to unobserved heterogeneity may seriously impact on inference from binary regression models. We examined the performance of the logistic, and the logistic-normal models for data with such clustering. The total variance of unobserved heterogeneity rather than the level of clustering determines the size of bias of the maximum likelihood (ML) estimator, for the logistic model. Incorrect specification of clustering as level 2, using the logistic-normal model, provides biased estimates of the structural and random parameters, while specifying level 1, provides unbiased estimates for the former, and adequately estimates the latter. The proposed procedure appeals to many research areas. 相似文献

5.

Point and Interval Estimations of Marginal Risk Difference by Logistic Model

Xiaoqin Wang Yin Jin 《统计学通讯:理论与方法》2013,42(17):3703-3722

We use logistic model to get point and interval estimates of the marginal risk difference in observational studies and randomized trials with dichotomous outcome. We prove that the maximum likelihood estimate of the marginal risk difference is unbiased for finite sample and highly robust to the effects of dispersing covariates. We use approximate normal distribution of the maximum likelihood estimates of the logistic model parameters to get approximate distribution of the maximum likelihood estimate of the marginal risk difference and then the interval estimate of the marginal risk difference. We illustrate application of the method by a real medical example. 相似文献

6.

On extending bock's model of logistic regression in the analysis of categorical data

Bruce Levin Patrick E. Shrout 《统计学通讯:理论与方法》2013,42(2):125-147

A general class of multiple logistic regression models is reviewed and an extension is proposed which leads to restricted maximum likelihood estimates of model parameters. Examples of thegeneral model are given, with an emphasis placed on the interpretation of the parameters in each case. 相似文献

7.

Bayesian and maximum likelihood estimations of the inverted exponentiated half logistic distribution under progressive Type II censoring

Kyeongjun Lee 《Journal of applied statistics》2017,44(5):811-832

In this paper, the estimation of parameters, reliability and hazard functions of a inverted exponentiated half logistic distribution (IEHLD) from progressive Type II censored data has been considered. The Bayes estimates for progressive Type II censored IEHLD under asymmetric and symmetric loss functions such as squared error, general entropy and linex loss function are provided. The Bayes estimates for progressive Type II censored IEHLD parameters, reliability and hazard functions are also obtained under the balanced loss functions. However, the Bayes estimates cannot be obtained explicitly, Lindley approximation method and importance sampling procedure are considered to obtain the Bayes estimates. Furthermore, the asymptotic normality of the maximum likelihood estimates is used to obtain the approximate confidence intervals. The highest posterior density credible intervals of the parameters based on importance sampling procedure are computed. Simulations are performed to see the performance of the proposed estimates. For illustrative purposes, two data sets have been analyzed. 相似文献

8.

Estimation of parameters in mixtures of inverse gaussian distributions

R.K. Amoh 《统计学通讯:理论与方法》2013,42(8):1031-1043

Iterative procedures are developed for finding maximum likelihood estimates of the parameters of mixtures of two inverse Gaussian distributions. The performance of the estimates based on small samples is studied by simulation experiments. Asymptotic efficiencies relative to estimates based on completely classified samples are also evaluated. Unless the components of the populations are widely separated, the maximum likelihood estimates perform poorly. 相似文献

9.

Estimation of exponential regression parameters using binary data

K.F. Cheng J.W. Wu 《统计学通讯:理论与方法》2013,42(8):2203-2214

Exponential regression model is important in analyzing data from heterogeneous populations. In this paper we propose a simple method to estimate the regression parameters using binary data. Under certain design distributions, including ellipticaily symmetric distributions, for the explanatory variables, the estimators are shown to be consistent and asymptotically normal when sample size is large. For finite samples, the new estimates were shown to behave reasonably well. They are competitive with the maximum likelihood estimates and more importantly, according to our simulation results, the cost of CPU time for computing new estimates is only 1/7 of that required for computing the usual maximum likelihood estimates. We expect the savings in CPU time would be more dramatic with larger dimension of the regression parameter space. 相似文献

10.

A Note on Whittle's Likelihood

Alberto Contreras-Cristán Eduardo Gutiérrez-Peña Stephen G. Walker 《统计学通讯:模拟与计算》2013,42(4):857-875

The approximate likelihood function introduced by Whittle has been used to estimate the spectral density and certain parameters of a variety of time series models. In this note we attempt to empirically quantify the loss of efficiency of Whittle's method in nonstandard settings. A recently developed representation of some first-order non-Gaussian stationary autoregressive process allows a direct comparison of the true likelihood function with that of Whittle. The conclusion is that Whittle's likelihood can produce unreliable estimates in the non-Gaussian case, even for moderate sample sizes. Moreover, for small samples, and if the autocorrelation of the process is high, Whittle's approximation is not efficient even in the Gaussian case. While these facts are known to some extent, the present study sheds more light on the degree of efficiency loss incurred by using Whittle's likelihood, in both Gaussian and non-Gaussian cases. 相似文献

11.

Infinite Parameter Estimates in Logistic Regression, with Application to Approximate Conditional Inference

John E. Kolassa 《Scandinavian Journal of Statistics》1997,24(4):523-530

This paper discusses recovery of information regarding logistic regression parameters in cases when maximum likelihood estimates of some parameters are infinite. An algorithm for detecting such cases and characterizing the divergence of the parameter estimates is presented. A method for fitting the remaining parameters is also presented . All of these methods rely only on sufficient statistics rather than less aggregated quantities, as required for inference according to the method of Kolassa & Tanner (1994). These results are applied to approximate conditional inference via saddlepoint methods. Specifically, the double saddlepoint method of Skovgaard (1987) is adapted to the case when the solution to the saddlepoint equations exists as a point at infinity 相似文献

12.

Robust logistic regression modelling via the elastic net-type regularization and tuning parameter selection

《Journal of Statistical Computation and Simulation》2012,82(7):1450-1461

The penalized logistic regression is a useful tool for classifying samples and feature selection. Although the methodology has been widely used in various fields of research, their performance takes a sudden turn for the worst in the presence of outlier, since the logistic regression is based on the maximum log-likelihood method which is sensitive to outliers. It implies that we cannot accurately classify samples and find important factors having crucial information for classification. To overcome the problem, we propose a robust penalized logistic regression based on a weighted likelihood methodology. We also derive an information criterion for choosing the tuning parameters, which is a vital matter in robust penalized logistic regression modelling in line with generalized information criteria. We demonstrate through Monte Carlo simulations and real-world example that the proposed robust modelling strategies perform well for sparse logistic regression modelling even in the presence of outliers. 相似文献

13.

Use of weighted least squares regression jn simulation studies

Christopher Irwin Stephen J. Finch 《统计学通讯:模拟与计算》2013,42(2):291-300

When a simulation or Monte Carlo analysis uses the same set of N samples as input for the comparison of the power of tests, the resulting estimates of power are highly correlated. As a result, the statistical analysis of these results should use weighted least squares or other equivalent procedures. A reanalysis of one simulation study (Thode et al., 1983) found that the weighted least squares estimates had much smaller standard errors than the ordinary least squares estimates. The reduction in the standard errors of the parameters was a factor between 4 and 9 for the tests found to be more powerful. The necessary calculations are described. 相似文献

14.

Comparisons of computational methods for clustered binary data

Tanujit Dey Chae Young Lim 《Journal of Statistical Computation and Simulation》2013,83(11):2030-2046

Clustered binary data are common in medical research and can be fitted to the logistic regression model with random effects which belongs to a wider class of models called the generalized linear mixed model. The likelihood-based estimation of model parameters often has to handle intractable integration which leads to several estimation methods to overcome such difficulty. The penalized quasi-likelihood (PQL) method is the one that is very popular and computationally efficient in most cases. The expectation–maximization (EM) algorithm allows to estimate maximum-likelihood estimates, but requires to compute possibly intractable integration in the E-step. The variants of the EM algorithm to evaluate the E-step are introduced. The Monte Carlo EM (MCEM) method computes the E-step by approximating the expectation using Monte Carlo samples, while the Modified EM (MEM) method computes the E-step by approximating the expectation using the Laplace's method. All these methods involve several steps of approximation so that corresponding estimates of model parameters contain inevitable errors (large or small) induced by approximation. Understanding and quantifying discrepancy theoretically is difficult due to the complexity of approximations in each method, even though the focus is on clustered binary data. As an alternative competing computational method, we consider a non-parametric maximum-likelihood (NPML) method as well. We review and compare the PQL, MCEM, MEM and NPML methods for clustered binary data via simulation study, which will be useful for researchers when choosing an estimation method for their analysis. 相似文献

15.

Maximum Likelihood Estimation of Logistic Regression Parameters under Two-phase, Outcome-dependent Sampling

Norman E. Breslow & Richard Holubkov 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1997,59(2):447-461

Outcome-dependent sampling increases the efficiency of studies of rare outcomes, examples being case—control studies in epidemiology and choice–based sampling in econometrics. Two-phase or double sampling is a standard technique for drawing efficient stratified samples. We develop maximum likelihood estimation of logistic regression coefficients for a hybrid two-phase, outcome–dependent sampling design. An algorithm is given for determining the estimates by repeated fitting of ordinary logistic regression models. Simulation results demonstrate the efficiency loss associated with alternative pseudolikelihood and weighted likelihood methods for certain data configurations. These results provide an efficient solution to the measurement error problem with validation sampling based on a discrete surrogate. 相似文献

16.

On estimating parameters of a progressively censored lognormal distribution

《Journal of Statistical Computation and Simulation》2012,82(6):1071-1089

We consider the problem of making statistical inference on unknown parameters of a lognormal distribution under the assumption that samples are progressively censored. The maximum likelihood estimates (MLEs) are obtained by using the expectation-maximization algorithm. The observed and expected Fisher information matrices are provided as well. Approximate MLEs of unknown parameters are also obtained. Bayes and generalized estimates are derived under squared error loss function. We compute these estimates using Lindley's method as well as importance sampling method. Highest posterior density interval and asymptotic interval estimates are constructed for unknown parameters. A simulation study is conducted to compare proposed estimates. Further, a data set is analysed for illustrative purposes. Finally, optimal progressive censoring plans are discussed under different optimality criteria and results are presented. 相似文献

17.

Inferences from logistic regression models in the presence of small samples,rare events,nonlinearity, and multicollinearity with observational data

Jason S. Bergtold Elizabeth A. Yeager Allen M. Featherstone 《Journal of applied statistics》2018,45(3):528-546

The logistic regression model has been widely used in the social and natural sciences and results from studies using this model can have significant policy impacts. Thus, confidence in the reliability of inferences drawn from these models is essential. The robustness of such inferences is dependent on sample size. The purpose of this article is to examine the impact of alternative data sets on the mean estimated bias and efficiency of parameter estimation and inference for the logistic regression model with observational data. A number of simulations are conducted examining the impact of sample size, nonlinear predictors, and multicollinearity on substantive inferences (e.g. odds ratios, marginal effects) when using logistic regression models. Findings suggest that small sample size can negatively affect the quality of parameter estimates and inferences in the presence of rare events, multicollinearity, and nonlinear predictor functions, but marginal effects estimates are relatively more robust to sample size. 相似文献

18.

Maximum Likelihood Estimation in a Semiparametric Logistic/Proportional-Hazards Mixture Model 总被引：2，自引：0，他引：2

HONG-BIN FANG GANG LI JIANGUO SUN 《Scandinavian Journal of Statistics》2005,32(1):59-75

Abstract. We consider large sample inference in a semiparametric logistic/proportional-hazards mixture model. This model has been proposed to model survival data where there exists a positive portion of subjects in the population who are not susceptible to the event under consideration. Previous studies of the logistic/proportional-hazards mixture model have focused on developing point estimation procedures for the unknown parameters. This paper studies large sample inferences based on the semiparametric maximum likelihood estimator. Specifically, we establish existence, consistency and asymptotic normality results for the semiparametric maximum likelihood estimator. We also derive consistent variance estimates for both the parametric and non-parametric components. The results provide a theoretical foundation for making large sample inference under the logistic/proportional-hazards mixture model. 相似文献

19.

Logistic Regression and Discriminant Analysis by Ordinary Least Squares

Gus W. Haggstrom 《商业与经济统计学杂志》2013,31(3):229-238

If the observations for fitting a polytomous logistic regression model satisfy certain normality assumptions, the maximum likelihood estimates of the regression coefficients are the discriminant function estimates. This article shows that these estimates, their unbiased counterparts, and associated test statistics for variable selection can be calculated using ordinary least squares regression techniques, thereby providing a convenient method for fitting logistic regression models in the normal case. Evidence is given indicating that the discriminant function estimates and test statistics merit wider use in nonnormal cases, especially in exploratory work on large data sets. 相似文献

20.

The unreasonable effectiveness of a biased logistic regression procedure in the analysis of pair-matched case-control studies

《Journal of statistical planning and inference》2001,96(2):371-385

We examine the rationale of prospective logistic regression analysis for pair-matched case-control data using explicit, parametric terms for matching variables in the model. We show that this approach can yield inconsistent estimates for the disease-exposure odds ratio, even in large samples. Some special conditions are given under which the bias for the disease-exposure odds ratio is small. It is because these conditions are not too uncommon that this flawed analytic method appears to possess an (unreasonable) effectiveness. 相似文献