期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Inferences from logistic regression models in the presence of small samples,rare events,nonlinearity, and multicollinearity with observational data

Jason S. Bergtold Elizabeth A. Yeager Allen M. Featherstone 《Journal of applied statistics》2018,45(3):528-546

The logistic regression model has been widely used in the social and natural sciences and results from studies using this model can have significant policy impacts. Thus, confidence in the reliability of inferences drawn from these models is essential. The robustness of such inferences is dependent on sample size. The purpose of this article is to examine the impact of alternative data sets on the mean estimated bias and efficiency of parameter estimation and inference for the logistic regression model with observational data. A number of simulations are conducted examining the impact of sample size, nonlinear predictors, and multicollinearity on substantive inferences (e.g. odds ratios, marginal effects) when using logistic regression models. Findings suggest that small sample size can negatively affect the quality of parameter estimates and inferences in the presence of rare events, multicollinearity, and nonlinear predictor functions, but marginal effects estimates are relatively more robust to sample size. 相似文献

2.

Ordinal ridge regression with categorical predictors

Faisal M. Zahid Shahla Ramzan 《Journal of applied statistics》2012,39(1):161-171

In multi-category response models, categories are often ordered. In the case of ordinal response models, the usual likelihood approach becomes unstable with ill-conditioned predictor space or when the number of parameters to be estimated is large relative to the sample size. The likelihood estimates do not exist when the number of observations is less than the number of parameters. The same problem arises if constraint on the order of intercept values is not met during the iterative procedure. Proportional odds models (POMs) are most commonly used for ordinal responses. In this paper, penalized likelihood with quadratic penalty is used to address these issues with a special focus on POMs. To avoid large differences between two parameter values corresponding to the consecutive categories of an ordinal predictor, the differences between the parameters of two adjacent categories should be penalized. The considered penalized-likelihood function penalizes the parameter estimates or differences between the parameter estimates according to the type of predictors. Mean-squared error for parameter estimates, deviance of fitted probabilities and prediction error for ridge regression are compared with usual likelihood estimates in a simulation study and an application. 相似文献

3.

Asymptotic properties of the complex valued non-linear regression model

Debasis Kundu 《统计学通讯:理论与方法》2013,42(12):3793-3803

The non-linear regression model, when the parameters are complex valued is considered here. Jennrich(1969) considered the non-linear regression model when the parameters are real valued. He first rigorously proved the existence of the least square estimator and showed its consistency properties and asymptotic normality. In this paper we generalise the idea for the com-plex parameters case. Large sample properties of the proposed estimator has been studied. 相似文献

4.

Estimation and prediction for the generalized inverted exponential distribution based on progressively first-failure-censored data with application

Essam A. Ahmed 《Journal of applied statistics》2017,44(9):1576-1608

In this paper, the estimation of parameters for a generalized inverted exponential distribution based on the progressively first-failure type-II right-censored sample is studied. An expectation–maximization (EM) algorithm is developed to obtain maximum likelihood estimates of unknown parameters as well as reliability and hazard functions. Using the missing value principle, the Fisher information matrix has been obtained for constructing asymptotic confidence intervals. An exact interval and an exact confidence region for the parameters are also constructed. Bayesian procedures based on Markov Chain Monte Carlo methods have been developed to approximate the posterior distribution of the parameters of interest and in addition to deduce the corresponding credible intervals. The performances of the maximum likelihood and Bayes estimators are compared in terms of their mean-squared errors through the simulation study. Furthermore, Bayes two-sample point and interval predictors are obtained when the future sample is ordinary order statistics. The squared error, linear-exponential and general entropy loss functions have been considered for obtaining the Bayes estimators and predictors. To illustrate the discussed procedures, a set of real data is analyzed. 相似文献

5.

The effect of parameter estimation on the performance of one-sided Shewhart control charts for zero-inflated processes

Athanasios C. Rakitzis Philippe Castagliola 《统计学通讯:理论与方法》2013,42(14):4194-4214

ABSTRACT

Zero-inflated probability models are used to model count data that have an excessive number of zeros. Shewhart-type control charts have been proposed for the monitoring of zero-inflated processes. Usually their performance is evaluated under the assumption of known process parameters. However, in practice, their values are rarely known and they have to be estimated from an in-control historical Phase I sample. In the present paper, we investigate the performance of Shewhart-type control charts for zero-inflated processes with estimated parameters and propose practical guidelines for the statistical design of the examined charts, when the size of the preliminary sample is predetermined. 相似文献

6.

A Simple Approach to Inference in Random Coefficient Models

Marcia Gumpertz Sastry G. Pantula 《The American statistician》2013,67(4):203-210

Random coefficient regression models have been used to analyze cross-sectional and longitudinal data in economics and growth-curve data from biological and agricultural experiments. In the literature several estimators, including the ordinary least squares and the estimated generalized least squares (EGLS), have been considered for estimating the parameters of the mean model. Based on the asymptotic properties of the EGLS estimators, test statistics have been proposed for testing linear hypotheses involving the parameters of the mean model. An alternative estimator, the simple mean of the individual regression coefficients, provides estimation and hypothesis-testing procedures that are simple to compute and teach. The large sample properties of this simple estimator are shown to be similar to that of the EGLS estimator. The performance of the proposed estimator is compared with that of the existing estimators by Monte Carlo simulation. 相似文献

7.

Estimation and prediction of the Kumaraswamy distribution based on record values and inter-record times

《Journal of Statistical Computation and Simulation》2012,82(12):2471-2493

ABSTRACT

The maximum likelihood and Bayesian approaches for estimating the parameters and the prediction of future record values for the Kumaraswamy distribution has been considered when the lower record values along with the number of observations following the record values (inter-record-times) have been observed. The Bayes estimates are obtained based on a joint bivariate prior for the shape parameters. In this case, Bayes estimates of the parameters have been developed by using Lindley's approximation and the Markov Chain Monte Carlo (MCMC) method due to the lack of explicit forms under the squared error and the linear-exponential loss functions. The MCMC method has been also used to construct the highest posterior density credible intervals. The Bayes and the maximum likelihood estimates are compared by using the estimated risk through Monte Carlo simulations. We further consider the non-Bayesian and Bayesian prediction for future lower record values arising from the Kumaraswamy distribution based on record values with their corresponding inter-record times and only record values. The comparison of the derived predictors are carried out by using Monte Carlo simulations. Real data are analysed for an illustration of the findings. 相似文献

8.

Consistent estimation of regression parameters under replicated ultrastructural model with non-normal errors

《Journal of Statistical Computation and Simulation》2012,82(3):251-274

This article discusses the construction and efficiency properties of consistent estimators of regression parameters under replicated ultrastructural model with not necessarily normally distributed measurement errors. The variances of measurement errors associated with the study and explanatory variables are estimated from the replicated sample observations and are used for the consistent estimation of regression parameters. The asymptotic efficiency properties of the estimators are derived and analysed. The finite sample performance of the estimators is empirically studied through a Monte Carlo simulation. 相似文献

9.

Mean-squared errors of small-area estimators under a unit-level multivariate model

Amparo Baíllo 《Statistics》2013,47(6):553-569

This work deals with estimating the vector of means of certain characteristics of small areas. In this context, a unit level multivariate model with correlated sampling errors is considered. An approximation is obtained for the mean-squared and cross-product errors of the empirical best linear unbiased predictors of the means, when model parameters are estimated either by maximum likelihood (ML) or by restricted ML. This approach has been implemented on a Monte Carlo study using social and labour data from the Spanish Labour Force Survey. 相似文献

10.

Gaussian Regularized Sliced Inverse Regression

Caroline Bernard-Michel Laurent Gardes Stéphane Girard 《Statistics and Computing》2009,19(1):85-98

Sliced Inverse Regression (SIR) is an effective method for dimension reduction in high-dimensional regression problems. The original method, however, requires the inversion of the predictors covariance matrix. In case of collinearity between these predictors or small sample sizes compared to the dimension, the inversion is not possible and a regularization technique has to be used. Our approach is based on a Fisher Lecture given by R.D. Cook where it is shown that SIR axes can be interpreted as solutions of an inverse regression problem. We propose to introduce a Gaussian prior distribution on the unknown parameters of the inverse regression problem in order to regularize their estimation. We show that some existing SIR regularizations can enter our framework, which permits a global understanding of these methods. Three new priors are proposed leading to new regularizations of the SIR method. A comparison on simulated data as well as an application to the estimation of Mars surface physical properties from hyperspectral images are provided. 相似文献

11.

Bootstrap estimation of the variance of the error term in monotonic regression models

O. Sysoev A. Grimvall O. Burdakov 《Journal of Statistical Computation and Simulation》2013,83(4):627-640

The variance of the error term in ordinary regression models and linear smoothers is usually estimated by adjusting the average squared residual for the trace of the smoothing matrix (the degrees of freedom of the predicted response). However, other types of variance estimators are needed when using monotonic regression (MR) models, which are particularly suitable for estimating response functions with pronounced thresholds. Here, we propose a simple bootstrap estimator to compensate for the over-fitting that occurs when MR models are estimated from empirical data. Furthermore, we show that, in the case of one or two predictors, the performance of this estimator can be enhanced by introducing adjustment factors that take into account the slope of the response function and characteristics of the distribution of the explanatory variables. Extensive simulations show that our estimators perform satisfactorily for a great variety of monotonic functions and error distributions. 相似文献

12.

The use of predicted values for item parameters in item response theory models: an application in intelligence tests

Mariagiulia Matteucci Stefania Mignani Bernard P. Veldkamp 《Journal of applied statistics》2012,39(12):2665-2683

In testing, item response theory models are widely used in order to estimate item parameters and individual abilities. However, even unidimensional models require a considerable sample size so that all parameters can be estimated precisely. The introduction of empirical prior information about candidates and items might reduce the number of candidates needed for parameter estimation. Using data for IQ measurement, this work shows how empirical information about items can be used effectively for item calibration and in adaptive testing. First, we propose multivariate regression trees to predict the item parameters based on a set of covariates related to the item-solving process. Afterwards, we compare the item parameter estimation when tree-fitted values are included in the estimation or when they are ignored. Model estimation is fully Bayesian, and is conducted via Markov chain Monte Carlo methods. The results are two-fold: (a) in item calibration, it is shown that the introduction of prior information is effective with short test lengths and small sample sizes and (b) in adaptive testing, it is demonstrated that the use of the tree-fitted values instead of the estimated parameters leads to a moderate increase in the test length, but provides a considerable saving of resources. 相似文献

13.

Prediction of response values in linear regression models from replicated experiments

H. Toutenburg Shalabh 《Statistical Papers》2002,43(3):423-433

This paper considers the problem of prediction in a linear regression model when data sets are available from replicated experiments. Pooling the data sets for the estimation of regression parameters, we present three predictors — one arising from the least squares method and two stemming from Stein-rule method. Efficiency properties of these predictors are discussed when they are used to predict actual and average values of response variable within/outside the sample. Received: November 17, 1999; revised version: August 10, 2000 相似文献

14.

Generalized linear models with functional predictors

Gareth M. James 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(3):411-432

Summary. We present a technique for extending generalized linear models to the situation where some of the predictor variables are observations from a curve or function. The technique is particularly useful when only fragments of each curve have been observed. We demonstrate, on both simulated and real data sets, how this approach can be used to perform linear, logistic and censored regression with functional predictors. In addition, we show how functional principal components can be used to gain insight into the relationship between the response and functional predictors. Finally, we extend the methodology to apply generalized linear models and principal components to standard missing data problems. 相似文献

15.

Nonparametric estimation of the link function including variable selection

Gerhard Tutz Sebastian Petry 《Statistics and Computing》2012,22(2):545-561

Nonparametric methods for the estimation of the link function in generalized linear models are able to avoid bias in the regression parameters. But for the estimation of the link typically the full model, which includes all predictors, has been used. When the number of predictors is large these methods fail since the full model cannot be estimated. In the present article a boosting type method is proposed that simultaneously selects predictors and estimates the link function. The method performs quite well in simulations and real data examples. 相似文献

16.

Large sample inference in random coefficient regression models

Randy L. Carter Mark C.K. Yang 《统计学通讯:理论与方法》2013,42(8):2507-2525

Random coefficient regression models have been used t odescribe repeated measures on members of a sample of n in dividuals . Previous researchers have proposed methods of estimating the mean parameters of such models. Their methods require that eachindividual be observed under the same settings of independent variablesor , lesss stringently , that the number of observations ,r , on each individual be the same. Under the latter restriction ,estimators of mean regression parameters exist which are consist ent as both r^→∞and n^→∞ and efficient as r^→∞, and large sample ( r large ) tests of mean parameters are available . These results are easily extended to the case where not a11 individuals are observed an equal number of times provided limit are taken as min(r) → ∞. Existing methods of inference , however, are not justified by the current literature when n is large and r is small, as is the case i n many bio-medical applications . The primary con tribution of the current paper is a derivation of the asymptotic properties of modifications of existing estimators as n alone tends to infinity, r fixed. From these properties it is shown that existing methods of inference, which are currently justified only when min(r) is large, are also justifiable when n is large and min(r) is small. A secondary contribution is the definition of a positive definite estimator of the covariance matrix for the random coefficients in these models. Use of this estimator avoids computational problems that can otherwise arise. 相似文献

17.

Borrowing strength from past data in small domain prediction by kalman filtering - a case

Arijit Chaudhuri Tapabrata Maiti 《统计学通讯:理论与方法》2013,42(12):3507-3514

Point and interval estimators for small domains based exclusively on current and domain specific sample observations are generally ineffective because of inadequate sample-sizes. So, borrowing strength from sample values for analogous domains and simultaneously from all relevant past and auxiliary data is useful in deriving improved small domain statistics. Postulating for simplicity a linear regression model with a single covariate and a zero intercept but a time-specific domain-invariant slope we start with “synthetic” generalized regression predictors for the domain totals. These borrow across only domains. For further improvements a simple autoregressive model is postulated for the slope parameters. Employing Kalman filtering the previous predictors are revised to borrow supplementary strength across time. As drastic simplifying assumptions are needed in such predictions the efficacy of the procedure is examined through an empirical exercise using live data as well as simulations. The numerical findings turn out encouraging. 相似文献

18.

Local linear estimation for covariate-adjusted varying-coefficient models

Yiqiang Lu Feng Li Sanying Feng 《统计学通讯:理论与方法》2019,48(15):3816-3835

We consider local linear estimation of varying-coefficient models in which the data are observed with multiplicative distortion which depends on an observed confounding variable. At first, each distortion function is estimated by non parametrically regressing the absolute value of contaminated variable on the confounder. Secondly, the coefficient functions are estimated by the local least square method on the basis of the predictors of latent variables, which are obtained in terms of the estimated distorting functions. We also establish the asymptotic normality of our proposed estimators and discuss the inference about the distortion function. Simulation studies are carried out to assess the finite sample performance of the proposed estimators and a real dataset of Pima Indians diabetes is analyzed for illustration. 相似文献

19.

The Performance of a Robust Multistage Estimator in Nonlinear Regression with Heteroscedastic Errors

Hossein Riazoshams Habshah BT. Midi 《统计学通讯:模拟与计算》2016,45(9):3394-3415

In this article, a robust multistage parameter estimator is proposed for nonlinear regression with heteroscedastic variance, where the residual variances are considered as a general parametric function of predictors. The motivation is based on considering the chi-square distribution for the calculated sample variance of the data. It is shown that outliers that are influential in nonlinear regression parameter estimates are not necessarily influential in calculating the sample variance. This matter persuades us, not only to robustify the estimate of the parameters of the models for both the regression function and the variance, but also to replace the sample variance of the data by a robust scale estimate. 相似文献

20.

Box‐Cox transformations in linear models: Large sample theory and tests of normality

Gemai Chen Richard A. Lockhart Michael A. Stephens 《Revue canadienne de statistique》2002,30(2):177-209

The authors provide a rigorous large sample theory for linear models whose response variable has been subjected to the Box‐Cox transformation. They provide a continuous asymptotic approximation to the distribution of estimators of natural parameters of the model. They show, in particular, that the maximum likelihood estimator of the ratio of slope to residual standard deviation is consistent and relatively stable. The authors further show the importance for inference of normality of the errors and give tests for normality based on the estimated residuals. For non‐normal errors, they give adjustments to the log‐likelihood and to asymptotic standard errors. 相似文献