期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Generalized M-estimation for the accelerated failure time model

Siyang Wang Tao Hu Hengjian Cui 《Statistics》2016,50(1):114-138

The accelerated failure time (AFT) model is an important regression tool to study the association between failure time and covariates. In this paper, we propose a robust weighted generalized M (GM) estimation for the AFT model with right-censored data by appropriately using the Kaplan–Meier weights in the GM–type objective function to estimate the regression coefficients and scale parameter simultaneously. This estimation method is computationally simple and can be implemented with existing software. Asymptotic properties including the root-n consistency and asymptotic normality are established for the resulting estimator under suitable conditions. We further show that the method can be readily extended to handle a class of nonlinear AFT models. Simulation results demonstrate satisfactory finite sample performance of the proposed estimator. The practical utility of the method is illustrated by a real data example. 相似文献

2.

An approximate maximum likelihood procedure for parameter estimation in multivariate discrete data regression models

Andrew W. Roddam 《Journal of applied statistics》2001,28(2):273-279

This paper considers an alternative to iterative procedures used to calculate maximum likelihood estimates of regression coefficients in a general class of discrete data regression models. These models can include both marginal and conditional models and also local regression models. The classical estimation procedure is generally via a Fisher-scoring algorithm and can be computationally intensive for high-dimensional problems. The alternative method proposed here is non-iterative and is likely to be more efficient in high-dimensional problems. The method is demonstrated on two different classes of regression models. 相似文献

3.

Analytical closed-form solution for binary logit regression by categorical predictors

Stan Lipovetsky 《Journal of applied statistics》2015,42(1):37-49

In contrast to the common belief that the logit model has no analytical presentation, it is possible to find such a solution in the case of categorical predictors. This paper shows that a binary logistic regression by categorical explanatory variables can be constructed in a closed-form solution. No special software and no iterative procedures of nonlinear estimation are needed to obtain a model with all its parameters and characteristics, including coefficients of regression, their standard errors and t-statistics, as well as the residual and null deviances. The derivation is performed for logistic models with one binary or categorical predictor, and several binary or categorical predictors. The analytical formulae can be used for arithmetical calculation of all the parameters of the logit regression. The explicit expressions for the characteristics of logit regression are convenient for the analysis and interpretation of the results of logistic modeling. 相似文献

4.

A fuzzy robust regression approach applied to bedload transport data

Jalal Chachi 《统计学通讯:模拟与计算》2017,46(3):1703-1714

Fuzzy least-square regression can be very sensitive to unusual data (e.g., outliers). In this article, we describe how to fit an alternative robust-regression estimator in fuzzy environment, which attempts to identify and ignore unusual data. The proposed approach concerns classical robust regression and estimation methods that are insensitive to outliers. In this regard, based on the least trimmed square estimation method, an estimation procedure is proposed for determining the coefficients of the fuzzy regression model for crisp input-fuzzy output data. The investigated fuzzy regression model is applied to bedload transport data forecasting suspended load by discharge based on a real world data. The accuracy of the proposed method is compared with the well-known fuzzy least-square regression model. The comparison results reveal that the fuzzy robust regression model performs better than the other models in suspended load estimation for the particular dataset. This comparison is done based on a similarity measure between fuzzy sets. The proposed model is general and can be used for modeling natural phenomena whose available observations are reported as imprecise rather than crisp. 相似文献

5.

Semiparametric estimation for weighted average derivatives with responses missing at random

Wanrong LiuXuewen Lu Changchun Xie 《Journal of statistical planning and inference》2012,142(1):347-357

When responses are missing at random, we propose a semiparametric direct estimator for the missing probability and density-weighted average derivatives of a general nonparametric multiple regression function. An estimator for the normalized version of the weighted average derivatives is constructed as well using instrumental variables regression. The proposed estimators are computationally simple and asymptotically normal, and provide a solution to the problem of estimating index coefficients of single-index models with responses missing at random. The developed theory generalizes the method of the density-weighted average derivatives estimation of Powell et al. (1989) for the non-missing data case. Monte Carlo simulation studies are conducted to study the performance of the methods. 相似文献

6.

Inverse regression estimation for censored data 总被引：1，自引：0，他引：1

Nadkarni NV Zhao Y Kosorok MR 《Journal of the American Statistical Association》2011,106(493):178-190

An inverse regression methodology for assessing predictor performance in the censored data setup is developed along with inference procedures and a computational algorithm. The technique developed here allows for conditioning on the unobserved failure time along with a weighting mechanism that accounts for the censoring. The implementation is nonparametric and computationally fast. This provides an efficient methodological tool that can be used especially in cases where the usual modeling assumptions are not applicable to the data under consideration. It can also be a good diagnostic tool that can be used in the model selection process. We have provided theoretical justification of consistency and asymptotic normality of the methodology. Simulation studies and two data analyses are provided to illustrate the practical utility of the procedure. 相似文献

7.

Empirical likelihood weighted composite quantile regression with partially missing covariates

Jing Sun Yunyan Ma 《Journal of nonparametric statistics》2017,29(1):137-150

This paper develops a novel weighted composite quantile regression (CQR) method for estimation of a linear model when some covariates are missing at random and the probability for missingness mechanism can be modelled parametrically. By incorporating the unbiased estimating equations of incomplete data into empirical likelihood (EL), we obtain the EL-based weights, and then re-adjust the inverse probability weighted CQR for estimating the vector of regression coefficients. Theoretical results show that the proposed method can achieve semiparametric efficiency if the selection probability function is correctly specified, therefore the EL weighted CQR is more efficient than the inverse probability weighted CQR. Besides, our algorithm is computationally simple and easy to implement. Simulation studies are conducted to examine the finite sample performance of the proposed procedures. Finally, we apply the new method to analyse the US news College data. 相似文献

8.

Robust estimation of variance components

Daniel Gervini Victor J. Yohai 《Revue canadienne de statistique》1998,26(3):419-430

New robust estimates for variance components are introduced. Two simple models are considered: the balanced one-way classification model with a random factor and the balanced mixed model with one random factor and one fixed factor. However, the method of estimation proposed can be extended to more complex models. The new method of estimation we propose is based on the relationship between the variance components and the coefficients of the least-mean-squared-error predictor between two observations of the same group. This relationship enables us to transform the problem of estimating the variance components into the problem of estimating the coefficients of a simple linear regression model. The variance-component estimators derived from the least-squares regression estimates are shown to coincide with the maximum-likelihood estimates. Robust estimates of the variance components can be obtained by replacing the least-squares estimates by robust regression estimates. In particular, a Monte Carlo study shows that for outlier-contaminated normal samples, the estimates of variance components derived from GM regression estimates and the derived test outperform other robust procedures. 相似文献

9.

Regression analysis of zero-inflated time-series counts: application to air pollution related emergency room visit data

M. Tariqul Hasan Gary Sneddon Renjun Ma 《Journal of applied statistics》2012,39(3):467-476

Time-series count data with excessive zeros frequently occur in environmental, medical and biological studies. These data have been traditionally handled by conditional and marginal modeling approaches separately in the literature. The conditional modeling approaches are computationally much simpler, whereas marginal modeling approaches can link the overall mean with covariates directly. In this paper, we propose new models that can have conditional and marginal modeling interpretations for zero-inflated time-series counts using compound Poisson distributed random effects. We also develop a computationally efficient estimation method for our models using a quasi-likelihood approach. The proposed method is illustrated with an application to air pollution-related emergency room visits. We also evaluate the performance of our method through simulation studies. 相似文献

10.

Single-index modelling of conditional probabilities in two-way contingency tables

Gery Geenens Léopold Simar 《Statistics》2013,47(5):451-478

When analysing a contingency table, it is often worth relating the probabilities that a given individual falls into different cells from a set of predictors. These conditional probabilities are usually estimated using appropriate regression techniques. In particular, in this paper, a semiparametric model is developed. Essentially, it is only assumed that the effect of the vector of covariates on the probabilities can entirely be captured by a single index, which is a linear combination of the initial covariates. The estimation is then twofold: the coefficients of the linear combination and the functions linking this index to the related conditional probabilities have to be estimated. Inspired by the estimation procedures already proposed in the literature for single-index regression models, four estimators of the index coefficients are proposed and compared, from a theoretical point-of-view, but also practically, with the aid of simulations. Estimation of the link functions is also addressed. 相似文献

11.

A unified approach to estimation of nonlinear mixed effects and Berkson measurement error models

Liqun Wang 《Revue canadienne de statistique》2007,35(2):233-248

Mixed effects models and Berkson measurement error models are widely used. They share features which the author uses to develop a unified estimation framework. He deals with models in which the random effects (or measurement errors) have a general parametric distribution, whereas the random regression coefficients (or unobserved predictor variables) and error terms have nonparametric distributions. He proposes a second-order least squares estimator and a simulation-based estimator based on the first two moments of the conditional response variable given the observed covariates. He shows that both estimators are consistent and asymptotically normally distributed under fairly general conditions. The author also reports Monte Carlo simulation studies showing that the proposed estimators perform satisfactorily for relatively small sample sizes. Compared to the likelihood approach, the proposed methods are computationally feasible and do not rely on the normality assumption for random effects or other variables in the model. 相似文献

12.

Cocaine Dependence Treatment Data: Methods for Measurement Error Problems With Predictors Derived From Stationary Stochastic Processes

Guan Y Li Y Sinha R 《Journal of the American Statistical Association》2011,106(493):480-493

In a cocaine dependence treatment study, we use linear and nonlinear regression models to model posttreatment cocaine craving scores and first cocaine relapse time. A subset of the covariates are summary statistics derived from baseline daily cocaine use trajectories, such as baseline cocaine use frequency and average daily use amount. These summary statistics are subject to estimation error and can therefore cause biased estimators for the regression coefficients. Unlike classical measurement error problems, the error we encounter here is heteroscedastic with an unknown distribution, and there are no replicates for the error-prone variables or instrumental variables. We propose two robust methods to correct for the bias: a computationally efficient method-of-moments-based method for linear regression models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear regression models. Simulations and an application to the cocaine dependence treatment data are used to illustrate the efficacy of the proposed methods. Asymptotic theory and variance estimation for the proposed subsampling extrapolation method and some additional simulation results are described in the online supplementary material. 相似文献

13.

Penalized empirical likelihood inference for sparse additive hazards regression with a diverging number of covariates

Shanshan Wang Liming Xiang 《Statistics and Computing》2017,27(5):1347-1364

High-dimensional sparse modeling with censored survival data is of great practical importance, as exemplified by applications in high-throughput genomic data analysis. In this paper, we propose a class of regularization methods, integrating both the penalized empirical likelihood and pseudoscore approaches, for variable selection and estimation in sparse and high-dimensional additive hazards regression models. When the number of covariates grows with the sample size, we establish asymptotic properties of the resulting estimator and the oracle property of the proposed method. It is shown that the proposed estimator is more efficient than that obtained from the non-concave penalized likelihood approach in the literature. Based on a penalized empirical likelihood ratio statistic, we further develop a nonparametric likelihood approach for testing the linear hypothesis of regression coefficients and constructing confidence regions consequently. Simulation studies are carried out to evaluate the performance of the proposed methodology and also two real data sets are analyzed. 相似文献

14.

The additive hazards model with high-dimensional regressors

Torben Martinussen Thomas H. Scheike 《Lifetime data analysis》2009,15(3):330-342

This paper considers estimation and prediction in the Aalen additive hazards model in the case where the covariate vector is high-dimensional such as gene expression measurements. Some form of dimension reduction of the covariate space is needed to obtain useful statistical analyses. We study the partial least squares regression method. It turns out that it is naturally adapted to this setting via the so-called Krylov sequence. The resulting PLS estimator is shown to be consistent provided that the number of terms included is taken to be equal to the number of relevant components in the regression model. A standard PLS algorithm can also be constructed, but it turns out that the resulting predictor can only be related to the original covariates via time-dependent coefficients. The methods are applied to a breast cancer data set with gene expression recordings and to the well known primary biliary cirrhosis clinical data. 相似文献

15.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker 《Econometric Reviews》2013,32(2-3):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

16.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker Siem Jan Koopman 《Econometric Reviews》2006,25(2):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

17.

The Ubiquity of Statistics

William Kruskal 《The American statistician》2013,67(1):3-6

The use of biased estimation in data analysis and model building is discussed. A review of the theory of ridge regression and its relation to generalized inverse regression is presented along with the results of a simulation experiment and three examples of the use of ridge regression in practice. Comments on variable selection procedures, model validation, and ridge and generalized inverse regression computation procedures are included. The examples studied here show that when the predictor variables are highly correlated, ridge regression produces coefficients which predict and extrapolate better than least squares and is a safe procedure for selecting variables. 相似文献

18.

Bayesian variable selection for multioutcome models through shared shrinkage

Debamita Kundu Riten Mitra Jeremy T. Gaskins 《Scandinavian Journal of Statistics》2021,48(1):295-320

Variable selection over a potentially large set of covariates in a linear model is quite popular. In the Bayesian context, common prior choices can lead to a posterior expectation of the regression coefficients that is a sparse (or nearly sparse) vector with a few nonzero components, those covariates that are most important. This article extends the “global‐local” shrinkage idea to a scenario where one wishes to model multiple response variables simultaneously. Here, we have developed a variable selection method for a K‐outcome model (multivariate regression) that identifies the most important covariates across all outcomes. The prior for all regression coefficients is a mean zero normal with coefficient‐specific variance term that consists of a predictor‐specific factor (shared local shrinkage parameter) and a model‐specific factor (global shrinkage term) that differs in each model. The performance of our modeling approach is evaluated through simulation studies and a data example. 相似文献

19.

Bounded-leverage weights for robust regression estimators

Dovalee Dorsett 《统计学通讯:理论与方法》2013,42(8):2785-2800

Both the least squares estimator and M-estimators of regression coefficients are susceptible to distortion when high leverage points occur among the predictor variables in a multiple linear regression model. In this article a weighting scheme which enables one to bound the leverage values of a weighted matrix of predictor variables is proposed. Bounded-leverage weighting of the predictor variables followed by M-estimation of the regression coefficients is shown to be effective in protecting against distortion due to extreme predictor-variable values, extreme response values, or outlier-induced multieollinearites. Bounded-leverage estimators can also protect against distortion by small groups of high leverage points. 相似文献

20.

Full Bayesian wavelet inference with a nonparametric prior

Xue Wang Stephen G. Walker 《Journal of statistical planning and inference》2013

In this paper, we introduce a new Bayesian nonparametric model for estimating an unknown function in the presence of Gaussian noise. The proposed model involves a mixture of a point mass and an arbitrary (nonparametric) symmetric and unimodal distribution for modeling wavelet coefficients. Posterior simulation uses slice sampling ideas and the consistency under the proposed model is discussed. In particular, the method is shown to be computationally competitive with some of best Empirical wavelet estimation methods. 相似文献