期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Reversible jump Markov chain Monte Carlo algorithms for Bayesian variable selection in logistic mixed models

Jia-Chiun Pan Mei-Hsien Lee 《统计学通讯:模拟与计算》2018,47(8):2234-2247

In this article, to reduce computational load in performing Bayesian variable selection, we used a variant of reversible jump Markov chain Monte Carlo methods, and the Holmes and Held (HH) algorithm, to sample model index variables in logistic mixed models involving a large number of explanatory variables. Furthermore, we proposed a simple proposal distribution for model index variables, and used a simulation study and real example to compare the performance of the HH algorithm with our proposed and existing proposal distributions. The results show that the HH algorithm with our proposed proposal distribution is a computationally efficient and reliable selection method. 相似文献

2.

Tree-structured subgroup analysis for censored survival data: Validation of computationally inexpensive model selection criteria

Abdissa?Negassa Email author Antonio?Ciampi Michal?Abrahamowicz Stanley?Shapiro Jean-Fran?ois?Boivin 《Statistics and Computing》2005,15(3):231-239

The performance of computationally inexpensive model selection criteria in the context of tree-structured subgroup analysis is investigated. It is shown through simulation that no single model selection criterion exhibits a uniformly superior performance over a wide range of scenarios. Therefore, a two-stage approach for model selection is proposed and shown to perform satisfactorily. Applied example of subgroup analysis is presented. Problems associated with tree-structured subgroup analysis are discussed and practical solutions are suggested. 相似文献

3.

A fast algorithm for non-negativity model selection

Cristian Gatu Erricos John Kontoghiorghes 《Statistics and Computing》2013,23(3):403-411

An efficient optimization algorithm for identifying the best least squares regression model under the condition of non-negative coefficients is proposed. The algorithm exposits an innovative solution via the unrestricted least squares and is based on the regression tree and branch-and-bound techniques for computing the best subset regression. The aim is to filling a gap in computationally tractable solutions to the non-negative least squares problem and model selection. The proposed method is illustrated with a real dataset. Experimental results on real and artificial random datasets confirm the computational efficacy of the new strategy and demonstrates its ability to solve large model selection problems that are subject to non-negativity constrains. 相似文献

4.

Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection 总被引：1，自引：0，他引：1

Ludwig Fahrmeir Thomas Kneib Susanne Konrath 《Statistics and Computing》2010,20(2):203-219

This paper surveys various shrinkage, smoothing and selection priors from a unifying perspective and shows how to combine them for Bayesian regularisation in the general class of structured additive regression models. As a common feature, all regularisation priors are conditionally Gaussian, given further parameters regularising model complexity. Hyperpriors for these parameters encourage shrinkage, smoothness or selection. It is shown that these regularisation (log-) priors can be interpreted as Bayesian analogues of several well-known frequentist penalty terms. Inference can be carried out with unified and computationally efficient MCMC schemes, estimating regularised regression coefficients and basis function coefficients simultaneously with complexity parameters and measuring uncertainty via corresponding marginal posteriors. For variable and function selection we discuss several variants of spike and slab priors which can also be cast into the framework of conditionally Gaussian priors. The performance of the Bayesian regularisation approaches is demonstrated in a hazard regression model and a high-dimensional geoadditive regression model. 相似文献

5.

Inverse regression estimation for censored data 总被引：1，自引：0，他引：1

Nadkarni NV Zhao Y Kosorok MR 《Journal of the American Statistical Association》2011,106(493):178-190

An inverse regression methodology for assessing predictor performance in the censored data setup is developed along with inference procedures and a computational algorithm. The technique developed here allows for conditioning on the unobserved failure time along with a weighting mechanism that accounts for the censoring. The implementation is nonparametric and computationally fast. This provides an efficient methodological tool that can be used especially in cases where the usual modeling assumptions are not applicable to the data under consideration. It can also be a good diagnostic tool that can be used in the model selection process. We have provided theoretical justification of consistency and asymptotic normality of the methodology. Simulation studies and two data analyses are provided to illustrate the practical utility of the procedure. 相似文献

6.

Variational approximation for heteroscedastic linear models and matching pursuit algorithms

David J. Nott Minh-Ngoc Tran Chenlei Leng 《Statistics and Computing》2012,22(2):497-512

Modern statistical applications involving large data sets have focused attention on statistical methodologies which are both efficient computationally and able to deal with the screening of large numbers of different candidate models. Here we consider computationally efficient variational Bayes approaches to inference in high-dimensional heteroscedastic linear regression, where both the mean and variance are described in terms of linear functions of the predictors and where the number of predictors can be larger than the sample size. We derive a closed form variational lower bound on the log marginal likelihood useful for model selection, and propose a novel fast greedy search algorithm on the model space which makes use of one-step optimization updates to the variational lower bound in the current model for screening large numbers of candidate predictor variables for inclusion/exclusion in a computationally thrifty way. We show that the model search strategy we suggest is related to widely used orthogonal matching pursuit algorithms for model search but yields a framework for potentially extending these algorithms to more complex models. The methodology is applied in simulations and in two real examples involving prediction for food constituents using NIR technology and prediction of disease progression in diabetes. 相似文献

7.

An efficient algorithm for Harrison-Stevens forecasting using the multi-process multivariate dynamic linear model

William M. Bolstad 《统计学通讯:模拟与计算》2013,42(3):819-828

This paper develops a computationally efficient algorithm for Harrison-Stevens forecasting in a multivariate time series which has correlated errors. The algorithm uses the observation vector one component at a time on the multiprocess multivariate dynamic linear model. This gives a computationally efficient, robust, quick adapting forecasting method for non stationary multivariate time series. 相似文献

8.

Approximate penalization path for smoothly clipped absolute deviation

《Journal of Statistical Computation and Simulation》2012,82(5):643-652

Feature selection often constitutes one of the central aspects of many scientific investigations. Among the methodologies for feature selection in penalized regression, the smoothly clipped and absolute deviation seems to be very useful because it satisfies the oracle property. However, its estimation algorithms such as the local quadratic approximation and the concave–convex procedure are not computationally efficient. In this paper, we propose an efficient penalization path algorithm. Through numerical examples on real and simulated data, we illustrate that our path algorithm can be useful for feature selection in regression problems. 相似文献

9.

Empirical likelihood weighted composite quantile regression with partially missing covariates

Jing Sun Yunyan Ma 《Journal of nonparametric statistics》2017,29(1):137-150

This paper develops a novel weighted composite quantile regression (CQR) method for estimation of a linear model when some covariates are missing at random and the probability for missingness mechanism can be modelled parametrically. By incorporating the unbiased estimating equations of incomplete data into empirical likelihood (EL), we obtain the EL-based weights, and then re-adjust the inverse probability weighted CQR for estimating the vector of regression coefficients. Theoretical results show that the proposed method can achieve semiparametric efficiency if the selection probability function is correctly specified, therefore the EL weighted CQR is more efficient than the inverse probability weighted CQR. Besides, our algorithm is computationally simple and easy to implement. Simulation studies are conducted to examine the finite sample performance of the proposed procedures. Finally, we apply the new method to analyse the US news College data. 相似文献

10.

Non-iterative Estimation and Variable Selection in the Single-index Quantile Regression Model

C. N. Kuruwita 《统计学通讯:模拟与计算》2016,45(10):3615-3628

A new estimation procedure is proposed for the single-index quantile regression model. Compared to existing work, this approach is non-iterative and hence, computationally efficient. The proposed method not only estimates the index parameter and the link function but also selects variables simultaneously. The performance of the variable selection is enhanced by a fully adaptive penalty function motivated by the sliced inverse regression technique. Finite sample performance is studied through a simulation study that compares the proposed method with existing work under several criteria. A data analysis is given that highlights the usefulness of the proposed methodology. 相似文献

11.

Statistical inference for oscillation processes

Rainer Dahlhaus Thierry Dumont Sylvain Le Corff Jan C. Neddermeyer 《Statistics》2017,51(1):61-83

ABSTRACT

A new model for time series with a specific oscillation pattern is proposed. The model consists of a hidden phase process controlling the speed of polling and a nonparametric curve characterizing the pattern, leading together to a generalized state space model. Identifiability of the model is proved and a method for statistical inference based on a particle smoother and a nonparametric EM algorithm is developed. In particular, the oscillation pattern and the unobserved phase process are estimated. The proposed algorithms are computationally efficient and their performance is assessed through simulations and an application to human electrocardiogram recordings. 相似文献

12.

Selection of a stroke risk model based on transcranial Doppler ultrasound velocity

S. Mukhopadhyay I. Das K. Das 《Journal of applied statistics》2012,39(12):2699-2712

Increased transcranial Doppler ultrasound (TCD) velocity is an indicator of cerebral infarction in children with sickle cell disease (SCD). In this article, the parallel genetic algorithm (PGA) is used to select a stroke risk model with TCD velocity as the response variable. Development of such a stroke risk model leads to the identification of children with SCD who are at a higher risk of stroke and their treatment in the early stages. Using blood velocity data from SCD patients, it is shown that the PGA is an easy-to-use computationally variable selection tool. The results of the PGA are also compared with those obtained from the stochastic search variable selection method, the Dantzig selector and conventional techniques such as stepwise selection and best subset selection. 相似文献

13.

An efficient and fast algorithm for estimating the parameters of two-dimensional sinusoidal signals

Swagata Nandi Anurag Prasad Debasis Kundu 《Journal of statistical planning and inference》2010,140(1):153-168

In this paper we propose a computationally efficient algorithm to estimate the parameters of a 2-D sinusoidal model in the presence of stationary noise. The estimators obtained by the proposed algorithm are consistent and asymptotically equivalent to the least squares estimators. Monte Carlo simulations are performed for different sample sizes and it is observed that the performances of the proposed method are quite satisfactory and they are equivalent to the least squares estimators. The main advantage of the proposed method is that the estimators can be obtained using only finite number of iterations. In fact it is shown that starting from the average of periodogram estimators, the proposed algorithm converges in three steps only. One synthesized texture data and one original texture data have been analyzed using the proposed algorithm for illustrative purpose. 相似文献

14.

Penalized inverse probability weighted estimators for weighted rank regression with missing covariates

Hu Yang Jing Lv 《统计学通讯:理论与方法》2013,42(5):1388-1402

Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators. 相似文献

15.

Variable selection for model-based clustering using the integrated complete-data likelihood

Matthieu Marbac Mohammed Sedki 《Statistics and Computing》2017,27(4):1049-1063

Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty. However, the calibration of the penalty term can suffer from criticisms. Model selection methods are an efficient alternative, yet they require a difficult optimization of an information criterion which involves combinatorial problems. First, most of these optimization algorithms are based on a suboptimal procedure (e.g. stepwise method). Second, the algorithms are often computationally expensive because they need multiple calls of EM algorithms. Here we propose to use a new information criterion based on the integrated complete-data likelihood. It does not require the maximum likelihood estimate and its maximization appears to be simple and computationally efficient. The original contribution of our approach is to perform the model selection without requiring any parameter estimation. Then, parameter inference is needed only for the unique selected model. This approach is used for the variable selection of a Gaussian mixture model with conditional independence assumed. The numerical experiments on simulated and benchmark datasets show that the proposed method often outperforms two classical approaches for variable selection. The proposed approach is implemented in the R package VarSelLCM available on CRAN. 相似文献

16.

Pruning regression trees for censored survival data: the recpam approach

Antonio Ciampi Johanne Thiffault 《统计学通讯:理论与方法》2013,42(9):3373-3388

RECPAM is a methodology, implemented in a computer program of the same name, for the construction of tree-structured models in Biostatistics. In this work we present algorithms for pruning and amalgamating terminal nodes of a tree, within the RECPAM approach. These algorithms construct sequences of nested models and calculate at each step the AIC of the corresponding model and correct significance levels, according to Gabriel's theory of Simultaneous Test Procedures. As an example, the analysis of data from clinical trials involving patients with Small Cell Carcinoma of the Lung is presented. 相似文献

17.

A Thresholding Algorithm for Order Selection in Finite Mixture Models

Chen Xu Jiahua Chen 《统计学通讯:模拟与计算》2015,44(2):433-453

Order selection is an important step in the application of finite mixture models. Classical methods such as AIC and BIC discourage complex models with a penalty directly proportional to the number of mixing components. In contrast, Chen and Khalili propose to link the penalty to two types of overfitting. In particular, they introduce a regularization penalty to merge similar subpopulations in a mixture model, where the shrinkage idea of regularized regression is seamlessly employed. However, the new method requires an effective and efficient algorithm. When the popular expectation-maximization (EM)-algorithm is used, we need to maximize a nonsmooth and nonconcave objective function in the M-step, which is computationally challenging. In this article, we show that such an objective function can be transformed into a sum of univariate auxiliary functions. We then design an iterative thresholding descent algorithm (ITD) to efficiently solve the associated optimization problem. Unlike many existing numerical approaches, the new algorithm leads to sparse solutions and thereby avoids undesirable ad hoc steps. We establish the convergence of the ITD and further assess its empirical performance using both simulations and real data examples. 相似文献

18.

A gradient approach to efficient design and analysis of multivariate EWMA control charts

Wenpo Huang Wei Jiang 《Journal of Statistical Computation and Simulation》2018,88(14):2707-2725

Compared to the grid search approach to optimal design of control charts, the gradient-based approach is more computationally efficient as the gradient information indicates the direction to search the optimal design parameters. However, the optimal parameters of multivariate exponentially weighted moving average (MEWMA) control charts are often obtained by using grid search in the existing literature. Note that the average run length (ARL) performance of the MEWMA chart can be calculated based on a Markov chain model, making it feasible to estimate the ARL gradient from it. Motivated by this, this paper develops an ARL gradient-based approach for the optimal design and sensitivity analysis of MEWMA control charts. It is shown that the proposed method is able to provide a fast, accurate, and easy-to-implement algorithm for the design and analysis of MEWMA charts, as compared to the conventional design approach based on grid search. 相似文献

19.

On-line inference for hidden Markov models via particle filters

Paul Fearnhead Peter Clifford 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(4):887-899

Summary. We consider the on-line Bayesian analysis of data by using a hidden Markov model, where inference is tractable conditional on the history of the state of the hidden component. A new particle filter algorithm is introduced and shown to produce promising results when analysing data of this type. The algorithm is similar to the mixture Kalman filter but uses a different resampling algorithm. We prove that this resampling algorithm is computationally efficient and optimal, among unbiased resampling algorithms, in terms of minimizing a squared error loss function. In a practical example, that of estimating break points from well-log data, our new particle filter outperforms two other particle filters, one of which is the mixture Kalman filter, by between one and two orders of magnitude. 相似文献

20.

Efficient estimation and model selection in large graphical models

Dag Wedelin 《Statistics and Computing》1996,6(4):313-323

We develop a computationally efficient method to determine the interaction structure in a multidimensional binary sample. We use an interaction model based on orthogonal functions, and give a result on independence properties in this model. Using this result we develop an efficient approximation algorithm for estimating the parameters in a given undirected model. To find the best model, we use a heuristic search algorithm in which the structure is determined incrementally. We also give an algorithm for reconstructing the causal directions, if such exist. We demonstrate that together these algorithms are capable of discovering almost all of the true structure for a problem with 121 variables, including many of the directions. 相似文献