期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Performance of Variable Selection Methods in Regression Using Variations of the Bayesian Information Criterion

Tom Burr Herb Fry Brian McVey Eric Sander Joseph Cavanaugh Andrew Neath 《统计学通讯:模拟与计算》2013,42(3):507-520

The Bayesian information criterion (BIC) is widely used for variable selection. We focus on the regression setting for which variations of the BIC have been proposed. A version that includes the Fisher Information matrix of the predictor variables performed best in one published study. In this article, we extend the evaluation, introduce a performance measure involving how closely posterior probabilities are approximated, and conclude that the version that includes the Fisher Information often favors regression models having more predictors, depending on the scale and correlation structure of the predictor matrix. In the image analysis application that we describe, we therefore prefer the standard BIC approximation because of its relative simplicity and competitive performance at approximating the true posterior probabilities. 相似文献

2.

Genetic Algorithm in the Wavelet Domain for Large p Small n Regression

Eylem Deniz Howe Orietta Nicolis 《统计学通讯:模拟与计算》2015,44(5):1144-1157

Many areas of statistical modeling are plagued by the “curse of dimensionality,” in which there are more variables than observations. This is especially true when developing functional regression models where the independent dataset is some type of spectral decomposition, such as data from near-infrared spectroscopy. While we could develop a very complex model by simply taking enough samples (such that n > p), this could prove impossible or prohibitively expensive. In addition, a regression model developed like this could turn out to be highly inefficient, as spectral data usually exhibit high multicollinearity. In this article, we propose a two-part algorithm for selecting an effective and efficient functional regression model. Our algorithm begins by evaluating a subset of discrete wavelet transformations, allowing for variation in both wavelet and filter number. Next, we perform an intermediate processing step to remove variables with low correlation to the response data. Finally, we use the genetic algorithm to perform a stochastic search through the subset regression model space, driven by an information-theoretic objective function. We allow our algorithm to develop the regression model for each response variable independently, so as to optimally model each variable. We demonstrate our method on the familiar biscuit dough dataset, which has been used in a similar context by several researchers. Our results demonstrate both the flexibility and the power of our algorithm. For each response variable, a different subset model is selected, and different wavelet transformations are used. The models developed by our algorithm show an improvement, as measured by lower mean error, over results in the published literature. 相似文献

3.

Consistency and asymptotic normality of the estimated effective doses in bioassay

Rabi Bhattacharya Maiying Kong 《Journal of statistical planning and inference》2007

In order to estimate the effective dose such as the 0.5 quantile _ED₅₀

{ED}_{50}

in a bioassay problem various parametric and semiparametric models have been used in the literature. If the true dose–response curve deviates significantly from the model, the estimates will generally be inconsistent. One strategy is to analyze the data making only a minimal assumption on the model, namely, that the dose–response curve is non-decreasing. In the present paper we first define an empirical dose–response curve based on the estimated response probabilities by using the “pool-adjacent-violators” (PAV) algorithm, then estimate effective doses _ED₁₀₀_p

{ED}_{100 p}

for a large range of p by taking inverse of this empirical dose–response curve. The consistency and asymptotic distribution of these estimated effective doses are obtained. The asymptotic results can be extended to the estimated effective doses proposed by Glasbey [1987. Tolerance-distribution-free analyses of quantal dose–response data. Appl. Statist. 36 (3), 251–259] and Schmoyer [1984. Sigmoidally constrained maximum likelihood estimation in quantal bioassay. J. Amer. Statist. Assoc. 79, 448–453] under the additional assumption that the dose–response curve is symmetric or sigmoidal. We give some simulations on constructing confidence intervals using different methods. 相似文献

4.

Bayesian hidden Markov models in DNA sequence segmentation using R: the case of Simian Vacuolating virus (SV40)

James A. Totterdell Kerrie L. Mengersen 《Journal of Statistical Computation and Simulation》2017,87(14):2799-2827

相似文献