期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An alternate version of the conceptual predictive statistic based on a symmetrized discrepancy measure

Joseph E. Cavanaugh Andrew A. Neath Simon L. Davies 《Journal of statistical planning and inference》2010

The conceptual predictive statistic, C_p, is a widely used criterion for model selection in linear regression. C_p serves as an estimator of a discrepancy, a measure that reflects the disparity between the generating model and a fitted candidate model. This discrepancy, based on scaled squared error loss, is asymmetric: an alternate measure is obtained by reversing the roles of the two models in the definition of the measure. We propose a variant of the C_p statistic based on estimating a symmetrized version of the discrepancy targeted by C_p. We claim that the resulting criterion provides better protection against overfitting than C_p, since the symmetric discrepancy is more sensitive towards detecting overspecification than its asymmetric counterpart. We illustrate our claim by presenting simulation results. Finally, we demonstrate the practical utility of the new criterion by discussing a modeling application based on data collected in a cardiac rehabilitation program at University of Iowa Hospitals and Clinics. 相似文献

2.

A minimum description length approach to hidden Markov models with Poisson and Gaussian emissions. Application to order identification

A. Chambaz A. Garivier E. Gassiat 《Journal of statistical planning and inference》2009

We address the issue of order identification for hidden Markov models with Poisson and Gaussian emissions. We prove information-theoretic BIC-like mixture inequalities in the spirit of Finesso [1991. Consistent estimation of the order for Markov and hidden Markov chains. Ph.D. Thesis, University of Maryland]; Liu and Narayan [1994. Order estimation and sequential universal data compression of a hidden Markov source by the method of mixtures. Canad. J. Statist. 30(4), 573–589]; Gassiat and Boucheron [2003. Optimal error exponents in hidden Markov models order estimation. IEEE Trans. Inform. Theory 49(4), 964–980]. These inequalities lead to consistent penalized estimators that need no prior bound on the order. A simulation study and an application to postural analysis in humans are provided. 相似文献

3.

Equivalence of Certain Chi-Squared Test Statistics

Robert F. Woolson Stephen S. Brier 《The American statistician》2013,67(4):250-253

In likelihood analysis of categorized data, it is well known that within a restricted class of log-linear models the likelihood kernels for multinomial and product multinomial sampling distributions are identical. In practical terms the estimation procedure for one is appropriate for the other. There does not appear to be a widespread realization that a similar result holds for a wide class of the Grizzle, Starmer, and Koch (1969) weighted least squares techniques. In this report such a fundamental relationship is explicitly presented and illustrated through two analyses of Bartlett's (1935) data. 相似文献

4.

Minimum Sample Size Considerations for Two-Group Linear and Quadratic Discriminant Analysis with Rare Populations

Shannon Zavorka Jamis J. Perrett 《统计学通讯:模拟与计算》2013,42(7):1726-1739

Linear discriminant analysis and quadratic discriminant analysis are used to predict group membership. Rare populations present situations in which group sizes differ drastically. This article examined k = 2 and k = 4 predictor variables for groups with different levels of rarity and different levels of sensitivity and specificity. Sample size recommendations were generated for both minimum and maximum group overlap using the leave-one-out (L-O-O) method of estimation. Minimum sample size recommendations are provided in tables for immediate implementation by applied researchers. 相似文献

5.

On elliptical multilevel models

Roberto F. Manghi Francisco José A. Cysneiros 《Journal of applied statistics》2016,43(12):2150-2171

Multilevel models have been widely applied to analyze data sets which present some hierarchical structure. In this paper we propose a generalization of the normal multilevel models, named elliptical multilevel models. This proposal suggests the use of distributions in the elliptical class, thus involving all symmetric continuous distributions, including the normal distribution as a particular case. Elliptical distributions may have lighter or heavier tails than the normal ones. In the case of normal error models with the presence of outlying observations, heavy-tailed error models may be applied to accommodate such observations. In particular, we discuss some aspects of the elliptical multilevel models, such as maximum likelihood estimation and residual analysis to assess features related to the fitting and the model assumptions. Finally, two motivating examples analyzed under normal multilevel models are reanalyzed under Student-t and power exponential multilevel models. Comparisons with the normal multilevel model are performed by using residual analysis. 相似文献

6.

Consistent model selection and data-driven smooth tests for longitudinal data in the estimating equations approach

Lan Wang Annie Qu 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(1):177-190

Summary. Model selection for marginal regression analysis of longitudinal data is challenging owing to the presence of correlation and the difficulty of specifying the full likelihood, particularly for correlated categorical data. The paper introduces a novel Bayesian information criterion type model selection procedure based on the quadratic inference function, which does not require the full likelihood or quasi-likelihood. With probability approaching 1, the criterion selects the most parsimonious correct model. Although a working correlation matrix is assumed, there is no need to estimate the nuisance parameters in the working correlation matrix; moreover, the model selection procedure is robust against the misspecification of the working correlation matrix. The criterion proposed can also be used to construct a data-driven Neyman smooth test for checking the goodness of fit of a postulated model. This test is especially useful and often yields much higher power in situations where the classical directional test behaves poorly. The finite sample performance of the model selection and model checking procedures is demonstrated through Monte Carlo studies and analysis of a clinical trial data set. 相似文献

7.

Model-Selection-Based Detection of Unit Root Allowing for Various Trend-Break Types

Kosei Fukuda 《统计学通讯:模拟与计算》2013,42(1):154-166

In the conventional hypothesis-testing approach to the detection of a unit root and a trend break, selections of the outlier type (additive or innovational) and of the break type (jump or kink) are carried out arbitrarily, because there is no generally accepted statistical technique. To overcome this problem, a model-selection approach using the modified Bayesian information criterion (MBIC) is proposed. Whether the observed time series contains a unit root and a trend break is determined as a result of model selection from among alternative models with and without unit root and trend break. The efficacy of the proposed approach is verified using comprehensive simulations. 相似文献

8.

On Model Selection Consistency of Bayesian Method for Normal Linear Models

Shuyun Wang Qin Chang 《统计学通讯:理论与方法》2013,42(22):4021-4040

相似文献

9.

Robust Variable Selection in Linear Mixed Models

Yali Fan Guoyou Qin 《统计学通讯:理论与方法》2014,43(21):4566-4581

In this article, we develop a robust variable selection procedure jointly for fixed and random effects in linear mixed models for longitudinal data. We propose a penalized robust estimator for both the regression coefficients and the variance of random effects based on a re-parametrization of the linear mixed models. Under some regularity conditions, we show the oracle properties of the proposed robust variable selection method. Simulation study shows the robustness of the proposed method against outliers. In the end, the proposed methods is illustrated in the analysis of a real data set. 相似文献

10.

Sample Size and Power Calculations for Left-Truncated Normal Distribution

Shiquan Ren Haitao Chu Shenghan Lai 《统计学通讯:理论与方法》2013,42(6):847-860

Abstract

Sample size calculation is an important component in designing an experiment or a survey. In a wide variety of fields—including management science, insurance, and biological and medical science—truncated normal distributions are encountered in many applications. However, the sample size required for the left-truncated normal distribution has not been investigated, because the distribution of the sample mean from the left-truncated normal distribution is complex and difficult to obtain. This paper compares an ad hoc approach to two newly proposed methods based on the Central Limit Theorem and on a high degree saddlepoint approximation for calculating the required sample size with the prespecified power. As shown by use of simulations and an example of health insurance cost in China, the ad hoc approach underestimates the sample size required to achieve prespecified power. The method based on the high degree saddlepoint approximation provides valid sample size and power calculations, and it performs better than the Central Limit Theorem. When the sample size is not too small, the Central Limit Theorem also provides a valid, but relatively simple tool to approximate that sample size. 相似文献

11.

Performance of information criteria for spatial models

《Journal of Statistical Computation and Simulation》2012,82(1):93-106

Model choice is one of the most crucial aspect in any statistical data analysis. It is well known that most models are just an approximation to the true data-generating process but among such model approximations, it is our goal to select the ‘best’ one. Researchers typically consider a finite number of plausible models in statistical applications, and the related statistical inference depends on the chosen model. Hence, model comparison is required to identify the ‘best’ model among several such candidate models. This article considers the problem of model selection for spatial data. The issue of model selection for spatial models has been addressed in the literature by the use of traditional information criteria-based methods, even though such criteria have been developed based on the assumption of independent observations. We evaluate the performance of some of the popular model selection critera via Monte Carlo simulation experiments using small to moderate samples. In particular, we compare the performance of some of the most popular information criteria such as Akaike information criterion (AIC), Bayesian information criterion, and corrected AIC in selecting the true model. The ability of these criteria to select the correct model is evaluated under several scenarios. This comparison is made using various spatial covariance models ranging from stationary isotropic to nonstationary models. 相似文献

12.

An m-estimation-based model selection criterion with a data-oriented penalty

《Journal of Statistical Computation and Simulation》2012,82(1):71-87

In Wu and Zen (1999), a linear model selection procedure based on M-estimation is proposed, which includes many classical model selection criteria as its special cases, and it is shown that the selection procedure is strongly consistent for a variety of penalty functions. In this paper, we will investigate its small sample performances for some choices of fixed penalty functions. It can be seen that the performance varies with the choice of the penalty. Hence, a randomized penalty based on observed data is proposed, which preserves the consistency property and provides improved performance over a fixed choice of penalty functions. 相似文献

13.

Model Selection and Regression t-Statistics

DeWayne Derryberry Ken Aho John Edwards Teri Peterson 《The American statistician》2013,67(4):379-381

It is shown that dropping quantitative variables from a linear regression, based on t-statistics, is mathematically equivalent to dropping variables based on commonly used information criteria. 相似文献

14.

Optimum designs for parameter estimation in a mixture experiment with two correlated responses

Manisha Pal Nripes Kumar Mandal 《统计学通讯:模拟与计算》2017,46(10):7698-7709

In this paper, we investigate a mixture problem with two responses, which are functions of the mixing proportions, and are correlated with known dispersion matrix. We obtain D- and A-optimal designs for estimating the parameters of the response functions, when none or some of the regression coefficients of the two functions are the same. It is shown that when no prior knowledge about the regression coefficients is available, the D-optimal design is independent of the dispersion matrix, while the A-optimal design depends on it, provided the response functions are of different degree. On the other hand, when some of the regression coefficients are known to be the same for both the functions, the D-optimal design depends on the dispersion matrix when the two response functions are not of the same degree. 相似文献

15.

A simultaneous variable selection methodology for linear mixed models

Juming Pan Junfeng Shang 《Journal of Statistical Computation and Simulation》2018,88(17):3323-3337

Selecting an appropriate structure for a linear mixed model serves as an appealing problem in a number of applications such as in the modelling of longitudinal or clustered data. In this paper, we propose a variable selection procedure for simultaneously selecting and estimating the fixed and random effects. More specifically, a profile log-likelihood function, along with an adaptive penalty, is utilized for sparse selection. The Newton-Raphson optimization algorithm is performed to complete the parameter estimation. By jointly selecting the fixed and random effects, the proposed approach increases selection accuracy compared with two-stage procedures, and the usage of the profile log-likelihood can improve computational efficiency in one-stage procedures. We prove that the proposed procedure enjoys the model selection consistency. A simulation study and a real data application are conducted for demonstrating the effectiveness of the proposed method. 相似文献

16.

A Note on Bayesian Analyses of Capture–Recapture Data with Perfect Recaptures

S. A. Sisson Y. V. Chan 《统计学通讯:理论与方法》2013,42(1):53-62

The present article deals with the problem of misspecifying the disturbance-covariance matrix as scalar, when it is locally non scalar. We consider a family of shrinkage estimators based on OLS estimator and compare its asymptotic properties with the properties of OLS estimator. We proposed a similar family of estimators based on FGLS and compared its asymptotic properties with the shrinkage estimator based on OLS under a Pitman's drift process. The effect of misspecifying the disturbances covariance matrix was analyzed with the help of a numerical simulation. 相似文献

17.

The Estimation of Compensating Wage Differentials: Lessons From the Deadliest Catch

《商业与经济统计学杂志》2012,30(1):165-182

ABSTRACT

I use longitudinal survey data from commercial fishing deckhands in the Alaskan Bering Sea to provide new insights on empirical methods commonly used to estimate compensating wage differentials and the value of statistical life (VSL). The unique setting exploits intertemporal variation in fatality rates and wages within worker-vessel pairs caused by a combination of weather patterns and policy changes, allowing identification of parameters and biases that it has only been possible to speculate about in more general settings. I show that estimation strategies common in the literature produce biased estimates in this setting, and decompose the bias components due to latent worker, establishment, and job-match heterogeneity. The estimates also remove the confounding effects of endogenous job mobility and dynamic labor market search, narrowing a conceptual gap between search-based hedonic wage theory and its empirical applications. I find that workers’ marginal aversion to fatal risk falls as risk levels rise, which suggests complementarities in the benefits of public safety policies. Supplementary materials for this article are available online. 相似文献

18.

顾客满意度模型的样本量研究 总被引：2，自引：0，他引：2

下载免费PDF全文

梁燕金勇进《统计研究》2007,24(7):68-74

本文在对顾客满意度模型及其估计方法PLS（Partial Least Square）进行简单讨论的基础上,详细研究了顾客满意度模型PLS估计方法需要的样本量,并针对中国顾客满意度研究的实际案例数据,给出了顾客满意度模型的样本量要求的建议,对顾客满意度实践有指导意义。相似文献

19.

Sparse group lasso for multiclass functional logistic regression models

Hidetoshi Matsui 《统计学通讯:模拟与计算》2019,48(6):1784-1797

Sparsity-inducing penalties are useful tools for variable selection and are also effective for regression problems where the data are functions. We consider the problem of selecting not only variables but also decision boundaries in multiclass logistic regression models for functional data, using sparse regularization. The parameters of the functional logistic regression model are estimated in the framework of the penalized likelihood method with the sparse group lasso-type penalty, and then tuning parameters for the model are selected using the model selection criterion. The effectiveness of the proposed method is investigated through simulation studies and the analysis of a gene expression data set. 相似文献

20.

Variable Selection for Naive Bayes Semisupervised Learning

Byoung-Jeong Choi Kwang-Rae Kim Kyu-Dong Cho Changyi Park 《统计学通讯:模拟与计算》2013,42(10):2702-2713

This article deals with a semisupervised learning based on naive Bayes assumption. A univariate Gaussian mixture density is used for continuous input variables whereas a histogram type density is adopted for discrete input variables. The EM algorithm is used for the computation of maximum likelihood estimators of parameters in the model when we fix the number of mixing components for each continuous input variable. We carry out a model selection for choosing a parsimonious model among various fitted models based on an information criterion. A common density method is proposed for the selection of significant input variables. Simulated and real datasets are used to illustrate the performance of the proposed method. 相似文献