期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Tracking interval for selecting between non-nested models: An investigation for Type II right censored data

Abdolreza Sayyareh 《Journal of statistical planning and inference》2012

In this paper, we consider the setting where the observed data is incomplete. For the general situation where the number of gaps as well as the number of unobserved values in some gaps go to infinity, the asymptotic behavior of maximum likelihood estimator is not clear. We derive and investigate the asymptotic properties of maximum likelihood estimator under censorship and drive a statistic for testing the null hypothesis that the proposed non-nested models are equally close to the true model against the alternative hypothesis that one model is closer when we are faced with a life-time situation. Furthermore rewrite a normalization of a difference of Akaike criterion for estimating the difference of expected Kullback–Leibler risk between the distributions in two different models. 相似文献

2.

Linear Signed Rank Test for Model Selection

Abdolreza Sayyareh 《统计学通讯:理论与方法》2014,43(21):4492-4502

In this article, we consider a linear signed rank test for non-nested distributions in the context of the model selection. Introducing a new test, we show that, it is asymptotically more efficient than the Vuong test and the test statistic based on B statistic introduced by Clarke. However, here, we let the magnitude of the data give a better performance to the test statistic. We have shown that this test is an unbiased one. The results of simulations show that the rank test has the greater statistical power than the Vuong test where the underline distributions is symmetric. 相似文献

3.

Mutual information and redundancy for categorical data

Chong Sun Hong Beom Jun Kim 《Statistical Papers》2011,52(1):17-31

Most methods for describing the relationship among random variables require specific probability distributions and some assumptions concerning random variables. Mutual information, based on entropy to measure the dependency among random variables, does not need any specific distribution and assumptions. Redundancy, which is an analogous version of mutual information, is also proposed as a method. In this paper, the concepts of redundancy and mutual information are explored as applied to multi-dimensional categorical data. We found that mutual information and redundancy for categorical data can be expressed as a function of the generalized likelihood ratio statistic under several kinds of independent log-linear models. As a consequence, mutual information and redundancy can also be used to analyze contingency tables stochastically. Whereas the generalized likelihood ratio statistic to test the goodness-of-fit of the log-linear models is sensitive to the sample size, the redundancy for categorical data does not depend on sample size but depends on its cell probabilities. 相似文献

4.

Estimation of risk ratio in a noncompliance randomized clinical trial with trichotomous dose levels

Kung-Jong Lui Kuang-Chao Chang 《Statistical Methodology》2009,6(2):164-176

In randomized clinical trials (RCTs), we may come across the situation in which some patients do not fully comply with their assigned treatment. For an experimental treatment with trichotomous levels, we derive the maximum likelihood estimator (MLE) of the risk ratio (RR) per level of dose increase in a RCT with noncompliance. We further develop three asymptotic interval estimators for the RR. To evaluate and compare the finite sample performance of these interval estimators, we employ Monte Carlo simulation. When the number of patients per treatment is large, we find that all interval estimators derived in this paper can perform well. When the number of patients is not large, we find that the interval estimator using Wald’s statistic can be liberal, while the interval estimator using the logarithmic transformation of the MLE can lose precision. We note that use of a bootstrap variance estimate in this case may alleviate these concerns. We further note that an interval estimator combining interval estimators using Wald’s statistic and the logarithmic transformation can generally perform well with respect to the coverage probability, and be generally more efficient than interval estimators using bootstrap variance estimates when RR>1. Finally, we use the data taken from a study of vitamin A supplementation to reduce mortality in preschool children to illustrate the use of these estimators. 相似文献

5.

Probability model choice in single samples from exponential families using Poisson log-linear modelling,and model comparison using Bayes and posterior Bayes factors

Murray Aitkin 《Statistics and Computing》1995,5(2):113-120

This paper describes a method due to Lindsey (1974a) for fitting different exponential family distributions for a single population to the same data, using Poisson log-linear modelling of the density or mass function. The method is extended to Efron's (1986) double exponential family, giving exact ML estimation of the two parameters not easily achievable directly. The problem of comparing the fit of the non-nested models is addressed by both Bayes and posterior Bayes factors (Aitkin, 1991). The latter allow direct comparisons of deviances from the fitted distributions. 相似文献

6.

Asymptotic relative efficiency of wald tests in measurement error models

Patricia Gimenez Enrico A. Colosimo Heleno Bolfarine 《统计学通讯:理论与方法》2013,42(3):549-564

In this paper, asymptotic relative efficiency (ARE) of Wald tests for the Tweedie class of models with log-linear mean, is considered when the aux¬iliary variable is measured with error. Wald test statistics based on the naive maximum likelihood estimator and on a consistent estimator which is obtained by using Nakarnura's (1990) corrected score function approach are defined. As shown analytically, the Wald statistics based on the naive and corrected score function estimators are asymptotically equivalents in terms of ARE. On the other hand, the asymptotic relative efficiency of the naive and corrected Wald statistic with respect to the Wald statistic based on the true covariate equals to the square of the correlation between the unobserved and the observed co-variate. A small scale numerical Monte Carlo study and an example illustrate the small sample size situation. 相似文献

7.

Bridge regression: Adaptivity and group selection

Cheolwoo ParkYoung Joo Yoon 《Journal of statistical planning and inference》2011,141(11):3506-3519

In high-dimensional regression problems regularization methods have been a popular choice to address variable selection and multicollinearity. In this paper we study bridge regression that adaptively selects the penalty order from data and produces flexible solutions in various settings. We implement bridge regression based on the local linear and quadratic approximations to circumvent the nonconvex optimization problem. Our numerical study shows that the proposed bridge estimators are a robust choice in various circumstances compared to other penalized regression methods such as the ridge, lasso, and elastic net. In addition, we propose group bridge estimators that select grouped variables and study their asymptotic properties when the number of covariates increases along with the sample size. These estimators are also applied to varying-coefficient models. Numerical examples show superior performances of the proposed group bridge estimators in comparisons with other existing methods. 相似文献

8.

Estimation of tobit-type models with individual specific effects

Bo E. Honoré Ekaterini Kyriazidou J. L. Powell 《Econometric Reviews》2013,32(3):341-366

The aim of this paper is two-fold. First, we review recent estimators for censored regression and sample selection panel data models with unobservable individual specific effects, and show how the idea behind these estimators can be used to construct estimators for a variety of other Tobit-type models. The estimators presented in this paper are semiparametric, in the sense that they do not require the parametrization of the distribution of the unobservables. The second aim of the paper is to introduce a new class of estimators for the censored regression model. The advantage of the new estimators is that they can be applied under a stationarity assumption on the transitory error terms, which is weaker than the exchangeability assumption that is usually made in this literature. A similar generalization does not seem feasible for the estimators of the other models that are considered. 相似文献

9.

Estimation of tobit-type models with individual specific effects

Bo E. Honor Ekaterini Kyriazidou J. L. Powell 《Econometric Reviews》2000,19(3):341-366

The aim of this paper is two-fold. First, we review recent estimators for censored regression and sample selection panel data models with unobservable individual specific effects, and show how the idea behind these estimators can be used to construct estimators for a variety of other Tobit-type models. The estimators presented in this paper are semiparametric, in the sense that they do not require the parametrization of the distribution of the unobservables. The second aim of the paper is to introduce a new class of estimators for the censored regression model. The advantage of the new estimators is that they can be applied under a stationarity assumption on the transitory error terms, which is weaker than the exchangeability assumption that is usually made in this literature. A similar generalization does not seem feasible for the estimators of the other models that are considered. 相似文献

10.

Inference after separated hypotheses testing: an empirical investigation for linear models

《Journal of Statistical Computation and Simulation》2012,82(9):1275-1286

Model selection problems arise while constructing unbiased or asymptotically unbiased estimators of measures known as discrepancies to find the best model. Most of the usual criteria are based on goodness-of-fit and parsimony. They aim to maximize a transformed version of likelihood. For linear regression models with normally distributed error, the situation is less clear when two models are equivalent: are they close to or far from the unknown true model? In this work, based on stochastic simulation and parametric simulation, we study the results of Vuong's test, Cox's test, Akaike's information criterion, Bayesian information criterion, Kullback information criterion and bias corrected Kullback information criterion and the ability of these tests to discriminate between non-nested linear models. 相似文献

11.

Wavelet Analysis of Change Points in Nonparametric Hazard Rate Models Under Random Censorship

Jingle Wang Ming Zheng Wen Yu 《统计学通讯:理论与方法》2014,43(9):1956-1978

In this article, we consider detection and estimation of change points in nonparametric hazard rate models. Wavelet methods are utilized to develop a testing procedure for change points detection. The asymptotic properties of the test statistic are explored. When there exist change points in hazard function, we also propose estimators for the number, the locations, and the jump sizes of the change points. The asymptotic properties of these estimators are systematically derived. Some simulation examples are conducted to assess the finite sample performance of the proposed approach and to make comparisons with some existing methods. A real data analysis is provided to illustrate the new approach. 相似文献

12.

Hypothesis Testing in Functional Comparative Calibration Models

Manuel Galea Heleno Bolfarine Patricia Giménez 《统计学通讯:理论与方法》2013,42(11):2021-2033

The purpose of this article is to investigate hypothesis testing in functional comparative calibration models. Wald type statistics are considered which are asymptotically distributed according to the chi-square distribution. The statistics are based on maximum likelihood, corrected score approach, and method of moment estimators of the model parameters, which are shown to be consistent and asymptotically normally distributed. Results of analytical and simulation studies seem to indicate that the Wald statistics based on the method of moment estimators and the corrected score estimators are, as expected, less efficient than the Wald type statistic based on the maximum likelihood estimators for small n. Wald statistic based on moment estimators are simpler to compute than the other Wald statistics tests and their performance improves significantly as n increases. Comparisons with an alternative F statistics proposed in the literature are also reported. 相似文献

13.

A Robust Test for Non-nested Hypotheses

Maria-Pia Victoria-Feser 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1997,59(3):715-727

We propose a robust version of Cox-type test statistics for the choice between two non-nested hypotheses. We first show that the influence of small amounts of contamination in the data on the test decision can be very large. Secondly, we build a robust test statistic by using the results on robust parametric tests that are available in the literature and show that the level of the robust test is stable. Finally, we show numerically not only the robustness of this new test statistic but also that its asymptotic distribution is a good approximation of its sample distribution, unlike for the classical test statistic. We apply our results to the choice between a Pareto and an exponential distribution as well as between two competing regressors in the simple linear regression model without intercept. 相似文献

14.

Testing homogeneity in discrete mixtures

Richard Charnigo Jiayang Sun 《Journal of statistical planning and inference》2008

This paper introduces W-tests for assessing homogeneity in mixtures of discrete probability distributions. A W-test statistic depends on the data solely through parameter estimators and, if a penalized maximum likelihood estimation framework is used, has a tractable asymptotic distribution under the null hypothesis of homogeneity. The large-sample critical values are quantiles of a chi-square distribution multiplied by an estimable constant for which we provide an explicit formula. In particular, the estimation of large-sample critical values does not involve simulation experiments or random field theory. We demonstrate that W-tests are generally competitive with a benchmark test in terms of power to detect heterogeneity. Moreover, in many situations, the large-sample critical values can be used even with small to moderate sample sizes. The main implementation issue (selection of an underlying measure) is thoroughly addressed, and we explain why W-tests are well-suited to problems involving large and online data sets. Application of a W-test is illustrated with an epidemiological data set. 相似文献

15.

Forecasting time series of economic processes by model averaging across data frames of various lengths

Nikita A. Moiseev 《Journal of Statistical Computation and Simulation》2017,87(16):3111-3131

This paper presents an extension of mean-squared forecast error (MSFE) model averaging for integrating linear regression models computed on data frames of various lengths. Proposed method is considered to be a preferable alternative to best model selection by various efficiency criteria such as Bayesian information criterion (BIC), Akaike information criterion (AIC), F-statistics and mean-squared error (MSE) as well as to Bayesian model averaging (BMA) and naïve simple forecast average. The method is developed to deal with possibly non-nested models having different number of observations and selects forecast weights by minimizing the unbiased estimator of MSFE. Proposed method also yields forecast confidence intervals with a given significance level what is not possible when applying other model averaging methods. In addition, out-of-sample simulation and empirical testing proves efficiency of such kind of averaging when forecasting economic processes. 相似文献

16.

A model averaging approach for the ordered probit and nested logit models with applications

Longmei Chen Geoffrey Tso Xinyu Zhang 《Journal of applied statistics》2018,45(16):3012-3052

This paper considers model averaging for the ordered probit and nested logit models, which are widely used in empirical research. Within the frameworks of these models, we examine a range of model averaging methods, including the jackknife method, which is proved to have an optimal asymptotic property in this paper. We conduct a large-scale simulation study to examine the behaviour of these model averaging estimators in finite samples, and draw comparisons with model selection estimators. Our results show that while neither averaging nor selection is a consistently better strategy, model selection results in the poorest estimates far more frequently than averaging, and more often than not, averaging yields superior estimates. Among the averaging methods considered, the one based on a smoothed version of the Bayesian Information criterion frequently produces the most accurate estimates. In three real data applications, we demonstrate the usefulness of model averaging in mitigating problems associated with the ‘replication crisis’ that commonly arises with model selection. 相似文献

17.

Weighting for unequal selection probabilities in multilevel models

D. Pfeffermann C. J. Skinner D. J. Holmes H. Goldstein & J. Rasbash 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(1):23-40

When multilevel models are estimated from survey data derived using multistage sampling, unequal selection probabilities at any stage of sampling may induce bias in standard estimators, unless the sources of the unequal probabilities are fully controlled for in the covariates. This paper proposes alternative ways of weighting the estimation of a two-level model by using the reciprocals of the selection probabilities at each stage of sampling. Consistent estimators are obtained when both the sample number of level 2 units and the sample number of level 1 units within sampled level 2 units increase. Scaling of the weights is proposed to improve the properties of the estimators and to simplify computation. Variance estimators are also proposed. In a limited simulation study the scaled weighted estimators are found to perform well, although non-negligible bias starts to arise for informative designs when the sample number of level 1 units becomes small. The variance estimators perform extremely well. The procedures are illustrated using data from the survey of psychiatric morbidity. 相似文献

18.

Nonlinear GCV and quasi-GCV for shrinkage models

《Journal of statistical planning and inference》2005,131(2):333-347

The generalized cross-validation (GCV) method has been a popular technique for the selection of tuning parameters for smoothing and penalty, and has been a standard tool to select tuning parameters for shrinkage models in recent works. Its computational ease and robustness compared to the cross-validation method makes it competitive for model selection as well. It is well known that the GCV method performs well for linear estimators, which are linear functions of the response variable, such as ridge estimator. However, it may not perform well for nonlinear estimators since the GCV emphasizes linear characteristics by taking the trace of the projection matrix. This paper aims to explore the GCV for nonlinear estimators and to further extend the results to correlated data in longitudinal studies. We expect that the nonlinear GCV and quasi-GCV developed in this paper will provide similar tools for the selection of tuning parameters in linear penalty models and penalized GEE models. 相似文献

19.

A New Lens on High School Dropout: Use of Correspondence Analysis and the Statewide Longitudinal Data System

Kathryn Schaefer Ziemer Bianica Pires Vicki Lancaster Sallie Keller Mark Orr Stephanie Shipp 《The American statistician》2018,72(2):191-198

The combination of log-linear models and correspondence analysis have long been used to decompose contingency tables and aid in their interpretation. Until now, this approach has not been applied to the education Statewide Longitudinal Data System (SLDS), which contains administrative school data at the student level. While some research has been conducted using the SLDS, its primary use is for state education administrative reporting. This article uses the combination of log-linear models and correspondence analysis to gain insight into high school dropouts in two discrete regions in Kentucky, Appalachia and non-Appalachia, defined by the American Community Survey. The individual student records from the SLDS were categorized into one of the two regions and a log-linear model was used to identify the interactions between the demographic characteristics and the dropout categories, push-out and pull-out. Correspondence analysis was then used to visualize the interactions with the expanded push-out categories, boredom, course selection, expulsion, failing grade, teacher conflict, and pull-out categories, employment, family problems, illness, marriage, and pregnancy to provide insights into the regional differences. In this article, we demonstrate that correspondence analysis can extend the insights gained from SDLS data and provide new perspectives on dropouts. Supplementary materials for this article are available online. 相似文献

20.

Pattern–mixture and selection models for analysing longitudinal data with monotone missing patterns

Jolene Birmingham rea Rotnitzky Garrett M. Fitzmaurice 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(1):275-297

Summary. We examine three pattern–mixture models for making inference about parameters of the distribution of an outcome of interest Y that is to be measured at the end of a longitudinal study when this outcome is missing in some subjects. We show that these pattern–mixture models also have an interpretation as selection models. Because these models make unverifiable assumptions, we recommend that inference about the distribution of Y be repeated under a range of plausible assumptions. We argue that, of the three models considered, only one admits a parameterization that facilitates the examination of departures from the assumption of sequential ignorability. The three models are nonparametric in the sense that they do not impose restrictions on the class of observed data distributions. Owing to the curse of dimensionality, the assumptions that are encoded in these models are sufficient for identification but not for inference. We describe additional flexible and easily interpretable assumptions under which it is possible to construct estimators that are well behaved with moderate sample sizes. These assumptions define semiparametric models for the distribution of the observed data. We describe a class of estimators which, up to asymptotic equivalence, comprise all the consistent and asymptotically normal estimators of the parameters of interest under the postulated semiparametric models. We illustrate our methods with the analysis of data from a randomized clinical trial of contracepting women. 相似文献