首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We study a problem of model selection for data produced by two different context tree sources. Motivated by linguistic questions, we consider the case where the probabilistic context trees corresponding to the two sources are finite and share many of their contexts. In order to understand the differences between the two sources, it is important to identify which contexts and which transition probabilities are specific to each source. We consider a class of probabilistic context tree models with three types of contexts: those which appear in one, the other, or both sources. We use a BIC penalized maximum likelihood procedure that jointly estimates the two sources. We propose a new algorithm which efficiently computes the estimated context trees. We prove that the procedure is strongly consistent. We also present a simulation study showing the practical advantage of our procedure over a procedure that works separately on each data set.  相似文献   

2.
Empirical likelihood based variable selection   总被引:1,自引:0,他引:1  
Information criteria form an important class of model/variable selection methods in statistical analysis. Parametric likelihood is a crucial part of these methods. In some applications such as the generalized linear models, the models are only specified by a set of estimating functions. To overcome the non-availability of well defined likelihood function, the information criteria under empirical likelihood are introduced. Under this setup, we successfully solve the existence problem of the profile empirical likelihood due to the over constraint in variable selection problems. The asymptotic properties of the new method are investigated. The new method is shown to be consistent at selecting the variables under mild conditions. Simulation studies find that the proposed method has comparable performance to the parametric information criteria when a suitable parametric model is available, and is superior when the parametric model assumption is violated. A real data set is also used to illustrate the usefulness of the new method.  相似文献   

3.
We develop an approach to evaluating frequentist model averaging procedures by considering them in a simple situation in which there are two‐nested linear regression models over which we average. We introduce a general class of model averaged confidence intervals, obtain exact expressions for the coverage and the scaled expected length of the intervals, and use these to compute these quantities for the model averaged profile likelihood (MPI) and model‐averaged tail area confidence intervals proposed by D. Fletcher and D. Turek. We show that the MPI confidence intervals can perform more poorly than the standard confidence interval used after model selection but ignoring the model selection process. The model‐averaged tail area confidence intervals perform better than the MPI and postmodel‐selection confidence intervals but, for the examples that we consider, offer little over simply using the standard confidence interval for θ under the full model, with the same nominal coverage.  相似文献   

4.
The reconstruction of phylogenetic trees is one of the most important and interesting problems of the evolutionary study. There are many methods proposed in the literature for constructing phylogenetic trees. Each approach is based on different criteria and evolutionary models. However, the topologies of trees constructed from different methods may be quite different. The topological errors may be due to unsuitable criterions or evolutionary models. Since there are many tree construction approaches, we are interested in selecting a better tree to fit the true model. In this study, we propose an adjusted k-means approach and a misclassification error score criterion to solve the problem. The simulation study shows this method can select better trees among the potential candidates, which can provide a useful way in phylogenetic tree selection.  相似文献   

5.
Regression analyses are commonly performed with doubly limited continuous dependent variables; for instance, when modeling the behavior of rates, proportions and income concentration indices. Several models are available in the literature for use with such variables, one of them being the unit gamma regression model. In all such models, parameter estimation is typically performed using the maximum likelihood method and testing inferences on the model''s parameters are usually based on the likelihood ratio test. Such a test can, however, deliver quite imprecise inferences when the sample size is small. In this paper, we propose two modified likelihood ratio test statistics for use with the unit gamma regressions that deliver much more accurate inferences when the number of data points in small. Numerical (i.e. simulation) evidence is presented for both fixed dispersion and varying dispersion models, and also for tests that involve nonnested models. We also present and discuss two empirical applications.  相似文献   

6.
The empirical likelihood method is proposed to construct the confidence regions for the difference in value between coefficients of two-sample linear regression model. Unlike existing empirical likelihood procedures for one-sample linear regression models, as the empirical likelihood ratio function is not concave, the usual maximum empirical likelihood estimation cannot be obtained directly. To overcome this problem, we propose to incorporate a natural and well-explained restriction into likelihood function and obtain a restricted empirical likelihood ratio statistic (RELR). It is shown that RELR has an asymptotic chi-squared distribution. Furthermore, to improve the coverage accuracy of the confidence regions, a Bartlett correction is applied. The effectiveness of the proposed approach is demonstrated by a simulation study.  相似文献   

7.
We consider the Whittle likelihood estimation of seasonal autoregressive fractionally integrated moving‐average models in the presence of an additional measurement error and show that the spectral maximum Whittle likelihood estimator is asymptotically normal. We illustrate by simulation that ignoring measurement errors may result in incorrect inference. Hence, it is pertinent to test for the presence of measurement errors, which we do by developing a likelihood ratio (LR) test within the framework of Whittle likelihood. We derive the non‐standard asymptotic null distribution of this LR test and the limiting distribution of LR test under a sequence of local alternatives. Because in practice, we do not know the order of the seasonal autoregressive fractionally integrated moving‐average model, we consider three modifications of the LR test that takes model uncertainty into account. We study the finite sample properties of the size and the power of the LR test and its modifications. The efficacy of the proposed approach is illustrated by a real‐life example.  相似文献   

8.
Abstract

In this paper we introduce continuous tree mixture model that is the mixture of undirected graphical models with tree structured graphs and is considered as multivariate analysis with a non parametric approach. We estimate its parameters, the component edge sets and mixture proportions through regularized maximum likalihood procedure. Our new algorithm, which uses expectation maximization algorithm and the modified version of Kruskal algorithm, simultaneosly estimates and prunes the mixture component trees. Simulation studies indicate this method performs better than the alternative Gaussian graphical mixture model. The proposed method is also applied to water-level data set and is compared with the results of Gaussian mixture model.  相似文献   

9.
It is well known that there exist multiple roots of the likelihood equations for finite normal mixture models. Selecting a consistent root for finite normal mixture models has long been a challenging problem. Simply using the root with the largest likelihood will not work because of the spurious roots. In addition, the likelihood of normal mixture models with unequal variance is unbounded and thus its maximum likelihood estimate (MLE) is not well defined. In this paper, we propose a simple root selection method for univariate normal mixture models by incorporating the idea of goodness of fit test. Our new method inherits both the consistency properties of distance estimators and the efficiency of the MLE. The new method is simple to use and its computation can be easily done using existing R packages for mixture models. In addition, the proposed root selection method is very general and can be also applied to other univariate mixture models. We demonstrate the effectiveness of the proposed method and compare it with some other existing methods through simulation studies and a real data application.  相似文献   

10.
Bootstrap smoothed (bagged) parameter estimators have been proposed as an improvement on estimators found after preliminary data‐based model selection. A result of Efron in 2014 is a very convenient and widely applicable formula for a delta method approximation to the standard deviation of the bootstrap smoothed estimator. This approximation provides an easily computed guide to the accuracy of this estimator. In addition, Efron considered a confidence interval centred on the bootstrap smoothed estimator, with width proportional to the estimate of this approximation to the standard deviation. We evaluate this confidence interval in the scenario of two nested linear regression models, the full model and a simpler model, and a preliminary test of the null hypothesis that the simpler model is correct. We derive computationally convenient expressions for the ideal bootstrap smoothed estimator and the coverage probability and expected length of this confidence interval. In terms of coverage probability, this confidence interval outperforms the post‐model‐selection confidence interval with the same nominal coverage and based on the same preliminary test. We also compare the performance of the confidence interval centred on the bootstrap smoothed estimator, in terms of expected length, to the usual confidence interval, with the same minimum coverage probability, based on the full model.  相似文献   

11.
Sparsity-inducing penalties are useful tools for variable selection and are also effective for regression problems where the data are functions. We consider the problem of selecting not only variables but also decision boundaries in multiclass logistic regression models for functional data, using sparse regularization. The parameters of the functional logistic regression model are estimated in the framework of the penalized likelihood method with the sparse group lasso-type penalty, and then tuning parameters for the model are selected using the model selection criterion. The effectiveness of the proposed method is investigated through simulation studies and the analysis of a gene expression data set.  相似文献   

12.
Huggins & Basawa (1999) proposed several extensions of the bifurcating autoregressive model used to model cell lineage trees. These models overcame limitations in the original bifurcating autoregressive mode by allowing larger correlations between cousin cells and other cells in the same generation. Huggins & Basawa only considered maximum likelihood inference based on independent trees. This paper examines the asymptotic properties of maximum likelihood estimators based on a single large tree.  相似文献   

13.
Variable selection is an effective methodology for dealing with models with numerous covariates. We consider the methods of variable selection for semiparametric Cox proportional hazards model under the progressive Type-II censoring scheme. The Cox proportional hazards model is used to model the influence coefficients of the environmental covariates. By applying Breslow’s “least information” idea, we obtain a profile likelihood function to estimate the coefficients. Lasso-type penalized profile likelihood estimation as well as stepwise variable selection method are explored as means to find the important covariates. Numerical simulations are conducted and Veteran’s Administration Lung Cancer data are exploited to evaluate the performance of the proposed method.  相似文献   

14.
空间计量模型的选择是空间计量建模的一个重要组成部分,也是空间计量模型实证分析的关键步骤。本文对空间计量模型选择中的Moran指数检验、LM检验、似然函数、三大信息准则、贝叶斯后验概率、马尔可夫链蒙特卡罗方法做了详细的理论分析。并在此基础之上,通过Matlab编程进行模拟分析,结果表明:在扩充的空间计量模型族中进行模型选择时,基于OLS残差的Moran指数与LM检验均存在较大的局限性,对数似然值最大原则缺少区分度,LM检验只针对SEM和SAR模型的区分有效,信息准则对大多数模型有效,但是也会出现误选。而当给出恰当的M-H算法时,充分利用了似然函数和先验信息的MCMC方法,具有更高的检验效度,特别是在较大的样本条件下得到了完全准确的判断,且对不同阶空间邻接矩阵的空间计量模型的选择也非常有效。  相似文献   

15.
We consider the use of Monte Carlo methods to obtain maximum likelihood estimates for random effects models and distinguish between the pointwise and functional approaches. We explore the relationship between the two approaches and compare them with the EM algorithm. The functional approach is more ambitious but the approximation is local in nature which we demonstrate graphically using two simple examples. A remedy is to obtain successively better approximations of the relative likelihood function near the true maximum likelihood estimate. To save computing time, we use only one Newton iteration to approximate the maximiser of each Monte Carlo likelihood and show that this is equivalent to the pointwise approach. The procedure is applied to fit a latent process model to a set of polio incidence data. The paper ends by a comparison between the marginal likelihood and the recently proposed hierarchical likelihood which avoids integration altogether.  相似文献   

16.
In this paper we consider test of dimensionality in MANOVA model. For this testing problem, the likelihood ratio (=LR) test, Lawley-Hotelling (=LH) type test and Bartlett-Nanda-Pillai (=BNP) type test are often used. We obtain the asymptotic expansions of powers of these tests under the local alternatives. Also Bahadur exact slopes of these tests are obtained. Based on these results, we obtain a unified opinion concerning comparison of LR test, LH type test and BNP type test.  相似文献   

17.
Summary.  Model selection for marginal regression analysis of longitudinal data is challenging owing to the presence of correlation and the difficulty of specifying the full likelihood, particularly for correlated categorical data. The paper introduces a novel Bayesian information criterion type model selection procedure based on the quadratic inference function, which does not require the full likelihood or quasi-likelihood. With probability approaching 1, the criterion selects the most parsimonious correct model. Although a working correlation matrix is assumed, there is no need to estimate the nuisance parameters in the working correlation matrix; moreover, the model selection procedure is robust against the misspecification of the working correlation matrix. The criterion proposed can also be used to construct a data-driven Neyman smooth test for checking the goodness of fit of a postulated model. This test is especially useful and often yields much higher power in situations where the classical directional test behaves poorly. The finite sample performance of the model selection and model checking procedures is demonstrated through Monte Carlo studies and analysis of a clinical trial data set.  相似文献   

18.
Model selection methods are important to identify the best approximating model. To identify the best meaningful model, purpose of the model should be clearly pre-stated. The focus of this paper is model selection when the modelling purpose is classification. We propose a new model selection approach designed for logistic regression model selection where main modelling purpose is classification. The method is based on the distance between the two clustering trees. We also question and evaluate the performances of conventional model selection methods based on information theory concepts in determining best logistic regression classifier. An extensive simulation study is used to assess the finite sample performances of the cluster tree based and the information theoretic model selection methods. Simulations are adjusted for whether the true model is in the candidate set or not. Results show that the new approach is highly promising. Finally, they are applied to a real data set to select a binary model as a means of classifying the subjects with respect to their risk of breast cancer.  相似文献   

19.
We discuss a method of weighting the likelihood equations with the aim of obtaining fully efficient and robust estimators. We discuss the case of discrete probability models using several weighting functions. If the weight functions generate increasing residual adjustment functions then the method provides a link between the maximum likelihood score equations and minimum disparity estimation, as well as a set of diagnostic weights and a goodness of fit criterion. However, when the weights do not generate increasing residual adjustment functions a selection criterion is needed to obtain the robust root.The weight functions discussed in this paper do not automatically downweight a proportion of the data; an observation is significantly downweighted only if it is inconsistent with the assumed model. At the true model, therefore, the proposed estimating equations behave like the ordinary likelihood equations. We apply our results to several discrete models; in addition, a toxicology experiment illustrates the method in the context of logistic regression.  相似文献   

20.
We consider the problem of model selection based on quantile analysis and with unknown parameters estimated using quantile leasts squares. We propose a model selection test for the null hypothesis that the competing models are equivalent against the alternative hypothesis that one model is closer to the true model. We follow with two applications of the proposed model selection test. The first application is in model selection for time series with non-normal innovations. The second application is in model selection in the NoVas method, short for normalizing and variance stabilizing transformation, forecast. A set of simulation results also lends strong support to the results presented in the paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号