首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Summary.  Model selection for marginal regression analysis of longitudinal data is challenging owing to the presence of correlation and the difficulty of specifying the full likelihood, particularly for correlated categorical data. The paper introduces a novel Bayesian information criterion type model selection procedure based on the quadratic inference function, which does not require the full likelihood or quasi-likelihood. With probability approaching 1, the criterion selects the most parsimonious correct model. Although a working correlation matrix is assumed, there is no need to estimate the nuisance parameters in the working correlation matrix; moreover, the model selection procedure is robust against the misspecification of the working correlation matrix. The criterion proposed can also be used to construct a data-driven Neyman smooth test for checking the goodness of fit of a postulated model. This test is especially useful and often yields much higher power in situations where the classical directional test behaves poorly. The finite sample performance of the model selection and model checking procedures is demonstrated through Monte Carlo studies and analysis of a clinical trial data set.  相似文献   

2.
This paper develops a new approach for order selection in autoregressive moving average models using the focused information criterion. This criterion minimizes the asymptotic mean squared error of the estimator of a parameter of interest. Simulation studies indicate that the suggested criterion is quite effective and comparable to the Akaike information criterion, the corrected Akaike information criterion and the Bayesian information criterion in autoregressive moving average order selection. The use of the focused information criterion for the simultaneous selection of regression variables and order of the error process in a linear regression model with autoregressive moving average errors is also considered.  相似文献   

3.
An adaptive variable selection procedure is proposed which uses an adaptive test along with a stepwise procedure to select variables for a multiple regression model. We compared this adaptive stepwise procedure to methods that use Akaike's information criterion, Schwartz's information criterion, and Sawa's information criterion. The simulation studies demonstrated that the adaptive stepwise method is more effective than the traditional variable selection methods if the error distribution is not normally distributed. If the error distribution is known to be normally distributed, the variable selection method based on Sawa's information criteria appears to be superior to the other methods. Unless the error distribution is known to be normally distributed, the adaptive stepwise method is recommended.  相似文献   

4.
In this paper, tests for the skewness parameter of the two-piece double exponential distribution are derived when the location parameter is unknown. Classical tests like Neyman structure test and likelihood ratio test (LRT), that are generally used to test hypotheses in the presence of nuisance parameters, are not feasible for this distribution since the exact distributions of the test statistics become very complicated. As an alternative, we identify a set of statistics that are ancillary for the location parameter. When the scale parameter is known, Neyman–Pearson's lemma is used, and when the scale parameter is unknown, the LRT is applied to the joint density function of ancillary statistics, in order to obtain a test for the skewness parameter of the distribution. Test for symmetry of the distribution can be deduced as a special case. It is found that power of the proposed tests for symmetry is only marginally less than the power of corresponding classical optimum tests when the location parameter is known, especially for moderate and large sample sizes.  相似文献   

5.
Goodness of Fit via Non-parametric Likelihood Ratios   总被引:1,自引:0,他引:1  
Abstract.  To test if a density f is equal to a specified f 0, one knows by the Neyman–Pearson lemma the form of the optimal test at a specified alternative f 1. Any non-parametric density estimation scheme allows an estimate of f . This leads to estimated likelihood ratios. Properties are studied of tests which for the density estimation ingredient use log-linear expansions. Such expansions are either coupled with subset selectors like the Akaike information criterion and the Bayesian information criterion regimes, or use order growing with sample size. Our tests are generalized to testing the adequacy of general parametric models, and to work also in higher dimensions. The tests are related to, but are different from, the 'smooth tests' that go back to Neyman [Skandinavisk Aktuarietidsskrift 20(1937) 149] and that have been studied extensively in recent literature. Our tests are large-sample equivalent to such smooth tests under local alternative conditions, but different from the smooth tests and often better under non-local conditions.  相似文献   

6.
The objective of this paper is to investigate exact slopes of test statistics { Tn } when the random vectors X 1, ..., Xn are distributed according to an unknown member of an exponential family { P θ; θ∈Ω. Here Ω is a parameter set. We will be concerned with the hypothesis testing problem of H 0θ∈Ω0 vs H 1: θ∉Ω0 where Ω0 is a subset of Ω. It will be shown that for an important class of problems and test statistics the exact slope of { Tn } at η in Ω−Ω0 is determined by the shortest Kullback–Leibler distance from {θ: Tn (λ(θ)) = Tn (λ(π))} to Ω0, λθ = E θ)( X ).  相似文献   

7.
Model selection problems arise while constructing unbiased or asymptotically unbiased estimators of measures known as discrepancies to find the best model. Most of the usual criteria are based on goodness-of-fit and parsimony. They aim to maximize a transformed version of likelihood. For linear regression models with normally distributed error, the situation is less clear when two models are equivalent: are they close to or far from the unknown true model? In this work, based on stochastic simulation and parametric simulation, we study the results of Vuong's test, Cox's test, Akaike's information criterion, Bayesian information criterion, Kullback information criterion and bias corrected Kullback information criterion and the ability of these tests to discriminate between non-nested linear models.  相似文献   

8.
Results from a power study of six statistics for testing that a sample is from a uniform distribution on the unit interval (0,1) are reported. The test statistics are all well-known and each of them was originally proposed because they should have high power against some alternative distributions. The tests considered are the Pearson probability product test, the Neyman smooth test, the Sukhatme test, the Durbin-Kolmogorov test, the Kuiper test, and the Sherman test. Results are given for each of these tests against each of four classes of alternatives. Also, the most powerful test against each member of the first three alternatives is obtained, and the powers of these tests are given for the same sample sizes as for the six general "omnibus" test statistics. These values constitute a "power envelope" against which all tests can be compared. The Neyman smooth tests with 2nd and 4th degree polynomials are found to have good power and are recommended as general tests for uniformity.  相似文献   

9.
In the problem of selecting variables in a multivariate linear regression model, we derive new Bayesian information criteria based on a prior mixing a smooth distribution and a delta distribution. Each of them can be interpreted as a fusion of the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Inheriting their asymptotic properties, our information criteria are consistent in variable selection in both the large-sample and the high-dimensional asymptotic frameworks. In numerical simulations, variable selection methods based on our information criteria choose the true set of variables with high probability in most cases.  相似文献   

10.
Sparsity-inducing penalties are useful tools for variable selection and are also effective for regression problems where the data are functions. We consider the problem of selecting not only variables but also decision boundaries in multiclass logistic regression models for functional data, using sparse regularization. The parameters of the functional logistic regression model are estimated in the framework of the penalized likelihood method with the sparse group lasso-type penalty, and then tuning parameters for the model are selected using the model selection criterion. The effectiveness of the proposed method is investigated through simulation studies and the analysis of a gene expression data set.  相似文献   

11.
Summary. We obtain the residual information criterion RIC, a selection criterion based on the residual log-likelihood, for regression models including classical regression models, Box–Cox transformation models, weighted regression models and regression models with autoregressive moving average errors. We show that RIC is a consistent criterion, and that simulation studies for each of the four models indicate that RIC provides better model order choices than the Akaike information criterion, corrected Akaike information criterion, final prediction error, C p and R adj2, except when the sample size is small and the signal-to-noise ratio is weak. In this case, none of the criteria performs well. Monte Carlo results also show that RIC is superior to the consistent Bayesian information criterion BIC when the signal-to-noise ratio is not weak, and it is comparable with BIC when the signal-to-noise ratio is weak and the sample size is large.  相似文献   

12.
Autoregressive model is a popular method for analysing the time dependent data, where selection of order parameter is imperative. Two commonly used selection criteria are the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), which are known to suffer the potential problems regarding overfit and underfit, respectively. To our knowledge, there does not exist a criterion in the literature that can satisfactorily perform under various situations. Therefore, in this paper, we focus on forecasting the future values of an observed time series and propose an adaptive idea to combine the advantages of AIC and BIC but to mitigate their weaknesses based on the concept of generalized degrees of freedom. Instead of applying a fixed criterion to select the order parameter, we propose an approximately unbiased estimator of mean squared prediction errors based on a data perturbation technique for fairly comparing between AIC and BIC. Then use the selected criterion to determine the final order parameter. Some numerical experiments are performed to show the superiority of the proposed method and a real data set of the retail price index of China from 1952 to 2008 is also applied for illustration.  相似文献   

13.
In applications of generalized order statistics as, for instance, reliability analysis of engineering systems, prior knowledge about the order of the underlying model parameters is often available and may therefore be incorporated in inferential procedures. Taking this information into account, we establish the likelihood ratio test, Rao's score test, and Wald's test for test problems arising from the question of appropriate model selection for ordered data, where simple order restrictions are put on the parameters under the alternative hypothesis. For simple and composite null hypothesis, explicit representations of the corresponding test statistics are obtained along with some properties and their asymptotic distributions. A simulation study is carried out to compare the order restricted tests in terms of their power. In the set-up considered, the adapted tests significantly improve the power of the associated omnibus versions for small sample sizes, especially when testing a composite null hypothesis.  相似文献   

14.
The comparison of an estimated parameter to its standard error, the Wald test, is a well known procedure of classical statistics. Here we discuss its application to graphical Gaussian model selection. First we derive the Fisher information matrix and its inverse about the parameters of any graphical Gaussian model. Both the covariance matrix and its inverse are considered and a comparative analysis of the asymptotic behaviour of their maximum likelihood estimators (m.l.e.s) is carried out. Then we give an example of model selection based on the standard errors. The method is shown to produce almost identical inference to likelihood ratio methods in the example considered.  相似文献   

15.
The focused information criterion for model selection is constructed to select the model that best estimates a particular quantity of interest, the focus, in terms of mean squared error. We extend this focused selection process to the high‐dimensional regression setting with potentially a larger number of parameters than the size of the sample. We distinguish two cases: (i) the case where the considered submodel is of low dimension and (ii) the case where it is of high dimension. In the former case, we obtain an alternative expression of the low‐dimensional focused information criterion that can directly be applied. In the latter case, we use a desparsified estimator that allows us to derive the mean squared error of the focus estimator. We illustrate the performance of the high‐dimensional focused information criterion with a numerical study and a real dataset.  相似文献   

16.
Testing Hypotheses in the Functional Linear Model   总被引:2,自引:0,他引:2  
The functional linear model with scalar response is a regression model where the predictor is a random function defined on some compact set of R and the response is scalar. The response is modelled as Y =Ψ( X )+ ɛ , where Ψ is some linear continuous operator defined on the space of square integrable functions and valued in R . The random input X is independent from the noise ɛ . In this paper, we are interested in testing the null hypothesis of no effect, that is, the nullity of Ψ restricted to the Hilbert space generated by the random variable X . We introduce two test statistics based on the norm of the empirical cross-covariance operator of ( X , Y ). The first test statistic relies on a χ 2 approximation and we show the asymptotic normality of the second one under appropriate conditions on the covariance operator of X . The test procedures can be applied to check a given relationship between X and Y . The method is illustrated through a simulation study.  相似文献   

17.
We propose a new criterion for model selection in prediction problems. The covariance inflation criterion adjusts the training error by the average covariance of the predictions and responses, when the prediction rule is applied to permuted versions of the data set. This criterion can be applied to general prediction problems (e.g. regression or classification) and to general prediction rules (e.g. stepwise regression, tree-based models and neural nets). As a by-product we obtain a measure of the effective number of parameters used by an adaptive procedure. We relate the covariance inflation criterion to other model selection procedures and illustrate its use in some regression and classification problems. We also revisit the conditional bootstrap approach to model selection.  相似文献   

18.
19.
Non-parametric Estimation of the Residual Distribution   总被引:2,自引:0,他引:2  
Consider a heteroscedastic regression model Y = m ( X ) +σ( X )ε, where the functions m and σ are "smooth", and ε is independent of X . An estimator of the distribution of ε based on non-parametric regression residuals is proposed and its weak convergence is obtained. Applications to prediction intervals and goodness-of-fit tests are discussed.  相似文献   

20.
In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号