首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Variable selection in the presence of outliers may be performed by using a robust version of Akaike's information criterion (AIC). In this paper, explicit expressions are obtained for such criteria when S- and MM-estimators are used. The performance of these criteria is compared with the existing AIC based on M-estimators and with the classical non-robust AIC. In a simulation study and in data examples, we observe that the proposed AIC with S and MM-estimators selects more appropriate models in case outliers are present.  相似文献   

Motivated by the papers of Woodward and Gray (1979) and Gray, Kelly and McIntire (1978) on the R and S array approach to ARMA modeling, the authors show that the R and S array algorithm is completely equivalent to Levinson recursion. Since entries in the R and S array can be computed by either algorithm, the equivalence provides greater insight into the R and S methodology as well as its links to Akaike's AIC or FPE. Numerical simulations serve to highlight the differences between the various approaches as well as illustrate the problems associated with exact methods. The K and S array approach is shown to be an effective procedure for determining ARMA model orders.  相似文献   

A variance components model with response variable depending on both fixed effects of explanatory variables and random components is specified to model longitudinal circular data, in order to study the directional behaviour of small animals, as insects, crustaceans, amphipods, etc. Unknown parameter estimators are obtained using a simulated maximum likelihood approach. Issues concerning log-likelihood variability and the related problems in the optimization algorithm are also addressed. The procedure is applied to the analysis of directional choices under full natural conditions ofTalitrus saltator from Castiglione della Pescaia (Italy) beaches.  相似文献   

We focus on the problem of selection of a subset of the variables so as to preserve the multivariate data structure that a principal-components analysis of the initial variables would reveal. We propose a new method based on some adapted Gaussian graphical models. This method is then compared with those developed by Bonifas et al. (1984) and Krzanowski (1987a, b). It appears that the criteria for all methods consider the same correlation submatrices and often lead to similar results. The proposed approach offers some guidance as to the number of variables to be selected. In particular, Akaike's information criterion is used.  相似文献   

The objective of this paper is to investigate through simulation the possible presence of the incidental parameters problem when performing frequentist model discrimination with stratified data. In this context, model discrimination amounts to considering a structural parameter taking values in a finite space, with k points, k≥2. This setting seems to have not yet been considered in the literature about the Neyman–Scott phenomenon. Here we provide Monte Carlo evidence of the severity of the incidental parameters problem also in the model discrimination setting and propose a remedy for a special class of models. In particular, we focus on models that are scale families in each stratum. We consider traditional model selection procedures, such as the Akaike and Takeuchi information criteria, together with the best frequentist selection procedure based on maximization of the marginal likelihood induced by the maximal invariant, or of its Laplace approximation. Results of two Monte Carlo experiments indicate that when the sample size in each stratum is fixed and the number of strata increases, correct selection probabilities for traditional model selection criteria may approach zero, unlike what happens for model discrimination based on exact or approximate marginal likelihoods. Finally, two examples with real data sets are given.  相似文献   

In regression analysis, to deal with the problem of multicollinearity, the restricted principal components regression estimator is proposed. In this paper, we compared the restricted principal components regression estimator, the principal components regression estimator, and the ordinary least-squares estimator with each other under the Pitman's closeness criterion. We showed that the restricted principal components regression estimator is always superior to the principal components regression estimator, under certain conditions the restricted principal components regression estimator is superior to the ordinary least-squares estimator under the Pitman's closeness criterion and under certain conditions the principal components regression estimator is superior to the ordinary least-squares estimator under the Pitman's closeness criterion.  相似文献   

The classical growth curve model is considered when one continuous characteristic is measured at q time points. The covariance adjusted estimator of growth curve parameters is the OLS estimator adjusted using analysis of covariance. The covariates are obtained from functions of within individuals error contrasts. On the other hand, REML estimators emerge from maximization of the likelihood of OLS residuals. We compare the efficiency of estimators of growth curve parameters obtained by REML with that of covariance-adjusted least squares estimators with covariates selected via CAIC.  相似文献   

We investigate the effect of measurement error on principal component analysis in the high‐dimensional setting. The effects of random, additive errors are characterized by the expectation and variance of the changes in the eigenvalues and eigenvectors. The results show that the impact of uncorrelated measurement error on the principal component scores is mainly in terms of increased variability and not bias. In practice, the error‐induced increase in variability is small compared with the original variability for the components corresponding to the largest eigenvalues. This suggests that the impact will be negligible when these component scores are used in classification and regression or for visualizing data. However, the measurement error will contribute to a large variability in component loadings, relative to the loading values, such that interpretation based on the loadings can be difficult. The results are illustrated by simulating additive Gaussian measurement error in microarray expression data from cancer tumours and control tissues.  相似文献   

A multiple regression method based on distance analysis and metric scaling is proposed and studied. This method allow us to predict a continuous response variable from several explanatory variables, is compatible with the general linear model and is found to be useful when the predictor variables are both continuous and categorical. Real data examples are given to illustrate the results obtained.  相似文献   

We consider model selection for linear mixed-effects models with clustered structure, where conditional Kullback–Leibler (CKL) loss is applied to measure the efficiency of the selection. We estimate the CKL loss by substituting the empirical best linear unbiased predictors (EBLUPs) into random effects with model parameters estimated by maximum likelihood. Although the BLUP approach is commonly used in predicting random effects and future observations, selecting random effects to achieve asymptotic loss efficiency concerning CKL loss is challenging and has not been well studied. In this paper, we propose addressing this difficulty using a conditional generalized information criterion (CGIC) with two tuning parameters. We further consider a challenging but practically relevant situation where the number, m $$ m $$ , of clusters does not go to infinity with the sample size. Hence the random-effects variances are not consistently estimable. We show that via a novel decomposition of the CKL risk, the CGIC achieves consistency and asymptotic loss efficiency, whether m $$ m $$ is fixed or increases to infinity with the sample size. We also conduct numerical experiments to illustrate the theoretical findings.  相似文献   

Different longitudinal study designs require different statistical analysis methods and different methods of sample size determination. Statistical power analysis is a flexible approach to sample size determination for longitudinal studies. However, different power analyses are required for different statistical tests which arises from the difference between different statistical methods. In this paper, the simulation-based power calculations of F-tests with Containment, Kenward-Roger or Satterthwaite approximation of degrees of freedom are examined for sample size determination in the context of a special case of linear mixed models (LMMs), which is frequently used in the analysis of longitudinal data. Essentially, the roles of some factors, such as variance–covariance structure of random effects [unstructured UN or factor analytic FA0], autocorrelation structure among errors over time [independent IND, first-order autoregressive AR1 or first-order moving average MA1], parameter estimation methods [maximum likelihood ML and restricted maximum likelihood REML] and iterative algorithms [ridge-stabilized Newton-Raphson and Quasi-Newton] on statistical power of approximate F-tests in the LMM are examined together, which has not been considered previously. The greatest factor affecting statistical power is found to be the variance–covariance structure of random effects in the LMM. It appears that the simulation-based analysis in this study gives an interesting insight into statistical power of approximate F-tests for fixed effects in LMMs for longitudinal data.  相似文献   

A general four parameter growth curve is presented as a model for the growth curve of a group of mice for which averaged weights of the group are available. Several data sets of mice weights obtained from experiments performed at the National Center for Toxicological Research are analyzed. The results are compared with traditional models for growth curves. Both additive and multiplicative error models are analyzed. It is shown that for this data the four parameter model gives a much better fit than traditional growth curve models and should be given serious consideration in model fitting.  相似文献   

A growth curve analysis is often applied to estimate patterns of changes in a given characteristic of different individuals. It is also used to find out if the variations in the growth rates among individuals are due to effects of certain covariates. In this paper, a random coefficient linear regression model, as a special case of the growth curve analysis, is generalized to accommodate the situation where the set of influential covariates is not known a priori. Two different approaches for seleaing influential covariates (a weighted stepwise selection procedure and a modified version of Rao and Wu’s selection criterion) for the random slope coefficient of a linear regression model with unbalanced data are proposed. Performances of these methods are evaluated by means of Monte-Carlo simulation. In addition, several methods (Maximum Likelihood, Restricted Maximum Likelihood, Pseudo Maximum Likelihood and Method of Moments) for estimating the parameters of the selected model are compared Proposed variable selection schemes and estimators are appliedtotheactualindustrial problem which motivated this investigation.  相似文献   

基于非线性主成分和聚类分析的综合评价方法   总被引:5,自引:0,他引:5  
针对传统主成分在处理非线性问题上的不足,阐述了传统方法在数据无量纲化中“中心标准化”的缺点和处理“线性”数据时的缺陷,给出了数据无量纲化和处理“非线性”数据时的改进方法,并建立了一种基于“对数中心化”的非线性主成分分析和聚类分析的新的综合评价方法。实验表明,该方法能有效地处理非线性数据。  相似文献   

In the absence of quantitative clinical standards to detect serial changes in cardiograms, statistical procedures are proposed as an alternative. These procedures are preceded by a dimension reducing orthonormal transformation of the original digitized cardiogram into a lower dimensional feature space. In feature space, mul-tivariate test criteria are given for the detection of changes in covariance matrices or mean vectors of the cardiograms. The flexibility is provided to compare the cardiograms of the same individual pairwise or simultaneously. Some pertinent remarks are also made about controlling the overall level of significance and its impact on the application of these techniques to cardiograms of USAF pilots.  相似文献   

Summary.  The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia.  相似文献   

Tanaka (1988) lias derived the influence functions, which are equivalent to the perturbation expansions up to linear terms, of two functions of eigenvalues and eigenvectors of a real symmetric matrix, and applied them to principal component analysis. The present paper deals with the perturbation expansions up to quadratic terms of the same functions and discusses their application to sensitivity analysis in multivariate methods, in particular, principal component analysis and principal factor analysis. Numerical examples are given to show how the approximation improves with the quadratic terms.  相似文献   

The Gompertz distribution has been used as a growth model, especially in epidemiological and biomedical studies. Based on Type I and II censored samples from a heterogeneous population that can be represented by a finite mixture of two-component Gompertz lifetime model, the maximum likelihood and Bayes estimates of the parameters, reliability and hazard rate functions are obtained. An approximation form due to Lindley (1980) is used in obtaining the corresponding Bayes estimates. The maximum likelihood and Bayes estimates are comparedvia a Monte Carlo simulation study.  相似文献   

This note gives a multivariate version of Rolle's theorem and shows its usefulness in establishing the uniqueness of the root of the maximum likelihood equations, the so-called maximum likelihood equation estimator. The technique is used to prove uniqueness in two situations from the literature where the original proof of uniqueness was in error.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号