首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 328 毫秒

We consider multiple regression (MR) model averaging using the focused information criterion (FIC). Our approach is motivated by the problem of implementing a mean-variance portfolio choice rule. The usual approach is to estimate parameters ignoring the intention to use them in portfolio choice. We develop an estimation method that focuses on the trading rule of interest. Asymptotic distributions of submodel estimators in the MR case are derived using a localization framework. The localization is of both regression coefficients and error covariances. Distributions of submodel estimators are used for model selection with the FIC. This allows comparison of submodels using the risk of portfolio rule estimators. FIC model averaging estimators are then characterized. This extension further improves risk properties. We show in simulations that applying these methods in the portfolio choice case results in improved estimates compared with several competitors. An application to futures data shows superior performance as well.  相似文献   

Panel count data arise in many fields and a number of estimation procedures have been developed along with two procedures for variable selection. In this paper, we discuss model selection and parameter estimation together. For the former, a focused information criterion (FIC) is presented and for the latter, a frequentist model average (FMA) estimation procedure is developed. A main advantage, also the difference from the existing model selection methods, of the FIC is that it emphasizes the accuracy of the estimation of the parameters of interest, rather than all parameters. Further efficiency gain can be achieved by the FMA estimation procedure as unlike existing methods, it takes into account the variability in the stage of model selection. Asymptotic properties of the proposed estimators are established, and a simulation study conducted suggests that the proposed methods work well for practical situations. An illustrative example is also provided. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics  相似文献   

We consider estimation of the tail index parameter from i.i.d. observations in Pareto and Weibull type models, using a local and asymptotic approach. The slowly varying function describing the non-tail behavior of the distribution is considered as an infinite dimensional nuisance parameter. Without further regularity conditions, we derive a local asymptotic normality (LAN) result for suitably chosen parametric submodels of the full semiparametric model. From this result, we immediately obtain the optimal rate of convergence of tail index parameter estimators for more specific models previously studied. On top of the optimal rate of convergence, our LAN result also gives the minimal limiting variance of estimators (regular for our parametric model) through the convolution theorem. We show that the classical Hill estimator is regular for the submodels introduced with limiting variance equal to the induced convolution theorem bound. We also discuss the Weibull model in this respect.  相似文献   

In this paper we address the problem of estimating a vector of regression parameters in the Weibull censored regression model. Our main objective is to provide natural adaptive estimators that significantly improve upon the classical procedures in the situation where some of the predictors may or may not be associated with the response. In the context of two competing Weibull censored regression models (full model and candidate submodel), we consider an adaptive shrinkage estimation strategy that shrinks the full model maximum likelihood estimate in the direction of the submodel maximum likelihood estimate. We develop the properties of these estimators using the notion of asymptotic distributional risk. The shrinkage estimators are shown to have higher efficiency than the classical estimators for a wide class of models. Further, we consider a LASSO type estimation strategy and compare the relative performance with the shrinkage estimators. Monte Carlo simulations reveal that when the true model is close to the candidate submodel, the shrinkage strategy performs better than the LASSO strategy when, and only when, there are many inactive predictors in the model. Shrinkage and LASSO strategies are applied to a real data set from Veteran's administration (VA) lung cancer study to illustrate the usefulness of the procedures in practice.  相似文献   

In this article, we propose an estimation procedure to estimate parameters of joint model when there exists a relationship between cluster size and clustered failure times of subunits within a cluster. We use a joint random effect model of clustered failure times and cluster size. To investigate the possible association, two submodels are connected by a common latent variable. The EM algorithm is applied for the estimation of parameters in the models. Simulation studies are performed to assess the finite sample properties of the estimators. Also, sensitivity tests show the influence of the misspecification of random effect distributions. The methods are applied to a lymphatic filariasis study for adult worm nests.  相似文献   

We consider settings where it is of interest to fit and assess regression submodels that arise as various explanatory variables are excluded from a larger regression model. The larger model is referred to as the full model; the submodels are the reduced models. We show that a computationally efficient approximation to the regression estimates under any reduced model can be obtained from a simple weighted least squares (WLS) approach based on the estimated regression parameters and covariance matrix from the full model. This WLS approach can be considered an extension to unbiased estimating equations of a first-order Taylor series approach proposed by Lawless and Singhal. Using data from the 2010 Nationwide Inpatient Sample (NIS), a 20% weighted, stratified, cluster sample of approximately 8 million hospital stays from approximately 1000 hospitals, we illustrate the WLS approach when fitting interval censored regression models to estimate the effect of type of surgery (robotic versus nonrobotic surgery) on hospital length-of-stay while adjusting for three sets of covariates: patient-level characteristics, hospital characteristics, and zip-code level characteristics. Ordinarily, standard fitting of the reduced models to the NIS data takes approximately 10 hours; using the proposed WLS approach, the reduced models take seconds to fit.  相似文献   

In this paper, we extend the focused information criterion (FIC) to copula models. Copulas are often used for applications where the joint tail behavior of the variables is of particular interest, and selecting a copula that captures this well is then essential. Traditional model selection methods such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) aim at finding the overall best‐fitting model, which is not necessarily the one best suited for the application at hand. The FIC, on the other hand, evaluates and ranks candidate models based on the precision of their point estimates of a context‐given focus parameter. This could be any quantity of particular interest, for example, the mean, a correlation, conditional probabilities, or measures of tail dependence. We derive FIC formulae for the maximum likelihood estimator, the two‐stage maximum likelihood estimator, and the so‐called pseudo‐maximum‐likelihood (PML) estimator combined with parametric margins. Furthermore, we confirm the validity of the AIC formula for the PML estimator combined with parametric margins. To study the numerical behavior of FIC, we have carried out a simulation study, and we have also analyzed a multivariate data set pertaining to abalones. The results from the study show that the FIC successfully ranks candidate models in terms of their performance, defined as how well they estimate the focus parameter. In terms of estimation precision, FIC clearly outperforms AIC, especially when the focus parameter relates to only a specific part of the model, such as the conditional upper‐tail probability.  相似文献   

Summary.  We consider the problem of combining inference in related nonparametric Bayes models. Analogous to parametric hierarchical models, the hierarchical extension formalizes borrowing strength across the related submodels. In the nonparametric context, modelling is complicated by the fact that the random quantities over which we define the hierarchy are infinite dimensional. We discuss a formal definition of such a hierarchical model. The approach includes a regression at the level of the nonparametric model. For the special case of Dirichlet process mixtures, we develop a Markov chain Monte Carlo scheme to allow efficient implementation of full posterior inference in the given model.  相似文献   

Linear regression models are useful statistical tools to analyze data sets in different fields. There are several methods to estimate the parameters of a linear regression model. These methods usually perform under normally distributed and uncorrelated errors. If error terms are correlated the Conditional Maximum Likelihood (CML) estimation method under normality assumption is often used to estimate the parameters of interest. The CML estimation method is required a distributional assumption on error terms. However, in practice, such distributional assumptions on error terms may not be plausible. In this paper, we propose to estimate the parameters of a linear regression model with autoregressive error term using Empirical Likelihood (EL) method, which is a distribution free estimation method. A small simulation study is provided to evaluate the performance of the proposed estimation method over the CML method. The results of the simulation study show that the proposed estimators based on EL method are remarkably better than the estimators obtained from CML method in terms of mean squared errors (MSE) and bias in almost all the simulation configurations. These findings are also confirmed by the results of the numerical and real data examples.  相似文献   

We introduce the dispersion models with a regression structure to extend the generalized linear models, the exponential family nonlinear models (Cordeiro and Paula, 1989) and the proper dispersion models (Jørgensen, 1997a). We provide a matrix expression for the skewness of the maximum likelihood estimators of the regression parameters in dispersion models. The formula is suitable for computer implementation and can be applied for several important submodels discussed in the literature. Expressions for the skewness of the maximum likelihood estimators of the precision and dispersion parameters are also derived. In particular, our results extend previous formulas obtained by Cordeiro and Cordeiro (2001) and Cavalcanti et al. (2009). A simulation study is performed to show the practice importance of our results.  相似文献   

The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.  相似文献   

This paper is concerned with model averaging procedure for varying-coefficient partially linear models with missing responses. The profile least-squares estimation process and inverse probability weighted method are employed to estimate regression coefficients of the partially restricted models, in which the propensity score is estimated by the covariate balancing propensity score method. The estimators of the linear parameters are shown to be asymptotically normal. Then we develop the focused information criterion, formulate the frequentist model averaging estimators and construct the corresponding confidence intervals. Some simulation studies are conducted to examine the finite sample performance of the proposed methods. We find that the covariate balancing propensity score improves the performance of the inverse probability weighted estimator. We also demonstrate the superiority of the proposed model averaging estimators over those of existing strategies in terms of mean squared error and coverage probability. Finally, our approach is further applied to a real data example.  相似文献   

Research concerning hospital readmissions has mostly focused on statistical and machine learning models that attempt to predict this unfortunate outcome for individual patients. These models are useful in certain settings, but their performance in many cases is insufficient for implementation in practice, and the dynamics of how readmission risk changes over time is often ignored. Our objective is to develop a model for aggregated readmission risk over time – using a continuous-time Markov chain – beginning at the point of discharge. We derive point and interval estimators for readmission risk, and find the asymptotic distributions for these probabilities. Finally, we validate our derived estimators using simulation, and apply our methods to estimate readmission risk over time using discharge and readmission data for surgical patients.  相似文献   

Abstract. The short‐term and long‐term hazard ratio model includes the proportional hazards model and the proportional odds model as submodels, and allows a wider range of hazard ratio patterns compared with some of the more traditional models. We propose two omnibus tests for checking this model, based, respectively, on the martingale residuals and the contrast between the non‐parametric and model‐based estimators of the survival function. These tests are shown to be consistent against any departure from the model. The empirical behaviours of the tests are studied in simulations, and the tests are illustrated with some real data examples.  相似文献   

We study model selection and model averaging in semiparametric partially linear models with missing responses. An imputation method is used to estimate the linear regression coefficients and the nonparametric function. We show that the corresponding estimators of the linear regression coefficients are asymptotically normal. Then a focused information criterion and frequentist model average estimators are proposed and their theoretical properties are established. Simulation studies are performed to demonstrate the superiority of the proposed methods over the existing strategies in terms of mean squared error and coverage probability. Finally, the approach is applied to a real data case.  相似文献   

We consider a partially linear model in which the vector of coefficients β in the linear part can be partitioned as ( β 1, β 2) , where β 1 is the coefficient vector for main effects (e.g. treatment effect, genetic effects) and β 2 is a vector for ‘nuisance’ effects (e.g. age, laboratory). In this situation, inference about β 1 may benefit from moving the least squares estimate for the full model in the direction of the least squares estimate without the nuisance variables (Steinian shrinkage), or from dropping the nuisance variables if there is evidence that they do not provide useful information (pretesting). We investigate the asymptotic properties of Stein‐type and pretest semiparametric estimators under quadratic loss and show that, under general conditions, a Stein‐type semiparametric estimator improves on the full model conventional semiparametric least squares estimator. The relative performance of the estimators is examined using asymptotic analysis of quadratic risk functions and it is found that the Stein‐type estimator outperforms the full model estimator uniformly. By contrast, the pretest estimator dominates the least squares estimator only in a small part of the parameter space, which is consistent with the theory. We also consider an absolute penalty‐type estimator for partially linear models and give a Monte Carlo simulation comparison of shrinkage, pretest and the absolute penalty‐type estimators. The comparison shows that the shrinkage method performs better than the absolute penalty‐type estimation method when the dimension of the β 2 parameter space is large.  相似文献   

Semiparametric accelerated failure time (AFT) models directly relate the expected failure times to covariates and are a useful alternative to models that work on the hazard function or the survival function. For case-cohort data, much less development has been done with AFT models. In addition to the missing covariates outside of the sub-cohort in controls, challenges from AFT model inferences with full cohort are retained. The regression parameter estimator is hard to compute because the most widely used rank-based estimating equations are not smooth. Further, its variance depends on the unspecified error distribution, and most methods rely on computationally intensive bootstrap to estimate it. We propose fast rank-based inference procedures for AFT models, applying recent methodological advances to the context of case-cohort data. Parameters are estimated with an induced smoothing approach that smooths the estimating functions and facilitates the numerical solution. Variance estimators are obtained through efficient resampling methods for nonsmooth estimating functions that avoids full blown bootstrap. Simulation studies suggest that the recommended procedure provides fast and valid inferences among several competing procedures. Application to a tumor study demonstrates the utility of the proposed method in routine data analysis.  相似文献   

In parametric regression models the sign of a coefficient often plays an important role in its interpretation. One possible approach to model selection in these situations is to consider a loss function that formulates prediction of the sign of a coefficient as a decision problem. Taking a Bayesian approach, we extend this idea of a sign based loss for selection to more complex situations. In generalized additive models we consider prediction of the sign of the derivative of an additive term at a set of predictors. Being able to predict the sign of the derivative at some point (that is, whether a term is increasing or decreasing) is one approach to selection of terms in additive modelling when interpretation is the main goal. For models with interactions, prediction of the sign of a higher order derivative can be used similarly. There are many advantages to our sign-based strategy for selection: one can work in a full or encompassing model without the need to specify priors on a model space and without needing to specify priors on parameters in submodels. Also, avoiding a search over a large model space can simplify computation. We consider shrinkage prior specifications on smoothing parameters that allow for good predictive performance in models with large numbers of terms without the need for selection, and a frequentist calibration of the parameter in our sign-based loss function when it is desired to control a false selection rate for interpretation.  相似文献   

This article investigates the asymptotic properties of quasi-maximum likelihood (QML) estimators for random-effects panel data transformation models where both the response and (some of) the covariates are subject to transformations for inducing normality, flexible functional form, homoskedasticity, and simple model structure. We develop a QML-type procedure for model estimation and inference. We prove the consistency and asymptotic normality of the QML estimators, and propose a simple bootstrap procedure that leads to a robust estimate of the variance-covariance (VC) matrix. Monte Carlo results reveal that the QML estimators perform well in finite samples, and that the gains by using the robust VC matrix estimate for inference can be enormous.  相似文献   

Some quality characteristics are well defined when treated as the response variables and their relationships are identified to some independent variables. This relationship is called a profile. The parametric models, such as linear models, may be used to model the profiles. However, due to the complexity of many processes in practical applications, it is inappropriate to model the process using parametric models. In these cases non parametric methods are used to model the processes. One of the most applicable non parametric methods used to model complicated profiles is the wavelet. Many authors considered the use of the wavelet transformation only for monitoring the processes in phase II. The problem of estimating the in-control profile in phase I using wavelet transformation is not deeply addressed. Usually classical estimators are used in phase I to estimate the in-control profiles, even when the wavelet transformation is used. These estimators are suitable if the data do not contain outliers. However, when the outliers exist, these estimators cannot estimate the in-control profile properly. In this research, a robust method of estimating the in-control profiles is proposed, which is insensitive to the presence of outliers and could be applied when the wavelet transformation is used. The proposed estimator is the combination of the robust clustering and the S-estimator. This estimator is compared with the classical estimator of the in-control profile in the presence of outliers. The results from a large simulation study show that using the proposed method, one can estimate the in-control profile precisely when the data are contaminated either locally or globally.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号