首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In data sets with many predictors, algorithms for identifying a good subset of predictors are often used. Most such algorithms do not allow for any relationships between predictors. For example, stepwise regression might select a model containing an interaction AB but neither main effect A or B. This paper develops mathematical representations of this and other relations between predictors, which may then be incorporated in a model selection procedure. A Bayesian approach that goes beyond the standard independence prior for variable selection is adopted, and preference for certain models is interpreted as prior information. Priors relevant to arbitrary interactions and polynomials, dummy variables for categorical factors, competing predictors, and restrictions on the size of the models are developed. Since the relations developed are for priors, they may be incorporated in any Bayesian variable selection algorithm for any type of linear model. The application of the methods here is illustrated via the stochastic search variable selection algorithm of George and McCulloch (1993), which is modified to utilize the new priors. The performance of the approach is illustrated with two constructed examples and a computer performance dataset.  相似文献   

2.
A number of articles have discussed the way lower order polynomial and interaction terms should be handled in linear regression models. Only if all lower order terms are included in the model will the regression model be invariant with respect to coding transformations of the variables. If lower order terms are omitted, the regression model will not be well formulated. In this paper, we extend this work to examine the implications of the ordering of variables in the linear mixed-effects model. We demonstrate how linear transformations of the variables affect the model and tests of significance of fixed effects in the model. We show how the transformations modify the random effects in the model, as well as their covariance matrix and the value of the restricted log-likelihood. We suggest a variable selection strategy for the linear mixed-effects model.  相似文献   

3.
The authors consider the problem of constructing standardized maximin D‐optimal designs for weighted polynomial regression models. In particular they show that by following the approach to the construction of maximin designs introduced recently by Dette, Haines & Imhof (2003), such designs can be obtained as weak limits of the corresponding Bayesian q‐optimal designs. They further demonstrate that the results are more broadly applicable to certain families of nonlinear models. The authors examine two specific weighted polynomial models in some detail and illustrate their results by means of a weighted quadratic regression model and the Bleasdale–Nelder model. They also present a capstone example involving a generalized exponential growth model.  相似文献   

4.
In this article the problem of the optimal selection and allocation of time points in repeated measures experiments is considered. D‐ optimal designs for linear regression models with a random intercept and first order auto‐regressive serial correlations are computed numerically and compared with designs having equally spaced time points. When the order of the polynomial is known and the serial correlations are not too small, the comparison shows that for any fixed number of repeated measures, a design with equally spaced time points is almost as efficient as the D‐ optimal design. When, however, there is no prior knowledge about the order of the underlying polynomial, the best choice in terms of efficiency is a D‐ optimal design for the highest possible relevant order of the polynomial. A design with equally‐spaced time points is the second best choice  相似文献   

5.
The purpose of this article is to obtain the jackknifed ridge predictors in the linear mixed models and to examine the superiorities, the linear combinations of the jackknifed ridge predictors over the ridge, principal components regression, r?k class and Henderson's predictors in terms of bias, covariance matrix and mean square error criteria. Numerical analyses are considered to illustrate the findings and a simulation study is conducted to see the performance of the jackknifed ridge predictors.  相似文献   

6.
Distributions of a response y (height, for example) differ with values of a factor t (such as age). Given a response y* for a subject of unknown t*, the objective of inverse prediction is to infer the value of t* and to provide a defensible confidence set for it. Training data provide values of y observed on subjects at known values of t. Models relating the mean and variance of y to t can be formulated as mixed (fixed and random) models in terms of sets of functions of t, such as polynomial spline functions. A confidence set on t* can then be had as those hypothetical values of t for which y* is not detected as an outlier when compared to the model fit to the training data. With nonconstant variance, the p-values for these tests are approximate. This article describes how versatile models for this problem can be formulated in such a way that the computations can be accomplished with widely available software for mixed models, such as SAS PROC MIXED. Coverage probabilities of confidence sets on t* are illustrated in an example.  相似文献   

7.
The author identifies static optimal designs for polynomial regression models with or without intercept. His optimality criterion is an average between the D‐optimality criterion for the estimation of low‐degree terms and the D8‐optimality criterion for testing the significance of higher degree terms. His work relies on classical results concerning canonical moments and the theory of continued fractions.  相似文献   

8.
The coefficient of determination, a.k.a. R2, is well-defined in linear regression models, and measures the proportion of variation in the dependent variable explained by the predictors included in the model. To extend it for generalized linear models, we use the variance function to define the total variation of the dependent variable, as well as the remaining variation of the dependent variable after modeling the predictive effects of the independent variables. Unlike other definitions that demand complete specification of the likelihood function, our definition of R2 only needs to know the mean and variance functions, so applicable to more general quasi-models. It is consistent with the classical measure of uncertainty using variance, and reduces to the classical definition of the coefficient of determination when linear regression models are considered.  相似文献   

9.
In contrast to the common belief that the logit model has no analytical presentation, it is possible to find such a solution in the case of categorical predictors. This paper shows that a binary logistic regression by categorical explanatory variables can be constructed in a closed-form solution. No special software and no iterative procedures of nonlinear estimation are needed to obtain a model with all its parameters and characteristics, including coefficients of regression, their standard errors and t-statistics, as well as the residual and null deviances. The derivation is performed for logistic models with one binary or categorical predictor, and several binary or categorical predictors. The analytical formulae can be used for arithmetical calculation of all the parameters of the logit regression. The explicit expressions for the characteristics of logit regression are convenient for the analysis and interpretation of the results of logistic modeling.  相似文献   

10.
In the common linear model with quantitative predictors we consider the problem of designing experiments for estimating the slope of the expected response in a regression. We discuss locally optimal designs, where the experimenter is only interested in the slope at a particular point, and standardized minimax optimal designs, which could be used if precise estimation of the slope over a given region is required. General results on the number of support points of locally optimal designs are derived if the regression functions form a Chebyshev system. For polynomial regression and Fourier regression models of arbitrary degree the optimal designs for estimating the slope of the regression are determined explicitly for many cases of practical interest.  相似文献   

11.
Model summaries based on the ratio of fitted and null likelihoods have been proposed for generalised linear models, reducing to the familiar R2 coefficient of determination in the Gaussian model with identity link. In this note I show how to define the Cox–Snell and Nagelkerke summaries under arbitrary probability sampling designs, giving a design‐consistent estimator of the population model summary. It is also shown that for logistic regression models under case–control sampling the usual Cox–Snell and Nagelkerke R2 are not design‐consistent, but are systematically larger than would be obtained with a cross‐sectional or cohort sample from the same population, even in settings where the weighted and unweighted logistic regression estimators are similar or identical. Implementation of the new estimators is straightforward and code is provided in R.  相似文献   

12.
The local polynomial quasi-likelihood estimation has several good statistical properties such as high minimax efficiency and adaptation of edge effects. In this paper, we construct a local quasi-likelihood regression estimator for a left truncated model, and establish the asymptotic normality of the proposed estimator when the observations form a stationary and α-mixing sequence, such that the corresponding result of Fan et al. [Local polynomial kernel regression for generalized linear models and quasilikelihood functions, J. Amer. Statist. Assoc. 90 (1995), pp. 141–150] is extended from the independent and complete data to the dependent and truncated one. Finite sample behaviour of the estimator is investigated via simulations too.  相似文献   

13.
Principal fitted component (PFC) models are a class of likelihood-based inverse regression methods that yield a so-called sufficient reduction of the random p-vector of predictors X given the response Y. Assuming that a large number of the predictors has no information about Y, we aimed to obtain an estimate of the sufficient reduction that ‘purges’ these irrelevant predictors, and thus, select the most useful ones. We devised a procedure using observed significance values from the univariate fittings to yield a sparse PFC, a purged estimate of the sufficient reduction. The performance of the method is compared to that of penalized forward linear regression models for variable selection in high-dimensional settings.  相似文献   

14.
We propose a new summary tool, so-called average predictive comparison (APC), which summarizes the effect of a particular predictor in a context of regression. Different from the definition in our earlier work (Liu and Gustafson, 2008), the new definition allows a pointwise evaluation of a predictor's effect for any given value of this predictor. We employ this summary tool to examine the consequence of erroneously omitting interactions in regression models. To be able to involve curved relationships between a response variable and predictors, we consider fractional polynomial regression models (Royston and Altman, 1994). We derive the asymptotic properties of the APC estimates under a general setting with p(≥2)p(2) predictors involved. In particular, when there are only two predictors of interest, we find out that the APC estimator is robust to the model misspecification under some certain conditions. We illustrate the application of the proposed summary tool via a real data example. We also conduct simulation experiments to further check the performance of the APC estimates.  相似文献   

15.
When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso.  相似文献   

16.
17.
This paper sets out to implement the Bayesian paradigm for fractional polynomial models under the assumption of normally distributed error terms. Fractional polynomials widen the class of ordinary polynomials and offer an additive and transportable modelling approach. The methodology is based on a Bayesian linear model with a quasi-default hyper-g prior and combines variable selection with parametric modelling of additive effects. A Markov chain Monte Carlo algorithm for the exploration of the model space is presented. This theoretically well-founded stochastic search constitutes a substantial improvement to ad hoc stepwise procedures for the fitting of fractional polynomial models. The method is applied to a data set on the relationship between ozone levels and meteorological parameters, previously analysed in the literature.  相似文献   

18.
Hedonic price models are commonly used in the study of markets for various goods, most notably those for wine, art, and jewelry. These models were developed to estimate implicit prices of product attributes within a given product class, where in the case of some goods, such as wine, substantial product differentiation exists. To address this issue, recent research on wine prices employs local polynomial regression clustering (LPRC) for estimating regression models under class uncertainty. This study demonstrates that a superior empirical approach – estimation of a mixture model – is applicable to a hedonic model of wine prices, provided only that the dependent variable in the model is rescaled. The present study also catalogues several of the advantages over LPRC modeling of estimating mixture models.  相似文献   

19.
In this article, we consider the problem of selecting functional variables using the L1 regularization in a functional linear regression model with a scalar response and functional predictors, in the presence of outliers. Since the LASSO is a special case of the penalized least-square regression with L1 penalty function, it suffers from the heavy-tailed errors and/or outliers in data. Recently, Least Absolute Deviation (LAD) and the LASSO methods have been combined (the LAD-LASSO regression method) to carry out robust parameter estimation and variable selection simultaneously for a multiple linear regression model. However, variable selection of the functional predictors based on LASSO fails since multiple parameters exist for a functional predictor. Therefore, group LASSO is used for selecting functional predictors since group LASSO selects grouped variables rather than individual variables. In this study, we propose a robust functional predictor selection method, the LAD-group LASSO, for a functional linear regression model with a scalar response and functional predictors. We illustrate the performance of the LAD-group LASSO on both simulated and real data.  相似文献   

20.
Although the effect of missing data on regression estimates has received considerable attention, their effect on predictive performance has been neglected. We studied the performance of three missing data strategies—omission of records with missing values, replacement with a mean and imputation based on regression—on the predictive performance of logistic regression (LR), classification tree (CT) and neural network (NN) models in the presence of data missing completely at random (MCAR). Models were constructed using datasets of size 500 simulated from a joint distribution of binary and continuous predictors including nonlinearities, collinearity and interactions between variables. Though omission produced models that fit better on the data from which the models were developed, imputation was superior on average to omission for all models when evaluating the receiver operating characteristic (ROC) curve area, mean squared error (MSE), pooled variance across outcome categories and calibration X 2 on an independently generated test set. However, in about one-third of simulations, omission performed better. Performance was also more variable with omission including quite a few instances of extremely poor performance. Replacement and imputation generally produced similar results except with neural networks for which replacement, the strategy typically used in neural network algorithms, was inferior to imputation. Missing data affected simpler models much less than they did more complex models such as generalized additive models that focus on local structure For moderate sized datasets, logistic regressions that use simple nonlinear structures such as quadratic terms and piecewise linear splines appear to be at least as robust to randomly missing values as neural networks and classification trees.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号