首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The weighted generalized estimating equation (WGEE), an extension of the generalized estimating equation (GEE) method, is a method for analyzing incomplete longitudinal data. An inappropriate specification of the working correlation structure results in the loss of efficiency of the GEE estimation. In this study, we evaluated the efficiency of WGEE estimation for incomplete longitudinal data when the working correlation structure was misspecified. As a result, we found that the efficiency of the WGEE estimation was lower when an improper working correlation structure was selected, similar to the case of the GEE method. Furthermore, we modified the criterion proposed by Gosho et al. (2011 Gosho, M., Hamada, C. and Yoshimura, I. 2011. Criterion for the selection of a working correlation structure in the generalized estimating equation approach for longitudinal balanced data. Communications in Statistics -Theory and Methods, 40: 38393856. [Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) for selecting a working correlation structure, such that the GEE and WGEE methods can be applied to incomplete longitudinal data, and we investigated the performance of the modified criterion. The results revealed that when the modified criterion was adopted, the proportion that the true correlation structure was selected was likely higher than that in the case of adopting other competing approaches.  相似文献   

2.
In this article, we study stepwise AIC method for variable selection comparing with other stepwise method for variable selection, such as, Partial F, Partial Correlation, and Semi-Partial Correlation in linear regression modeling. Then we show mathematically that the stepwise AIC method and other stepwise methods lead to the same method as Partial F. Hence, there are more reasons to use the stepwise AIC method than the other stepwise methods for variable selection, since the stepwise AIC method is a model selection method that can be easily managed and can be widely extended to more generalized models and applied to non normally distributed data. We also treat problems that always appear in applications, that are validation of selected variables and problem of collinearity.  相似文献   

3.
ABSTRACT

Stepwise regression building procedures are commonly used applied statistical tools, despite their well-known drawbacks. While many of their limitations have been widely discussed in the literature, other aspects of the use of individual statistical fit measures, especially in high-dimensional stepwise regression settings, have not. Giving primacy to individual fit, as is done with p-values and R2, when group fit may be the larger concern, can lead to misguided decision making. One of the most consequential uses of stepwise regression is in health care, where these tools allocate hundreds of billions of dollars to health plans enrolling individuals with different predicted health care costs. The main goal of this “risk adjustment” system is to convey incentives to health plans such that they provide health care services fairly, a component of which is not to discriminate in access or care for persons or groups likely to be expensive. We address some specific limitations of p-values and R2 for high-dimensional stepwise regression in this policy problem through an illustrated example by additionally considering a group-level fairness metric.  相似文献   

4.
In “stepwise” regression analysis, the usual procedure enters or removes variables at each “step” on the basis of testing whether certain partial correlation coefficients are zero. An alternative method suggested in this paper involves testing the hypothesis that the mean square error of prediction does not decrease from one step to the next. This is equivalent to testing that the partial correlation coefficient is equal to a certain nonzero constant. For sample sizes sufficiently large, Fisher's z transformation can be used to obtain an asymptotically UMP unbiased test. The two methods are contrasted with an example involving actual data.  相似文献   

5.
Stepwise methods for variable selection are frequently used to determine the predictors of an outcome in generalized linear models. Although it is widely used within the scientific community, it is well known that the tests on the explained deviance of the selected model are biased. This arises from the fact that the traditional test statistics upon which these methods are based were intended for testing pre-specified hypotheses; instead, the tested model is selected through a data-driven procedure. A multiplicity problem therefore arises. In this work, we define and discuss a nonparametric procedure to adjust the p-value of the selected model of any stepwise selection method. The unbiasedness and consistency of the method is also proved. A simulation study shows the validity of this procedure. Theoretical differences with previous works in the same field are also discussed.  相似文献   

6.
Abstract

Errors-in-variable (EIV) regression is often used to gauge linear relationship between two variables both suffering from measurement and other errors, such as, the comparison of two measurement platforms (e.g., RNA sequencing vs. microarray). Scientists are often at a loss as to which EIV regression model to use for there are infinite many choices. We provide sound guidelines toward viable solutions to this dilemma by introducing two general nonparametric EIV regression frameworks: the compound regression and the constrained regression. It is shown that these approaches are equivalent to each other and, to the general parametric structural modeling approach. The advantages of these methods lie in their intuitive geometric representations, their distribution free nature, and their ability to offer candidate solutions with various optimal properties when the ratio of the error variances is unknown. Each includes the classic nonparametric regression methods of ordinary least squares, geometric mean regression (GMR), and orthogonal regression as special cases. Under these general frameworks, one can readily uncover some surprising optimal properties of the GMR, and truly comprehend the benefit of data normalization. Supplementary materials for this article are available online.  相似文献   

7.
This article aims to put forward a new method to solve the linear quantile regression problems based on EM algorithm using a location-scale mixture of the asymmetric Laplace error distribution. A closed form of the estimator of the unknown parameter vector β based on EM algorithm, is obtained. In addition, some simulations are conducted to illustrate the performance of the proposed method. Simulation results demonstrate that the proposed algorithm performs well. Finally, the classical Engel data is fitted and the Bootstrap confidence intervals for estimators are provided.  相似文献   

8.
An adaptive variable selection procedure is proposed which uses an adaptive test along with a stepwise procedure to select variables for a multiple regression model. We compared this adaptive stepwise procedure to methods that use Akaike's information criterion, Schwartz's information criterion, and Sawa's information criterion. The simulation studies demonstrated that the adaptive stepwise method is more effective than the traditional variable selection methods if the error distribution is not normally distributed. If the error distribution is known to be normally distributed, the variable selection method based on Sawa's information criteria appears to be superior to the other methods. Unless the error distribution is known to be normally distributed, the adaptive stepwise method is recommended.  相似文献   

9.
Summary.  It is well known that in a sequential study the probability that the likelihood ratio for a simple alternative hypothesis H 1 versus a simple null hypothesis H 0 will ever be greater than a positive constant c will not exceed 1/ c under H 0. However, for a composite alternative hypothesis, this bound of 1/ c will no longer hold when a generalized likelihood ratio statistic is used. We consider a stepwise likelihood ratio statistic which, for each new observation, is updated by cumulatively multiplying the ratio of the conditional likelihoods for the composite alternative hypothesis evaluated at an estimate of the parameter obtained from the preceding observations versus the simple null hypothesis. We show that, under the null hypothesis, the probability that this stepwise likelihood ratio will ever be greater than c will not exceed 1/ c . In contrast, under the composite alternative hypothesis, this ratio will generally converge in probability to ∞. These results suggest that a stepwise likelihood ratio statistic can be useful in a sequential study for testing a composite alternative versus a simple null hypothesis. For illustration, we conduct two simulation studies, one for a normal response and one for an exponential response, to compare the performance of a sequential test based on a stepwise likelihood ratio statistic with a constant boundary versus some existing approaches.  相似文献   

10.
This article primarily aims to put forward the linearized restricted ridge regression (LRRR) estimator in linear regression models. Two types of LRRR estimators are investigated under the PRESS criterion and the optimal LRRR estimators and the optimal restricted generalized ridge regression estimator are obtained. We apply the results to the Hald data and finally make a simulation study by using the method of McDonald and Galarneau.  相似文献   

11.
Many areas of statistical modeling are plagued by the “curse of dimensionality,” in which there are more variables than observations. This is especially true when developing functional regression models where the independent dataset is some type of spectral decomposition, such as data from near-infrared spectroscopy. While we could develop a very complex model by simply taking enough samples (such that n > p), this could prove impossible or prohibitively expensive. In addition, a regression model developed like this could turn out to be highly inefficient, as spectral data usually exhibit high multicollinearity. In this article, we propose a two-part algorithm for selecting an effective and efficient functional regression model. Our algorithm begins by evaluating a subset of discrete wavelet transformations, allowing for variation in both wavelet and filter number. Next, we perform an intermediate processing step to remove variables with low correlation to the response data. Finally, we use the genetic algorithm to perform a stochastic search through the subset regression model space, driven by an information-theoretic objective function. We allow our algorithm to develop the regression model for each response variable independently, so as to optimally model each variable. We demonstrate our method on the familiar biscuit dough dataset, which has been used in a similar context by several researchers. Our results demonstrate both the flexibility and the power of our algorithm. For each response variable, a different subset model is selected, and different wavelet transformations are used. The models developed by our algorithm show an improvement, as measured by lower mean error, over results in the published literature.  相似文献   

12.
The mode of a distribution provides an important summary of data and is often estimated on the basis of some non‐parametric kernel density estimator. This article develops a new data analysis tool called modal linear regression in order to explore high‐dimensional data. Modal linear regression models the conditional mode of a response Y given a set of predictors x as a linear function of x . Modal linear regression differs from standard linear regression in that standard linear regression models the conditional mean (as opposed to mode) of Y as a linear function of x . We propose an expectation–maximization algorithm in order to estimate the regression coefficients of modal linear regression. We also provide asymptotic properties for the proposed estimator without the symmetric assumption of the error density. Our empirical studies with simulated data and real data demonstrate that the proposed modal regression gives shorter predictive intervals than mean linear regression, median linear regression and MM‐estimators.  相似文献   

13.
In this article, we aim to study the linearized ridge regression (LRR) estimator in a linear regression model motivated by the work of Liu (1993). The LRR estimator and the two types of generalized Liu estimators are investigated under the PRESS criterion. The method of obtaining the optimal generalized ridge regression (GRR) estimator is derived from the optimal LRR estimator. We apply the Hald data as a numerical example and then make a simulation study to show the main results. It is concluded that the idea of transforming the GRR estimator as a complicated function of the biasing parameters to a linearized version should be paid more attention in the future.  相似文献   

14.
This paper introduces an alternating conditional expectation (ACE) algorithm: a non-parametric approach for estimating the transformations that lead to the maximal multiple correlation of a response and a set of independent variables in regression and correlation analysis. These transformations can give the data analyst insight into the relationships between these variables so that this can be best described and non-linear relationships uncovered. Using the Bayesian information criterion (BIC), we show how to find the best closed-form approximations for the optimal ACE transformations. By means of ACE and BIC, the model fit can be considerably improved compared with the conventional linear model as demonstrated in the two simulated and two real datasets in this paper.  相似文献   

15.
从属性、构建方法及意义等方面,分析研究线性回归模型在计量经济学和统计学两学科视角下的差异,并根据这种差异进一步提出回归模型的基本设定思路。研究表明:识别这种差异是完成模型设定工作的基础性和必要性举措,有助于实现线性回归模型的正确设定。以经典例证对计量经济学和统计学回归模型在应用中的区别以及模型设定问题进行进一步展示和分析。  相似文献   

16.
17.
A discussion is made of asymptotic properties of an Operational Ordinary Ridge Regression estimator and comparison is made with the Operational Generalized Least Squares estimator. Also, some simulation experiments are carried showing efficiency gains can be made through the use of de Ridge estimator.  相似文献   

18.
We present a new approach to regression function estimation in which a non-parametric regression estimator is guided by a parametric pilot estimate with the aim of reducing the bias. New classes of parametrically guided kernel weighted local polynomial estimators are introduced and formulae for asymptotic expectation and variance, hence approximated mean squared error and mean integrated squared error, are derived. It is shown that the new classes of estimators have the very same large sample variance as the estimators in the standard non-parametric setting, while there is substantial room for reducing the bias if the chosen parametric pilot function belongs to a wide neighbourhood around the true regression line. Bias reduction is discussed in light of examples and simulations.  相似文献   

19.
Simplifying Regression Models Using Dimensional Analysis   总被引:1,自引:0,他引:1  
Dimensional analysis can make a contribution to model formulation when some of the measurements in the problem are of physical factors. The analysis constructs a set of independent dimensionless factors that should be used as the variables of the regression in place of the original measurements. There are fewer of these than the originals and they often have a more appropriate interpretation. The technique is described briefly and its proposed role in regression discussed and illustrated with examples. We conclude that dimensional analysis can be effective in the preliminary stages of regression analysis whendeveloping formulations involving continuous variables with several dimensions.  相似文献   

20.
ABSTRACT

In the stepwise procedure of selection of a fixed or a random explanatory variable in a mixed quantitative linear model with errors following a Gaussian stationary autocorrelated process, we have studied the efficiency of five estimators relative to Generalized Least Squares (GLS): Ordinary Least Squares (OLS), Maximum Likelihood (ML), Restricted Maximum Likelihood (REML), First Differences (FD), and First-Difference Ratios (FDR). We have also studied the validity and power of seven derived testing procedures, to assess the significance of the slope of the candidate explanatory variable x 2 to enter the model in which there is already one regressor x 1. In addition to five testing procedures of the literature, we considered the FDR t-test with n ? 3 df and the modified t-test with n? ? 3 df for partial correlations, where n? is Dutilleul's effective sample size. Efficiency, validity, and power were analyzed by Monte Carlo simulations, as functions of the nature, fixed vs. random (purely random or autocorrelated), of x 1 and x 2, the sample size and the autocorrelation of random terms in the regression model. We report extensive results for the autocorrelation structure of first-order autoregressive [AR(1)] type, and discuss results we obtained for other autocorrelation structures, such as spherical semivariogram, first-order moving average [MA(1)] and ARMA(1,1), but we could not present because of space constraints. Overall, we found that:
  1. the efficiency of slope estimators and the validity of testing procedures depend primarily on the nature of x 2, but not on that of x 1;

  2. FDR is the most inefficient slope estimator, regardless of the nature of x 1 and x 2;

  3. REML is the most efficient of the slope estimators compared relative to GLS, provided the specified autocorrelation structure is correct and the sample size is large enough to ensure the convergence of its optimization algorithm;

  4. the FDR t-test, the modified t-test and the REML t-test are the most valid of the testing procedures compared, despite the inefficiency of the FDR and OLS slope estimators for the former two;

  5. the FDR t-test, however, suffers from a lack of power that varies with the nature of x 1 and x 2; and

  6. the modified t-test for partial correlations, which does not require the specification of an autocorrelation structure, can be recommended when x 1 is fixed or random and x 2 is random, whether purely random or autocorrelated. Our results are illustrated by the environmental data that motivated our work.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号