首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Dependence in outcome variables may pose formidable difficulty in analyzing data in longitudinal studies. In the past, most of the studies made attempts to address this problem using the marginal models. However, using the marginal models alone, it is difficult to specify the measures of dependence in outcomes due to association between outcomes as well as between outcomes and explanatory variables. In this paper, a generalized approach is demonstrated using both the conditional and marginal models. This model uses link functions to test for dependence in outcome variables. The estimation and test procedures are illustrated with an application to the mobility index data from the Health and Retirement Survey and also simulations are performed for correlated binary data generated from the bivariate Bernoulli distributions. The results indicate the usefulness of the proposed method.  相似文献   

2.
When an existing risk prediction model is not sufficiently predictive, additional variables are sought for inclusion in the model. This paper addresses study designs to evaluate the improvement in prediction performance that is gained by adding a new predictor to a risk prediction model. We consider studies that measure the new predictor in a case–control subset of the study cohort, a practice that is common in biomarker research. We ask if matching controls to cases in regards to baseline predictors improves efficiency. A variety of measures of prediction performance are studied. We find through simulation studies that matching improves the efficiency with which most measures are estimated, but can reduce efficiency for some. Efficiency gains are less when more controls per case are included in the study. A method that models the distribution of the new predictor in controls appears to improve estimation efficiency considerably.  相似文献   

3.
4.
A problem in logit analysis is the interval estimation of the logistic response curve. Scheffé's method is used to obtain confidence bands for the logistic response function for any number of explanatory variables. This method is computationally easier and more general than a previously reported method.  相似文献   

5.
The geographical relative risk function is a useful tool for investigating the spatial distribution of disease based on case and control data. The most common way of estimating this function is using the ratio of bivariate kernel density estimates constructed from the locations of cases and controls, respectively. An alternative is to use a local-linear (LL) estimator of the log-relative risk function. In both cases, the choice of bandwidth is critical. In this article, we examine the relative performance of the two estimation techniques using a variety of data-driven bandwidth selection methods, including likelihood cross-validation (CV), least-squares CV, rule-of-thumb reference methods, and a new approximate plug-in (PI) bandwidth for the LL estimator. Our analysis includes the comparison of asymptotic results; a simulation study; and application of the estimators on two real data sets. Our findings suggest that the density ratio method implemented with the least-squares CV bandwidth selector is generally best, with the LL estimator with PI bandwidth being competitive in applications with strong large-scale trends but much worse in situations with elliptical clusters.  相似文献   

6.
The problem of consistent estimation of regression coefficients in a multivariate linear ultrastructural measurement error model is considered in this article when some additional information on regression coefficients is available a priori. Such additional information is expressible in the form of stochastic linear restrictions. Utilizing stochastic restrictions given a priori, some methodologies are presented to obtain the consistent estimators of regression coefficients under two types of additional information separately, viz., covariance matrix of measurement errors and reliability matrix associated with explanatory variables. The measurement errors are assumed to be not necessarily normally distributed. The asymptotic properties of the proposed estimators are derived and analyzed analytically as well as numerically through a Monte Carlo simulation experiment.  相似文献   

7.
A data-driven bandwidth choice for a kernel density estimator called critical bandwidth is investigated. This procedure allows the estimation to have as many modes as assumed for the density to estimate. Both Gaussian and uniform kernels are considered. For the Gaussian kernel, asymptotic results are given. For the uniform kernel, an argument against these properties is mentioned. These theoretical results are illustrated with a simulation study that compares the kernel estimators that rely on critical bandwidth with another one that uses a plug-in method to select its bandwidth. An estimator that consists in estimates of density contour clusters and takes assumptions on number of modes into account is also considered. Finally, the methodology is illustrated using environment monitoring data.  相似文献   

8.
Abstract. We consider a general non‐parametric regression model, where the distribution of the error, given the covariate, is modelled by a conditional distribution function. For the estimation, a kernel approach as well as the (kernel based) empirical likelihood method are discussed. The latter method allows for incorporation of additional information on the error distribution into the estimation. We show weak convergence of the corresponding empirical processes to Gaussian processes and compare both approaches in asymptotic theory and by means of a simulation study.  相似文献   

9.
For the functional measurement error model, the true, unobservable explanatory variables when treated as nuisance parameters yield an increase in the number of nuisance parameters corresponding to an increase in sample size. Fisher's information may not exist for all parameters under this scenario. We propose a simple but effective method of deriving Fisher's information by approximating the design matrix of explanatory variables with a quantile design matrix. We illustrate the application of our method with a numerical example. Adaptation of this method shows very good performance for the prediction problem.  相似文献   

10.
We consider the problem of density estimation when the data is in the form of a continuous stream with no fixed length. In this setting, implementations of the usual methods of density estimation such as kernel density estimation are problematic. We propose a method of density estimation for massive datasets that is based upon taking the derivative of a smooth curve that has been fit through a set of quantile estimates. To achieve this, a low-storage, single-pass, sequential method is proposed for simultaneous estimation of multiple quantiles for massive datasets that form the basis of this method of density estimation. For comparison, we also consider a sequential kernel density estimator. The proposed methods are shown through simulation study to perform well and to have several distinct advantages over existing methods.  相似文献   

11.
In this paper, the ridge estimation method is generalized to the median regression. Though the least absolute deviation (LAD) estimation method is robust in the presence of non-Gaussian or asymmetric error terms, it can still deteriorate into a severe multicollinearity problem when non-orthogonal explanatory variables are involved. The proposed method increases the efficiency of the LAD estimators by reducing the variance inflation and giving more room for the bias to get a smaller mean squared error of the LAD estimators. This paper includes an application of the new methodology and a simulation study as well.  相似文献   

12.
删除截距项和遗漏解释变量是线性回归模型估计中的两个常见错误,删除截距项错误发生的原因是检验过程中发现其不显著而将其剔除,这会造成模型参数估计和假设检验的失真;遗漏解释变量的错误发生原因是人们错误认为只要变量存在相关性且存在因果联系就可以进行回归分析,以至于不考虑其它重要的解释变量,此时建立的模型不能用于经济结构分析和政策评价,最多只能用于预测目的。  相似文献   

13.
The author considers the estimation of the common probability density of independent and identically distributed random variables observed with added white noise. She assumes that the unknown density belongs to some class of supersmooth functions, and that the error distribution is ordinarily smooth, meaning that its characteristic function decays polynomially asymptotically. In this context, the author evaluates the minimax rate of convergence of the pointwise risk and describes a kernel estimator having this rate. She computes upper bounds for the L2 risk of this estimator.  相似文献   

14.
A great deal of research has focused on improving the bias properties of kernel estimators. One proposal involves removing the restriction of non-negativity on the kernel to construct “higher-order” kernels that eliminate additional terms in the Taylor's series expansion of the bias. This paper considers an alternative that uses a local approach to bandwidth selection to not only reduce the bias, but to eliminate it entirely. These so-called “zero-bias bandwidths” are shown to exist for univariate and multivariate kernel density estimation as well as kernel regression. Implications of the existence of such bandwidths are discussed. An estimation strategy is presented, and the extent of the reduction or elimination of bias in practice is studied through simulation and example.  相似文献   

15.
Techniques of credit scoring have been developed these last years in order to reduce the risk taken by banks and financial institutions in the loans that they are granting. Credit Scoring is a classification problem of individuals in one of the two following groups: defaulting borrowers or non-defaulting borrowers. The aim of this paper is to propose a new method of discrimination when the dependent variable is categorical and when a large number of categorical explanatory variables are retained. This method, Categorical Multiblock Linear Discriminant Analysis, computes components which take into account both relationships between explanatory categorical variables and canonical correlation between each explanatory categorical variable and the dependent variable. A comparison with three other techniques and an application on credit scoring data are provided.  相似文献   

16.
An alternative graphical method, called the SSR plot, is proposed for use with a multiple regression model. The new method uses the fact that the sum of squares for regression (SSR) of two explanatory variables can be partitioned into the SSR of one variable and the increment in SSR due to the addition of the second variable. The SSR plot represents each explanatory variable as a vector in a half circle. Our proposed SSR plot explains that the explanatory variables corresponding to the vectors located closer to the horizontal axis have stronger effects on the response variable. Furthermore, for a regression model with two explanatory variables, the magnitude of the angle between two vectors can be used to identify suppression.  相似文献   

17.
RATES OF CONVERGENCE IN SEMI-PARAMETRIC MODELLING OF LONGITUDINAL DATA   总被引:2,自引:0,他引:2  
We consider the problem of semi-parametric regression modelling when the data consist of a collection of short time series for which measurements within series are correlated. The objective is to estimate a regression function of the form E[Y(t) | x] =x'ß+μ(t), where μ(.) is an arbitrary, smooth function of time t, and x is a vector of explanatory variables which may or may not vary with t. For the non-parametric part of the estimation we use a kernel estimator with fixed bandwidth h. When h is chosen without reference to the data we give exact expressions for the bias and variance of the estimators for β and μ(t) and an asymptotic analysis of the case in which the number of series tends to infinity whilst the number of measurements per series is held fixed. We also report the results of a small-scale simulation study to indicate the extent to which the theoretical results continue to hold when h is chosen by a data-based cross-validation method.  相似文献   

18.
This paper examines the problem of assessing local influence on the optimal bandwidth estimation in kernel smoothing based on cross validation. The bandwidth for kernel smoothing plays an important role in the model fitting and is often estimated using the cross-validation criterion. Following the argument of the second-order approach to local influence suggested by Wu and Luo (1993), we develop a new diagnostic statistic to examine the local influence of the observations on the estimation of the optimal bandwidth, where the perturbation may belong to one of three schemes. These are the response perturbation, the perturbation in the explanatory variable, and the case-weight

perturbation. The proposed diagnostic is nonparametric and is capable of identifying influential observations with strong influence on the bandwidth estimation. An example is presented to illustrate the application of the proposed diagnostic, and the usefulness of the nonparametric approach is illustrated in comparison with some other approaches to the assessment of local influence  相似文献   

19.
In this paper we consider the regression problem for random sets of the Boolean-model type. Regression modeling of the Boolean random sets using some explanatory variables are classified according to the type of these variables as propagation, growth or propagation-growth models. The maximum likelihood estimation of the parameters for the propagation model is explained in detail for some specific link functions using three methods. These three methods of estimation are also compared in a simulation study.  相似文献   

20.
蒋青嬗等 《统计研究》2018,35(11):105-115
忽略个体效应和空间效应会严重干扰效率测算,其中忽略个体效应使得技术无效率项发生偏移,忽略空间相关性导致估计量有偏且不一致。本文基于真实固定效应随机前沿模型(引入了个体效应),引入因变量和双边误差项的空间滞后项,构建了适用性更佳的真实固定效应空间随机前沿模型。对模型进行组内变化以消除额外参数,使用贝叶斯方法(需推导未知参数的后验分布并执行MCMC抽样)估计参数和技术效率。该方法真正克服了额外参数问题,比同类方法直观、简便。数值模拟结果表明,本文方法对参数、个体截距项及技术无效率项的估计精度均较高,且增加样本容量,估计精度变优。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号