首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 842 毫秒
The mode of a distribution provides an important summary of data and is often estimated on the basis of some non‐parametric kernel density estimator. This article develops a new data analysis tool called modal linear regression in order to explore high‐dimensional data. Modal linear regression models the conditional mode of a response Y given a set of predictors x as a linear function of x . Modal linear regression differs from standard linear regression in that standard linear regression models the conditional mean (as opposed to mode) of Y as a linear function of x . We propose an expectation–maximization algorithm in order to estimate the regression coefficients of modal linear regression. We also provide asymptotic properties for the proposed estimator without the symmetric assumption of the error density. Our empirical studies with simulated data and real data demonstrate that the proposed modal regression gives shorter predictive intervals than mean linear regression, median linear regression and MM‐estimators.  相似文献   

Maximum likelihood estimation is investigated in the context of linear regression models under partial independence restrictions. These restrictions aim to assume a kind of completeness of a set of predictors Z in the sense that they are sufficient to explain the dependencies between an outcome Y and predictors X: ?(Y|Z, X) = ?(Y|Z), where ?(·|·) stands for the conditional distribution. From a practical point of view, the former model is particularly interesting in a double sampling scheme where Y and Z are measured together on a first sample and Z and X on a second separate sample. In that case, estimation procedures are close to those developed in the study of double‐regression by Engel & Walstra (1991) and Causeur & Dhorne (1998) . Properties of the estimators are derived in a small sample framework and in an asymptotic one, and the procedure is illustrated by an example from the food industry context.  相似文献   

In survey sampling and in stereology, it is often desirable to estimate the ratio of means θ= E(Y)/E(X) from bivariate count data (X, Y) with unknown joint distribution. We review methods that are available for this problem, with particular reference to stereological applications. We also develop new methods based on explicit statistical models for the data, and associated model diagnostics. The methods are tested on a stereological dataset. For point‐count data, binomial regression and bivariate binomial models are generally adequate. Intercept‐count data are often overdispersed relative to Poisson regression models, but adequately fitted by negative binomial regression.  相似文献   

In this article, we test the effects of predictors in survival regression through two well-known sufficient dimension reduction methods. Since the usual sufficient dimension reduction methods do not require pre-specified models, the predictor effect tests can be considered model-free. All of the test statistics have χ 2 distributions. Numerical studies of the proposed predictor effect tests in various simulations and real data application are presented.  相似文献   

In this paper, we estimate the reliability of a component subjected to two different stresses which are independent of the strength of a component. We assume that the distribution of stresses follow a bivariate exponential (BVE) distribution. If X is the strength of a component subjected to two stresses (Y 1,Y 2), then the reliability of a component is given by R=P[Y 1+Y 2<X]. We estimate R when (Y 1,Y 2) follow different BVE models proposed by Marshall-Olkin (1967), Block-Basu-(1974), Freund (1961) and Proschan-Sullo (1974). The distribution of X is assumed to be exponential. The asymptotic normal (AN) distributions of these estimates of R are obtained.  相似文献   


Parameter reduction can enable otherwise infeasible design and uncertainty studies with modern computational science models that contain several input parameters. In statistical regression, techniques for sufficient dimension reduction (SDR) use data to reduce the predictor dimension of a regression problem. A computational scientist hoping to use SDR for parameter reduction encounters a problem: a computer prediction is best represented by a deterministic function of the inputs, so data comprised of computer simulation queries fail to satisfy the SDR assumptions. To address this problem, we interpret SDR methods sliced inverse regression (SIR) and sliced average variance estimation (SAVE) as estimating the directions of a ridge function, which is a composition of a low-dimensional linear transformation with a nonlinear function. Within this interpretation, SIR and SAVE estimate matrices of integrals whose column spaces are contained in the ridge directions’ span; we analyze and numerically verify convergence of these column spaces as the number of computer model queries increases. Moreover, we show example functions that are not ridge functions but whose inverse conditional moment matrices are low-rank. Consequently, the computational scientist should beware when using SIR and SAVE for parameter reduction, since SIR and SAVE may mistakenly suggest that truly important directions are unimportant.


Suppose one estimates the coefficient β2 in E[Y] = β0 + β1 X 1 + β2 X 2 by stagewise regression. That is, first the model E[Y] ≌ β0 + β1 X 1 is fit using simple linear regression followed by a simple linear regression of the residuals from this model on X 2 to yield the estimator β2. The ratio of the squared t statistic for the estimate b 2 from multiple regression to the squared t statistic for β2 is greater than or equal to 1.0 and is shown to be a convenient function of correlation coefficients among Y, X 1, and X 2. Examination of stagewise regression can provide useful insights when introducing concepts of multiple regression.  相似文献   

We consider the situation where there is a known regression model that can be used to predict an outcome, Y, from a set of predictor variables X . A new variable B is expected to enhance the prediction of Y. A dataset of size n containing Y, X and B is available, and the challenge is to build an improved model for Y| X ,B that uses both the available individual level data and some summary information obtained from the known model for Y| X . We propose a synthetic data approach, which consists of creating m additional synthetic data observations, and then analyzing the combined dataset of size n + m to estimate the parameters of the Y| X ,B model. This combined dataset of size n + m now has missing values of B for m of the observations, and is analyzed using methods that can handle missing data (e.g., multiple imputation). We present simulation studies and illustrate the method using data from the Prostate Cancer Prevention Trial. Though the synthetic data method is applicable to a general regression context, to provide some justification, we show in two special cases that the asymptotic variances of the parameter estimates in the Y| X ,B model are identical to those from an alternative constrained maximum likelihood estimation approach. This correspondence in special cases and the method's broad applicability makes it appealing for use across diverse scenarios. The Canadian Journal of Statistics 47: 580–603; 2019 © 2019 Statistical Society of Canada  相似文献   

We introduce multicovariate-adjusted regression (MCAR), an adjustment method for regression analysis, where both the response (Y) and predictors (X 1, …, X p ) are not directly observed. The available data have been contaminated by unknown functions of a set of observable distorting covariates, Z 1, …, Z s , in a multiplicative fashion. The proposed method substantially extends the current contaminated regression modelling capability, by allowing for multiple distorting covariate effects. MCAR is a flexible generalisation of the recently proposed covariate-adjusted regression method, an effective adjustment method in the presence of a single covariate, Z. For MCAR estimation, we establish a connection between the MCAR models and adaptive varying coefficient models. This connection leads to an adaptation of a hybrid backfitting estimation algorithm. Extensive simulations are used to study the performance and limitations of the proposed iterative estimation algorithm. In particular, the bias and mean square error of the proposed MCAR estimators are examined, relative to a baseline and a consistent benchmark estimator. The method is also illustrated with a Pima Indian diabetes data set, where the response and predictors are potentially contaminated by body mass index and triceps skin fold thickness. Both distorting covariates measure aspects of obesity, an important risk factor in type 2 diabetes.  相似文献   


This article considers nonparametric regression problems and develops a model-averaging procedure for smoothing spline regression problems. Unlike most smoothing parameter selection studies determining an optimum smoothing parameter, our focus here is on the prediction accuracy for the true conditional mean of Y given a predictor X. Our method consists of two steps. The first step is to construct a class of smoothing spline regression models based on nonparametric bootstrap samples, each with an appropriate smoothing parameter. The second step is to average bootstrap smoothing spline estimates of different smoothness to form a final improved estimate. To minimize the prediction error, we estimate the model weights using a delete-one-out cross-validation procedure. A simulation study has been performed by using a program written in R. The simulation study provides a comparison of the most well known cross-validation (CV), generalized cross-validation (GCV), and the proposed method. This new method is straightforward to implement, and gives reliable performances in simulations.  相似文献   

It is well-known that when ranked set sampling (RSS) scheme is employed to estimate the mean of a population, it is more efficient than simple random sampling (SRS) with the same sample size. One can use a RSS analog of SRS regression estimator to estimate the population mean of Y using its concomitant variable X when they are linearly related. Unfortunately, the variance of this estimate cannot be evaluated unless the distribution of X is known. We investigate the use of resampling methods to establish confidence intervals for the regression estimation of the population mean. Simulation studies show that the proposed methods perform well in a variety of situations when the assumption of linearity holds, and decently well under mild non-linearity.  相似文献   

Significance tests on coefficients of lower-order terms in polynomial regression models are affected by linear transformations. For this reason, a polynomial regression model that excludes hierarchically inferior predictors (i.e., lower-order terms) is considered to be not well formulated. Existing variable-selection algorithms do not take into account the hierarchy of predictors and often select as “best” a model that is not hierarchically well formulated. This article proposes a theory of the hierarchical ordering of the predictors of an arbitrary polynomial regression model in m variables, where m is any arbitrary positive integer. Ways of modifying existing algorithms to restrict their search to well-formulated models are suggested. An algorithm that generates all possible well-formulated models is presented.  相似文献   

In this article, a new method named cumulative slicing principle fitted component (CUPFC) model is proposed to conduct sufficient dimension reduction and prediction in regression. Based on the classical PFC methods, the CUPFC avoids selecting some parameters such as the specific basis function form or the number of slices in slicing estimation. We develop the estimator of the central subspace in the CUPFC method under three error-term structures and establish its consistency. The simulations investigate the effectiveness of the new method in prediction and reduction estimation with other competitors. The results indicate that the new proposed method generally outperforms the existing PFC methods no matter how the predictors are truly related to the response. The application to real data also verifies the validity of the proposed method.  相似文献   


In a model of the form Y = h(X1, …, Xd) where the goal is to estimate a parameter of the probability distribution of Y, we define new sensitivity indices which quantify the importance of each variable Xi with respect to this parameter of interest. The aim of this paper is to define goal oriented sensitivity indices and we will show that Sobol indices are sensitivity indices associated to a particular characteristic of the distribution Y. We name the framework we present as Goal Oriented Sensitivity Analysis (GOSA).  相似文献   

Although the concept of sufficient dimension reduction that was originally proposed has been there for a long time, studies in the literature have largely focused on properties of estimators of dimension-reduction subspaces in the classical “small p, and large n” setting. Rather than the subspace, this paper considers directly the set of reduced predictors, which we believe are more relevant for subsequent analyses. A principled method is proposed for estimating a sparse reduction, which is based on a new, revised representation of an existing well-known method called the sliced inverse regression. A fast and efficient algorithm is developed for computing the estimator. The asymptotic behavior of the new method is studied when the number of predictors, p, exceeds the sample size, n, providing a guide for choosing the number of sufficient dimension-reduction predictors. Numerical results, including a simulation study and a cancer-drug-sensitivity data analysis, are presented to examine the performance.  相似文献   

Biplots are useful tools to explore the relationship among variables. In this paper, the specific regression relationship between a set of predictors X and set of response variables Y by means of partial least-squares (PLS) regression is represented. The PLS biplot provides a single graphical representation of the samples together with the predictor and response variables, as well as their interrelationships in terms of the matrix of regression coefficients.  相似文献   

We present a novel approach to sufficient dimension reduction for the conditional kth moments in regression. The approach provides a computationally feasible test for the dimension of the central kth-moment subspace. In addition, we can test predictor effects without assuming any models. All test statistics proposed in the novel approach have asymptotic chi-squared distributions.  相似文献   

We propose a robust Kalman filter (RKF) to estimate the true but hidden return when microstructure noise is present. Following Zhou's definition, we assume the observed return Yt is the result of adding microstructure noise to the true but hidden return Xt. Microstructure noise is assumed to be independent and identically distributed (i.i.d.); it is also independent of Xt. When Xt is sampled from a geometric Brownian motion process to yield Yt, the Kalman filter can produce optimal estimates of Xt from Yt. However, the covariance matrix of microstructure noise and that of Xt must be known for this claim to hold. In practice, neither covariance matrix is known so they must be estimated. Our RKF, in contrast, does not need the covariance matrices as input. Simulation results show that the RKF gives essentially identical estimates to the Kalman filter, which has access to the covariance matrices. As applications, estimated Xt can be used to estimate the volatility of Xt.  相似文献   

Admissibility of linear estimators is characterized in linear models E(Y)=Xβ, D(Y)=V, with an unknown multidimensional parameter (β, V) varying in the Cartesian product C × ν, where C is a subset of space and ν is a given set of non negative definite symmetric matrices. The relation between admissibility of inhomogeneous and homogeneous linear estimators is discussed, and some sufficient and necessary conditions for admissibility of an inhomogeneous linear estimator are given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号