首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary. Varying-coefficient linear models arise from multivariate nonparametric regression, non-linear time series modelling and forecasting, functional data analysis, longitudinal data analysis and others. It has been a common practice to assume that the varying coefficients are functions of a given variable, which is often called an index . To enlarge the modelling capacity substantially, this paper explores a class of varying-coefficient linear models in which the index is unknown and is estimated as a linear combination of regressors and/or other variables. We search for the index such that the derived varying-coefficient model provides the least squares approximation to the underlying unknown multidimensional regression function. The search is implemented through a newly proposed hybrid backfitting algorithm. The core of the algorithm is the alternating iteration between estimating the index through a one-step scheme and estimating coefficient functions through one-dimensional local linear smoothing. The locally significant variables are selected in terms of a combined use of the t -statistic and the Akaike information criterion. We further extend the algorithm for models with two indices. Simulation shows that the methodology proposed has appreciable flexibility to model complex multivariate non-linear structure and is practically feasible with average modern computers. The methods are further illustrated through the Canadian mink–muskrat data in 1925–1994 and the pound–dollar exchange rates in 1974–1983.  相似文献   

2.
By modifying the direct method to solve the overdetermined linear system we are able to present an algorithm for L1 estimation which appears to be superior computationally to any other known algorithm for the simple linear regression problem.  相似文献   

3.
The Barrodale and Roberts algorithm for least absolute value (LAV) regression and the algorithm proposed by Bartels and Conn both have the advantage that they are often able to skip across points at which the conventional simplex-method algorithms for LAV regression would be required to carry out an (expensive) pivot operation.

We indicate here that this advantage holds in the Bartels-Conn approach for a wider class of problems: the minimization of piecewise linear functions. We show how LAV regression, restricted LAV regression, general linear programming and least maximum absolute value regression can all be easily expressed as piecewise linear minimization problems.  相似文献   

4.
We propose a hybrid two-group classification method that integrates linear discriminant analysis, a polynomial expansion of the basis (or variable space), and a genetic algorithm with multiple crossover operations to select variables from the expanded basis. Using new product launch data from the biochemical industry, we found that the proposed algorithm offers mean percentage decreases in the misclassification error rate of 50%, 56%, 59%, 77%, and 78% in comparison to a support vector machine, artificial neural network, quadratic discriminant analysis, linear discriminant analysis, and logistic regression, respectively. These improvements correspond to annual cost savings of $4.40–$25.73 million.  相似文献   

5.
This paper considers the analysis of linear models where the response variable is a linear function of observable component variables. For example, scores on two or more psychometric measures (the component variables) might be weighted and summed to construct a single response variable in a psychological study. A linear model is then fit to the response variable. The question addressed in this paper is how to optimally transform the component variables so that the response is approximately normally distributed. The transformed component variables, themselves, need not be jointly normal. Two cases are considered; in both cases, the Box-Cox power family of transformations is employed. In Case I, the coefficients of the linear transformation are known constants. In Case II, the linear function is the first principal component based on the matrix of correlations among the transformed component variables. For each case, an algorithm is described for finding the transformation powers that minimize a generalized Anderson-Darling statistic. The proposed transformation procedure is compared to likelihood-based methods by means of simulation. The proposed method rarely performed worse than likelihood-based methods and for many data sets performed substantially better. As an illustration, the algorithm is applied to a problem from rural sociology and social psychology; namely scaling family residences along an urban-rural dimension.  相似文献   

6.
The authors propose a profile likelihood approach to linear clustering which explores potential linear clusters in a data set. For each linear cluster, an errors‐in‐variables model is assumed. The optimization of the derived profile likelihood can be achieved by an EM algorithm. Its asymptotic properties and its relationships with several existing clustering methods are discussed. Methods to determine the number of components in a data set are adapted to this linear clustering setting. Several simulated and real data sets are analyzed for comparison and illustration purposes. The Canadian Journal of Statistics 38: 716–737; 2010 © 2010 Statistical Society of Canada  相似文献   

7.
Albert and Chib introduced a complete Bayesian method to analyze data arising from the generalized linear model in which they used the Gibbs sampling algorithm facilitated by latent variables. Recently, Cowles proposed an alternative algorithm to accelerate the convergence of the Albert-Chib algorithm. The novelty in this latter algorithm is achieved by using a Hastings algorithm to generate latent variables and bin boundary parameters jointly instead of individually from their respective full conditionals. In the same spirit, we reparameterize the cumulative-link generalized linear model to accelerate the convergence of Cowles’ algorithm even further. One important advantage of our method is that for the three-bin problem it does not require the Hastings algorithm. In addition, for problems with more than three bins, while the Hastings algorithm is required, we provide a proposal density based on the Dirichlet distribution which is more natural than the truncated normal density used in the competing algorithm. Also, using diagnostic procedures recommended in the literature for the Markov chain Monte Carlo algorithm (both single and multiple runs) we show that our algorithm is substantially better than the one recently obtained. Precisely, our algorithm provides faster convergence and smaller autocorrelations between the iterates. Using the probit link function, extensive results are obtained for the three-bin and the five-bin multinomial ordinal data problems.  相似文献   

8.
This paper describes an EM algorithm for maximum likelihood estimation in generalized linear models (GLMs) with continuous measurement error in the explanatory variables. The algorithm is an adaptation of that for nonparametric maximum likelihood (NPML) estimation in overdispersed GLMs described in Aitkin (Statistics and Computing 6: 251–262, 1996). The measurement error distribution can be of any specified form, though the implementation described assumes normal measurement error. Neither the reliability nor the distribution of the true score of the variables with measurement error has to be known, nor are instrumental variables or replication required.Standard errors can be obtained by omitting individual variables from the model, as in Aitkin (1996).Several examples are given, of normal and Bernoulli response variables.  相似文献   

9.
Summary.  Non-hierarchical clustering methods are frequently based on the idea of forming groups around 'objects'. The main exponent of this class of methods is the k -means method, where these objects are points. However, clusters in a data set may often be due to certain relationships between the measured variables. For instance, we can find linear structures such as straight lines and planes, around which the observations are grouped in a natural way. These structures are not well represented by points. We present a method that searches for linear groups in the presence of outliers. The method is based on the idea of impartial trimming. We search for the 'best' subsample containing a proportion 1− α of the data and the best k affine subspaces fitting to those non-discarded observations by measuring discrepancies through orthogonal distances. The population version of the sample problem is also considered. We prove the existence of solutions for the sample and population problems together with their consistency. A feasible algorithm for solving the sample problem is described as well. Finally, some examples showing how the method proposed works in practice are provided.  相似文献   

10.
Methods for linear regression with multivariate response variables are well described in statistical literature. In this study we conduct a theoretical evaluation of the expected squared prediction error in bivariate linear regression where one of the response variables contains missing data. We make the assumption of known covariance structure for the error terms. On this basis, we evaluate three well-known estimators: standard ordinary least squares, generalized least squares, and a James–Stein inspired estimator. Theoretical risk functions are worked out for all three estimators to evaluate under which circumstances it is advantageous to take the error covariance structure into account.  相似文献   

11.
In geostatistics, detecting atypical observations is of special interest due to the changes they can cause in environmental and geological patterns. Several methods for detecting them have been already suggested for the univariate spatial case. However, the problem is more complicated when various variables are observed simultaneously and the spatial correlation among them must be taken into account. The aim of this paper is to detect outliers and influential observations in multivariate spatial linear models. For this purpose, we derive and explore two different methods. First, a multivariate version of the forward search algorithm is given, where locations with outliers are detected in the last steps of the procedure. Next, we derive influence measures to assess the impact of the observations on the multivariate spatial linear model. The procedures are easy to compute and to interpret by means of graphical representations. Finally, an example and a Monte Carlo study illustrate the performance of these methods for identification of outliers in multivariate spatial linear models.  相似文献   

12.
A nonconcave penalized estimation method is proposed for partially linear models with longitudinal data when the number of parameters diverges with the sample size. The proposed procedure can simultaneously estimate the parameters and select the important variables. Under some regularity conditions, the rate of convergence and asymptotic normality of the resulting estimators are established. In addition, an iterative algorithm is proposed to implement the proposed estimators. To improve efficiency for regression coefficients, the estimation of the covariance function is integrated in the iterative algorithm. Simulation studies are carried out to demonstrate that the proposed method performs well, and a real data example is analysed to illustrate the proposed procedure.  相似文献   

13.
A reversible jump algorithm for Bayesian model determination among generalised linear models, under relatively diffuse prior distributions for the model parameters, is proposed. Orthogonal projections of the current linear predictor are used so that knowledge from the current model parameters is used to make effective proposals. This idea is generalised to moves of a reversible jump algorithm for model determination among generalised linear mixed models. Therefore, this algorithm exploits the full flexibility available in the reversible jump method. The algorithm is demonstrated via two examples and compared to existing methods.  相似文献   

14.
The restrictive properties of compositional data, that is multivariate data with positive parts that carry only relative information in their components, call for special care to be taken while performing standard statistical methods, for example, regression analysis. Among the special methods suitable for handling this problem is the total least squares procedure (TLS, orthogonal regression, regression with errors in variables, calibration problem), performed after an appropriate log-ratio transformation. The difficulty or even impossibility of deeper statistical analysis (confidence regions, hypotheses testing) using the standard TLS techniques can be overcome by calibration solution based on linear regression. This approach can be combined with standard statistical inference, for example, confidence and prediction regions and bounds, hypotheses testing, etc., suitable for interpretation of results. Here, we deal with the simplest TLS problem where we assume a linear relationship between two errorless measurements of the same object (substance, quantity). We propose an iterative algorithm for estimating the calibration line and also give confidence ellipses for the location of unknown errorless results of measurement. Moreover, illustrative examples from the fields of geology, geochemistry and medicine are included. It is shown that the iterative algorithm converges to the same values as those obtained using the standard TLS techniques. Fitted lines and confidence regions are presented for both original and transformed compositional data. The paper contains basic principles of linear models and addresses many related problems.  相似文献   

15.
The theory and properties of trend-free (TF) and nearly trend-free (NTF) block designs are wel1 developed. Applications have been hampered because a methodology for design construction has not been available.

This article begins with a short review of concepts and properties of TF and NTF block designs. The major contribution is provision of an algorithm for the construction of linear and nearly linear TF block designs. The algorithm is incorporated in a computer program in FORTRAN 77 provided in an appendix for the IBM PC or compatible microcomputer, a program adaptable also to other computers. Three sets of block designs generated by the program are given as examples.

A numerical example of analysis of a linear trend-free balanced incomplete block design is provided.  相似文献   

16.
A new technique is devised to mitigate the errors-in-variables bias in linear regression. The procedure mimics a 2-stage least squares procedure where an auxiliary regression which generates a better behaved predictor variable is derived. The generated variable is then used as a substitute for the error-prone variable in the first-stage model. The performance of the algorithm is tested by simulation and regression analyses. Simulations suggest the algorithm efficiently captures the additive error term used to contaminate the artificial variables. Regressions provide further credit to the simulations as they clearly show that the compact genetic algorithm-based estimate of the true but unobserved regressor yields considerably better results. These conclusions are robust across different sample sizes and different variance structures imposed on both the measurement error and regression disturbances.  相似文献   

17.
Generalized linear mixed models are a widely used tool for modeling longitudinal data. However, their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed models includes an L 1-penalty term that enforces variable selection and shrinkage simultaneously. A gradient ascent algorithm is proposed that allows to maximize the penalized log-likelihood yielding models with reduced complexity. In contrast to common procedures it can be used in high-dimensional settings where a large number of potentially influential explanatory variables is available. The method is investigated in simulation studies and illustrated by use of real data sets.  相似文献   

18.
The exact distribution of a linear combination of n indepedent negative exponential random variables , when the coefficients cf the linear combination are distinct and positive , is well-known. Recently Ali and Obaidullah (1982) extended this result by taking the coeff icients to be arbitrary real numbers. They used a lengthy geometric.

al approach to arrive at the result . This article gives a simple derivation of the result with the help of a generalized partial fraction technique. This technique also works when the variables involved are gamma variables with certain types of parameters. Results are presented in a form which can easily be programmed for computational purposes. Connection of this problem t o various problems in different fields is also pointed out.  相似文献   

19.
We consider local linear estimation of varying-coefficient models in which the data are observed with multiplicative distortion which depends on an observed confounding variable. At first, each distortion function is estimated by non parametrically regressing the absolute value of contaminated variable on the confounder. Secondly, the coefficient functions are estimated by the local least square method on the basis of the predictors of latent variables, which are obtained in terms of the estimated distorting functions. We also establish the asymptotic normality of our proposed estimators and discuss the inference about the distortion function. Simulation studies are carried out to assess the finite sample performance of the proposed estimators and a real dataset of Pima Indians diabetes is analyzed for illustration.  相似文献   

20.
A simple linear algebraic algorithm to generate a basis of the null space of a given integral matrix is utilized to present a computer algorithm, which in general, is used to reduce the support size of a given design as in a theorem of FoodyHedayat (Theorem 4.1, 1977), and in particular, it is used to produce a basis for trades. The computations based on this algorithm is of order of a polynomial function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号