首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 112 毫秒
1.
Feed-forward neural networks—also known as multi-layer perceptrons—are now widely used for regression and classification. In parallel but slightly earlier, a family of methods for flexible regression and discrimination were developed in multivariate statistics, and tree-induction methods have been developed in both machine learning and statistics. We expound and compare these approaches in the context of a number of examples.  相似文献   

2.
Multiple regression diagnostic methods have recently been developed to help data analysts identify failures of data to adhere to the assumptions that customarily accompany regression models. However, the mathematical development of regression diagnostics has not generally led to efficient computing formulas. Conflicting terminology and the use of closely related but subtly different statistics has caused confusion. This article attempts to make regression diagnostics more readily available to those who compute regressions with packaged statistics programs. We review regression diagnostic methodology, highlighting ambiguities of terminology and relationships among similar methods. We present new formulas for efficient computing of regression diagnostics. Finally, we offer specific advice on obtaining regression diagnostics from existing statistics programs, with examples drawn from Minitab and SAS.  相似文献   

3.
We present several methods for full, partial, and practical adaptation. Selector statistics that are measures of skewness, peakedness, and tailweight are used, primarily in estimating loca-tion in some single-sample situations. We note several practical adaptive techniques in current use, including illustrations in-volving stepwise regression, analysis of variance, ridge regres-sion, and splines. We suggest some areas in which future develop-ment of adaptive methods is needed:density estimation; M, R, and L estimation in regression; and dependent data. There is also a need to develop better selector statistics.  相似文献   

4.
An innovative algorithm is developed for obtaining spreadsheet regression measures used in computing out-of-sample statistics. This algorithm alleviates the leave-one-out computational simulation complexity and memory size problems perceived in computing these statistics. Hence, the purpose of this article is to describe a computationally enhanced algorithm that gives spreadsheet users advanced regression capabilities thereby adding a new dimension to spreadsheet regression operations. These statistics include diagonals of the hat matrix, legitimate forecasting intervals, and PRESS residuals. These computational innovations promote learning while eliminating spreadsheet inadequacies thereby making spreadsheet regression attractive to academicians in teaching and practitioners in acquiring further application competence.  相似文献   

5.
This paper starts with an itroduction which gives definitions of order statistics,makes a clear distinction between order statistics and rank statistics, and gives an overview of the problems in which methods based on order statstics are useful.The following sections deal with the history and role of order sta¬tistics in selected areas: central tendency, dispersion, and regression; treatment of outliers and robust estimation; maximum likelihood estimators; best linear unbiased estimators; extreme-values; multiple comparisons and studentized range. The last two sections discuss a basic coverage property and some of its conse¬quences and describe the author's chronological annotated bibliography of order statistics. Finally, there is a list of references.  相似文献   

6.
This paper introduces a novel hybrid regression method (MixReg) combining two linear regression methods, ordinary least square (OLS) and least squares ratio (LSR) regression. LSR regression is a method to find the regression coefficients minimizing the sum of squared error rate while OLS minimizes the sum of squared error itself. The goal of this study is to combine two methods in a way that the proposed method superior both OLS and LSR regression methods in terms of R2 statistics and relative error rate. Applications of MixReg, on both simulated and real data, show that MixReg method outperforms both OLS and LSR regression.  相似文献   

7.
Under an assumption that missing values occur randomly in a matrix, formulae are developed for the expected value and variance of six statistics that summarize the number and location of the missing values. For a seventh statistic, a regression model based on simulated data yields an estimate of the expected value. The results can be used in the development of methods to control the Type I error and approximate power and sample size for multilevel and longitudinal studies with missing data.  相似文献   

8.
Influence measures in multivariate regression analysis have been widely developed, especially through use of the case-deletion approach. However, there seem to be few accounts of the influence of observations on test statistics in hypothesis testing. This paper examines four common multivariate tests, namely the Wilks' ratio, Lawley-Hotelling trace, Pillai's trace and Roy's greatest root for testing a general linear hypothesis of the regression coefficients in multivariate regression. The influence of observations is measured using the case-deletion approach. The proposed diagnostic measures, except that of Roy's greatest root, can be expressed in terms of statistics without involving the actual deletion of observations. An illustrative example is given with satisfactory results.  相似文献   

9.

Finite sample properties of ML and REML estimators in time series regression models with fractional ARIMA noise are examined. In particular, theoretical approximations for bias of ML and REML estimators of the noise parameters are developed and their accuracy is assessed through simulations. The impact of noise parameter estimation on performance of t -statistics and likelihood ratio statistics for testing regression parameters is also investigated.  相似文献   

10.
A ratio-correlation (multiple regression) approach for estimating key economic statistics for small areas is proposed and compared with several synthetic estimation methods using various measures of performance. Using published data that are easily accessible, the methods provide a means of estimating statistics not currently published with a reasonable amount of error.  相似文献   

11.
Recursive and en-bloc approaches to signal extraction   总被引:1,自引:0,他引:1  
In the literature on unobservable component models , three main statistical instruments have been used for signal extraction: fixed interval smoothing (FIS), which derives from Kalman's seminal work on optimal state-space filter theory in the time domain; Wiener-Kolmogorov-Whittle optimal signal extraction (OSE) theory, which is normally set in the frequency domain and dominates the field of classical statistics; and regularization , which was developed mainly by numerical analysts but is referred to as 'smoothing' in the statistical literature (such as smoothing splines, kernel smoothers and local regression). Although some minor recognition of the interrelationship between these methods can be discerned from the literature, no clear discussion of their equivalence has appeared. This paper exposes clearly the interrelationships between the three methods; highlights important properties of the smoothing filters used in signal extraction; and stresses the advantages of the FIS algorithms as a practical solution to signal extraction and smoothing problems. It also emphasizes the importance of the classical OSE theory as an analytical tool for obtaining a better understanding of the problem of signal extraction.  相似文献   

12.
It is suggested that inference under the proportional hazard model can be carried out by programs for exact inference under the logistic regression model. Advantages of such inference is that software is available and that multivariate models can be addressed. The method has been evaluated by means of coverage and power calculations in certain situations. In all situations coverage was above the nominal level, but on the other hand rather conservative. A different type of exact inference is developed under Type II censoring. Inference was then less conservative, however there are limitations with respect to censoring mechanism, multivariate generalizations and software is not available. This method also requires extensive computational power. Performance of large sample Wald, score and likelihood inference was also considered. Large sample methods works remarkably well with small data sets, but inference by score statistics seems to be the best choice. There seems to be some problems with likelihood ratio inference that may originate from how this method works with infinite estimates of the regression parameter. Inference by Wald statistics can be quite conservative with very small data sets.  相似文献   

13.
ADE-4: a multivariate analysis and graphical display software   总被引:59,自引:0,他引:59  
We present ADE-4, a multivariate analysis and graphical display software. Multivariate analysis methods available in ADE-4 include usual one-table methods like principal component analysis and correspondence analysis, spatial data analysis methods (using a total variance decomposition into local and global components, analogous to Moran and Geary indices), discriminant analysis and within/between groups analyses, many linear regression methods including lowess and polynomial regression, multiple and PLS (partial least squares) regression and orthogonal regression (principal component regression), projection methods like principal component analysis on instrumental variables, canonical correspondence analysis and many other variants, coinertia analysis and the RLQ method, and several three-way table (k-table) analysis methods. Graphical display techniques include an automatic collection of elementary graphics corresponding to groups of rows or to columns in the data table, thus providing a very efficient way for automatic k-table graphics and geographical mapping options. A dynamic graphic module allows interactive operations like searching, zooming, selection of points, and display of data values on factor maps. The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for non- specialists in statistics, data analysis or computer science.  相似文献   

14.
15.
This paper extends an analysis of variance for categorical data (CATANOVA) procedure to multidimensional contingency tables involving several factors and a response variable measured on a nominal scale. Using an appropriate measure of total variation for multinomial data, partial and multiple association measures are developed as R2 quantities which parallel the analogous statistics in multiple linear regression for quantitative data. In addition, test statistics are derived in terms of these R2 criteria. Finally, this CATANOVA approach is illustrated within the context of 2 three-way contingency table from a multicenter clinicaltrial.  相似文献   

16.
Two methods are suggested for generating R 2 measures for a wide class of models. These measures are linked to the R 2 of the standard linear regression model through Wald and likelihood ratio statistics for testing the joint significance of the explanatory variables. Some currently used R 2's are shown to be special cases of these methods.  相似文献   

17.
18.
In a cocaine dependence treatment study, we use linear and nonlinear regression models to model posttreatment cocaine craving scores and first cocaine relapse time. A subset of the covariates are summary statistics derived from baseline daily cocaine use trajectories, such as baseline cocaine use frequency and average daily use amount. These summary statistics are subject to estimation error and can therefore cause biased estimators for the regression coefficients. Unlike classical measurement error problems, the error we encounter here is heteroscedastic with an unknown distribution, and there are no replicates for the error-prone variables or instrumental variables. We propose two robust methods to correct for the bias: a computationally efficient method-of-moments-based method for linear regression models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear regression models. Simulations and an application to the cocaine dependence treatment data are used to illustrate the efficacy of the proposed methods. Asymptotic theory and variance estimation for the proposed subsampling extrapolation method and some additional simulation results are described in the online supplementary material.  相似文献   

19.
To study the relationship between a sensitive binary response variable and a set of non‐sensitive covariates, this paper develops a hidden logistic regression to analyse non‐randomized response data collected via the parallel model originally proposed by Tian (2014). This is the first paper to employ the logistic regression analysis in the field of non‐randomized response techniques. Both the Newton–Raphson algorithm and a monotone quadratic lower bound algorithm are developed to derive the maximum likelihood estimates of the parameters of interest. In particular, the proposed logistic parallel model can be used to study the association between a sensitive binary variable and another non‐sensitive binary variable via the measure of odds ratio. Simulations are performed and a study on people's sexual practice data in the United States is used to illustrate the proposed methods.  相似文献   

20.
Importance sampling and control variates have been used as variance reduction techniques for estimating bootstrap tail quantiles and moments, respectively. We adapt each method to apply to both quantiles and moments, and combine the methods to obtain variance reductions by factors from 4 to 30 in simulation examples.We use two innovations in control variates—interpreting control variates as a re-weighting method, and the implementation of control variates using the saddlepoint; the combination requires only the linear saddlepoint but applies to general statistics, and produces estimates with accuracy of order n -1/2 B -1, where n is the sample size and B is the bootstrap sample size.We discuss two modifications to classical importance sampling—a weighted average estimate and a mixture design distribution. These modifications make importance sampling robust and allow moments to be estimated from the same bootstrap simulation used to estimate quantiles.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号