首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
We propose new dependence measures for two real random variables not necessarily linearly related. Covariance and linear correlation are expressed in terms of principal components and are generalized for variables distributed along a curve. Properties of these measures are discussed. The new measures are estimated using principal curves and are computed for simulated and real data sets. Finally, we present several statistical applications for the new dependence measures.  相似文献   

2.
We consider local linear estimation of varying-coefficient models in which the data are observed with multiplicative distortion which depends on an observed confounding variable. At first, each distortion function is estimated by non parametrically regressing the absolute value of contaminated variable on the confounder. Secondly, the coefficient functions are estimated by the local least square method on the basis of the predictors of latent variables, which are obtained in terms of the estimated distorting functions. We also establish the asymptotic normality of our proposed estimators and discuss the inference about the distortion function. Simulation studies are carried out to assess the finite sample performance of the proposed estimators and a real dataset of Pima Indians diabetes is analyzed for illustration.  相似文献   

3.
ABSTRACT

A bivariate distribution, whose marginal distributions are truncated Poisson distributions, is developed as a product of truncated Poisson distributions and a multiplicative factor. The multiplicative factor takes into account the correlation, either positive or negative, between the two random variables. The distributional properties of this model are studied and the model is fitted to a real life bivariate data.  相似文献   

4.
In an epidemiological study the regression slope between a response and predictor variable is underestimated when the predictor variable is measured imprecisely. Repeat measurements of the predictor in individuals in a subset of the study or in a separate study can be used to estimate a multiplicative factor to correct for this 'regression dilution bias'. In applied statistics publications various methods have been used to estimate this correction factor. Here we compare six different estimation methods and explain how they fall into two categories, namely regression and correlation-based methods. We provide new asymptotic variance formulae for the optimal correction factors in each category, when these are estimated from the repeat measurements subset alone, and show analytically and by simulation that the correlation method of choice gives uniformly lower variance. The simulations also show that, when the correction factor is not much greater than 1, this correlation method gives a correction factor which is closer to the true value than that from the best regression method on up to 80% of occasions. We also provide a variance formula for a modified correlation method which uses the standard deviation of the predictor variable in the main study; this shows further improved performance provided that the correction factor is not too extreme. A confidence interval for a corrected regression slope in an epidemiological study should reflect the imprecision of both the uncorrected slope and the estimated correction factor. We provide formulae for this and show that, particularly when the correction factor is large and the size of the subset of repeat measures is small, the effect of allowing for imprecision in the estimated correction factor can be substantial.  相似文献   

5.
In this paper, we derive explicit expressions for marginal and product moments of a bivariate lognormal distribution when a multiplicative constraint is present. We show that the coefficients of variation always decrease regardless of the multiplicative constraint imposed. We also evaluate the effects of the constraint on the variances and covariance, and present conditions under which the correlation coefficient increases under the presence of such a multiplicative constraint. We finally apply these results to futures hedging analysis and some other financial applications.  相似文献   

6.
We consider varying coefficient models, which are an extension of the classical linear regression models in the sense that the regression coefficients are replaced by functions in certain variables (for example, time), the covariates are also allowed to depend on other variables. Varying coefficient models are popular in longitudinal data and panel data studies, and have been applied in fields such as finance and health sciences. We consider longitudinal data and estimate the coefficient functions by the flexible B-spline technique. An important question in a varying coefficient model is whether an estimated coefficient function is statistically different from a constant (or zero). We develop testing procedures based on the estimated B-spline coefficients by making use of nice properties of a B-spline basis. Our method allows longitudinal data where repeated measurements for an individual can be correlated. We obtain the asymptotic null distribution of the test statistic. The power of the proposed testing procedures are illustrated on simulated data where we highlight the importance of including the correlation structure of the response variable and on real data.  相似文献   

7.
We propose serial correlation-robust asymptotic confidence bands for the receiver operating characteristic (ROC) curve and its functional, viz., the area under ROC curve (AUC), estimated by quasi-maximum likelihood in the binormal model. Our simulation experiments confirm that this new method performs fairly well in finite samples, and confers an additional measure of robustness to nonnormality. The conventional procedure is found to be markedly undersized in terms of yielding empirical coverage probabilities lower than the nominal level, especially when the serial correlation is strong. An example from macroeconomic forecasting demonstrates the importance of accounting for serial correlation when the probability forecasts for real GDP declines are evaluated using ROC. Supplementary materials for this article are available online.  相似文献   

8.
We consider whether one should transform to estimate nonparametrically a regression curve sampled from data with a constant coefficient of variation, i.e. with multiplicative errors. Kernel-based smoothing methods are used to provide curve estimates from the data both in the original units and after transformation. Comparisons are based on the mean-squared error (MSE) or mean integrated squared error (MISE), calculated in the original units. Even when the data are generated by the simplest multiplicative error model, the asymptotically optimal MSE (or MISE) is surprisingly not always obtained by smoothing transformed data, but in many cases by directly smoothing the original data. Which method is optimal depends on both the regression curve and the distribution of the errors. Data-based procedures which could be useful in choosing between transforming and not transforming a particular data set are discussed. The results are illustrated on simulated and real data.  相似文献   

9.
The logistic sigmoid curve is widely used in nonlinear regression and in binary response modeling. There are problems corresponding to a double sigmoid behavior which consists of the first increase to an early saturation at an intermediate level, and the second sigmoid with the eventual plateau of saturation. A double sigmoid behavior is usually achieved using additive or multiplicative combinations of logit and more complicated functions with numerous parameters. In this work, double sigmoid functions are constructed as logistic ones with a sign defining the point of inflection and with an additional powering parameter. The elaborated models describe rather complicated double saturation behavior via only four or five parameters which can be efficiently estimated by nonlinear optimization techniques. Theoretical features and practical applications of the models are discussed.  相似文献   

10.
We used two statistical methods to identify prognostic factors: a log-linear model (logistic and COX regression, based on the notions of linearity and multiplicative relative risk), and the CORICO method (ICOnography of CORrelations) based on the geometric significance of the correlation coefficient. We applied the methods to two different situations (a "case-control study' and a "historical cohort'). We show that the geometric exploratory tool is particularly suited to the analysis of small samples with a large number of variables. It could save time when setting up new study protocols. In this instance, the geometric approach highlighted, without preconceived ideas, the potential role of multihormonality in the course of pituitary adenoma and the unexpected influence of the date of tumour excision on the risk attached to haemorrhage.  相似文献   

11.
Regression methods for common data types such as measured, count and categorical variables are well understood but increasingly statisticians need ways to model relationships between variable types such as shapes, curves, trees, correlation matrices and images that do not fit into the standard framework. Data types that lie in metric spaces but not in vector spaces are difficult to use within the usual regression setting, either as the response and/or a predictor. We represent the information in these variables using distance matrices which requires only the specification of a distance function. A low-dimensional representation of such distance matrices can be obtained using methods such as multidimensional scaling. Once these variables have been represented as scores, an internal model linking the predictors and the responses can be developed using standard methods. We call scoring as the transformation from a new observation to a score, whereas backscoring is a method to represent a score as an observation in the data space. Both methods are essential for prediction and explanation. We illustrate the methodology for shape data, unregistered curve data and correlation matrices using motion capture data from an experiment to study the motion of children with cleft lip.  相似文献   

12.
ABSTRACT

The aim of this study is to investigate the impact of correlation structure, prevalence and effect size on the risk prediction model by using the change in the area under the receiver operating characteristic curve (ΔAUC), net reclassification improvement (NRI), and integrated discrimination improvement (IDI). In simulation study, the dataset is generated under different correlation structures, prevalences and effect sizes. We verify the simulation results with the real-data application. In conclusion, the correlation structure between the variables should be taken into account while composing a multivariable model. Negative correlation structure between independent variables is more beneficial while constructing a model.  相似文献   

13.
Statements that are inherently multiplicative have historically been justified using ratios of random variables. Although recent work on ratios has extended the classical theory to produce confidence bounds conditioned on a positive denominator, this current article offers a novel perspective that eliminates the need for such a condition. Although seemingly trivial, this new perspective leads to improved lower confidence bounds to support multiplicative statements. This perspective is also more satisfying as it allows comparisons that are inherently multiplicative in nature to be properly analyzed as such.  相似文献   

14.
In the context of spatial linear regression, we discuss detection of jump location curve treated as threshold curve which cannot be expressed by independent variables but indirectly determines two specific model forms. The threshold curve in this paper is described by a straight line with two location variables, longitude and latitude, and can be estimated by maximizing the coefficient difference between two one-sided linear regression models. Theoretical results show that the estimator is consistent. Our method performs well by numerical studies.  相似文献   

15.
The aggregated worths of the alternatives, when compared with respect to several criteria, are estimated in a hierarchical comparisons model introduced by Saaty (1980). A multiplicative model is used for the paired comparisons data which are collected in a ratio scale in this set-up in any level of this hierarchy. An iterative scheme is found for the maximum likelihood estimation of the worth parameters in this multiplicative model. The iterative values are shown to be convergent monotonically to the estimates. We also obtain the asymptotic dispersion matrix of the maximum likelihood estimates of the relative worths of the alternatives according to a single criterion as well as those according to the over-all suitability when compared under several criteria. A numerical example is presented to illustrate the method developed in this paper. Simulation techniques are employed to find the average number of iterations required for the convergence of the above iterative scheme.  相似文献   

16.
There is often more structure in the way two random variables are associated than a single scalar dependence measure, such as correlation, can reflect. Local dependence functions such as that of Holland and Wang (1987) are, therefore, useful. However, it can be argued that estimated local dependence functions convey information that is too detailed to be easily interpretable. We seek to remedy this difficulty, and hence make local dependence a more readily interpretable practical tool, by introducing dependence maps. Via local permutation testing, dependence maps simplify the estimated local dependence structure between two variables by identifying regions of (significant) positive, (not significant) zero and (significant) negative local dependence. When viewed in conjunction with an estimate of the joint density, a comprehensive picture of the joint behaviour of the variables is provided. A little theory, many implementational details and several examples are given.  相似文献   

17.
Fitting multiplicative models by robust alternating regressions   总被引:1,自引:0,他引:1  
In this paper a robust approach for fitting multiplicative models is presented. Focus is on the factor analysis model, where we will estimate factor loadings and scores by a robust alternating regression algorithm. The approach is highly robust, and also works well when there are more variables than observations. The technique yields a robust biplot, depicting the interaction structure between individuals and variables. This biplot is not predetermined by outliers, which can be retrieved from the residual plot. Also provided is an accompanying robust R 2-plot to determine the appropriate number of factors. The approach is illustrated by real and artificial examples and compared with factor analysis based on robust covariance matrix estimators. The same estimation technique can fit models with both additive and multiplicative effects (FANOVA models) to two-way tables, thereby extending the median polish technique.  相似文献   

18.
We compare the accuracy of five approaches for contour detection in speckled imagery. Some of these methods take advantage of the statistical properties of speckled data, and all of them employ active contours using B-spline curves. Images obtained with coherent illumination are affected by a noise called speckle, which is inherent to the imaging process. These data have been statistically modeled by a multiplicative model using the G0 distribution, under which regions with different degrees of roughness can be characterized by the value of a parameter. We use this information to find boundaries between regions with different textures. We propose and compare five strategies for boundary detection: three based on the data (maximum discontinuity on raw data, fractal dimension and maximum likelihood) and two based on estimates of the roughness parameter (maximum discontinuity and anisotropic smoothed roughness estimates). In order to compare these strategies, a Monte Carlo experience was performed to assess the accuracy of fitting a curve to a region. The probability of finding the correct edge with less than a specified error is estimated and used to compare the techniques. The two best procedures are then compared in terms of their computational cost and, finally, we show that the maximum likelihood approach on the raw data using the G0 law is the best technique.  相似文献   

19.
Household Expenditure Survey (HES) data are widely reported in grouped form for a number of reasons. Only within-group arithmetic means (AMs) of the household expenditures on various consumption items, total expenditure, income . and other variables are reported in the tabular form. However, the use of such within-group AMs introduces biases when the parameters of various commonly used non-linear Engel functions are estimated by the Aitken's generalized least squares (GLS) method. This is because the within-group geometric means (GMs)/harmonic means (HMs) are needed in order to estimate unbiased parameters of those non-linear Engel functions. Kakwani (1977) estimated the within-group GMs/HMs from the Kakwani-Podder (1976) Lorenz curve for Indonesian data. We have extended his method to estimate within-group GMs/HMs to a set of variables, based on a general type of concentration curve. It is shown that our estimated within-group GMs/HMs based on concentration curves are not entirely suitable for the Australian HES data. However, these GMs/HMs are then used to estimate Engel parameters for various non-linear Engel functions and it is seen that these elasticities are different for some items of certain non-linear Engel functions than those when the reported within-group AMs are used as proxies for within-group GMs/HMs in order to estimate those non-linear Engel functions. The concept of the average elasticity of a variable elasticity Engel function is discussed and computed for various Australian household consumption items. It is empirically demonstrated that the average elasticities are more meaningful than the traditional elasticity estimates computed at some representative values for certain functions.  相似文献   

20.
Abstract.  We propose covariate adjusted correlation (Cadcor) analysis to target the correlation between two hidden variables that are observed after being multiplied by an unknown function of a common observable confounding variable. The distorting effects of this confounding may alter the correlation relation between the hidden variables. Covariate adjusted correlation analysis enables consistent estimation of this correlation, by targeting the definition of correlation through the slopes of the regressions of the hidden variables on each other and by establishing a connection to varying-coefficient regression. The asymptotic distribution of the resulting adjusted correlation estimate is established. These distribution results, when combined with proposed consistent estimates of the asymptotic variance, lead to the construction of approximate confidence intervals and inference for adjusted correlations. We illustrate our approach through an application to the Boston house price data. Finite sample properties of the proposed procedures are investigated through a simulation study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号