首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Generalized discriminant analysis based on distances   总被引:14,自引:1,他引:13  
This paper describes a method of generalized discriminant analysis based on a dissimilarity matrix to test for differences in a priori groups of multivariate observations. Use of classical multidimensional scaling produces a low‐dimensional representation of the data for which Euclidean distances approximate the original dissimilarities. The resulting scores are then analysed using discriminant analysis, giving tests based on the canonical correlations. The asymptotic distributions of these statistics under permutations of the observations are shown to be invariant to changes in the distributions of the original variables, unlike the distributions of the multi‐response permutation test statistics which have been considered by other workers for testing differences among groups. This canonical method is applied to multivariate fish assemblage data, with Monte Carlo simulations to make power comparisons and to compare theoretical results and empirical distributions. The paper proposes classification based on distances. Error rates are estimated using cross‐validation.  相似文献   

2.
Jan Rataj 《Statistics》2013,47(4):377-385
Two classes of random distances (generated as contact distances or free path lengths) external to a stationary random closed set in Euclidean space are introduced. The censored distance distributions within bounded region are obtained. Unbiased estimators of the random distance distribution functions using only the information from inside the bounded region are constructed.  相似文献   

3.
Regression models that account for main state effects and nested county effects are considered for the assessment of farmland values. Empirical predictors obtained by replacing the unknown variances in the formulas of the optimal predictors by maximum likelihood estimates are presented. The computations are carried out by simple iterations between two SAS procedures. Estimators for the prediction variances are derived, and a modification to secure the robustness of the predictors is proposed. The procedure is applied to data on nonirrigated cropland in the Corn Belt states and is shown to yield predictors with considerably lower prediction mean squared errors than the survey estimators and other regression-type estimators.  相似文献   

4.
This paper addresses the problems of frequentist and Bayesian estimation for the unknown parameters of generalized Lindley distribution based on lower record values. We first derive the exact explicit expressions for the single and product moments of lower record values, and then use these results to compute the means, variances and covariance between two lower record values. We next obtain the maximum likelihood estimators and associated asymptotic confidence intervals. Furthermore, we obtain Bayes estimators under the assumption of gamma priors on both the shape and the scale parameters of the generalized Lindley distribution, and associated the highest posterior density interval estimates. The Bayesian estimation is studied with respect to both symmetric (squared error) and asymmetric (linear-exponential (LINEX)) loss functions. Finally, we compute Bayesian predictive estimates and predictive interval estimates for the future record values. To illustrate the findings, one real data set is analyzed, and Monte Carlo simulations are performed to compare the performances of the proposed methods of estimation and prediction.  相似文献   

5.
Some new results of a distance—based (DB) model for prediction with mixed variables are presented and discussed. This model can be thought of as a linear model where predictor variables for a response Y are obtained from the observed ones via classic multidimensional scaling. A coefficient is introduced in order to choose the most predictive dimensions, providing a solution to the problem of small variances and a very large number n of observations (the dimensionality increases as n). The problem of missing data is explored and a DB solution is proposed. It is shown that this approach can be regarded as a kind of ridge regression when the usual Euclidean distance is used.  相似文献   

6.
Based on a chi square transform of the multivariate normal data set, we proposed a technique for testing multinormality which is the sum of interpoint squared distances between an ordered set of the transformed observations and the set of the population pth quantiles of the chi squared distribution. The critical values of the test were evaluated for different sample sizes and random vector dimensions through extensive simulations. The empirical type-I-error rates and powers of the proposed test were compared with those of some other well known tests for MVN with the proposed test showing excellent results at large sample sizes.  相似文献   

7.
Bayesian Geostatistical Design   总被引:6,自引:1,他引:5  
Abstract.  This paper describes the use of model-based geostatistics for choosing the set of sampling locations, collectively called the design, to be used in a geostatistical analysis. Two types of design situation are considered. These are retrospective design, which concerns the addition of sampling locations to, or deletion of locations from, an existing design, and prospective design, which consists of choosing positions for a new set of sampling locations. We propose a Bayesian design criterion which focuses on the goal of efficient spatial prediction whilst allowing for the fact that model parameter values are unknown. The results show that in this situation a wide range of inter-point distances should be included in the design, and the widely used regular design is often not the best choice.  相似文献   

8.
ABSTRACT

In this paper, we propose a parameter estimation method for the three-parameter lognormal distribution based on Type-II right censored data. In the proposed method, under mild conditions, the estimates always exist uniquely in the entire parameter space, and the estimators also have consistency over the entire parameter space. Through Monte Carlo simulations, we further show that the proposed method performs very well compared to a prominent method of estimation in terms of bias and root mean squared error (RMSE) in small-sample situations. Finally, two examples based on real data sets are presented for illustrating the proposed method.  相似文献   

9.
ABSTRACT

We consider point and interval estimation of the unknown parameters of a generalized inverted exponential distribution in the presence of hybrid censoring. The maximum likelihood estimates are obtained using EM algorithm. We then compute Fisher information matrix using the missing value principle. Bayes estimates are derived under squared error and general entropy loss functions. Furthermore, approximate Bayes estimates are obtained using Tierney and Kadane method as well as using importance sampling approach. Asymptotic and highest posterior density intervals are also constructed. Proposed estimates are compared numerically using Monte Carlo simulations and a real data set is analyzed for illustrative purposes.  相似文献   

10.
This study proposes a simple way to perform a power analysis of Mantel's test applied to squared Euclidean distance matrices. The general statistical aspects of the simple Mantel's test are reviewed. The Monte Carlo method is used to generate bivariate Gaussian variables in order to create squared Euclidean distance matrices. The power of the parametric correlation t-test applied to raw data is also evaluated and compared with that of Mantel's test. The standard procedure for calculating punctual power levels is used for validation. The proposed procedure allows one to draw the power curve by running the test only once, dispensing with the time demanding standard procedure of Monte Carlo simulations. Unlike the standard procedure, it does not depend on a knowledge of the distribution of the raw data. The simulated power function has all the properties of the power analysis theory and is in agreement with the results of the standard procedure.  相似文献   

11.
The problem of estimating the common mean μ of two univariate normal populations with unknown and unequal variances is considered from a decision-theoretic point of view. We restrict our attention to an appropriate class C and its three subclasses C0C1C2of un-biased estimates of μ. We consider the usual estimate μ0 of μ which is the weighted linear combination of the sample means with weights as reciprocals of the sample variances. Its admissibility in C0 and extended admissibility in C is proved. Admissible estimates in C1 and C2are also obtained.The loss is always assumed to be squared error. The question of admissibility of μ0 in the class of all estimators is still open.  相似文献   

12.
We consider the problem of making statistical inference on unknown parameters of a lognormal distribution under the assumption that samples are progressively censored. The maximum likelihood estimates (MLEs) are obtained by using the expectation-maximization algorithm. The observed and expected Fisher information matrices are provided as well. Approximate MLEs of unknown parameters are also obtained. Bayes and generalized estimates are derived under squared error loss function. We compute these estimates using Lindley's method as well as importance sampling method. Highest posterior density interval and asymptotic interval estimates are constructed for unknown parameters. A simulation study is conducted to compare proposed estimates. Further, a data set is analysed for illustrative purposes. Finally, optimal progressive censoring plans are discussed under different optimality criteria and results are presented.  相似文献   

13.
Based on progressively type-II censored data, the problem of estimating unknown parameters and reliability function of a two-parameter generalized half-normal distribution is considered. Maximum likelihood estimates are obtained by applying expectation-maximization algorithm. Since they do not have closed forms, approximate maximum likelihood estimators are proposed. Several Bayesian estimates with respect to different symmetric and asymmetric loss functions such as squared error, LINEX and general entropy are calculated. The Lindley approximation method is applied to determine Bayesian estimates. Monte Carlo simulations are performed to compare the performances of the different methods. Finally, one real data set is analysed.  相似文献   

14.
The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study.  相似文献   

15.
In many clinical studies more than one observer may be rating a characteristic measured on an ordinal scale. For example, a study may involve a group of physicians rating a feature seen on a pathology specimen or a computer tomography scan. In clinical studies of this kind, the weighted κ coefficient is a popular measure of agreement for ordinally scaled ratings. Our research stems from a study in which the severity of inflammatory skin disease was rated. The investigators wished to determine and evaluate the strength of agreement between a variable number of observers taking into account patient-specific (age and gender) as well as rater-specific (whether board certified in dermatology) characteristics. This suggested modelling κ as a function of these covariates. We propose the use of generalized estimating equations to estimate the weighted κ coefficient. This approach also accommodates unbalanced data which arise when some subjects are not judged by the same set of observers. Currently an estimate of overall κ for a simple unbalanced data set without covariates involving more than two observers is unavailable. In the inflammatory skin disease study none of the covariates were significantly associated with κ, thus enabling the calculation of an overall weighted κ for this unbalanced data set. In the second motivating example (multiple sclerosis), geographic location was significantly associated with κ. In addition we also compared the results of our method with current methods of testing for heterogeneity of weighted κ coefficients across strata (geographic location) that are available for balanced data sets.  相似文献   

16.
The power function distribution is often used to study the electrical component reliability. In this paper, we model a heterogeneous population using the two-component mixture of the power function distribution. A comprehensive simulation scheme including a large number of parameter points is followed to highlight the properties and behavior of the estimates in terms of sample size, censoring rate, parameters size and the proportion of the components of the mixture. The parameters of the power function mixture are estimated and compared using the Bayes estimates. A simulated mixture data with censored observations is generated by probabilistic mixing for the computational purposes. Elegant closed form expressions for the Bayes estimators and their variances are derived for the censored sample as well as for the complete sample. Some interesting comparison and properties of the estimates are observed and presented. The system of three non-linear equations, required to be solved iteratively for the computations of maximum likelihood (ML) estimates, is derived. The complete sample expressions for the ML estimates and for their variances are also given. The components of the information matrix are constructed as well. Uninformative as well as informative priors are assumed for the derivation of the Bayes estimators. A real-life mixture data example has also been discussed. The posterior predictive distribution with the informative Gamma prior is derived, and the equations required to find the lower and upper limits of the predictive intervals are constructed. The Bayes estimates are evaluated under the squared error loss function.  相似文献   

17.
In this work it is shown how the k-means method for clustering objects can be applied in the context of statistical shape analysis. Because the choice of the suitable distance measure is a key issue for shape analysis, the Hartigan and Wong k-means algorithm is adapted for this situation. Simulations on controlled artificial data sets demonstrate that distances on the pre-shape spaces are more appropriate than the Euclidean distance on the tangent space. Finally, results are presented of an application to a real problem of oceanography, which in fact motivated the current work.  相似文献   

18.
ABSTRACT

In a regression model with a random individual and a random time effect explicit representations of the nonnegative quadratic minimum biased estimators of the corresponding variances are deduced. These estimators always exist and are unique. Moreover, under normality assumption of the dependent variable unbiased estimators of the mean squared errors of the variance estimates are derived. Finally, confidence intervals on the variance components are considered.  相似文献   

19.
This article introduces principal component analysis for multidimensional sparse functional data, utilizing Gaussian basis functions. Our multidimensional model is estimated by maximizing a penalized log-likelihood function, while previous mixed-type models were estimated by maximum likelihood methods for one-dimensional data. The penalized estimation performs well for our multidimensional model, while maximum likelihood methods yield unstable parameter estimates and some of the parameter estimates are infinite. Numerical experiments are conducted to investigate the effectiveness of our method for some types of missing data. The proposed method is applied to handwriting data, which consist of the XY coordinates values in handwritings.  相似文献   

20.
Analysis of data in the form of a set of points irregularly distributed within a region of space usually involves the study of some property of the distribution of inter-event distances. One such function is G, the distribution of the distance from an event to its nearest neighbor. In practice, point processes are commonly observed through a bounded window, thus making edge effects an important component in the estimation of G. Several estimators have been proposed, all dealing with the edge effect problem in different ways. This paper proposes a new alternative for estimating the nearest neighbor distribution and compares it to other estimators. The result is an estimator with relatively small mean squared error for a wide variety of stationary processes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号