首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Model-based clustering typically involves the development of a family of mixture models and the imposition of these models upon data. The best member of the family is then chosen using some criterion and the associated parameter estimates lead to predicted group memberships, or clusterings. This paper describes the extension of the mixtures of multivariate t-factor analyzers model to include constraints on the degrees of freedom, the factor loadings, and the error variance matrices. The result is a family of six mixture models, including parsimonious models. Parameter estimates for this family of models are derived using an alternating expectation-conditional maximization algorithm and convergence is determined based on Aitken’s acceleration. Model selection is carried out using the Bayesian information criterion (BIC) and the integrated completed likelihood (ICL). This novel family of mixture models is then applied to simulated and real data where clustering performance meets or exceeds that of established model-based clustering methods. The simulation studies include a comparison of the BIC and the ICL as model selection techniques for this novel family of models. Application to simulated data with larger dimensionality is also explored.  相似文献   

2.
In the presence of multicollinearity, the rk class estimator is proposed as an alternative to the ordinary least squares (OLS) estimator which is a general estimator including the ordinary ridge regression (ORR), the principal components regression (PCR) and the OLS estimators. Comparison of competing estimators of a parameter in the sense of mean square error (MSE) criterion is of central interest. An alternative criterion to the MSE criterion is the Pitman’s (1937) closeness (PC) criterion. In this paper, we compare the rk class estimator to the OLS estimator in terms of PC criterion so that we can get the comparison of the ORR estimator to the OLS estimator under the PC criterion which was done by Mason et al. (1990) and also the comparison of the PCR estimator to the OLS estimator by means of the PC criterion which was done by Lin and Wei (2002).  相似文献   

3.
This paper deals with the problem of testing statistical hypotheses when both the hypotheses and data are fuzzy. To this end, we first introduce the concept of fuzzy p-value and then develop an approach for testing fuzzy hypotheses by comparing a fuzzy p-value and a fuzzy significance level. Numerical examples are provided to illustrate the approach for different cases.  相似文献   

4.
A finite mixture model using the Student's t distribution has been recognized as a robust extension of normal mixtures. Recently, a mixture of skew normal distributions has been found to be effective in the treatment of heterogeneous data involving asymmetric behaviors across subclasses. In this article, we propose a robust mixture framework based on the skew t distribution to efficiently deal with heavy-tailedness, extra skewness and multimodality in a wide range of settings. Statistical mixture modeling based on normal, Student's t and skew normal distributions can be viewed as special cases of the skew t mixture model. We present analytically simple EM-type algorithms for iteratively computing maximum likelihood estimates. The proposed methodology is illustrated by analyzing a real data example.  相似文献   

5.
In the present paper the distribution theory of maximum and minimum of ther th concomitants from k independent subgroups each of same size m from the Morgenstern family is investigated. Some applications of the results in estimation of the scale parameter of a marginal variable in the bivariate uniform distribution and a selection problem are discussed.  相似文献   

6.
Based on an FQ-System for continuous unimodal distributions, which was introduced by Scheffner (1998), we propose a pure data-driven method for density estimation, which provides good results even for small samples. This procedure does not involve any problems or uncertainties as e.g. bandwidth selection for kernel density estimates.  相似文献   

7.
For measuring the goodness of 2 m 41 designs, Wu and Zhang (1993) proposed the minimum aberration (MA) criterion. MA 2 m 41 designs have been constructed using the idea of complementary designs when the number of two-level factors, m, exceeds n/2, where n is the total number of runs. In this paper, the structures of MA 2 m 41 designs are obtained when m>5n/16. Based on these structures, some methods are developed for constructing MA 2 m 41 designs for 5n/16<m<n/2 as well as for n/2≤m<n. When m≤5n/16, there is no general method for constructing MA 2 m 41 designs. In this case, we obtain lower bounds for A 30 and A 31, where A 30 and A 31 are the numbers of type 0 and type 1 words with length three respectively. And a method for constructing weak minimum aberration (WMA) 2 m 41 designs (A 30 and A 31 achieving the lower bounds) is demonstrated. Some MA or WMA 2 m 41 designs with 32 or 64 runs are tabulated for practical use, which supplement the tables in Wu and Zhang (1993), Zhang and Shao (2001) and Mukerjee and Wu (2001).  相似文献   

8.
This paper (i) discusses theR-chart with asymmetric probability control limits under the assumption that the distribution of the quality characteristic under study is either exponential, Laplace, or logistic, (ii) examines the effect of the estimated probability limits on the performance of theR-chart, and (iii) obtains the desired probability limits of theR-chart that has a specified false alarm rate when probability limits must be estimated from preliminary samples taken from either the exponential, Laplace, or logistic processes.  相似文献   

9.
A recent comparison of evolutionary, neural network, and scatter search heuristics for solving the p-median problem is completed by (i) gathering or obtaining exact optimal values in order to evaluate errors precisely, and (ii) including results obtained with several variants of a variable neighborhood search (VNS) heuristic. For a first, well-known, series of instances, the average errors of the evolutionary and neural network heuristics are over 10% and more than 1000 times larger than that of VNS. For a second series, this error is about 3% while the errors of the parallel VNS and of a hybrid heuristic are about 0.01% and that of parallel scatter search even smaller.  相似文献   

10.
Traditionally, when applying the two-sample t test, some pre-testing occurs. That is, the theory-based assumptions of normal distributions as well as of homogeneity of the variances are often tested in applied sciences in advance of the tried-for t test. But this paper shows that such pre-testing leads to unknown final type-I- and type-II-risks if the respective statistical tests are performed using the same set of observations. In order to get an impression of the extension of the resulting misinterpreted risks, some theoretical deductions are given and, in particular, a systematic simulation study is done. As a result, we propose that it is preferable to apply no pre-tests for the t test and no t test at all, but instead to use the Welch-test as a standard test: its power comes close to that of the t test when the variances are homogeneous, and for unequal variances and skewness values |γ 1| < 3, it keeps the so called 20% robustness whereas the t test as well as Wilcoxon’s U test cannot be recommended for most cases.  相似文献   

11.
Kø-divergence’s statistic family for goodness-of-fit, under the null hypothesis, has an asymptotic chi-squared distribution; however, for small samples, the chi-squared approximation in some cases does not well agree with the exact distribution. In this paper, a closer approximation to the exact distribution is obtained by extracting the ø-dependent second order component from the distribution. Moreover, numerical results are presented for moderate sample sizes with moderate number of cells.  相似文献   

12.
An exact confidence set for that x-coordinate where a quadratic regression model has a given gradient is derived. The limits of the confidence set are given by mathematical formulae. They are implemented in Fortran programs that can be downloaded from the web. The confidence set need not be an interval. Its increase and its changing shape for increasing confidence level is extensively described and visualized in a figure that relates to data from nitrogen-rate trials in Germany. The wheat yields in this example are modeled as quadratic functions of the nitrogen input in order to determine a confidence set for the economically optimum nitrogen fertilization. The disadvantage that the confidence set does not distinguish between concave and convex parabolae, between profit maxima and minima, is also discussed.  相似文献   

13.
Exact permutation testing of effects in unreplicated two-level multifactorial designs is developed based on the notion of realigning observations and on paired permutations. This approach preserves the exchangeability of error components for testing up tok effects. Advantages and limitations of exact permutation procedures for unreplicated factorials are discussed and a simulation study on paired permutation testing is presented.  相似文献   

14.
As GARCH models and stable Paretian distributions have been revisited in the recent past with the papers of Hansen and Lunde (J Appl Econom 20: 873–889, 2005) and Bidarkota and McCulloch (Quant Finance 4: 256–265, 2004), respectively, in this paper we discuss alternative conditional distributional models for the daily returns of the US, German and Portuguese main stock market indexes, considering ARMA-GARCH models driven by Normal, Student’s t and stable Paretian distributed innovations. We find that a GARCH model with stable Paretian innovations fits returns clearly better than the more popular Normal distribution and slightly better than the Student’s t distribution. However, the Student’s t outperforms the Normal and stable Paretian distributions when the out-of-sample density forecasts are considered.  相似文献   

15.
The r largest order statistics approach is widely used in extreme value analysis because it may use more information from the data than just the block maxima. In practice, the choice of r is critical. If r is too large, bias can occur; if too small, the variance of the estimator can be high. The limiting distribution of the r largest order statistics, denoted by GEV\(_r\), extends that of the block maxima. Two specification tests are proposed to select r sequentially. The first is a score test for the GEV\(_r\) distribution. Due to the special characteristics of the GEV\(_r\) distribution, the classical chi-square asymptotics cannot be used. The simplest approach is to use the parametric bootstrap, which is straightforward to implement but computationally expensive. An alternative fast weighted bootstrap or multiplier procedure is developed for computational efficiency. The second test uses the difference in estimated entropy between the GEV\(_r\) and GEV\(_{r-1}\) models, applied to the r largest order statistics and the \(r-1\) largest order statistics, respectively. The asymptotic distribution of the difference statistic is derived. In a large scale simulation study, both tests held their size and had substantial power to detect various misspecification schemes. A new approach to address the issue of multiple, sequential hypotheses testing is adapted to this setting to control the false discovery rate or familywise error rate. The utility of the procedures is demonstrated with extreme sea level and precipitation data.  相似文献   

16.
A novel projection pursuit method based on projecting the data onto itself is proposed. Using a number of real datasets it is shown how to obtain interesting one and two-dimensional projections using only O(n) evaluations of a one-dimensional projection index.  相似文献   

17.
In this paper we compare two robust pseudo-likelihoods for a parameter of interest, also in the presence of nuisance parameters. These functions are obtained by computing quasi-likelihood and empirical likelihood from the estimating equations which define robustM-estimators. Application examples in the context of linear transformation models are considered. Monte Carlo studies are performed in order to assess the finite-sample performance of the inferential procedures based on quasi-and empirical likelihood, when the objective is the construction of robust confidence regions.  相似文献   

18.
Let X be a N(μ, σ 2) distributed characteristic with unknown σ. We present the minimax version of the two-stage t test having minimal maximal average sample size among all two-stage t tests obeying the classical two-point-condition on the operation characteristic. We give several examples. Furthermore, the minimax version of the two-stage t test is compared with the corresponding two-stage Gauß test.  相似文献   

19.
The skew t-distribution includes both the skew normal and the normal distributions as special cases. Inference for the skew t-model becomes problematic in these cases because the expected information matrix is singular and the parameter corresponding to the degrees of freedom takes a value at the boundary of its parameter space. In particular, the distributions of the likelihood ratio statistics for testing the null hypotheses of skew normality and normality are not asymptotically \(\chi ^2\). The asymptotic distributions of the likelihood ratio statistics are considered by applying the results of Self and Liang (J Am Stat Assoc 82:605–610, 1987) for boundary-parameter inference in terms of reparameterizations designed to remove the singularity of the information matrix. The Self–Liang asymptotic distributions are mixtures, and it is shown that their accuracy can be improved substantially by correcting the mixing probabilities. Furthermore, although the asymptotic distributions are non-standard, versions of Bartlett correction are developed that afford additional accuracy. Bootstrap procedures for estimating the mixing probabilities and the Bartlett adjustment factors are shown to produce excellent approximations, even for small sample sizes.  相似文献   

20.
Mixtures of multivariate t distributions provide a robust parametric extension to the fitting of data with respect to normal mixtures. In presence of some noise component, potential outliers or data with longer-than-normal tails, one way to broaden the model can be provided by considering t distributions. In this framework, the degrees of freedom can act as a robustness parameter, tuning the heaviness of the tails, and downweighting the effect of the outliers on the parameters estimation. The aim of this paper is to extend to mixtures of multivariate elliptical distributions some theoretical results about the likelihood maximization on constrained parameter spaces. Further, a constrained monotone algorithm implementing maximum likelihood mixture decomposition of multivariate t distributions is proposed, to achieve improved convergence capabilities and robustness. Monte Carlo numerical simulations and a real data study illustrate the better performance of the algorithm, comparing it to earlier proposals.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号