首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper proposes an intuitive clustering algorithm capable of automatically self-organizing data groups based on the original data structure. Comparisons between the propopsed algorithm and EM [1 A. Banerjee, I.S. Dhillon, J. Ghosh, and S. Sra, Clustering on the unit hypersphere using von Mises–Fisher distribution, J. Mach. Learn. Res. 6 (2005), pp. 139. [Google Scholar]] and spherical k-means [7 I.S. Dhillon and D.S. Modha, Concept decompositions for large sparse text data using clustering, Mach. Learn. 42 (2001), pp. 143175. doi: 10.1023/A:1007612920971[Crossref], [Web of Science ®] [Google Scholar]] algorithms are given. These numerical results show the effectiveness of the proposed algorithm, using the correct classification rate and the adjusted Rand index as evaluation criteria [5 J.-M. Chiou and P.-L. Li, Functional clustering and identifying substructures of longitudinal data, J. R. Statist. Soc. Ser. B. 69 (2007), pp. 679699. doi: 10.1111/j.1467-9868.2007.00605.x[Crossref] [Google Scholar],6 J.-M. Chiou and P.-L. Li, Correlation-based functional clustering via subspace projection, J. Am. Statist. Assoc. 103 (2008), pp. 16841692. doi: 10.1198/016214508000000814[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]]. In 1995, Mayor and Queloz announced the detection of the first extrasolar planet (exoplanet) around a Sun-like star. Since then, observational efforts of astronomers have led to the detection of more than 1000 exoplanets. These discoveries may provide important information for understanding the formation and evolution of planetary systems. The proposed clustering algorithm is therefore used to study the data gathered on exoplanets. Two main implications are also suggested: (1) there are three major clusters, which correspond to the exoplanets in the regimes of disc, ongoing tidal and tidal interactions, respectively, and (2) the stellar metallicity does not play a key role in exoplanet migration.  相似文献   

2.
This paper deals with the problem of increasing air pollution monitoring stations in Tehran city for efficient spatial prediction. As the data are multivariate and skewed, we introduce two multivariate skew models through developing the univariate skew Gaussian random field proposed by Zareifard and Jafari Khaledi [21 H. Zareifard and M. Jafari Khaledi, Non-Gaussian modeling of spatial data using scale mixing of a unified skew Gaussian process, J. Multivariate Anal. 114 (2013), pp. 1628. doi: 10.1016/j.jmva.2012.07.003[Crossref], [Web of Science ®] [Google Scholar]]. These models provide extensions of the linear model of coregionalization for non-Gaussian data. In the Bayesian framework, the optimal network design is found based on the maximum entropy criterion. A Markov chain Monte Carlo algorithm is developed to implement posterior inference. Finally, the applicability of two proposed models is demonstrated by analyzing an air pollution data set.  相似文献   

3.
This article is concerned with the minimax estimation of a scale parameter under the quadratic loss function where the family of densities is location-scale type. We obtain results for the case when the scale parameter is bounded below by a known constant. Implications for the estimation of a lower-bounded scale parameter of an exponential distribution are presented under unknown location. Furthermore, classes of improved minimax estimators are derived for the restricted parameter using the Integral Expression for Risk Difference (IERD) approach of Kubokawa (1994 Kubokawa, T. (1994). A unified approach to improving equivariant estimators. Ann. Stat. 22:290299.[Crossref], [Web of Science ®] [Google Scholar]). These classes are shown to include some existing estimators from literature.  相似文献   

4.
Visuri et al. (2000 Visuri, S., Koivunen, V., Oja, H. (2000). Sign and rank covariance matrices. J. Stat. Plann. Inference 91:557575.[Crossref], [Web of Science ®] [Google Scholar]) proposed a technique for robust covariance matrix estimation based on different notions of multivariate sign and rank. Among them, the spatial rank based covariance matrix estimator that utilizes a robust scale estimator is especially appealing due to its high robustness, computational ease, and good efficiency. Also, it is orthogonally equivariant under any distribution and affinely equivariant under elliptically symmetric distributions. In this paper, we study robustness properties of the estimator with respective to two measures: breakdown point and influence function. More specifically, the upper bound of the finite sample breakdown point can be achieved by a proper choice of univariate robust scale estimator. The influence functions for eigenvalues and eigenvectors of the estimator are derived. They are found to be bounded under some assumptions. Moreover, finite sample efficiency comparisons to popular robust MCD, M, and S estimators are reported.  相似文献   

5.
Competing models arise naturally in many research fields, such as survival analysis and economics, when the same phenomenon of interest is explained by different researcher using different theories or according to different experiences. The model selection problem is therefore remarkably important because of its great importance to the subsequent inference; Inference under a misspecified or inappropriate model will be risky. Existing model selection tests such as Vuong's tests [26 Q.H. Vuong, Likelihood ratio test for model selection and non-nested hypothesis, Econometrica 57 (1989), pp. 307333. doi: 10.2307/1912557[Crossref], [Web of Science ®] [Google Scholar]] and Shi's non-degenerate tests [21 X. Shi, A non-degenerate Vuong test, Quant. Econ. 6 (2015), pp. 85121. doi: 10.3982/QE382[Crossref], [Web of Science ®] [Google Scholar]] suffer from the variance estimation and the departure of the normality of the likelihood ratios. To circumvent these dilemmas, we propose in this paper an empirical likelihood ratio (ELR) tests for model selection. Following Shi [21 X. Shi, A non-degenerate Vuong test, Quant. Econ. 6 (2015), pp. 85121. doi: 10.3982/QE382[Crossref], [Web of Science ®] [Google Scholar]], a bias correction method is proposed for the ELR tests to enhance its performance. A simulation study and a real-data analysis are provided to illustrate the performance of the proposed ELR tests.  相似文献   

6.
This paper presents a new variable weight method, called the singular value decomposition (SVD) approach, for Kohonen competitive learning (KCL) algorithms based on the concept of Varshavsky et al. [18 R. Varshavsky, A. Gottlieb, M. Linial, and D. Horn, Novel unsupervised feature filtering of bilogical data, Bioinformatics 22 (2006), pp. 507513.[Crossref], [PubMed], [Web of Science ®] [Google Scholar]]. Integrating the weighted fuzzy c-means (FCM) algorithm with KCL, in this paper, we propose a weighted fuzzy KCL (WFKCL) algorithm. The goal of the proposed WFKCL algorithm is to reduce the clustering error rate when data contain some noise variables. Compared with the k-means, FCM and KCL with existing variable-weight methods, the proposed WFKCL algorithm with the proposed SVD's weight method provides a better clustering performance based on the error rate criterion. Furthermore, the complexity of the proposed SVD's approach is less than Pal et al. [17 S.K. Pal, R.K. De, and J. Basak, Unsupervised feature evaluation: a neuro-fuzzy approach, IEEE. Trans. Neural Netw. 11 (2000), pp. 366376.[Crossref], [PubMed], [Web of Science ®] [Google Scholar]], Wang et al. [19 X.Z. Wang, Y.D. Wang, and L.J. Wang, Improving fuzzy c-means clustering based on feature-weight learning, Pattern Recognit. Lett. 25 (2004), pp. 11231132.[Crossref], [Web of Science ®] [Google Scholar]] and Hung et al. [9 W. -L. Hung, M. -S. Yang, and D. -H. Chen, Bootstrapping approach to feature-weight selection in fuzzy c-means algorithms with an application in color image segmentation, Pattern Recognit. Lett. 29 (2008), pp. 13171325.[Crossref], [Web of Science ®] [Google Scholar]].  相似文献   

7.
This article proposes new symmetric and asymmetric distributions applying methods analogous as the ones in Kim (2005 Kim, H.J. (2005). On a class of two-piece skew-normal distributions. Statist.: J. Theoret. Appl. Statist. 39:537553.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) and Arnold et al. (2009 Arnold, B.C., H.W. Gómez, and H.S. Salinas. (2009). On multiple constraint skewed models. Statist. J. Theoret. Appl. Statist. 43: 279293.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) to the exponentiated normal distribution studied in Durrans (1992 Durrans, S.R. (1992). Distributions of fractional order statistics in hydrology. Water Resour. Res. 28:16491655.[Crossref], [Web of Science ®] [Google Scholar]), that we call the power-normal (PN) distribution. The proposed bimodal extension, the main focus of the paper, is called the bimodal power-normal model and is denoted by BPN(α) model, where α is the asymmetry parameter. The authors give some properties including moments and maximum likelihood estimation. Two important features of the model proposed is that its normalizing constant has closed and simple form and that the Fisher information matrix is nonsingular, guaranteeing large sample properties of the maximum likelihood estimators. Finally, simulation studies and real applications reveal that the proposed model can perform well in both situations.  相似文献   

8.
The density power divergence (DPD) measure, defined in terms of a single parameter α, has proved to be a popular tool in the area of robust estimation [1 A. Basu, I.R. Harris, N.L. Hjort and M.C. Jones, Robust and efficient estimation by minimizing a density power divergence, Biometrika 85 (1998), pp. 549559. doi: 10.1093/biomet/85.3.549[Crossref], [Web of Science ®] [Google Scholar]]. Recently, Ghosh and Basu [5 A. Ghosh and A. Basu, Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression, Electron. J. Stat. 7 (2013), pp. 24202456. doi: 10.1214/13-EJS847[Crossref], [Web of Science ®] [Google Scholar]] rigorously established the asymptotic properties of the MDPDEs in case of independent non-homogeneous observations. In this paper, we present an extensive numerical study to describe the performance of the method in the case of linear regression, the most common setup under the case of non-homogeneous data. In addition, we extend the existing methods for the selection of the optimal robustness tuning parameter from the case of independent and identically distributed (i.i.d.) data to the case of non-homogeneous observations. Proper selection of the tuning parameter is critical to the appropriateness of the resulting analysis. The selection of the optimal robustness tuning parameter is explored in the context of the linear regression problem with an extensive numerical study involving real and simulated data.  相似文献   

9.
In this paper, we consider a model for repeated count data, with within-subject correlation and/or overdispersion. It extends both the generalized linear mixed model and the negative-binomial model. This model, proposed in a likelihood context [17 G. Molenberghs, G. Verbeke, and C.G.B. Demétrio, An extended random-effects approach to modeling repeated, overdispersion count data, Lifetime Data Anal. 13 (2007), pp. 457511.[Web of Science ®] [Google Scholar],18 G. Molenberghs, G. Verbeke, C.G.B. Demétrio, and A. Vieira, A family of generalized linear models for repeated measures with normal and conjugate random effects, Statist. Sci. 25 (2010), pp. 325347. doi: 10.1214/10-STS328[Crossref], [Web of Science ®] [Google Scholar]] is placed in a Bayesian inferential framework. An important contribution takes the form of Bayesian model assessment based on pivotal quantities, rather than the often less adequate DIC. By means of a real biological data set, we also discuss some Bayesian model selection aspects, using a pivotal quantity proposed by Johnson [12 V.E. Johnson, Bayesian model assessment using pivotal quantities, Bayesian Anal. 2 (2007), pp. 719734. doi: 10.1214/07-BA229[Crossref], [Web of Science ®] [Google Scholar]].  相似文献   

10.
This paper considers the estimation of the stress–strength reliability of a multi-state component or of a multi-state system where its states depend on the ratio of the strength and stress variables through a kernel function. The article presents a Bayesian approach assuming the stress and strength as exponentially distributed with a common location parameter but different scale parameters. We show that the limits of the Bayes estimators of both location and scale parameters under suitable priors are the maximum likelihood estimators as given by Ghosh and Razmpour [15 M. Ghosh and A. Razmpour, Estimation of the common location parameter of several exponentials, Sankhyā, Ser. A 46 (1984), pp. 383394. [Google Scholar]]. We use the Bayes estimators to determine the multi-state stress–strength reliability of a system having states between 0 and 1. We derive the uniformly minimum variance unbiased estimators of the reliability function. Interval estimation using the bootstrap method is also considered. Under the squared error loss function and linex loss function, risk comparison of the reliability estimators is carried out using extensive simulations.  相似文献   

11.
This paper treats the problem of stochastic comparisons for the extreme order statistics arising from heterogeneous beta distributions. Some sufficient conditions involved in majorization-type partial orders are provided for comparing the extreme order statistics in the sense of various magnitude orderings including the likelihood ratio order, the reversed hazard rate order, the usual stochastic order, and the usual multivariate stochastic order. The results established here strengthen and extend those including Kochar and Xu (2007 Kochar, S.C., Xu, M. (2007). Stochastic comparisons of parallel systems when components have proportional hazard rates. Probab. Eng. Inf. Sci. 21:597609.[Crossref], [Web of Science ®] [Google Scholar]), Mao and Hu (2010 Mao, T., Hu, T. (2010). Equivalent characterizations on orderings of order statistics and sample ranges. Probab. Eng. Inf. Sci. 24:245262.[Crossref], [Web of Science ®] [Google Scholar]), Balakrishnan et al. (2014 Balakrishnan, N., Barmalzan, G., Haidari, A. (2014). On usual multivariate stochastic ordering of order statistics from heterogeneous beta variables. J. Multivariate Anal. 127:147150.[Crossref], [Web of Science ®] [Google Scholar]), and Torrado (2015 Torrado, N. (2015). On magnitude orderings between smallest order statistics from heterogeneous beta distributions. J. Math. Anal. Appl. 426:824838.[Crossref], [Web of Science ®] [Google Scholar]). A real application in system assembly and some numerical examples are also presented to illustrate the theoretical results.  相似文献   

12.
Recently, [1] Ebrahimi, N. 1996. How to measure uncertainty about residual life time. Sankhya Ser. A, 58: 4857.  [Google Scholar] proposed a dynamic measure based on differential entropy applied to the residual lifetime. This measure has been used for the classification and ordering of survival functions. More recently, [2] Ebrahimi, N. 1997. Testing whether lifetime distribution is decreasing uncertainty. Journal of Statistical Planning and Inference, 64: 919. [Crossref], [Web of Science ®] [Google Scholar] has considered the problem of testing the monotonicity of this measure. We propose and study several kernel type estimators of the entropy of residual life through the estimation of f(x) log f(x). These estimators can be applied to the classification and comparison of lifetime distribution.  相似文献   

13.
The geometric Brownian motion (GBM) is very popular in modeling the dynamics of stock prices. However, the constant volatility assumption is questionable and many models with nonconstant volatility have been developed. In the papers [7 M.L. Esquível and P.P. Mota, On some auto-induced regime switching double-threshold glued diffusions, J. Stat. Theory Pract. 8 (2014), pp. 760771. doi: 10.1080/15598608.2013.854184.[Taylor &; Francis Online] [Google Scholar],12 P. P. Mota and M.L. Esquível, On a continuous time stock price model with regime switching, delay, and threshold, Quant. Financ. 14 (2014), pp. 14791488. doi: 10.1080/14697688.2013.879990.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]] the authors introduce a regime switching process where in each regime the process is driven by GBM and the change in regime is defined by the crossing of a threshold. In this paper we used Akaike's and Bayesian information criteria to show that the GBM with regimes provides a better fit than the GBM. We also perform a forecasting comparison of the models for two selected companies.  相似文献   

14.
Lindeman et al. [12 Lindeman, R. H., Merenda, P. F. and Gold, R. Z. 1980. Introduction to Bivariate and Multivariate Analysis, Glenview, IL: Scott Foresman.  [Google Scholar]] provide a unique solution to the relative importance of correlated predictors in multiple regression by averaging squared semi-partial correlations obtained for each predictor across all p! orderings. In this paper, we propose a series of predictor sensitivity statistics that complement the variance decomposition procedure advanced by Lindeman et al. [12 Lindeman, R. H., Merenda, P. F. and Gold, R. Z. 1980. Introduction to Bivariate and Multivariate Analysis, Glenview, IL: Scott Foresman.  [Google Scholar]]. First, we detail the logic of averaging over orderings as a technique of variance partitioning. Second, we assess predictors by conditional dominance analysis, a qualitative procedure designed to overcome defects in the Lindeman et al. [12 Lindeman, R. H., Merenda, P. F. and Gold, R. Z. 1980. Introduction to Bivariate and Multivariate Analysis, Glenview, IL: Scott Foresman.  [Google Scholar]] variance decomposition solution. Third, we introduce a suite of indices to assess the sensitivity of a predictor to model specification, advancing a series of sensitivity-adjusted contribution statistics that allow for more definite quantification of predictor relevance. Fourth, we describe the analytic efficiency of our proposed technique against the Budescu conditional dominance solution to the uneven contribution of predictors across all p! orderings.  相似文献   

15.
In this paper, a new survival cure rate model is introduced considering the Yule–Simon distribution [12 H.A. Simon, On a class of skew distribution functions, Biometrika 42 (1955), pp. 425440.[Crossref], [Web of Science ®] [Google Scholar]] to model the number of concurrent causes. We study some properties of this distribution and the model arising when the distribution of the competing causes is the Weibull model. We call this distribution the Weibull–Yule–Simon distribution. Maximum likelihood estimation is conducted for model parameters. A small scale simulation study is conducted indicating satisfactory parameter recovery by the estimation approach. Results are applied to a real data set (melanoma) illustrating the fact that the model proposed can outperform traditional alternative models in terms of model fitting.  相似文献   

16.
Since the seminal paper by Cook and Weisberg [9 R.D. Cook and S. Weisberg, Residuals and Influence in Regression, Chapman &; Hall, London, 1982. [Google Scholar]], local influence, next to case deletion, has gained popularity as a tool to detect influential subjects and measurements for a variety of statistical models. For the linear mixed model the approach leads to easily interpretable and computationally convenient expressions, not only highlighting influential subjects, but also which aspect of their profile leads to undue influence on the model's fit [17 E. Lesaffre and G. Verbeke, Local influence in linear mixed models, Biometrics 54 (1998), pp. 570582. doi: 10.2307/3109764[Crossref], [PubMed], [Web of Science ®] [Google Scholar]]. Ouwens et al. [24 M.J.N.M. Ouwens, F.E.S. Tan, and M.P.F. Berger, Local influence to detect influential data structures for generalized linear mixed models, Biometrics 57 (2001), pp. 11661172. doi: 10.1111/j.0006-341X.2001.01166.x[Crossref], [PubMed], [Web of Science ®] [Google Scholar]] applied the method to the Poisson-normal generalized linear mixed model (GLMM). Given the model's nonlinear structure, these authors did not derive interpretable components but rather focused on a graphical depiction of influence. In this paper, we consider GLMMs for binary, count, and time-to-event data, with the additional feature of accommodating overdispersion whenever necessary. For each situation, three approaches are considered, based on: (1) purely numerical derivations; (2) using a closed-form expression of the marginal likelihood function; and (3) using an integral representation of this likelihood. Unlike when case deletion is used, this leads to interpretable components, allowing not only to identify influential subjects, but also to study the cause thereof. The methodology is illustrated in case studies that range over the three data types mentioned.  相似文献   

17.
ABSTRACT

This paper proposes an alternative two-stage stratified randomized response model based on Tracy and Osahan (1999 Tracy, D.S., Osahan, S.S. (1999). An improved randomized response technique. Pak. J. Stat. 15(1):16. [Google Scholar]) model that has an optimal allocation and large gain in precision. It is also shown that the proposed model is more efficient than Kim and Warde (2004 Kim, J., Warde, W. (2004). A stratified Warner randomized response model. J. Stat. Plan. Infer. 120:155165.[Crossref], [Web of Science ®] [Google Scholar]) and Kim and Elam (2005 Kim, J.M., Elam, M.E. (2005). A two-stage stratified Warner's randomized response model using optimal allocation. Metrika 61:17.[Crossref], [Web of Science ®] [Google Scholar]) stratified randomized response models under the conditions presented in both the cases of completely truthful reporting and that of not completely truthful reporting by the respondents. Numerical illustrations and graphs are also given in support of the present study.  相似文献   

18.
This article proposes an asymptotic expansion for the Studentized linear discriminant function using two-step monotone missing samples under multivariate normality. The asymptotic expansions related to discriminant function have been obtained for complete data under multivariate normality. The result derived by Anderson (1973 Anderson , T. W. ( 1973 ). An asymptotic expansion of the distribution of the Studentized classification statistic W . The Annals of Statistics 1 : 964972 .[Crossref], [Web of Science ®] [Google Scholar]) plays an important role in deciding the cut-off point that controls the probabilities of misclassification. This article provides an extension of the result derived by Anderson (1973 Anderson , T. W. ( 1973 ). An asymptotic expansion of the distribution of the Studentized classification statistic W . The Annals of Statistics 1 : 964972 .[Crossref], [Web of Science ®] [Google Scholar]) in the case of two-step monotone missing samples under multivariate normality. Finally, numerical evaluations by Monte Carlo simulations were also presented.  相似文献   

19.
This paper studies the allocations of two non identical active redundancies in series systems in terms of the reversed hazard rate order and hazard rate order, which generalizes some results built in Valdés and Zequeira (2003 Valdés, J. E., and R. I. Zequeira 2003. On the optimal allocation of an active redundancy in a two-component series system. Stat. Probab. Lett. 63:32532.[Crossref], [Web of Science ®] [Google Scholar], 2006 Valdés, J. E., and R. I. Zequeira 2006. On the optimal allocation of two active redundancies in a two-component series system. Oper. Res. Lett. 34:4952.[Crossref], [Web of Science ®] [Google Scholar]).  相似文献   

20.
Coppi et al. [7 R. Coppi, P. D'Urso, and P. Giordani, Fuzzy and possibilistic clustering for fuzzy data, Comput. Stat. Data Anal. 56 (2012), pp. 915927. doi: 10.1016/j.csda.2010.09.013[Crossref], [Web of Science ®] [Google Scholar]] applied Yang and Wu's [20 M.-S. Yang and K.-L. Wu, Unsupervised possibilistic clustering, Pattern Recognit. 30 (2006), pp. 521. doi: 10.1016/j.patcog.2005.07.005[Crossref], [Web of Science ®] [Google Scholar]] idea to propose a possibilistic k-means (PkM) clustering algorithm for LR-type fuzzy numbers. The memberships in the objective function of PkM no longer need to satisfy the constraint in fuzzy k-means that of a data point across classes sum to one. However, the clustering performance of PkM depends on the initializations and weighting exponent. In this paper, we propose a robust clustering method based on a self-updating procedure. The proposed algorithm not only solves the initialization problems but also obtains a good clustering result. Several numerical examples also demonstrate the effectiveness and accuracy of the proposed clustering method, especially the robustness to initial values and noise. Finally, three real fuzzy data sets are used to illustrate the superiority of this proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号