首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
For two or more populations of which the covariance matrices have a common set of eigenvectors, but different sets of eigenvalues, the common principal components (CPC) model is appropriate. Pepler et al. (2015 Pepler, P. T., Uys, D. W. and Nel, D. G. (2015). Regularised covariance matrix estimation under the common principal components model. Communications in Statistics: Simulation and Computation. (In press). [Google Scholar]) proposed a regularized CPC covariance matrix estimator and showed that this estimator outperforms the unbiased and pooled estimators in situations, where the CPC model is applicable. This article extends their work to the context of discriminant analysis for two groups, by plugging the regularized CPC estimator into the ordinary quadratic discriminant function. Monte Carlo simulation results show that CPC discriminant analysis offers significant improvements in misclassification error rates in certain situations, and at worst performs similar to ordinary quadratic and linear discriminant analysis. Based on these results, CPC discriminant analysis is recommended for situations, where the sample size is small compared to the number of variables, in particular for cases where there is uncertainty about the population covariance matrix structures.  相似文献   

2.
There are many situations where n objects are ranked by b>2 independent sources or observers and in which the interest is focused on agreement on the top rankings. Kendall's coefficient of concordance [10 M. Kendall and B. Smith, The problem of m rankings, Ann. Math. Stat. 10 (1939), pp. 275287. doi: 10.1214/aoms/1177732186[Crossref] [Google Scholar]] assigns equal weights to all rankings. In this paper, a new coefficient of concordance is introduced which is more sensitive to agreement on the top rankings. The limiting distribution of the new concordance coefficient under the null hypothesis of no association among the rankings is presented, and a summary of the exact and approximate quantiles for this coefficient is provided. A simulation study is carried out to compare the performance of Kendall's, the top-down and the new concordance coefficients in detecting the agreement on the top rankings. Finally, examples are given for illustration purposes, including a real data set from financial market indices.  相似文献   

3.
A set of \(n\) -principal points of a \(p\) -dimensional distribution is an optimal \(n\) -point-approximation of the distribution in terms of a squared error loss. It is in general difficult to derive an explicit expression of principal points. Hence, we may have to search the whole space \(R^p\) for \(n\) -principal points. Many efforts have been devoted to establish results that specify a linear subspace in which principal points lie. However, the previous studies focused on elliptically symmetric distributions and location mixtures of spherically symmetric distributions, which may not be suitable to many practical situations. In this paper, we deal with a mixture of elliptically symmetric distributions that form an allometric extension model, which has been widely used in the context of principal component analysis. We give conditions under which principal points lie in the linear subspace spanned by the first several principal components.  相似文献   

4.
Three new weighted rank correlation coefficients are proposed which are sensitive to both agreement on top and bottom rankings. The first one is based on the weighted rank correlation coefficient proposed by Maturi and Abdelfattah [13 T.A. Maturi and E.H. Abdelfattah, A new weighted rank correlation, J. Math. Stat. 4 (2008), pp. 226230. doi: 10.3844/jmssp.2008.226.230[Crossref] [Google Scholar]], the second and the third are based on the order statistics and the quantiles of the Laplace distribution, respectively. The limiting distributions of the new correlation coefficients under the null hypothesis of no association between the rankings are presented, and a summary of the exact and approximate quantiles for these coefficients is provided. A simulation study is performed to compare the performance of Kendall's tau, Spearman's rho, and the new weighted rank correlation coefficients in detecting the agreement on the top and the bottom rankings simultaneously. Finally, examples are given for illustration purposes, including a real data set from financial market indices.  相似文献   

5.
The main purpose of the present work is to introduce and investigate a simple kernel procedure based on marginal integration that estimates the regression function for stationary and ergodic continuous time processes in the setting of the additive model introduced by Stone (1985 Stone, C.J. (1985). Additive regression and other nonparametric models. Ann. Stat. 13(2):689705.[Crossref], [Web of Science ®] [Google Scholar]). We obtain the uniform almost sure consistency with exact rate and the asymptotic normality of the kernel-type estimators of the components of the additive model. Asymptotic properties of these estimators are obtained, under mild conditions, by means of martingale approaches. Finally, a general notion of the bootstrapped additive components, constructed by exchangeably weighting sample, is presented.  相似文献   

6.
Omission of some relevant explanatory variables and multicollinearity in regression models are very serious problems in applied works. There are some papers examining the multicollinearity and misspecification which is due to omission of some relevant explanatory variables, concurrently. To remedy the problem of multicollinearity, Kaç?ranlar and Sakall?o?lu (2001 Kaç?ranlar, S., Sakall?o?lu, S. (2001). Combining the Liu estimator and the principal component regression estimator. Commun. Stat. Theory Methods. 30:26992705.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]) proposed the r-d class estimator that includes the ordinary least squares, principal components regression, and Liu estimators as special cases. The aim of this paper is to examine the performance of the r-d class estimator in misspecificied linear models.  相似文献   

7.
Frequently, the main objective of statistically designed simulation experiments is to estimate and validate regression metamodels, where the regressors are functions of the design variables and the dependent variable is the system response. In this article, a weighted least squares procedure for estimating the unknown parameters of a nonlinear regression metamodel is formulated and evaluated. Since the validity of a fitted regression model must be tested, a method for validating nonlinear regression simulation metamodels is presented. This method is a generalization of the cross-validation test proposed by Kleijnen (1983 Kleijnen , J. P. C. ( 1983 ). Cross-validation using the t statistic . European Journal of Operational Research 13 : 133141 .[Crossref] [Google Scholar]) in the context of linear regression metamodels. One drawback of the cross-validation strategy is the need to perform a large number of nonlinear regressions, if the number of experimental points is large. In this article, cross-validation is implemented using only one nonlinear regression. The proposed statistical analysis allows us to obtain Scheffé-type simultaneous confidence intervals for linear combinations of the metamodel's unknown parameters. Using the well-known M/M/1 example, a metamodel is built and validated with the aid of the proposed procedure.  相似文献   

8.
For regression problems with grouped covariates, we adapt the idea of sparse group lasso (SGL) [10 J. Friedman, T. Hastie, and R. Tibshirani, A note on the group lasso and a sparse group lasso, Tech. Rep., Statistics Department, Stanford University, 2010. [Google Scholar]] to the framework of the sufficient dimension reduction. Assuming that the regression falls into a single-index structure, we propose a method called the sparse group sufficient dimension reduction to conduct group and within-group variable selections simultaneously without assuming a specific link function. Simulation studies show that our method is comparable to the SGL under the regular linear model setting and outperforms SGL with higher true positive rates and substantially lower false positive rates when the regression function is nonlinear. One immediate application of our method is to the gene pathway data analysis where genes naturally fall into groups (pathways). An analysis of a glioblastoma microarray data is included for illustration of our method.  相似文献   

9.
The Friedman test is often used for a randomized complete block design when the normality assumption is not satisfied or the data are ordinal. The Friedman test can be viewed as an extension of the sign test for multiple measurements within each subject or block. We propose a modified Friedman test based on the Wilcoxon sign rank approach. Coincidentally, Tukey proposed a test statistic similar to our proposed test, but blocks are ranked by the minimum difference within each block. In the proposed test, we use the variance of block to rank the blocks, with the least variance being ranked the smallest. In both Tukey test and the modified Friedman test, linear ranks are used for blocks and treatments. The Tukey test belongs to the family of weighted-ranking test from Quade (1979 Quade, D. (1979). Using weighted rankings in the analysis of complete blocks with additive block effects. Journal of the American Statistical Association 74(367):680683. [Google Scholar]).The modified Friedman test, the Friedman test and the Tukey test are compared under various conditions and the results indicate that the proposed test is generally more powerful than the Friedman test and the Tukey test when the number of groups is small.  相似文献   

10.
Double outward box distributed residuals are another type of non monotonic heteroscedasticity that severely violates homoscedasticity assumption. In this study Çelik's (2015 Çelik, R. (2015). Stabilizing heteroscedasticity for butterfly-distributed residuals by the weighting absolute centered external variable. J. Appl. Stat. 42(4):705721.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) WCEV is applied to double outward box distributed residuals to provide homoscedasticity for simple and multiple regression models.  相似文献   

11.
《统计学通讯:理论与方法》2012,41(13-14):2394-2404
Sousa et al. (2010 Sousa , R. , Shabbir , J. , Real , P. C. , Gupta , S. ( 2010 ). Ratio estimation of the mean of a sensitive variable in the presence of auxiliary information . J. Statist. Theor. Prac. 4 ( 3 ): 495507 .[Taylor & Francis Online] [Google Scholar]) introduced a ratio estimator for the mean of a sensitive variable and showed that this estimator performs better than the ordinary mean estimator based on a randomized response technique (RRT). In this article, we introduce a regression estimator that performs better than the ratio estimator even for modest correlation between the primary and the auxiliary variables. The underlying assumption is that the primary variable is sensitive in nature but a non sensitive auxiliary variable exists that is positively correlated with the primary variable. Expressions for the Bias and MSE (Mean Square Error) are derived based on the first order of approximation. It is shown that the proposed regression estimator performs better than the ratio estimator and the ordinary RRT mean estimator (that does not utilize the auxiliary information). We also consider a generalized regression-cum-ratio estimator that has even smaller MSE. An extensive simulation study is presented to evaluate the performances of the proposed estimators in relation to other estimators in the study. The procedure is also applied to some financial data: purchase orders (a sensitive variable) and gross turnover (a non sensitive variable) in 2009 for a population of 5,336 companies in Portugal from a survey on Information and Communication Technologies (ICT) usage.  相似文献   

12.
This article considers some classes of estimators of the population median of the study variable using information on an auxiliary variable with their properties under large sample approximation. Asymptotic optimum estimator (AOE) in each class of estimators has been investigated along with the approximate mean square error formulae. It has been shown that the proposed classes of estimators are better than these considered by Gross (1980 Gross , T. S. ( 1980 ). Median estimation in sample surveys. Proc. Surv. Res. Meth. Sect. Amer. Statist. Assoc. 181–184 . [Google Scholar]), Kuk and Mak (1989 Kuk , A. Y. C. , Mak , T. K. ( 1989 ). Median estimation in the presence of auxiliary information . J. Roy. Statist. Soc. Ser. B51 : 261269 . [Google Scholar]), Singh et al. (2003a Singh , H. P. , Singh , S. , Joarder , A. H. ( 2003a ). Estimation of population median when mode of an auxiliary variable is known . J. Statist. Res. 37 ( 1 ): 5763 . [Google Scholar]), and Al and Cingi (2009 Al , S. , Cingi , H. ( 2009 ). New estimators for the population median in simple random sampling. Tenth Islamic Countries Conference on Statistical Sciences, held in New Cairo, Egypt . [Google Scholar]). An empirical study is carried out to judge the merits of the suggested class of estimators over other existing estimators.  相似文献   

13.
Robust parameter design has been widely used to improve the quality of products and processes. Although a product array, in which an orthogonal array for control factors is crossed with an orthogonal array for noise factors, is commonly used for parameter design experiments, this may lead to an unacceptably large number of experimental runs. The compound noise strategy proposed by Taguchi [30 G. Taguchi, System of Experimental Design: Engineering Methods to Optimize Quality and Minimize Costs, UNIPUB/Kraus International, White Plains, New York, 1987. [Google Scholar]] can be used to reduce the number of experimental runs. In this strategy, a compound noise factor is formed based on the directionality of the effects of noise factors. However, the directionality is usually unknown in practice. Recently, Singh et al. [28 J. Singh, D.D. Frey, N. Soderborg, and R. Jugulum, Compound noise: Evaluation as a robust parameter design method, Qual. Reliab. Eng. Int. 23 (2007), 387398. doi: 10.1002/qre.812[Crossref], [Web of Science ®] [Google Scholar]] proposed a random compound noise strategy, in which a compound noise factor is formed by randomly selecting a setting of the levels of noise factors. The present paper evaluates the random compound noise strategy in terms of the precision of the estimators of the response mean and the response variance. In addition, the variances of the estimators in the random compound noise strategy are compared with those in the n-replication design. The random compound noise strategy is shown to have smaller variances of the estimators than the 2-replication design, especially when the control-by-noise-interactions are strong.  相似文献   

14.
Best et al. (Best, D. J., Rayner, J. C. W., O'Sullivan, M. G. (2000 Best, D. J., Rayner, J. C. W., O'Sullivan, M. G. (2000). Product maps for consumer categorical data. Food Quality and Preference 11:9197.[Crossref], [Web of Science ®] [Google Scholar]). Product maps for consumer categorical data. Food Quality and Preference, 11:91–97) suggested tests based on partitioning the X2 statistic into relevant components of location, dispersion, and skewness effects for testing equality of each effect for ordinal preference data. It is known that the chi-square approximation requires large counts for categories. For this purpose, in this study, we investigate a permutation approach for these statistics and compare the performance of these tests with simulation study. In addition, the permutation approach can be used to produce a product map that classifies the products. We illustrate the approach with a real data example.  相似文献   

15.
When a sufficient correlation between the study variable and the auxiliary variable exists, the ranks of the auxiliary variable are also correlated with the study variable, and thus, these ranks can be used as an effective tool in increasing the precision of an estimator. In this paper, we propose a new improved estimator of the finite population mean that incorporates the supplementary information in forms of: (i) the auxiliary variable and (ii) ranks of the auxiliary variable. Mathematical expressions for the bias and the mean-squared error of the proposed estimator are derived under the first order of approximation. The theoretical and empirical studies reveal that the proposed estimator always performs better than the usual mean, ratio, product, exponential-ratio and -product, classical regression estimators, and Rao (1991 Rao, T.J. (1991). On certail methods of improving ration and regression estimators. Commun. Stat. Theory Methods 20(10):33253340.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]), Singh et al. (2009 Singh, R., Chauhan, P., Sawan, N., Smarandache, F. (2009). Improvement in estimating the population mean using exponential estimator in simple random sampling. Int. J. Stat. Econ. 3(A09):1318. [Google Scholar]), Shabbir and Gupta (2010 Shabbir, J., Gupta, S. (2010). On estimating finite population mean in simple and stratified random sampling. Commun. Stat. Theory Methods 40(2):199212.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]), Grover and Kaur (2011 Grover, L.K., Kaur, P. (2011). An improved estimator of the finite population mean in simple random sampling. Model Assisted Stat. Appl. 6(1):4755. [Google Scholar], 2014) estimators.  相似文献   

16.
The concept of negative variance components in linear mixed-effects models, while confusing at first sight, has received considerable attention in the literature, for well over half a century, following the early work of Chernoff [7 H. Chernoff, On the distribution of the likelihood ratio, Ann. Math. Statist. 25 (1954), pp. 573578.[Crossref] [Google Scholar]] and Nelder [21 J.A. Nelder, The interpretation of negative components of variance, Biometrika 41 (1954), pp. 544548.[Crossref], [Web of Science ®] [Google Scholar]]. Broadly, negative variance components in linear mixed models are allowable if inferences are restricted to the implied marginal model. When a hierarchical view-point is adopted, in the sense that outcomes are specified conditionally upon random effects, the variance–covariance matrix of the random effects must be positive-definite (positive-semi-definite is also possible, but raises issues of degenerate distributions). Many contemporary software packages allow for this distinction. Less work has been done for generalized linear mixed models. Here, we study such models, with extension to allow for overdispersion, for non-negative outcomes (counts). Using a study of trichomes counts on tomato plants, it is illustrated how such negative variance components play a natural role in modeling both the correlation between repeated measures on the same experimental unit and over- or underdispersion.  相似文献   

17.
We consider a nonlinear censored regression problem with a vector of predictors. With censoring, high-dimensional regression analysis becomes much more complicated. Since censoring can cause severe bias in estimation, modification to adjust such bias is needed to be made. Based on the weight adjustment, we develop the modification of sliced average variance estimation for estimating the lifetime central subspace without requiring a prespecified parametric model. Our proposed method preserves as much regression information as possible. Simulation results are reported and comparisons are made with the sliced inverse regression of Li et al. (1999 Li , K. C. , Wang , J. L. , Chen , C. H. ( 1999 ). Dimension reduction for censored regression data . Ann. Statist. 27 : 123 . [Google Scholar]).  相似文献   

18.
We adopt boosting for classification and selection of high-dimensional binary variables for which classical methods based on normality and non singular sample dispersion are inapplicable. Boosting seems particularly well suited for binary variables. We present three methods of which two combine boosting with the relatively classical variable selection methods developed in Wilbur et al. (2002 Wilbur , J. D. , Ghosh , J. K. , Nakatsu , C. H. , Brouder , S. M. , Doerge , R. W. ( 2002 ). Variable selection in high-dimensional multivariate binary data with application to the analysis of microbial community DNA fingerprints . Biometrics 58 : 378386 . [Google Scholar]). Our primary interest is variable selection in classification with small misclassification error being used as validation of proposed method for variable selection. Two of the new methods perform uniformly better than Wilbur et al. (2002 Wilbur , J. D. , Ghosh , J. K. , Nakatsu , C. H. , Brouder , S. M. , Doerge , R. W. ( 2002 ). Variable selection in high-dimensional multivariate binary data with application to the analysis of microbial community DNA fingerprints . Biometrics 58 : 378386 . [Google Scholar]) in one set of simulated and three real life examples.  相似文献   

19.
In this research, multiple dependent state and repetitive group sampling are used to design a variable sampling plan based on one-sided process capability indices, which consider the quality of the current lot as well as the quality of the preceding lots. The sample size and critical values of the proposed plan are determined by minimizing the average sample number while satisfying the producer's risk and consumer's risk at corresponding quality levels. In addition, comparisons are made with the existing sampling plans [Pearn and Wu (2006a Pearn, W. L., and C. W. Wu. 2006a. Critical acceptance values and sample sizes of a variables sampling plan for very low fraction of defectives. Omega: International Journal of Management Science 34 (1):90101.[Crossref], [Web of Science ®] [Google Scholar]), Yen et al. (2015 Yen, C. H., C. H. Chang, and M. Aslam. 2015. Repetitive variable acceptance sampling plan for one-sided specification. Journal of Statistical Computation and Simulation 85 (6):110216.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar])] in terms of average sample number and operating characteristic curve. Finally, an example is provided to illustrate the proposed plan.  相似文献   

20.
Non Symmetric Correspondence Analysis (NSCA) (D'Ambra and Lauro, 1989 D'Ambra , L. , Lauro , N. ( 1989 ). Non symmetrical analysis of three way contingency tables . In: Multiway Data Analysis , Coppi , R. , Bolasco , S. , Eds., North Holland , Amsterdam : pp. 301315 . [Google Scholar]) is a useful technique for analyzing a two-way contingency table.

The key difference between the symmetrical and non symmetrical versions of correspondence analysis rests on the measure of the association used to quantify the relationship between the variables. For a two-way, or multi-way, contingency table, the Pearson chi-squared statistic is commonly used when it can be assumed that the categorical variables are symmetrically related. However, for a two-way table, it may be that one variable can be treated as a predictor variable and the second variable can be considered as a response variable.

Yet, for such a variable structure, the Pearson chi-squared statistic is not an appropriate measure of the association. Instead, one may consider the Goodman-Kruskal tau index. In the case that there are more than two cross-classified variables, multivariate versions of the Goodman-Kruskal tau index can be considered. These include Marcotorchino's index (Marcotorchino, 1985) and Gray-Williams’ index (Gray and Williams, 1975 Gray , L. N. , Williams , J. S. ( 1975 ). Goodman and Kruskals Tau B: Multiple and partial analogy. Amer. Statist. Assoc. Proc. Soc. Statist. Sec. pp. 444448 . [Google Scholar]).

In this article, the Multiple non Symmetric Correspondence Analysis (MNSCA), along with the decomposition of the TAU by Gray-Williams in main effects and interaction (D'Ambra et al., 2011 D'Ambra , L. , D'Ambra , A. , Sarnacchiaro , P. ( 2011 ). Visualising main effects and interaction term in multiple non symmetric correspondence analysis. Submitted.  [Google Scholar]), is used for the evaluation of the innovative performance of the manufacturing enterprises in Campania.

Finally, to identify a category which is statistically significant, the confidence ellipses have been proposed for the Multiple Non Symmetric Correspondence Analysis starting from the ellipses suggested by Beh (2010 Beh , E. J. ( 2010 ). Elliptical confidence regions for simple correspondence analysis . J. Statisti. Plann. Infer. [Web of Science ®] [Google Scholar]) for the symmetrical analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号