共查询到20条相似文献,搜索用时 46 毫秒
1.
M. Schemper 《统计学通讯:理论与方法》2013,42(13):1655-1665
The present paper surveys tests with censored sur-vival data for 2-and k-samples and for association by a framework which classifies them into complete or restric-ted permutation tests and into tests based on U-or Savage-scores. The formulae of the resulting twelve tests are briefly described for quick reference. Some of the tests have been applied frequently in the past as the tests by Mantel, Breslow or Gehan; others have been developed rather recently, partly by the author. The concluding discussion presents the results of a simulation study, clarifying similarities and differences of the restricted and complete permutation approach, and deals with rela-tive efficiencies of the two scoring systems. 相似文献
2.
J. M. C. Santos Silva 《统计学通讯:模拟与计算》2013,42(3):1089-1102
This paper suggests a flexible parametrization of the generalized Poisson regression, which is likely to be particularly useful when the sample is truncated at zero. Suitable specification tests for this case are also studied. The use of the models and tests suggested is illustrated with an application to the number of recreational fishing trips taken by households in Alaska 相似文献
3.
《Journal of Statistical Computation and Simulation》2012,82(15):2995-3008
ABSTRACTAmong the statistical methods to model stochastic behaviours of objects, clustering is a preliminary technique to recognize similar patterns within a group of observations in a data set. Various distances to measure differences among objects could be invoked to cluster data through numerous clustering methods. When variables in hand contain geometrical information of objects, such metrics should be adequately adapted. In fact, statistical methods for these typical data are endowed with a geometrical paradigm in a multivariate sense. In this paper, a procedure for clustering shape data is suggested employing appropriate metrics. Then, the best shape distance candidate as well as a suitable agglomerative method for clustering the simulated shape data are provided by considering cluster validation measures. The results are implemented in a real life application. 相似文献
4.
C.H. Proctor 《统计学通讯:理论与方法》2013,42(5):617-638
A method i s suggested for detecting spatial pattern of disease data on a rectangular lattice. One first classifies all pairs of points by their distance and orientation of separation and then counts the number of pairs of points in each distanceorientation type for which both points show the disease. A log-linear model is proposed for these counts. Methods of fitting to it are suggested that furnish tests of clustering and of anisotropy. Empirical sarnpling results show that the tests are reasonably powerful against certain alternatives and verify that nominal levels of simificances are approximately correct. The method is given in detail for a single four by five plot and is then adapted for combining data from many such plots. Examples are given of such counts and of the calculations used to analyze them. 相似文献
5.
The asymptotic efficiencies are computed for several popular two sample rank tests when the underlying distributions are Poisson, binomial, discrete uniform, and negative binomial The rank tests examined include the Mann-Whitney test, the van der Waerden test, and the median test. Three methods for handling ties are discussed and compared. The computed asymptotic efficiencies apply also to the k-sample extensions of the above tests, such as the Kruskal-Wallis test, etc. 相似文献
6.
This paper proposes a working estimating equation which is computationally easy to use for spatial count data. The proposed estimating equation is a modification of quasi-likelihood estimating equations without the need of correctly specifying the covariance matrix. Under some regularity conditions, we show that the proposed estimator has consistency and asymptotic normality. A simulation comparison also indicates that the proposed method has competitive performance in dealing with over-dispersion data from a parameter-driven model. 相似文献
7.
We set out IDR as a loglinear-model-based Moran's I test for Poisson count data that resembles the Moran's I residual test for Gaussian data. We evaluate its type I and type II error probabilities via simulations, and demonstrate its utility via a case study. When population sizes are heterogeneous, IDR is effective in detecting local clusters by local association terms with an acceptable type I error probability. When used in conjunction with local spatial association terms in loglinear models, IDR can also indicate the existence of first-order global cluster that can hardly be removed by local spatial association terms. In this situation, IDR should not be directly applied for local cluster detection. In the case study of St. Louis homicides, we bridge loglinear model methods for parameter estimation to exploratory data analysis, so that a uniform association term can be defined with spatially varied contributions among spatial neighbors. The method makes use of exploratory tools such as Moran's I scatter plots and residual plots to evaluate the magnitude of deviance residuals, and it is effective to model the shape, the elevation and the magnitude of a local cluster in the model-based test. 相似文献
8.
The problem of testing homogeneity in contingency tables when the data are spatially correlated is considered. We derive statistics defined as divergences between unrestricted and restricted estimated joint cell probabilities and we show that they are asymptotically distributed as linear combinations of chi-square random variables under the null hypothesis of homogeneity. Monte Carlo simulation experiments are carried out to investigate the behavior of the new divergence test statistics and to make comparisons with the statistics that do not take into account the spatial correlation. We show that some of the introduced divergence test statistics have a significantly better behavior than the classical chi-square test for the problem under consideration when we compare them on the basis of the simulated sizes and powers. 相似文献
9.
It is common to test the null hypothesis that two samples were drawn from identical distributions; and the Smirnov (sometimes called Kolmogorov–Smirnov) test is conventionally applied. We present simulation results to compare the performance of this test with three recently introduced alternatives. We consider both continuous and discrete data. We show that the alternative methods preserve type I error at the nominal level as well as the Smirnov test but offer superior power. We argue for the routine replacement of the Smirnov test with the modified Baumgartner test according to Murakami (2006), or with the test proposed by Zhang (2006). 相似文献
10.
Somanath D. Pawar Digambar T. Shirke 《Journal of Statistical Computation and Simulation》2019,89(9):1574-1591
Several nonparametric tests for multivariate multi-sample location problem are proposed in this paper. These tests are based on the notion of data depth, which is used to measure the centrality/outlyingness of a given point with respect to a given distribution or a data cloud. Proposed tests are completely nonparametric and implemented through the idea of permutation tests. Performance of the proposed tests is compared with existing parametric test and nonparametric test based on data depth. An extensive simulation study reveals that proposed tests are superior to the existing tests based on data depth with regard to power. Illustrations with real data are provided. 相似文献
11.
《Journal of Statistical Computation and Simulation》2012,82(8):1544-1553
This paper considers two-sample nonparametric comparison of survival function when data are subject to left truncation and interval censoring. We propose a class of rank-based tests, which are generalization of weighted log-rank tests for right-censored data. Simulation studies indicate that the proposed tests are appropriate for practical use. 相似文献
12.
Coppi et al. [7] applied Yang and Wu's [20] idea to propose a possibilistic k-means (PkM) clustering algorithm for LR-type fuzzy numbers. The memberships in the objective function of PkM no longer need to satisfy the constraint in fuzzy k-means that of a data point across classes sum to one. However, the clustering performance of PkM depends on the initializations and weighting exponent. In this paper, we propose a robust clustering method based on a self-updating procedure. The proposed algorithm not only solves the initialization problems but also obtains a good clustering result. Several numerical examples also demonstrate the effectiveness and accuracy of the proposed clustering method, especially the robustness to initial values and noise. Finally, three real fuzzy data sets are used to illustrate the superiority of this proposed algorithm. 相似文献
13.
Clay King 《Journal of applied statistics》2019,46(4):580-597
Quantile regression (QR) allows one to model the effect of covariates across the entire response distribution, rather than only at the mean, but QR methods have been almost exclusively applied to continuous response variables and without considering spatial effects. Of the few studies that have performed QR on count data, none have included random spatial effects, which is an integral facet of the Bayesian spatial QR model for areal counts that we propose. Additionally, we introduce a simplifying alternative to the response variable transformation currently employed in the QR for counts literature. The efficacy of the proposed model is demonstrated via simulation study and on a real data application from the Texas Department of Family and Protective Services (TDFPS). Our model outperforms a comparable non-spatial model in both instances, as evidenced by the deviance information criterion (DIC) and coverage probabilities. With the TDFPS data, we identify one of four covariates, along with the intercept, as having a nonconstant effect across the response distribution. 相似文献
14.
《Journal of Statistical Computation and Simulation》2012,82(9):1083-1093
In the last few years, two adaptive tests for paired data have been proposed. One test proposed by Freidlin et al. [On the use of the Shapiro–Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions, Biom. J. 45 (2003), pp. 887–900] is a two-stage procedure that uses a selection statistic to determine which of three rank scores to use in the computation of the test statistic. Another statistic, proposed by O'Gorman [Applied Adaptive Statistical Methods: Tests of Significance and Confidence Intervals, Society for Industrial and Applied Mathematics, Philadelphia, 2004], uses a weighted t-test with the weights determined by the data. These two methods, and an earlier rank-based adaptive test proposed by Randles and Hogg [Adaptive Distribution-free Tests, Commun. Stat. 2 (1973), pp. 337–356], are compared with the t-test and to Wilcoxon's signed-rank test. For sample sizes between 15 and 50, the results show that the adaptive test proposed by Freidlin et al. and the adaptive test proposed by O'Gorman have higher power than the other tests over a range of moderate to long-tailed symmetric distributions. The results also show that the test proposed by O'Gorman has greater power than the other tests for short-tailed distributions. For sample sizes greater than 50 and for small sample sizes the adaptive test proposed by O'Gorman has the highest power for most distributions. 相似文献
15.
《Journal of Statistical Computation and Simulation》2012,82(12):853-865
Many spatial data such as those in climatology or environmental monitoring are collected over irregular geographical locations. Furthermore, it is common to have multivariate observations at each location. We propose a method of segmentation of a region of interest based on such data that can be carried out in two steps: (1) clustering or classification of irregularly sample points and (2) segmentation of the region based on the classified points. We develop a spatially-constrained clustering algorithm for segmentation of the sample points by incorporating a geographical-constraint into the standard clustering methods. Both hierarchical and nonhierarchical methods are considered. The latter is a modification of the seeded region growing method known in image analysis. Both algorithms work on a suitable neighbourhood structure, which can for example be defined by the Delaunay triangulation of the sample points. The number of clusters is estimated by testing the significance of successive change in the within-cluster sum-of-squares relative to a null permutation distribution. The methodology is validated on simulated data and used in construction of a climatology map of Ireland based on meteorological data of daily rainfall records from 1294 stations over the period of 37 years. 相似文献
16.
This paper investigates a nonparametric spatial predictor of a stationary multidimensional spatial process observed over a rectangular domain. The proposed predictor depends on two kernels in order to control both the distance between observations and that between spatial locations. The uniform almost complete consistency and the asymptotic normality of the kernel predictor are obtained when the sample considered is an alpha-mixing sequence. Numerical studies were carried out in order to illustrate the behaviour of our methodology both for simulated data and for an environmental data set. 相似文献
17.
《Journal of Statistical Computation and Simulation》2012,82(5):996-1009
Compared to tests for localized clusters, the tests for global clustering only collect evidence for clustering throughout the study region without evaluating the statistical significance of the individual clusters. The weighted likelihood ratio (WLR) test based on the weighted sum of likelihood ratios represents an important class of tests for global clustering. Song and Kulldorff (Likelihood based tests for spatial randomness. Stat Med. 2006;25(5):825–839) developed a wide variety of weight functions with the WLR test for global clustering. However, these weight functions are often defined based on the cell population size or the geographic information such as area size and distance between cells. They do not make use of the information from the observed count, although the likelihood ratio of a potential cluster depends on both the observed count and its population size. In this paper, we develop a self-adjusted weight function to directly allocate weights onto the likelihood ratios according to their values. The power of the test was evaluated and compared with existing methods based on a benchmark data set. The comparison results favour the suggested test especially under global chain clustering models. 相似文献
18.
We consider a class of finite state, two-dimensional Markov chains which can produce a rich variety of patterns and whose simulation is very fast. A parameterization is chosen to make the process nearly spatially homogeneous. We use a form of pseudo-likelihood estimation which results in quick determination of estimate. Parameters associated with boundary cells are estimated separately. We derive the asymptotic distribution of the maximum pseudo-likelihood estimates and show that the usual form of the variance matrix has to be modified to take account of local dependence. Standard error calculations based on the modified asymptotic variance are supported by a simulation study. The procedure is applied to an eight-state permeability pattern from a section of hydrocarbon reservoir rock. 相似文献
19.
Monte Carlo methods are used to examine the small-sample properties of 11 test statistics that can be used for comparing several treatments with respect to their mortality experiences while adjusting for covariables. The test statistics are investigated from three distinct models: the parametric, semiparametric and rank analysis of covariance (Quade, 1967) models. Four tests (likelihood ratio, Wald, conditional and unconditional score tests) from each of the first two models and three tests (based on rank scores) from the last model are discussed. The empirical size and power of the tests are investigated under a proportional hazards model in three situations: (1) the baseline hazard is correctly assumed to be Exponential, (2) the baseline hazard is incorrectly assumed to be Exponential, and (3) a treatment-covariate interaction is omitted from the analysis. 相似文献
20.
Hamid Shahriari 《Journal of applied statistics》2015,42(6):1183-1205
The first step in statistical analysis is the parameter estimation. In multivariate analysis, one of the parameters of interest to be estimated is the mean vector. In multivariate statistical analysis, it is usually assumed that the data come from a multivariate normal distribution. In this situation, the maximum likelihood estimator (MLE), that is, the sample mean vector, is the best estimator. However, when outliers exist in the data, the use of sample mean vector will result in poor estimation. So, other estimators which are robust to the existence of outliers should be used. The most popular robust multivariate estimator for estimating the mean vector is S-estimator with desirable properties. However, computing this estimator requires the use of a robust estimate of mean vector as a starting point. Usually minimum volume ellipsoid (MVE) is used as a starting point in computing S-estimator. For high-dimensional data computing, the MVE takes too much time. In some cases, this time is so large that the existing computers cannot perform the computation. In addition to the computation time, for high-dimensional data set the MVE method is not precise. In this paper, a robust starting point for S-estimator based on robust clustering is proposed which could be used for estimating the mean vector of the high-dimensional data. The performance of the proposed estimator in the presence of outliers is studied and the results indicate that the proposed estimator performs precisely and much better than some of the existing robust estimators for high-dimensional data. 相似文献