期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Prediction bands for the EDF of a future sample

Jesse Frey 《Journal of statistical planning and inference》2012,142(2):506-515

We develop both nonparametric and parametric methods for obtaining prediction bands for the empirical distribution function (EDF) of a future sample. These methods yield simultaneous prediction intervals for all order statistics of the future sample, and they also correspond to tests for the two-sample problem. The nonparametric prediction bands correspond to the two-sample Kolmogorov-Smirnov test and related nonparametric tests, but the parametric prediction bands correspond to entirely new parametric two-sample tests. The parametric prediction bands tend to outperform the nonparametric bands when the parametric assumptions hold, but they may have true coverage probabilities well below their nominal levels when the parametric assumptions fail. A new computational algorithm is used to obtain critical values in the nonparametric case. 相似文献

2.

Two-sample tests for sparse high-dimensional binary data

Amanda Plunkett 《统计学通讯:理论与方法》2017,46(22):11181-11193

In this article, we study the methods for two-sample hypothesis testing of high-dimensional data coming from a multivariate binary distribution. We test the random projection method and apply an Edgeworth expansion for improvement. Additionally, we propose new statistics which are especially useful for sparse data. We compare the performance of these tests in various scenarios through simulations run in a parallel computing environment. Additionally, we apply these tests to the 20 Newsgroup data showing that our proposed tests have considerably higher power than the others for differentiating groups of news articles with different topics. 相似文献

3.

Two-sample tests for survival data from observational studies

Chenxi Li 《Lifetime data analysis》2018,24(3):509-531

When observational data are used to compare treatment-specific survivals, regular two-sample tests, such as the log-rank test, need to be adjusted for the imbalance between treatments with respect to baseline covariate distributions. Besides, the standard assumption that survival time and censoring time are conditionally independent given the treatment, required for the regular two-sample tests, may not be realistic in observational studies. Moreover, treatment-specific hazards are often non-proportional, resulting in small power for the log-rank test. In this paper, we propose a set of adjusted weighted log-rank tests and their supremum versions by inverse probability of treatment and censoring weighting to compare treatment-specific survivals based on data from observational studies. These tests are proven to be asymptotically correct. Simulation studies show that with realistic sample sizes and censoring rates, the proposed tests have the desired Type I error probabilities and are more powerful than the adjusted log-rank test when the treatment-specific hazards differ in non-proportional ways. A real data example illustrates the practical utility of the new methods. 相似文献

4.

Discretizing a compound distribution with application to categorical modelling

Monique Graf Desislava Nedyalkova 《Statistics》2017,51(3):685-710

Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to define a partition of the domain of definition of the random parameters, so that we can represent the expected density of the variable of interest as a finite mixture of conditional densities. We then model the mixture probabilities of the conditional densities using information on population categories, thus modifying the original overall model. We thus obtain specific models for sub-populations that stem from the overall model. The distribution of a sub-population of interest is thus completely specified in terms of mixing probabilities. All characteristics of interest can be derived from this distribution and the comparison between sub-populations easily proceeds from the comparison of the mixing probabilities. A real example based on EU-SILC data is given. Then the methodology is investigated through simulation. 相似文献

5.

On a class of partially sequential two-sample test procedures for multivariate continuous data

Gopaldeb Chattopadhyay 《Statistics》2015,49(2):455-473

In a two-sample testing problem, sometimes one of the sample observations are difficult and/or costlier to collect compared to the other one. Also, it may be the situation that sample observations from one of the populations have been previously collected and for operational advantages we do not wish to collect any more observations from the second population that are necessary for reaching a decision. Partially sequential technique is found to be very useful in such situations. The technique gained its popularity in statistics literature due to its very nature of capitalizing the best aspects of both fixed and sequential procedures. The literature is enriched with various types of partially sequential techniques useable under different types of data set-up. Nonetheless, there is no mention of multivariate data framework in this context, although very common in practice. The present paper aims at developing a class of partially sequential nonparametric test procedures for two-sample multivariate continuous data. For this we suggest a suitable stopping rule adopting inverse sampling technique and propose a class of test statistics based on the samples drawn using the suggested sampling scheme. Various asymptotic properties of the proposed tests are explored. An extensive simulation study is also performed to study the asymptotic performance of the tests. Finally the benefit of the proposed test procedure is demonstrated with an application to a real-life data on liver disease. 相似文献

6.

Goodness-of-Fit Tests for the Additive Risk Model with (p > 2)-Dimensional Time-Invariant Covariates

Kim Jinheum Song Moon Sup Lee Seungyeoun 《Lifetime data analysis》1998,4(4):405-416

This paper presents methods for checking the goodness-of-fit of the additive risk model with p(> 2)-dimensional time-invariant covariates. The procedures are an extension of Kim and Lee (1996) who developed a test to assess the additive risk assumption for two-sample censored data. We apply the proposed tests to survival data from South Wales nikel refinery workers. Simulation studies are carried out to investigate the performance of the proposed tests for practical sample sizes. 相似文献

7.

A two-sample empirical likelihood ratio test based on samples entropy

Gregory Gurevich Albert Vexler 《Statistics and Computing》2011,21(4):657-670

Powerful entropy-based tests for normality, uniformity and exponentiality have been well addressed in the statistical literature. The density-based empirical likelihood approach improves the performance of these tests for goodness-of-fit, forming them into approximate likelihood ratios. This method is extended to develop two-sample empirical likelihood approximations to optimal parametric likelihood ratios, resulting in an efficient test based on samples entropy. The proposed and examined distribution-free two-sample test is shown to be very competitive with well-known nonparametric tests. For example, the new test has high and stable power detecting a nonconstant shift in the two-sample problem, when Wilcoxon’s test may break down completely. This is partly due to the inherent structure developed within Neyman-Pearson type lemmas. The outputs of an extensive Monte Carlo analysis and real data example support our theoretical results. The Monte Carlo simulation study indicates that the proposed test compares favorably with the standard procedures, for a wide range of null and alternative distributions. 相似文献

8.

Checking the censored two-sample accelerated life model using integrated cumulative hazard difference

Lee SH Yang S 《Lifetime data analysis》2007,13(3):371-380

In this paper, new statistical tests for the censored two-sample accelerated life model are discussed. From the estimating functions using integrated cumulative hazard difference, stochastic processes are constructed. They can be described by martingale residuals, and, given the data, conditional distributions can be approximated by zero mean Gaussian processes. The new methods, based on these processes, provide asymptotically consistent tests against a general departure from the model. A graphical method is also discussed. In various numerical studies, the new tests performed better than the existing method, especially when the hazard curves cross. The proposed procedures are illustrated with two real data sets. 相似文献

9.

Closed testing procedures for all pairwise comparisons in a randomized block design

Taka-Aki Shiraishi Shin-Ichi Matsuda 《统计学通讯:理论与方法》2018,47(15):3571-3587

We consider multiple comparison test procedures among treatment effects in a randomized block design. We propose closed testing procedures based on maximum values of some two-sample t test statistics and based on F test statistics. It is shown that the proposed procedures are more powerful than single-step procedures and the REGW (Ryan/Einot–Gabriel/Welsch)-type tests. Next, we consider the randomized block design under simple ordered restrictions of treatment effects. We propose closed testing procedures based on maximum values of two-sample one-sided t test statistics and based on Batholomew’s statistics for all pairwise comparisons of treatment effects. Although single-step multiple comparison procedures are utilized in general, the power of these procedures is low for a large number of groups. The closed testing procedures stated in the present article are more powerful than the single-step procedures. Simulation studies are performed under the null hypothesis and some alternative hypotheses. In this studies, the proposed procedures show a good performance. 相似文献

10.

Šidák-type tests for the two-sample problem based on precedence and exceedance statistics

Eugenia Stoimenova N. Balakrishnan 《Statistics》2017,51(2):247-264

This paper deals with a class of nonparametric two-sample tests for ordered alternatives. The test statistics proposed are based on the number of observations from one sample that precede or exceed a threshold specified by the other sample, and they are extensions of ?idák's test. We derive their exact null distributions and also discuss a large-sample approximation. We then study their power properties exactly against the Lehmann alternative and make some comparative comments. Finally, we present an example to illustrate the proposed tests. 相似文献

11.

Optimal weighted two-sample t-test with partially paired data in a unified framework

Xu Guo Yan Wang Niwen Zhou Xuehu Zhu 《Journal of applied statistics》2021,48(6):961

In this paper, we provide a unified framework for two-sample t-test with partially paired data. We show that many existing two-sample t-tests with partially paired data can be viewed as special members in our unified framework. Some shortcomings of these t-tests are discussed. We also propose the asymptotically optimal weighted linear combination of the test statistics comparing all four paired and unpaired data sets. Simulation studies are used to illustrate the performance of our proposed asymptotically optimal weighted combinations of test statistics and compare with some existing methods. It is found that our proposed test statistic is generally more powerful. Three real data sets about CD4 count, DNA extraction concentrations, and the quality of sleep are also analyzed by using our newly introduced test statistic. 相似文献

12.

Power Comparison of Multivariate Wilcoxon-type Tests Based on the Jurečková–Kalina’s Ranks of Distances

Hidetoshi Murakami 《统计学通讯:模拟与计算》2015,44(8):2176-2194

A multivariate two-sample testing problem is one of the most important topics in nonparametric statistics. One of the multivariate two-sample testing problems based on the Jure?ková–Kalina ranks of distance is discussed in this article. Further, a multivariate Wilcoxon-type test is proposed for testing the equality of two continuous distribution functions. Simulations are used to investigate the power of this test for the two-sided alternative with various population distributions. The results show that the proposed test statistic is more suitable than various existing statistics for testing a shift in the locationt and location-scale parameters. 相似文献

13.

Regression imputation with Q-mode clustering for rounded zero replacement in high-dimensional compositional data

Jiajia Chen Karel Hron Matthias Templ Shengjia Li 《Journal of applied statistics》2018,45(11):2067-2080

The logratio methodology is not applicable when rounded zeros occur in compositional data. There are many methods to deal with rounded zeros. However, some methods are not suitable for analyzing data sets with high dimensionality. Recently, related methods have been developed, but they cannot balance the calculation time and accuracy. For further improvement, we propose a method based on regression imputation with Q-mode clustering. This method forms the groups of parts and builds partial least squares regression with these groups using centered logratio coordinates. We also prove that using centered logratio coordinates or isometric logratio coordinates in the response of partial least squares regression have the equivalent results for the replacement of rounded zeros. Simulation study and real example are conducted to analyze the performance of the proposed method. The results show that the proposed method can reduce the calculation time in higher dimensions and improve the quality of results. 相似文献

14.

A general class of non parametric tests for comparing scale parameters

Narinder Kumar Manish Goyal 《统计学通讯:理论与方法》2018,47(24):5956-5972

In this paper, a general class of non parametric tests is proposed for the two-sample scale problem. Testing of the scale parameter is very useful in real-life situations commonly faced in engineering, trade, cultivation, industries, medicine, etc. In all these fields, one will prefer the method that gives more consistent results. Thus, it is worthwhile to test the equality of scale parameters. The distribution of the proposed test is established. To assess the performance of the proposed test, the asymptotic efficacies are studied for some underlying distributions and the results are interpreted with useful information. To see the working of the proposed test, an illustrative example for the real-life data set is provided. The simulation study is also carried out to find the asymptotic power of the proposed test. An extension of the general class of tests to the multiple-sample problem is also discussed. 相似文献

15.

A TEST OF HOMOGENEITY AGAINST SCALE ALTERNATIVES USING SUBSAMPLE EXTREMA1

A. P. Gore Ashok Shanubhogue 《Australian & New Zealand Journal of Statistics》1985,27(3):252-258

In this paper a new class of non-parametric tests for testing homogeneity of several populations against scale alternatives is proposed. For this, independent samples of fixed sizes are drawn from each population and from these samples, all possible sub-samples of the same size are drawn and their maxima and minima are computed. Using these extreme the class of tests is obtained. Tests of this type have been offered for the two-sample slippage problem by Kochar (1978). Under certain conditions, this class of tests is shown to be consistent against ‘difference in scale’ alternatives. The test has been compared with Bhapkar's V-test (1961), Deshpande's D-test (1965), Sugiura's D_rs-test (1965) and with a classical test given by Lehmann (1959, pp. 273–275). It is shown that some members of this proposed class of tests are more efficient than the first three tests in the case of uniform, Laplace and normal distributions, when the number of populations compared is small. 相似文献

16.

Empirical receiver operating characteristic curve for two-sample comparison with cure fractions

Xiaobing Zhao Xian Zhou 《Lifetime data analysis》2010,16(3):316-332

Two-sample comparison of survival times with “cured patients” is of major interest and a challenging issue in many areas, particularly in cancer clinical research. Recently, several authors have proposed various procedures of comparison, including tests of no overall, no short-term and no long-term differences between two samples. In clinical practice, it is often of interest to detect the difference in treatment effects among noncured patients regardless of the difference between cure fractions. In this paper, we propose a statistical test to compare two samples with cured patients and possibly heterogeneous treatment effects based on a class of semi-parametric transformation models, and our main focus is on the survival times of noncured patients. The empirical and quantile processes are used to construct strong approximations for the empirical curves. The two-sample test is then constructed from general least squares estimators derived from these processes. Simulation results show that the proposed test perform well. As an example of application, a set of bladder cancer data is analyzed to illustrate the proposed methods. 相似文献

17.

Nonparametric tests for left-truncated and interval-censored data

《Journal of Statistical Computation and Simulation》2012,82(8):1544-1553

This paper considers two-sample nonparametric comparison of survival function when data are subject to left truncation and interval censoring. We propose a class of rank-based tests, which are generalization of weighted log-rank tests for right-censored data. Simulation studies indicate that the proposed tests are appropriate for practical use. 相似文献

18.

Distributed inference for two-sample U-statistics in massive data analysis

Bingyao Huang Yanyan Liu Liuhua Peng 《Scandinavian Journal of Statistics》2023,50(3):1090-1115

This paper considers distributed inference for two-sample U-statistics under the massive data setting. In order to reduce the computational complexity, this paper proposes distributed two-sample U-statistics and blockwise linear two-sample U-statistics. The blockwise linear two-sample U-statistic, which requires less communication cost, is more computationally efficient especially when the data are stored in different locations. The asymptotic properties of both types of distributed two-sample U-statistics are established. In addition, this paper proposes bootstrap algorithms to approximate the distributions of distributed two-sample U-statistics and blockwise linear two-sample U-statistics for both nondegenerate and degenerate cases. The distributed weighted bootstrap for the distributed two-sample U-statistic is new in the literature. The proposed bootstrap procedures are computationally efficient and are suitable for distributed computing platforms with theoretical guarantees. Extensive numerical studies illustrate that the proposed distributed approaches are feasible and effective. 相似文献

19.

Tests for symmetry with right censoring

Ehab F. Abd-Elfattah 《Journal of applied statistics》2011,38(4):683-693

Permutation tests for symmetry are suggested using data that are subject to right censoring. Such tests are directly relevant to the assumptions that underlie the generalized Wilcoxon test since the symmetric logistic distribution for log-errors has been used to motivate Wilcoxon scores in the censored accelerated failure time model. Its principal competitor is the log-rank (LGR) test motivated by an extreme value error distribution that is positively skewed. The proposed one-sided tests for symmetry against the alternative of positive skewness are directly relevant to the choice between usage of these two tests.

The permutation tests use statistics from the weighted LGR class normally used for making two-sample comparisons. From this class, the test using LGR weights (all weights equal) showed the greatest discriminatory power in simulations that compared the possibility of logistic errors versus extreme value errors.

In the test construction, a median estimate, determined by inverting the Kaplan–Meier estimator, is used to divide the data into a “control” group to its left that is compared with a “treatment” group to its right. As an unavoidable consequence of testing symmetry, data in the control group that have been censored become uninformative in performing this two-sample test. Thus, early heavy censoring of data can reduce the effective sample size of the control group and result in diminished power for discriminating symmetry in the population distribution. 相似文献

20.

A comparison of subgroup identification methods in clinical drug development: Simulation study and regulatory considerations

Cynthia Huber Norbert Benda Tim Friede 《Pharmaceutical statistics》2019,18(5):600-626

With advancement of technologies such as genomic sequencing, predictive biomarkers have become a useful tool for the development of personalized medicine. Predictive biomarkers can be used to select subsets of patients, which are most likely to benefit from a treatment. A number of approaches for subgroup identification were proposed over the last years. Although overviews of subgroup identification methods are available, systematic comparisons of their performance in simulation studies are rare. Interaction trees (IT), model‐based recursive partitioning, subgroup identification based on differential effect, simultaneous threshold interaction modeling algorithm (STIMA), and adaptive refinement by directed peeling were proposed for subgroup identification. We compared these methods in a simulation study using a structured approach. In order to identify a target population for subsequent trials, a selection of the identified subgroups is needed. Therefore, we propose a subgroup criterion leading to a target subgroup consisting of the identified subgroups with an estimated treatment difference no less than a pre‐specified threshold. In our simulation study, we evaluated these methods by considering measures for binary classification, like sensitivity and specificity. In settings with large effects or huge sample sizes, most methods perform well. For more realistic settings in drug development involving data from a single trial only, however, none of the methods seems suitable for selecting a target population. Using the subgroup criterion as alternative to the proposed pruning procedures, STIMA and IT can improve their performance in some settings. The methods and the subgroup criterion are illustrated by an application in amyotrophic lateral sclerosis. 相似文献