期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Sure independence screening for analyzing supersaturated designs

K. Drosou 《统计学通讯:模拟与计算》2013,42(7):1979-1995

ABSTRACT

Supersaturated designs (SSDs) constitute a large class of fractional factorial designs which can be used for screening out the important factors from a large set of potentially active ones. A major advantage of these designs is that they reduce the experimental cost dramatically, but their crucial disadvantage is the confounding involved in the statistical analysis. Identification of active effects in SSDs has been the subject of much recent study. In this article we present a two-stage procedure for analyzing two-level SSDs assuming a main-effect only model, without including any interaction terms. This method combines sure independence screening (SIS) with different penalty functions; such as Smoothly Clipped Absolute Deviation (SCAD), Lasso and MC penalty achieving both the down-selection and the estimation of the significant effects, simultaneously. Insights on using the proposed methodology are provided through various simulation scenarios and several comparisons with existing approaches, such as stepwise in combination with SCAD and Dantzig Selector (DS) are presented as well. Results of the numerical study and real data analysis reveal that the proposed procedure can be considered as an advantageous tool due to its extremely good performance for identifying active factors. 相似文献

2.

An Adaptive Method of Variable Selection in Regression

Thomas W. O'Gorman 《统计学通讯:模拟与计算》2013,42(6):1129-1142

An adaptive variable selection procedure is proposed which uses an adaptive test along with a stepwise procedure to select variables for a multiple regression model. We compared this adaptive stepwise procedure to methods that use Akaike's information criterion, Schwartz's information criterion, and Sawa's information criterion. The simulation studies demonstrated that the adaptive stepwise method is more effective than the traditional variable selection methods if the error distribution is not normally distributed. If the error distribution is known to be normally distributed, the variable selection method based on Sawa's information criteria appears to be superior to the other methods. Unless the error distribution is known to be normally distributed, the adaptive stepwise method is recommended. 相似文献

3.

A method for analyzing supersaturated designs inspired by control charts

K. Drosou A. Lappa 《统计学通讯:模拟与计算》2018,47(4):1134-1145

The identification of active effects in supersaturated designs (SSDs) constitutes a problem of considerable interest to both scientists and engineers. The complicated structure of the design matrix renders the analysis of such designs a complicated issue. Although several methods have been proposed so far, a solution to the problem beyond one or two active factors seems to be inadequate. This article presents a heuristic approach for analyzing SSDs using the cumulative sum control chart (CUSUM) under a sure independence screening approach. Simulations are used to investigate the performance of the method comparing the proposed method with other well-known methods from the literature. The results establish the powerfulness of the proposed methodology. 相似文献

4.

A permutation‐based trend test for the analysis of a mechanistic animal migraine assay with a nonstandard design

Michael Meyners Kirsten Arndt 《Pharmaceutical statistics》2005,4(2):109-118

The effect of a test compound on neurogenically induced vasodilation in marmosets was studied using a non‐standard experimental design with overlapping dosage groups and repeated measurements. In this study, the assumption that the data were normally distributed seemed inappropriate, so no traditional data analyses could be used. As an alternative, a new permutation trend test was designed based on the Jonckheere–Terpstra test statistic. This test protects the type I error without any further assumptions. Statistically significant differences in trend between treatment groups were detected. The effect of the compound was then shown across doses using subsequent Wilcoxon rank‐sum tests against ordered alternatives. In all, the permutation test proved quite useful in this context. This nonparametric approach to the analysis may easily be adapted to other applications. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

5.

Model selection with distributed SCAD penalty

Puyu Wang Yong Liang 《Journal of applied statistics》2018,45(11):1938-1955

In this paper, we focus on the feature extraction and variable selection of massive data which is divided and stored in different linked computers. Specifically, we study the distributed model selection with the Smoothly Clipped Absolute Deviation (SCAD) penalty. Based on the Alternating Direction Method of Multipliers (ADMM) algorithm, we propose distributed SCAD algorithm and prove its convergence. The results of variable selection of the distributed approach are same with the results of the non-distributed approach. Numerical studies show that our method is both effective and efficient which performs well in distributed data analysis. 相似文献

6.

Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

Martinez JG Carroll RJ Müller S Sampson JN Chatterjee N 《The American statistician》2011,65(4):223-228

When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso. 相似文献

7.

A power study of goodness-of-fit tests for multivariate normality implemented in R

《Journal of Statistical Computation and Simulation》2012,82(5):1055-1078

Multivariate statistical analysis procedures often require data to be multivariate normally distributed. Many tests have been developed to verify if a sample could indeed have come from a normally distributed population. These tests do not all share the same sensitivity for detecting departures from normality, and thus a choice of test is of central importance. This study investigates through simulated data the power of those tests for multivariate normality implemented in the statistic software R and pits them against the variant of testing each marginal distribution for normality. The results of testing two-dimensional data at a level of significance α=5% showed that almost one-third of those tests implemented in R do not have a type I error below this. Other tests outperformed the naive variant in terms of power even when the marginals were not normally distributed. Even though no test was consistently better than all alternatives with every alternative distribution, the energy-statistic test always showed relatively good power across all tested sample sizes. 相似文献

8.

An evaluation of methods for testing hypotheses relating to two endpoints in a single clinical trial

Su TL Glimm E Whitehead J Branson M 《Pharmaceutical statistics》2012,11(2):107-117

The issues and dangers involved in testing multiple hypotheses are well recognised within the pharmaceutical industry. In reporting clinical trials, strenuous efforts are taken to avoid the inflation of type I error, with procedures such as the Bonferroni adjustment and its many elaborations and refinements being widely employed. Typically, such methods are conservative. They tend to be accurate if the multiple test statistics involved are mutually independent and achieve less than the type I error rate specified if these statistics are positively correlated. An alternative approach is to estimate the correlations between the test statistics and to perform a test that is conditional on those estimates being the true correlations. In this paper, we begin by assuming that test statistics are normally distributed and that their correlations are known. Under these circumstances, we explore several approaches to multiple testing, adapt them so that type I error is preserved exactly and then compare their powers over a range of true parameter values. For simplicity, the explorations are confined to the bivariate case. Having described the relative strengths and weaknesses of the approaches under study, we use simulation to assess the accuracy of the approximate theory developed when the correlations are estimated from the study data rather than being known in advance and when data are binary so that test statistics are only approximately normally distributed. 相似文献

9.

On the analysis of unbalanced two-level supersaturated designs via generalized linear models

K. Chatterjee C. Parpoula 《统计学通讯:模拟与计算》2017,46(5):3383-3395

Supersaturated designs (SSDs) are factorial designs in which the number of experimental runs is smaller than the number of parameters to be estimated in the model. While most of the literature on SSDs has focused on balanced designs, the construction and analysis of unbalanced designs has not been developed to a great extent. Recent studies discuss the possible advantages of relaxing the balance requirement in construction or data analysis of SSDs, and that unbalanced designs compare favorably to balanced designs for several optimality criteria and for the way in which the data are analyzed. Moreover, the effect analysis framework of unbalanced SSDs until now is restricted to the central assumption that experimental data come from a linear model. In this article, we consider unbalanced SSDs for data analysis under the assumption of generalized linear models (GLMs), revealing that unbalanced SSDs perform well despite the unbalance property. The examination of Type I and Type II error rates through an extensive simulation study indicates that the proposed method works satisfactorily. 相似文献

10.

Naive method to test the convergence of simulation and its applications in the computation of bankruptcy probability

《Journal of Statistical Computation and Simulation》2012,82(6):1216-1232

We analyse a naive method using sample mean and sample variance to test the convergence of simulation. We find this method is valid for identically, independently distributed samples, as well as correlated samples with correlation disappearing in long period. Our simulation results on the approximation to bankruptcy probability (BP) show the naive method compares well with the Half-Width, Geweke and CUSUM methods in terms of accuracy and time cost. There are clear evidences of variance reduction from tail-distribution sampling for all convergence test methods when the true BP is very low. 相似文献

11.

Computer-aided unbalanced supersaturated designs involving interactions

《Journal of Statistical Computation and Simulation》2012,82(4):756-770

Supersaturated designs (SSDs) are defined as fractional factorial designs whose experimental run size is smaller than the number of main effects to be estimated. While most of the literature on SSDs has focused only on main effects designs, the construction and analysis of such designs involving interactions has not been developed to a great extent. In this paper, we propose a backward elimination design-driven optimization (BEDDO) method, with one main goal in mind, to eliminate the factors which are identified to be fully aliased or highly partially aliased with each other in the design. Under the proposed BEDDO method, we implement and combine correlation-based statistical measures taken from classical test theory and design of experiments field, and we also present an optimality criterion which is a modified form of Cronbach's alpha coefficient. In this way, we provide a new class of computer-aided unbalanced SSDs involving interactions, that derive directly from BEDDO optimization. 相似文献

12.

A Discussion of Permutation Tests Conditional to Observed Responses in Unreplicated 2 M Full Factorial Designs

D. Basso 《统计学通讯:理论与方法》2013,42(1):83-97

ABSTRACT

In this article we present a new solution to test for effects in unreplicated two-level factorial designs. The proposed test statistic, in case the error components are normally distributed, follows an F random variable, though our attention is on its nonparametric permutation version. The proposed procedure does not require any transformation of data such as residualization and it is exact for each effect and distribution-free. Our main aim is to discuss a permutation solution conditional to the original vector of responses. We give two versions of the same nonparametric testing procedure in order to control both the individual error rate and the experiment-wise error rate. A power comparison with Loughin and Noble's test is provided in the case of a unreplicated 2⁴ full factorial design. 相似文献

13.

Robust estimation and model identification for longitudinal data varying-coefficient model

Shu Liu Heng Lian 《统计学通讯:理论与方法》2018,47(11):2701-2719

It is well known that M-estimation is a widely used method for robust statistical inference and the varying coefficient models have been widely applied in many scientific areas. In this paper, we consider M-estimation and model identification of bivariate varying coefficient models for longitudinal data. We make use of bivariate tensor-product B-splines as an approximation of the function and consider M-type regression splines by minimizing the objective convex function. Mean and median regressions are included in this class. Moreover, with a double smoothly clipped absolute deviation (SCAD) penalization, we study the problem of simultaneous structure identification and estimation. Under approximate conditions, we show that the proposed procedure possesses the oracle property in the sense that it is as efficient as the estimator when the true model is known prior to statistical analysis. Simulation studies are carried out to demonstrate the methodological power of the proposed methods with finite samples. The proposed methodology is illustrated with an analysis of a real data example. 相似文献

14.

Validation of surrogate end points in multiple randomized clinical trials with failure time end points 总被引：2，自引：0，他引：2

Tomasz Burzykowski Geert Molenberghs Marc Buyse Helena Geys & Didier Renard 《Journal of the Royal Statistical Society. Series C, Applied statistics》2001,50(4):405-422

Before a surrogate end point can replace a final (true) end point in the evaluation of an experimental treatment, it must be formally 'validated'. The validation will typically require large numbers of observations. It is therefore useful to consider situations in which data are available from several randomized experiments. For two normally distributed end points Buyse and co-workers suggested a new definition of validity in terms of the quality of both trial level and individual level associations between the surrogate and true end points. This paper extends this approach to the important case of two failure time end points, using bivariate survival modelling. The method is illustrated by using two actual sets of data from cancer clinical trials. 相似文献

15.

A Stepwise AIC Method for Variable Selection in Linear Regression

Toshie Yamashita Keizo Yamashita Ryotaro Kamimura 《统计学通讯:理论与方法》2013,42(13):2395-2403

In this article, we study stepwise AIC method for variable selection comparing with other stepwise method for variable selection, such as, Partial F, Partial Correlation, and Semi-Partial Correlation in linear regression modeling. Then we show mathematically that the stepwise AIC method and other stepwise methods lead to the same method as Partial F. Hence, there are more reasons to use the stepwise AIC method than the other stepwise methods for variable selection, since the stepwise AIC method is a model selection method that can be easily managed and can be widely extended to more generalized models and applied to non normally distributed data. We also treat problems that always appear in applications, that are validation of selected variables and problem of collinearity. 相似文献

16.

Book Reviews

《Journal of Statistical Computation and Simulation》2012,82(6):517-518

The Levene test is a widely used test for detecting differences in dispersion. The modified Levene transformation using sample medians is considered in this article. After Levene's transformation the data are not normally distributed, hence, nonparametric tests may be useful. As the Wilcoxon rank sum test applied to the transformed data cannot control the type I error rate for asymmetric distributions, a permutation test based on reallocations of the original observations rather than the absolute deviations was investigated. Levene's transformation is then only an intermediate step to compute the test statistic. Such a Levene test, however, cannot control the type I error rate when the Wilcoxon statistic is used; with the Fisher–Pitman permutation test it can be extremely conservative. The Fisher–Pitman test based on reallocations of the transformed data seems to be the only acceptable nonparametric test. Simulation results indicate that this test is on average more powerful than applying the t test after Levene's transformation, even when the t test is improved by the deletion of structural zeros. 相似文献

17.

Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

《The American statistician》2013,67(4):223-228

When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso. 相似文献

18.

ESTIMATION AND TESTING FOR PARTIALLY LINEAR SINGLE-INDEX MODELS

Liang H Liu X Li R Tsai CL 《Annals of statistics》2010,38(6):3811-3836

In partially linear single-index models, we obtain the semiparametrically efficient profile least-squares estimators of regression coefficients. We also employ the smoothly clipped absolute deviation penalty (SCAD) approach to simultaneously select variables and estimate regression coefficients. We show that the resulting SCAD estimators are consistent and possess the oracle property. Subsequently, we demonstrate that a proposed tuning parameter selector, BIC, identifies the true model consistently. Finally, we develop a linear hypothesis test for the parametric coefficients and a goodness-of-fit test for the nonparametric component, respectively. Monte Carlo studies are also presented. 相似文献

19.

A permutation test for the spread of three-dimensional rotation data

Marissa D. Eckrote 《Journal of nonparametric statistics》2017,29(3):553-560

The permutation test is a nonparametric test that can be used to compare measures of spread for two data sets, but is yet to be explored in the context of three-dimensional rotation data. A permutation test for such data is developed and the statistical power of this test is considered under various conditions. The test is then used in a brief application comparing movement around the calcaneocuboid joint for a human, chimpanzee, and baboon. 相似文献

20.

A Distribution-free Multivariate Change-point Model for Statistical Process Control

Maoyuan Zhou Xuemin Zi Wei Geng 《统计学通讯:模拟与计算》2015,44(8):1975-1987

This article develops a new distribution-free multivariate procedure for statistical process control based on minimal spanning tree (MST), which integrates a multivariate two-sample goodness-of-fit (GOF) test based on MST and change-point model. Simulation results show that our proposed procedure is quite robust to nonnormally distributed data, and moreover, it is efficient in detecting process shifts, especially moderate to large shifts, which is one of the main drawbacks of most distribution-free procedures in the literature. The proposed procedure is particularly useful in start-up situations. Comparison results and a real data example show that our proposed procedure has great potential for application. 相似文献