期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Analysis of supersaturated designs via the Dantzig selector

Frederick K.H. Phoa Yu-Hui PanHongquan Xu 《Journal of statistical planning and inference》2009

A supersaturated design is a design whose run size is not enough for estimating all the main effects. It is commonly used in screening experiments, where the goals are to identify sparse and dominant active factors with low cost. In this paper, we study a variable selection method via the Dantzig selector, proposed by Candes and Tao [2007. The Dantzig selector: statistical estimation when p

p

is much larger than n

n

. Annals of Statistics 35, 2313–2351], to screen important effects. A graphical procedure and an automated procedure are suggested to accompany with the method. Simulation shows that this method performs well compared to existing methods in the literature and is more efficient at estimating the model size. 相似文献

2.

Sure independence screening for analyzing supersaturated designs

K. Drosou 《统计学通讯:模拟与计算》2013,42(7):1979-1995

ABSTRACT

Supersaturated designs (SSDs) constitute a large class of fractional factorial designs which can be used for screening out the important factors from a large set of potentially active ones. A major advantage of these designs is that they reduce the experimental cost dramatically, but their crucial disadvantage is the confounding involved in the statistical analysis. Identification of active effects in SSDs has been the subject of much recent study. In this article we present a two-stage procedure for analyzing two-level SSDs assuming a main-effect only model, without including any interaction terms. This method combines sure independence screening (SIS) with different penalty functions; such as Smoothly Clipped Absolute Deviation (SCAD), Lasso and MC penalty achieving both the down-selection and the estimation of the significant effects, simultaneously. Insights on using the proposed methodology are provided through various simulation scenarios and several comparisons with existing approaches, such as stepwise in combination with SCAD and Dantzig Selector (DS) are presented as well. Results of the numerical study and real data analysis reveal that the proposed procedure can be considered as an advantageous tool due to its extremely good performance for identifying active factors. 相似文献

3.

Computer-aided unbalanced supersaturated designs involving interactions

《Journal of Statistical Computation and Simulation》2012,82(4):756-770

Supersaturated designs (SSDs) are defined as fractional factorial designs whose experimental run size is smaller than the number of main effects to be estimated. While most of the literature on SSDs has focused only on main effects designs, the construction and analysis of such designs involving interactions has not been developed to a great extent. In this paper, we propose a backward elimination design-driven optimization (BEDDO) method, with one main goal in mind, to eliminate the factors which are identified to be fully aliased or highly partially aliased with each other in the design. Under the proposed BEDDO method, we implement and combine correlation-based statistical measures taken from classical test theory and design of experiments field, and we also present an optimality criterion which is a modified form of Cronbach's alpha coefficient. In this way, we provide a new class of computer-aided unbalanced SSDs involving interactions, that derive directly from BEDDO optimization. 相似文献

4.

Addition of runs to a two-level supersaturated design

V.K. Gupta Poonam Singh Basudev Kole Rajender Parsad 《Journal of statistical planning and inference》2010

The purpose of this article is to introduce a new class of extended E(s²)-optimal two level supersaturated designs obtained by adding runs to an existing E(s²)-optimal two level supersaturated design. The extended design is a union of two optimal SSDs belonging to different classes. New lower bound to E(s²) has been obtained for the extended supersaturated designs. Some examples and a small catalogue of E(s²)-optimal SSDs are also included. 相似文献

5.

Large sample interval mapping method for genetic trait loci in finite regression mixture models

Hong Zhang Hanfeng Chen Zhaohai Li 《Journal of statistical planning and inference》2009

This article investigates the large sample interval mapping method for genetic trait loci (GTL) in a finite non-linear regression mixture model. The general model includes most commonly used kernel functions, such as exponential family mixture, logistic regression mixture and generalized linear mixture models, as special cases. The populations derived from either the backcross or intercross design are considered. In particular, unlike all existing results in the literature in the finite mixture models, the large sample results presented in this paper do not require the boundness condition on the parametric space. Therefore, the large sample theory presented in this article possesses general applicability to the interval mapping method of GTL in genetic research. The limiting null distribution of the likelihood ratio test statistics can be utilized easily to determine the threshold values or p-values required in the interval mapping. The limiting distribution is proved to be free of the parameter values of null model and free of the choice of a kernel function. Extension to the multiple marker interval GTL detection is also discussed. Simulation study results show favorable performance of the asymptotic procedure when sample sizes are moderate. 相似文献

6.

A method for analyzing supersaturated designs inspired by control charts

K. Drosou A. Lappa 《统计学通讯:模拟与计算》2018,47(4):1134-1145

The identification of active effects in supersaturated designs (SSDs) constitutes a problem of considerable interest to both scientists and engineers. The complicated structure of the design matrix renders the analysis of such designs a complicated issue. Although several methods have been proposed so far, a solution to the problem beyond one or two active factors seems to be inadequate. This article presents a heuristic approach for analyzing SSDs using the cumulative sum control chart (CUSUM) under a sure independence screening approach. Simulations are used to investigate the performance of the method comparing the proposed method with other well-known methods from the literature. The results establish the powerfulness of the proposed methodology. 相似文献

7.

A three-stage variable selection method for supersaturated designs

Ai-Jun Qi Zong-Feng Qi Qiao-Zhen Zhang 《统计学通讯:模拟与计算》2017,46(4):2601-2610

A supersaturated design (SSD) is a design whose run size is not enough for estimating all main effects. Such a design is commonly used in screening experiments to screen active effects based on the effect sparsity principle. Traditional approaches, such as the ordinary stepwise regression and the best subset variable selection, may not be appropriate in this situation. In this article, a new variable selection method is proposed based on the idea of staged dimensionality reduction. Simulations and several real data studies indicate that the newly proposed method is more effective than the existing data analysis methods. 相似文献

8.

Wilcoxon–Mann–Whitney test for stratified samples and Efron's paradox dice

Karthinathan Thangavelu Edgar Brunner 《Journal of statistical planning and inference》2007

Two-treatment multi-center clinical trials are the most common type of clinical trials in practice. The aim of this paper is to discuss a curious property of certain standard nonparametric procedures used in the analysis of such clinical trials. Different analyses of a simulated data example are presented, which lead to contrasting and surprising results. The source of the potentially misleading outcome is then explored while relating the simulated data with the concept of Efron's paradox dice and the notion of nontransitivity. With the root of the problem established, an alternate nonparametric method from the literature is shown to address the problem. Finally, pointing out an interpretational concern of using the alternate procedure, a modification to this procedure is also suggested and corresponding theoretical results are presented. 相似文献

9.

Optimum covariate designs in partially balanced incomplete block (PBIB) design set-ups

Ganesh Dutta Premadhis Das Nripes K. Mandal 《Journal of statistical planning and inference》2009

The use of covariates in block designs is necessary when the covariates cannot be controlled like the blocking factor in the experiment. In this paper, we consider the situation where there is some flexibility for selection in the values of the covariates. The choice of values of the covariates for a given block design attaining minimum variance for estimation of each of the parameters has attracted attention in recent times. Optimum covariate designs in simple set-ups such as completely randomised design (CRD), randomised block design (RBD) and some series of balanced incomplete block design (BIBD) have already been considered. In this paper, optimum covariate designs have been considered for the more complex set-ups of different partially balanced incomplete block (PBIB) designs, which are popular among practitioners. The optimum covariate designs depend much on the methods of construction of the basic PBIB designs. Different combinatorial arrangements and tools such as orthogonal arrays, Hadamard matrices and different kinds of products of matrices viz. Khatri–Rao product, Kronecker product have been conveniently used to construct optimum covariate designs with as many covariates as possible. 相似文献

10.

Factor screening in nonregular two-level designs based on projection-based variable selection

John Tyssedal Shahrukh Hussain 《Journal of applied statistics》2016,43(3):490-508

In this paper, we focus on the problem of factor screening in nonregular two-level designs through gradually reducing the number of possible sets of active factors. We are particularly concerned with situations when three or four factors are active. Our proposed method works through examining fits of projection models, where variable selection techniques are used to reduce the number of terms. To examine the reliability of the methods in combination with such techniques, a panel of models consisting of three or four active factors with data generated from the 12-run and the 20-run Plackett–Burman (PB) design is used. The dependence of the procedure on the amount of noise, the number of active factors and the number of experimental factors is also investigated. For designs with few runs such as the 12-run PB design, variable selection should be done with care and default procedures in computer software may not be reliable to which we suggest improvements. A real example is included to show how we propose factor screening can be done in practice. 相似文献

11.

Optimal mixed-level k-circulant supersaturated designs

Jie Chen Min-Qian Liu 《Journal of statistical planning and inference》2008

Supersaturated designs (SSDs) offer a potentially useful way to investigate many factors with only few experiments in the preliminary stages of experimentation. This paper explores how to construct E(_f_NOD)

E (f_{NOD})

-optimal mixed-level SSDs using k-cyclic generators. The necessary and sufficient conditions for the existence of mixed-level k-circulant SSDs with the equal occurrence property are provided. Properties of the mixed-level k -circulant SSDs are investigated, in particular, the sufficient condition under which the generator vector produces an E(_f_NOD)

E (f_{NOD})

-optimal SSD is obtained. Moreover, many new E(_f_NOD)

E (f_{NOD})

-optimal mixed-level SSDs are constructed and listed. The method here generalizes the one proposed by Liu and Dean [2004. k

k

-circulant supersaturated designs. Technometrics 46, 32–43] for two-level SSDs and the one due to Georgiou and Koukouvinos [2006. Multi-level k-circulant supersaturated designs. Metrika 64, 209–220] for the multi-level case. 相似文献

12.

Model Selection,Transformations and Variance Estimation in Nonlinear Regression

Olaf Bunke Bernd Droge Jörg Polzehl 《Statistics》2013,47(3):197-240

The results of analyzing experimental data using a parametric model may heavily depend on the chosen model for regression and variance functions, moreover also on a possibly underlying preliminary transformation of the variables. In this paper we propose and discuss a complex procedure which consists in a simultaneous selection of parametric regression and variance models from a relatively rich model class and of Box-Cox variable transformations by minimization of a cross-validation criterion. For this it is essential to introduce modifications of the standard cross-validation criterion adapted to each of the following objectives: 1. estimation of the unknown regression function, 2. prediction of future values of the response variable, 3. calibration or 4. estimation of some parameter with a certain meaning in the corresponding field of application. Our idea of a criterion oriented combination of procedures (which usually if applied, then in an independent or sequential way) is expected to lead to more accurate results. We show how the accuracy of the parameter estimators can be assessed by a “moment oriented bootstrap procedure", which is an essential modification of the “wild bootstrap” of Härdle and Mammen by use of more accurate variance estimates. This new procedure and its refinement by a bootstrap based pivot (“double bootstrap”) is also used for the construction of confidence, prediction and calibration intervals. Programs written in Splus which realize our strategy for nonlinear regression modelling and parameter estimation are described as well. The performance of the selected model is discussed, and the behaviour of the procedures is illustrated, e.g., by an application in radioimmunological assay. 相似文献

13.

Semiparametric estimation for count data through weighted distributions

C.C. Kokonendji T. Senga Kiessé N. Balakrishnan 《Journal of statistical planning and inference》2009

This paper is concerned with semiparametric discrete kernel estimators when the unknown count distribution can be considered to have a general weighted Poisson form. The estimator is constructed by multiplying the Poisson estimate with a nonparametric discrete kernel-type estimate of the Poisson weight function. Comparisons are then carried out with the ordinary discrete kernel probability mass function estimators. The Poisson weight function is thus a local multiplicative correction factor, and is considered as the uniform measure to detect departures from the equidispersed Poisson distribution. In this way, the effects of dispersion and zero-proportion with respect to the standard Poisson distribution are also minimized. This method of estimation is also applied to the weighted binomial form for the count distribution having a finite support. The proposed estimators, in addition to being simple, easy-to-implement and effective, also outperform the competing nonparametric and parametric estimators in finite-sample situations. Two examples illustrate this new semiparametric estimation. 相似文献

14.

Robust designs for misspecified logistic models

Adeniyi J. Adewale Douglas P. Wiens 《Journal of statistical planning and inference》2009

We develop criteria that generate robust designs and use such criteria for the construction of designs that insure against possible misspecifications in logistic regression models. The design criteria we propose are different from the classical in that we do not focus on sampling error alone. Instead we use design criteria that account as well for error due to bias engendered by the model misspecification. Our robust designs optimize the average of a function of the sampling error and bias error over a specified misspecification neighbourhood. Examples of robust designs for logistic models are presented, including a case study implementing the methodologies using beetle mortality data. 相似文献

15.

Projection estimation capacity of Hadamard designs

Yingfu Li M.L. Aggarwal 《Journal of statistical planning and inference》2008

In a screening design, often only a few factors among a large number of potential factors are significantly important. Usually, it is not known which factors will be important ones. Thus, it is of practical interest to know if each projection of a design onto a small subset of factors is able to entertain and estimate all two-factor-interactions along with its main effects, assuming higher order interactions are negligible. In this paper, we investigate the estimation capacity of projections of Hadamard designs with run size up to 60. Possible applications of our results to robust parameter designs are also discussed. 相似文献

16.

An optimality criterion for supersaturated designs with quantitative factors

Chao Huang Dennis K.J. Lin Min-Qian Liu 《Journal of statistical planning and inference》2012

A supersaturated design (SSD) is a factorial design in which the degrees of freedom for all its main effects exceed the total number of distinct factorial level-combinations (runs) of the design. Designs with quantitative factors, in which level permutation within one or more factors could result in different geometrical structures, are very different from designs with nominal ones which have been treated as traditional designs. In this paper, a new criterion is proposed for SSDs with quantitative factors. Comparison and analysis for this new criterion are made. It is shown that the proposed criterion has a high efficiency in discriminating geometrically nonisomorphic designs and an advantage in computation. 相似文献

17.

The convolution theorem for estimating linear functionals in indirect nonparametric regression models

Ali Khoujmane Frits Ruymgaart Mikail Shubov 《Journal of statistical planning and inference》2007

Nonparametric regression—directly or indirectly observed—is one of the important statistical models. On one hand it contains two infinite dimensional parameters (the regression function and the error density), and on the other it is of rather simple structure. Therefore, it may serve as an interesting paradigm for illustrating or developing abstract statistical theory for non-Euclidean parameters. In this paper estimation of a linear functional of the indirectly observed regression function is considered, when a deterministic design is used. It should be noted that any Fourier coefficient of an expansion of the regression function in an orthonormal basis is such a functional. Because the design is deterministic the observables are independent but not identically distributed. Local asymptotic normality is established and applied to prove Hájek's convolution theorem for this functional. Pertinent references are Beran [1977. Robust location estimates. Ann. Statist. 5, 431–444] and McNeney and Wellner [2000. Application of convolution theorems in semiparametric models with non-i.i.d. data. J. Statist. Plann. Inference 91, 441–480]. For purposes explained above, however, the paper is kept self-contained and full proofs are provided. 相似文献

18.

The effect of mixing-distribution misspecification in conjugate mixture models

Paul Gustafson 《Revue canadienne de statistique》1996,24(3):307-318

Parametric mixture models are commonly used in the analysis of clustered data. Parametric families are specified for the conditional distribution of the response variable given a cluster-specific effect, and for the marginal distribution of the cluster-specific effects. This latter distribution is referred to as the mixing distribution. If the form of the mixing distribution is misspecified, then Bayesian and maximum-likelihood estimators of parameters associated with either distribution may be inconsistent. The magnitude of the asymptotic bias is investigated, using an approximation based on infinitesimal contamination of the mixing distribution. The approximation is useful when there is a closed-form expression for the marginal distribution of the response under the assumed mixing distribution, but not under the true mixing distribution. Typically this occurs when the assumed mixing distribution is conjugate, meaning that the conditional distribution of the cluster-specific parameter given the response variable belongs to the same parametric family as the mixing distribution. 相似文献

19.

Combined multiple testing by censored empirical likelihood

Arne Bathke Mi-Ok Kim Mai Zhou 《Journal of statistical planning and inference》2009

We propose a new procedure for combining multiple tests in samples of right-censored observations. The new method is based on multiple constrained censored empirical likelihood where the constraints are formulated as linear functionals of the cumulative hazard functions. We prove a version of Wilks’ theorem for the multiple constrained censored empirical likelihood ratio, which provides a simple reference distribution for the test statistic of our proposed method. A useful application of the proposed method is, for example, examining the survival experience of different populations by combining different weighted log-rank tests. Real data examples are given using the log-rank and Gehan-Wilcoxon tests. In a simulation study of two sample survival data, we compare the proposed method of combining tests to previously developed procedures. The results demonstrate that, in addition to its computational simplicity, the combined test performs comparably to, and in some situations more reliably than previously developed procedures. Statistical software is available in the R package ‘emplik’. 相似文献

20.

On the analysis of unbalanced two-level supersaturated designs via generalized linear models

K. Chatterjee C. Parpoula 《统计学通讯:模拟与计算》2017,46(5):3383-3395

Supersaturated designs (SSDs) are factorial designs in which the number of experimental runs is smaller than the number of parameters to be estimated in the model. While most of the literature on SSDs has focused on balanced designs, the construction and analysis of unbalanced designs has not been developed to a great extent. Recent studies discuss the possible advantages of relaxing the balance requirement in construction or data analysis of SSDs, and that unbalanced designs compare favorably to balanced designs for several optimality criteria and for the way in which the data are analyzed. Moreover, the effect analysis framework of unbalanced SSDs until now is restricted to the central assumption that experimental data come from a linear model. In this article, we consider unbalanced SSDs for data analysis under the assumption of generalized linear models (GLMs), revealing that unbalanced SSDs perform well despite the unbalance property. The examination of Type I and Type II error rates through an extensive simulation study indicates that the proposed method works satisfactorily. 相似文献