期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Nonparametric methods in multivariate factorial designs for large number of factor levels

Arne C. Bathke Solomon W. Harrar 《Journal of statistical planning and inference》2008

We propose different multivariate nonparametric tests for factorial designs and derive their asymptotic distribution for the situation where the number of replications is limited, whereas the number of treatments goes to infinity (large a, small n case). The tests are based on separate rankings for the different variables, and they are therefore invariant under separate monotone transformations of the individual variables. 相似文献

2.

Equineighboured balanced nested row-column designs

《Journal of statistical planning and inference》1988,20(2):191-199

This paper presents equineighboured balanced nested row-column designs for v treatments arranged in b blocks each comprising pq units further grouped into p rows and q columns. These are two-dimensional designs with the property that all pairs of treatments are neighbours equally frequently at all levels in both rows and columns. Methods of construction are given both for designs based on Latin squares and those where pq≤v. Cyclic equineighboured designs are defined and tabulated for 5≤v≤10, p≤3, q≤5, where r=bpq/v is the number of replications of each treatment. 相似文献

3.

TESTING SUBHYPOTHESES AND ESTIMATING σ2 IN THE NONREPLICATED THREE-WAY MULTIPLICATIVE INTERACTION MODEL

《统计学通讯:模拟与计算》2013,42(4):605-618

ABSTRACT

Standard statistical techniques do not provide methods for analyzing data from nonreplicated factorial experiments. Such experiments occur for several reasons. Many experimenters may prefer conducting experiments having a large number of factor levels with no replications than conducting experiments with a few factor levels with replications particularly in pilot studies. Such experiments may allow one to identify factor combinations to be used in follow-up experiments. Another possibility is when the experimenter thinks that an experiment is replicated when in fact it is not. This occurs when a naive researcher believes that sub-samples are replicates when in reality they are not. Nonreplicated two-way experiments have been extensively studied. This paper discusses the analysis of nonreplicated three-way experiments. In particular, estimation of σ² is discussed and a test is derived for testing whether three-factor interaction is absent in sub-areas of three-way data using a nonreplicated three-way multiplicative interaction model with a single multiplicative term. Approximate null distribution of the derived test statistic is studied using Monte Carlo studies and results are illustrated through an example. 相似文献

4.

Selection and screening procedures to determine optimal product designs

《Journal of statistical planning and inference》1998,67(2):311-330

To compare several promising product designs, manufacturers must measure their performance under multiple environmental conditions. In many applications, a product design is considered to be seriously flawed if its performance is poor for any level of the environmental factor. For example, if a particular automobile battery design does not function well under temperature extremes, then a manufacturer may not want to put this design into production. Thus, this paper considers the measure of a product's quality to be its worst performance over the levels of the environmental factor. We develop statistical procedures to identify (a near) optimal product design among a given set of product designs, i.e., the manufacturing design that maximizes the worst product performance over the levels of the environmental variable. We accomplish this by intuitive procedures based on the split-plot experimental design (and the randomized complete block design as a special case); split-plot designs have the essential structure of a product array and the practical convenience of local randomization. Two classes of statistical procedures are provided. In the first, the δ-best formulation of selection problems, we determine the number of replications of the basic split-plot design that are needed to guarantee, with a given confidence level, the selection of a product design whose minimum performance is within a specified amount, δ, of the performance of the optimal product design. In particular, if the difference between the quality of the best and second best manufacturing designs is δ or more, then the procedure guarantees that the best design will be selected with specified probability. For applications where a split-plot experiment that involves several product designs has been completed without the planning required of the δ-best formulation, we provide procedures to construct a ‘confidence subset’ of the manufacturing designs; the selected subset contains the optimal product design with a prespecified confidence level. The latter is called the subset selection formulation of selection problems. Examples are provided to illustrate the procedures. 相似文献

5.

A nonparametric procedure for the analysis of balanced crossover designs

Serge Tardif Franois Bellavance Constance Van Eeden 《Revue canadienne de statistique》2005,33(4):471-488

The authors propose nonparametric tests for the hypothesis of no direct treatment effects, as well as for the hypothesis of no carryover effects, for balanced crossover designs in which the number of treatments equals the number of periods p, where p ≥ 3. They suppose that the design consists of n replications of balanced crossover designs, each formed by m Latin squares of order p. Their tests are permutation tests which are based on the n vectors of least squares estimators of the parameters of interest obtained from the n replications of the experiment. They obtain both the exact and limiting distribution of the test statistics, and they show that the tests have, asymptotically, the same power as the F‐ratio test. 相似文献

6.

Number of Replications Required in Control Chart Monte Carlo Simulation Studies

Jay R. Schaffer Myoung-Jin Kim 《统计学通讯:模拟与计算》2013,42(5):1075-1087

Monte Carlo simulations have been used extensively in studying the performance of control charts. Researchers have used various numbers of replications in their studies, but almost none of them provided justifications for the number of replications used. Currently, there are no empirically based recommendations regarding the required number of replications to ensure accurate results. This research examined six recently published studies to develop recommendations for the minimum number of replications necessary to reproduce the reported results within a specified degree of accuracy. The results of this study indicated that using 10,000 replications was unnecessarily large and a smaller number of replications could be used to reproduce the target ARLs within the 2% error bands satisfying the modified Mundfrom's criteria. In many cases, only 5,000 replications or fewer were required. In general, the number of replications required to reproduce the target ARL decreased as the shift size increased. In addition, the results of this study provide general recommendations for the required number of replications to use in future SPC simulation studies. 相似文献

7.

On the construction of m-associate PBIB designs I

Kishore Sinha 《Journal of statistical planning and inference》1977,1(2):133-142

The m-associate triangular association scheme has been discussed, and several series of partially balanced incomplete block (PBIB) designs with m-associate triangular association scheme have been obtained in Section 1. In Section 2, an m-associate triangular-group divisible association scheme

(T_{q} — GD m, 1<q<m)

has been introduced and several series of PBIB designs with m-associate triangular group divisible association scheme, from m-associate triangular PBIB designs have been constructed. Some numerical values to the three associate triangular designs, and three associate triangular group divisible designs in the range b, v≦100; r, k≦10, with their average efficiencies are given, respectively, in Tables I and II, in Section 3, where as usual v denotes the number of treatments, b the number of blocks, r the number of replications of each treatment, and k the block size. 相似文献

8.

Generalized neighbor designs with block size 3

Ijaz Iqbal M.H. TahirM.L. Aggarwal Asghar AliIftikhar Ahmed 《Journal of statistical planning and inference》2012,142(3):626-632

A generalized neighbor design relaxes the equality condition on the number of times two treatments as neighbors in the design. In this article, we have considered the construction of some classes of generalized neighbor designs with block size k=3 by using the method of cyclic shifts. The distinguishing feature of this construction method is that the properties of a design can easily be obtained from the sets of shifts instead of constructing the actual blocks of the design. A catalog of generalized neighbor designs with block size k=3 is compiled for v∈{5,6,…,18} treatments and for different replications. We provide the reader with a simpler method of construction, and in general the catalog that gives an open choice to the experimenter for selecting any class of neighbor designs. 相似文献

9.

Bayesian Additive Regression Trees using Bayesian model averaging

Belinda Hernández Adrian E. Raftery Stephen R Pennington Andrew C. Parnell 《Statistics and Computing》2018,28(4):869-890

Bayesian Additive Regression Trees (BART) is a statistical sum of trees model. It can be considered a Bayesian version of machine learning tree ensemble methods where the individual trees are the base learners. However, for datasets where the number of variables p is large the algorithm can become inefficient and computationally expensive. Another method which is popular for high-dimensional data is random forests, a machine learning algorithm which grows trees using a greedy search for the best split points. However, its default implementation does not produce probabilistic estimates or predictions. We propose an alternative fitting algorithm for BART called BART-BMA, which uses Bayesian model averaging and a greedy search algorithm to obtain a posterior distribution more efficiently than BART for datasets with large p. BART-BMA incorporates elements of both BART and random forests to offer a model-based algorithm which can deal with high-dimensional data. We have found that BART-BMA can be run in a reasonable time on a standard laptop for the “small n large p” scenario which is common in many areas of bioinformatics. We showcase this method using simulated data and data from two real proteomic experiments, one to distinguish between patients with cardiovascular disease and controls and another to classify aggressive from non-aggressive prostate cancer. We compare our results to their main competitors. Open source code written in R and Rcpp to run BART-BMA can be found at: https://github.com/BelindaHernandez/BART-BMA.git. 相似文献

10.

Model robust extrapolation designs

《Journal of statistical planning and inference》1988,18(1):1-24

We seek designs which are optimal in some sense for extrapolation when the true regression function is in a certain class of regression functions. More precisely, the class is defined to be the collection of regression functions such that its (h + 1)-th derivative is bounded. The class can be viewed as representing possible departures from an ‘ideal’ model and thus describes a model robust setting. The estimates are restricted to be linear and the designs are restricted to be with minimal number of points. The design and estimate sought is minimax for mean square error. The optimal designs for cases X = [0, ∞] and X = [-1, 1], where X is the place where observations can be taken, are discussed. 相似文献

11.

Variable selection for generalized linear mixed models by L 1-penalized estimation

Andreas Groll Gerhard Tutz 《Statistics and Computing》2014,24(2):137-154

Generalized linear mixed models are a widely used tool for modeling longitudinal data. However, their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed models includes an L ₁-penalty term that enforces variable selection and shrinkage simultaneously. A gradient ascent algorithm is proposed that allows to maximize the penalized log-likelihood yielding models with reduced complexity. In contrast to common procedures it can be used in high-dimensional settings where a large number of potentially influential explanatory variables is available. The method is investigated in simulation studies and illustrated by use of real data sets. 相似文献

12.

On Nonsmooth Estimating Functions via Jackknife Empirical Likelihood

下载免费PDF全文

Zhouping Li Jinfeng Xu Wang Zhou 《Scandinavian Journal of Statistics》2016,43(1):49-69

In many applications, the parameters of interest are estimated by solving non‐smooth estimating functions with U‐statistic structure. Because the asymptotic covariances matrix of the estimator generally involves the underlying density function, resampling methods are often used to bypass the difficulty of non‐parametric density estimation. Despite its simplicity, the resultant‐covariance matrix estimator depends on the nature of resampling, and the method can be time‐consuming when the number of replications is large. Furthermore, the inferences are based on the normal approximation that may not be accurate for practical sample sizes. In this paper, we propose a jackknife empirical likelihood‐based inferential procedure for non‐smooth estimating functions. Standard chi‐square distributions are used to calculate the p‐value and to construct confidence intervals. Extensive simulation studies and two real examples are provided to illustrate its practical utilities. 相似文献

13.

Moving Block Bootstrap for Analyzing Longitudinal Data

Hyunsu Ju 《统计学通讯:理论与方法》2013,42(6):1130-1142

In a longitudinal study subjects are followed over time. I focus on a case where the number of replications over time is large relative to the number of subjects in the study. I investigate the use of moving block bootstrap methods for analyzing such data. Asymptotic properties of the bootstrap methods in this setting are derived. The effectiveness of these resampling methods is also demonstrated through a simulation study. 相似文献

14.

Developing Ridge Parameters for SUR Model

M. A. Alkhamisi 《统计学通讯:理论与方法》2013,42(4):544-564

This paper proposes a number of procedures for developing new biased estimators of the seemingly unrelated regression (SUR) parameters, when the explanatory variables are affected by multicollinearity. Several ridge parameters are proposed and then compared in terms of the trace mean squared error (TMSE) and (PR) criteria. The PR criterion is the proportion of replication (out of 1,000) for which the SUR version of the generalized least squares (SGLS) estimator has a smaller TMSE than others. The study was performed using Monte Carlo simulations where the number of equations in the system, the number of observations, the correlation among equations, and the correlation between explanatory variables have been varied. For each model, we performed 1,000 replications. Our results show that under certain conditions some of the proposed SUR ridge parameters, (R _Sgeom, R _Skmed, R _Sqarith, and R _Sqmax), performed well when compared, in terms of TMSE and PR criteria, with other proposed and popular existing ridge parameters. In large samples and when the collinearity between the explanatory variables is not high, the unbiased SUR estimator (SGLS), performed better than the other ridge parameters. 相似文献

15.

Weakly resolvable IV.3 search designs for the pⁿ factorial experiment

Donald A. Anderson Ann M. Thomas 《Journal of statistical planning and inference》1980,4(3):299-312

A series of weakly resolvable search designs for the pⁿ factorial experiment is given for which the mean and all main effects are estimable in the presence of any number of two-factor interactions and for which any combination of three or fewer pairs of factors that interact may be detected. The designs have N = p(p–1)n+p runs except in one case where additional runs are required for detection and one case where

(p?1) 2

additional runs are needed to estimate all (p–1)² degrees of freedom for each pair of detected interactions. The detection procedure is simple enough that computations can be carried out with hand calculations. 相似文献

16.

Partition of a query set into minimal number of subsets having consecutive retrieval property

Sumiyasu Yamamoto Kazuhiko Ushio Shinsei Tazawa Hideto Ikeda Fumikazu Tamari Noboru Hamada 《Journal of statistical planning and inference》1977,1(1):41-51

This paper is concerned with information retrieval. The basic problem is how to store large masses of data in such a way that whenever information regarding some particular aspect of the data is needed, such information is easily and efficiently retrieved. Work in this field is thus very important for organizations dealing with large classes of data.The consecutive retrieval (C-R) property defined by S.P. Ghosh is an important relation between a set of queries and a set of records. Its existence enables the design of information retrieval system with a minimal search time and no redundant storage in that the records can be organized in such a way that those pertinent to any query are stored in consecutive storage locations. The C-R property, however, can not exist between every arbitrary query set and every record set.A subset of the query set Q having the C-R property is called a C-R subset and a C-R subset having the maximum cardinality is called the maximal C-R subset. A partition of Q is called the C-R partition if every subset has the C-R property. A C-R partition with minimum number of subsets is called the minimal C-R partition. With respect to the set of all binary queries and the set of all binary records, it is shown that the maximal cardinality of a C-R subset is 2l-1 where l is the number of attributes concerned. A combinatorial characterization of a maximal C-R subset is also given. A lower bound on the number of subsets in a C-R partition and several examples which attain the lower bound are given. A general procedure for obtaining a minimal C-R partition which attains the lower bound is given provided the number of attributes is even. 相似文献

17.

Independent screening in high-dimensional exponential family predictors’ space

Kofi Placid Adragni 《Journal of applied statistics》2015,42(2):347-359

We present a methodology for screening predictors that, given the response, follow a one-parameter exponential family distributions. Screening predictors can be an important step in regressions when the number of predictors p is excessively large or larger than n the number of observations. We consider instances where a large number of predictors are suspected irrelevant for having no information about the response. The proposed methodology helps remove these irrelevant predictors while capturing those linearly or nonlinearly related to the response. 相似文献

18.

Hidden Markov Models with mixtures as emission distributions

Stevenn Volant Caroline Bérard Marie-Laure Martin-Magniette Stéphane Robin 《Statistics and Computing》2014,24(4):493-504

In unsupervised classification, Hidden Markov Models (HMM) are used to account for a neighborhood structure between observations. The emission distributions are often supposed to belong to some parametric family. In this paper, a semiparametric model where the emission distributions are a mixture of parametric distributions is proposed to get a higher flexibility. We show that the standard EM algorithm can be adapted to infer the model parameters. For the initialization step, starting from a large number of components, a hierarchical method to combine them into the hidden states is proposed. Three likelihood-based criteria to select the components to be combined are discussed. To estimate the number of hidden states, BIC-like criteria are derived. A simulation study is carried out both to determine the best combination between the combining criteria and the model selection criteria and to evaluate the accuracy of classification. The proposed method is also illustrated using a biological dataset from the model plant Arabidopsis thaliana. A R package HMMmix is freely available on the CRAN. 相似文献

19.

Use of inter-block information to obtain uniformly better estimators of treatment contrasts

S. Mejza 《Statistics》2013,47(3):335-341

In this paper the problem of combining the estimates is reexamined by making use of the theory of basic contrasts. For some basic contrasts, called partially confounded, a general method of finding uniformly better combined estimators of treatment contrast is derived, The method is applicable for all proper block designs, not necessarily connected, with equal or different treatment replications, for which there are multiple efficiency factors ?ε of multiplicity q> 2and if ν _e> 2, where ν_eis the number of the error degrees of freedom in the intra-block analysis. 相似文献

20.

Incomplete block designs for asymmetric parallel line assays

Shashi Shekhar 《统计学通讯:理论与方法》2013,42(8):2413-2425

Abstract

A method of construction of A-optimal binary block designs for asymmetrical parallel line assays, i.e., the assays in which the number doses for standard and test preparation are unequal has been considered. The method is illustrated with examples. Two cases of this method have been considered. In the first case, designs obtained are of equal replications of the doses. In the second case, designs with unequal replications are obtained. 相似文献