期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SAMPLE SURVEYS VERSUS EXPERIMENTS,CONTROLLED OBSERVATIONS,CENSUSES, REGISTERS,AND LOCAL STUDIES1 2

Leslie Kish 《Australian & New Zealand Journal of Statistics》1985,27(2):111-122

Can we find some common principle in the three comparisons? Lacking adequate time for a thorough exploration, let me suggest that representation is that common principle. I suggested (section 4) that judgment selection of spatial versus temporal extensions distinguish “longitudinal” local studies from “cross-section” population sampling. We had noted (section 3) that censuses are taken for detailed representation of the spatial dimension but they depend on judgmental selection of the temporal. Survey sampling lacks spatial detail but is spatially representative with randomization, and it can be made timely. Periodic samples can be designed that are representative of temporal extension. Furthermore, spatial and temporal detail can be obtained either through estimation or through cumulated samples [Purcell and Kish 1979, 1980; Kish 1979b, 1981, 1986 6.6]. Registers and administrative records can have good spatial and temporal representation, but representation may be lacking in population content, and surely in representation of variables. Representation of variables and of the relations between variables and over the population are the issues in conflict between surveys, experiments, and observations. This is a deep subject, and too deep to be explored again, as it was in section 2. A final point about limits for randomization to achieve representation through sampling: randomization for selecting samples of variables is beyond me generally, because I cannot conceive of frames for defined populations of variables. Yet we can find attempts at randomized selection of variables: in the selection of items for the consumer price index, also of items for tests of IQ or of achievements. Generally I believe that randomization is the way to achieve representation without complete coverage, and that it can be applied and practised in many dimensions. 相似文献

2.

Invariance and factorial models

P. McCullagh 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(2):209-256

Two factors having the same set of levels are said to be homologous. This paper aims to extend the domain of factorial models to designs that include homologous factors. In doing so, it is necessary first to identify the characteristic property of those vector spaces that constitute the standard factorial models. We argue here that essentially every interesting statistical model specified by a vector space is necessarily a representation of some algebraic category. Logical consistency of the sort associated with the standard marginality conditions is guaranteed by category representations, but not by group representations. Marginality is thus interpreted as invariance under selection of factor levels ( I -representations), and invariance under replication of levels ( S -representations). For designs in which each factor occurs once, the representations of the product category coincide with the standard factorial models. For designs that include homologous factors, the set of S -representations is a subset of the I -representations. It is shown that symmetry and quasi-symmetry are representations in both senses, but that not all representations include the constant functions (intercept). The beginnings of an extended algebra for constructing general I -representations is described and illustrated by a diallel cross design. 相似文献

3.

The table look-up rule

Robert M. Haralick 《统计学通讯:理论与方法》2013,42(12):1163-1191

The table look-up rule problem can be described by the question: what is a good way for the table to represent the decision regions in the N-dimensional measurement space. This paper describes a quickly implementable table look-up rule based on Ashby’s representation of sets in his constraint analysis. A decision region for category c in the N-dimensional measurement space is considered to be the intersection of the inverse projections of the decision regions determined for category c by Bayes rules in smaller dimensional projection spaces. Error bounds for this composite decision rule are derived: any entry in the confusion matrix for the composite decision rule is bounded above by the minimum of that entry taken over all the confusion matrices of the Bayes decision rules in the smaller dimensional projection spaces.

On simulated Gaussian Data, probability of error with the table look-up rule is comparable to the optimum Bayes rule. 相似文献

4.

Label‐ and Level‐Invariant Graphical Log‐Linear Models

下载免费PDF全文

Ricardo Ramírez‐Aldana Guillermina Eslava‐Gómez 《Australian & New Zealand Journal of Statistics》2016,58(2):269-291

We introduce two types of graphical log‐linear models: label‐ and level‐invariant models for triangle‐free graphs. These models generalise symmetry concepts in graphical log‐linear models and provide a tool with which to model symmetry in the discrete case. A label‐invariant model is category‐invariant and is preserved after permuting some of the vertices according to transformations that maintain the graph, whereas a level‐invariant model equates expected frequencies according to a given set of permutations. These new models can both be seen as instances of a new type of graphical log‐linear model termed the restricted graphical log‐linear model, or RGLL, in which equality restrictions on subsets of main effects and first‐order interactions are imposed. Their likelihood equations and graphical representation can be obtained from those derived for the RGLL models. 相似文献

5.

基于支持向量机的新经济增长点选择研究

张倩《统计与信息论坛》2012,27(6):66-71

针对现阶段新经济增长点选择模型无法区分“已有的”增长点与“新的”增长点的问题,使用支持向量机挖掘新经济增长点的潜在性.研究显示:陕西省2010年38个工业行业可划分为“新经济增长点”与“非新经济增长点”两类,新经济增长点一类中前十位行业与陕西省“十二五”规划中的文化产业、高新技术产业、新能源产业发展相一致,可见支持向量机在新经济增长点选择中的可行性和可靠性. 相似文献

6.

A graphical model selection tool for mixed models

M. Sciandra A. Plaia 《统计学通讯:模拟与计算》2013,42(9):2624-2638

ABSTRACT

Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing the residual random variance ratio: a cross-evaluation of the two graphical representations will allow to derive some conclusions on the random part specification of the model and a more accurate selection of the final model. 相似文献

7.

A class of weighted normal distributions and its variants useful for inequality constrained analysis

Hea-Jung Kim 《Statistics》2013,47(5):421-441

This article develops a class of the weighted normal distributions for which the probability density function has the form of a product of a normal density and a weight function. The class constitutes marginal distributions obtained from various kinds of doubly truncated bivariate normal distributions. This class of distributions strictly includes the normal, skew–normal and two-piece skew–normal and is useful for selection modelling and inequality constrained normal mean analysis. Some distributional properties and Bayesian perspectives of the class are given. Probabilistic representation of the distributions is also given. The representation is shown to be straightforward to specify distribution and to implement computation, with output readily adapted for required analysis. Necessary theories and illustrative examples are provided. 相似文献

8.

Statistical pattern recognition in image analysis

J. Kittler 《Journal of applied statistics》1994,21(1):61-75

Many tasks in image analysis can be formulated as problems of discrimination or, generally, of pattern recognition. A pattern-recognition system is normally considered to comprise two processing stages: the feature selection and extraction stage, which attempts to reduce the dimensionality of the pattern to be classified, and the classification stage, the purpose of which is to assign the pattern into its perceptually meaningful category. This paper gives an overview of the various approaches to designing statistical pattern recognition schemes. The problem of feature selection and extraction is introduced. The discussion then focuses on statistical decision theoretic rules and their implementation. Both parametric and non-parametric classification methods are covered. The emphasis then switches to decision making in context. Two basic formulations of contextual pattern classification are put forward, and the various methods developed from these two formulations are reviewed. These include the method of hidden Markov chains, the Markov random field approach, Markov meshes, and probabilistic and discrete relaxation. 相似文献

9.

Statistical pattern recognition in image analysis

J. Kittler 《Journal of applied statistics》1994,21(1-2):61-75

Many tasks in image analysis can be formulated as problems of discrimination or, generally, of pattern recognition. A pattern-recognition system is normally considered to comprise two processing stages: the feature selection and extraction stage, which attempts to reduce the dimensionality of the pattern to be classified, and the classification stage, the purpose of which is to assign the pattern into its perceptually meaningful category. This paper gives an overview of the various approaches to designing statistical pattern recognition schemes. The problem of feature selection and extraction is introduced. The discussion then focuses on statistical decision theoretic rules and their implementation. Both parametric and non-parametric classification methods are covered. The emphasis then switches to decision making in context. Two basic formulations of contextual pattern classification are put forward, and the various methods developed from these two formulations are reviewed. These include the method of hidden Markov chains, the Markov random field approach, Markov meshes, and probabilistic and discrete relaxation. 相似文献

10.

Post-hoc selection of covariates in randomized experiments

Mark D. Schelchter Alan B. Forsythe 《统计学通讯:理论与方法》2013,42(3):679-699

Monte Carlo methods are used to compere a number of adaptive strategies for deciding which of several covariates to incorporate into the analysis of a randomized experiment.Sixteen selection strategies in three categories are considered: 1)select covariates correlated with the response, 2)select covariates with means differing across groups, and 3)select covariates with means differing across groups that are also correlated with the response. The criteria examined are the type I error rate of the test for equality of adjusted group means and the variance of the estimated treatment effect. These strategies can result in either inflated or deflated type I errors, depending on the method and the population parameters. The adaptive methods in the first category some times yieldpoint estimates of the treatment effect more precise than estimators derive dusing either all or none of the covariates. 相似文献

11.

Reliability models for categorical data

Max R. Mickey Claude O. Archer 《统计学通讯:理论与方法》2013,42(15):1851-1869

As assumed hypothetical consensus category corresponding to a case being classified provides a basis for assessment of reliability of judges. Equivalent judges are characterised by the joint probability distribution of the judge assignment and the consensus category. Estimates of the conditional probabilities of judge assignment given consensus category and of consensus category given judge assignments are indices of reliability. All parameters can be estimated if data include classifications of a number of cases by 3 or more judges. Restrictive assumptions are imposed to obtain models for data from classifications by two judges. Maximum likelihood estimation is discussed and illustrated by example for the 3 or more judges case. 相似文献

12.

Dealing with big data: comparing dimension reduction and shrinkage regression methods

Hamideh D. Hamedani Sara Sadat Moosavi 《Journal of applied statistics》2017,44(3):511-532

In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data. 相似文献

13.

The Bayesian elastic net regression

Rahim Alhamzawi Haithem Taha Mohammad Ali 《统计学通讯:模拟与计算》2018,47(4):1168-1178

A Bayesian elastic net approach is presented for variable selection and coefficient estimation in linear regression models. A simple Gibbs sampling algorithm was developed for posterior inference using a location-scale mixture representation of the Bayesian elastic net prior for the regression coefficients. The penalty parameters are chosen through an empirical method that maximizes the data marginal likelihood. Both simulated and real data examples show that the proposed method performs well in comparison to the other approaches. 相似文献

14.

A factor model approach for the joint segmentation with between‐series correlation

Xavier Collilieux Emilie Lebarbier Stphane Robin 《Scandinavian Journal of Statistics》2019,46(3):686-705

We consider the detection of changes in the mean of a set of time series. The breakpoints are allowed to be series specific, and the series are assumed to be correlated. The correlation between the series is supposed to be constant along time but is allowed to take an arbitrary form. We show that such a dependence structure can be encoded in a factor model. Thanks to this representation, the inference of the breakpoints can be achieved via dynamic programming, which remains one the most efficient algorithms. We propose a model selection procedure to determine both the number of breakpoints and the number of factors. This proposed method is implemented in the FASeg R package, which is available on the CRAN. We demonstrate the performances of our procedure through simulation experiments and present an application to geodesic data. 相似文献

15.

Fast computation of spatially adaptive kernel estimates

Tilman M. Davies Adrian Baddeley 《Statistics and Computing》2018,28(4):937-956

Kernel smoothing of spatial point data can often be improved using an adaptive, spatially varying bandwidth instead of a fixed bandwidth. However, computation with a varying bandwidth is much more demanding, especially when edge correction and bandwidth selection are involved. This paper proposes several new computational methods for adaptive kernel estimation from spatial point pattern data. A key idea is that a variable-bandwidth kernel estimator for d-dimensional spatial data can be represented as a slice of a fixed-bandwidth kernel estimator in \((d+1)\)-dimensional scale space, enabling fast computation using Fourier transforms. Edge correction factors have a similar representation. Different values of global bandwidth correspond to different slices of the scale space, so that bandwidth selection is greatly accelerated. Potential applications include estimation of multivariate probability density and spatial or spatiotemporal point process intensity, relative risk, and regression functions. The new methods perform well in simulations and in two real applications concerning the spatial epidemiology of primary biliary cirrhosis and the alarm calls of capuchin monkeys. 相似文献

16.

A Simple Approximate Procedure for Constructing Binomial and Poisson Tolerance Intervals

K. Krishnamoorthy Yanping Xia Fang Xie 《统计学通讯:理论与方法》2013,42(12):2243-2258

The problems of constructing tolerance intervals for the binomial and Poisson distributions are considered. Closed-form approximate equal-tailed tolerance intervals (that control percentages in both tails) are proposed for both distributions. Exact coverage probabilities and expected widths are evaluated for the proposed equal-tailed tolerance intervals and the existing intervals. Furthermore, an adjustment to the nominal confidence level is suggested so that an equal-tailed tolerance interval can be used as a tolerance interval which includes a specified proportion of the population, but does not necessarily control percentages in both tails. Comparison of such coverage-adjusted tolerance intervals with respect to coverage probabilities and expected widths indicates that the closed-form approximate tolerance intervals are comparable with others, and less conservative, with minimum coverage probabilities close to the nominal level in most cases. The approximate tolerance intervals are simple and easy to compute using a calculator, and they can be recommended for practical applications. The methods are illustrated using two practical examples. 相似文献

17.

ACCELERATED FAILURE TIME MODELS WITH NONLINEAR COVARIATES EFFECTS

Chenlei Leng Shuangge Ma 《Australian & New Zealand Journal of Statistics》2007,49(2):155-172

As a flexible alternative to the Cox model, the accelerated failure time (AFT) model assumes that the event time of interest depends on the covariates through a regression function. The AFT model with non‐parametric covariate effects is investigated, when variable selection is desired along with estimation. Formulated in the framework of the smoothing spline analysis of variance model, the proposed method based on the Stute estimate ( Stute, 1993 [Consistent estimation under random censorship when covariables are present, J. Multivariate Anal. 45 , 89–103]) can achieve a sparse representation of the functional decomposition, by utilizing a reproducing kernel Hilbert norm penalty. Computational algorithms and theoretical properties of the proposed method are investigated. The finite sample size performance of the proposed approach is assessed via simulation studies. The primary biliary cirrhosis data is analyzed for demonstration. 相似文献

18.

On a threshold representation for complex load-sharing systems

Shuang Li 《Journal of statistical planning and inference》2011,141(8):2811-2823

Complex load-sharing systems are studied to incorporate dependencies among components through a load-sharing rule. As the load on the system increases, a series of cycles of Phase I/II failures occur where Phase I failure is a single component failure, which then causes a cascade of component failures (Phase II) due to the load transfer as these components fail. A threshold representation for the process of system failure is given. This representation is a gamma-type mixture representation when the component strengths are independent exponentials. In this case, for a given breaking pattern the mixture is over the gamma scale parameter and is based on a convolution of uniforms defined by the load-sharing parameters. Such convolutions can be approximated by normal densities which reduces the dimension of the parameter space. This representation can be generalized to independent component strengths with arbitrary distributions by transforming the strength and load-sharing to pseudo-strength and pseudo-load-sharing rules. 相似文献

19.

Bayesian analysis of semiparametric Bernstein polynomial regression models for data with sample selection

Hea-Jung Kim Taeyoung Roh 《Statistics》2013,47(5):1082-1111

In regression analysis, a sample selection scheme often applies to the response variable, which results in missing not at random observations on the variable. In this case, a regression analysis using only the selected cases would lead to biased results. This paper proposes a Bayesian methodology to correct this bias based on a semiparametric Bernstein polynomial regression model that incorporates the sample selection scheme into a stochastic monotone trend constraint, variable selection, and robustness against departures from the normality assumption. We present the basic theoretical properties of the proposed model that include its stochastic representation, sample selection bias quantification, and hierarchical model specification to deal with the stochastic monotone trend constraint in the nonparametric component, simple bias corrected estimation, and variable selection for the linear components. We then develop computationally feasible Markov chain Monte Carlo methods for semiparametric Bernstein polynomial functions with stochastically constrained parameter estimation and variable selection procedures. We demonstrate the finite-sample performance of the proposed model compared to existing methods using simulation studies and illustrate its use based on two real data applications. 相似文献

20.

A Class of Multivariate Bilateral Selection t Distributions and Its Properties

Hea-Jung Kim 《统计学通讯:理论与方法》2013,42(12):2136-2154

This article proposes a class of multivariate bilateral selection t distributions useful for analyzing non-normal (skewed and/or bimodal) multivariate data. The class is associated with a bilateral selection mechanism, and it is obtained from a marginal distribution of the centrally truncated multivariate t. It is flexible enough to include the multivariate t and multivariate skew-t distributions and mathematically tractable enough to account for central truncation of a hidden t variable. The class, closed under linear transformation, marginal, and conditional operations, is studied from several aspects such as shape of the probability density function, conditioning of a distribution, scale mixtures of multivariate normal, and a probabilistic representation. The relationships among these aspects are given, and various properties of the class are also discussed. Necessary theories and two applications are provided. 相似文献