期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian model comparison based on expected posterior priors for discrete decomposable graphical models

Guido Consonni Monia Lupparelli 《Journal of statistical planning and inference》2009,139(12):4154-4164

The implementation of the Bayesian paradigm to model comparison can be problematic. In particular, prior distributions on the parameter space of each candidate model require special care. While it is well known that improper priors cannot be routinely used for Bayesian model comparison, we claim that also the use of proper conventional priors under each model should be regarded as suspicious, especially when comparing models having different dimensions. The basic idea is that priors should not be assigned separately under each model; rather they should be related across models, in order to acquire some degree of compatibility, and thus allow fairer and more robust comparisons. In this connection, the intrinsic prior as well as the expected posterior prior (EPP) methodology represent a useful tool. In this paper we develop a procedure based on EPP to perform Bayesian model comparison for discrete undirected decomposable graphical models, although our method could be adapted to deal also with directed acyclic graph models. We present two possible approaches. One based on imaginary data, and one which makes use of a limited number of actual data. The methodology is illustrated through the analysis of a 2×3×4 contingency table. 相似文献

2.

Copula directed acyclic graphs

Eugen Pircalabelu Gerda Claeskens Irène Gijbels 《Statistics and Computing》2017,27(1):55-78

A new methodology for selecting a Bayesian network for continuous data outside the widely used class of multivariate normal distributions is developed. The ‘copula DAGs’ combine directed acyclic graphs and their associated probability models with copula C/D-vines. Bivariate copula densities introduce flexibility in the joint distributions of pairs of nodes in the network. An information criterion is studied for graph selection tailored to the joint modeling of data based on graphs and copulas. Examples and simulation studies show the flexibility and properties of the method. 相似文献

3.

The stochastic topic block model for the clustering of vertices in networks with textual edges

C. Bouveyron P. Latouche R. Zreik 《Statistics and Computing》2018,28(1):11-31

Due to the significant increase of communications between individuals via social media (Facebook, Twitter, Linkedin) or electronic formats (email, web, e-publication) in the past two decades, network analysis has become an unavoidable discipline. Many random graph models have been proposed to extract information from networks based on person-to-person links only, without taking into account information on the contents. This paper introduces the stochastic topic block model, a probabilistic model for networks with textual edges. We address here the problem of discovering meaningful clusters of vertices that are coherent from both the network interactions and the text contents. A classification variational expectation-maximization algorithm is proposed to perform inference. Simulated datasets are considered in order to assess the proposed approach and to highlight its main features. Finally, we demonstrate the effectiveness of our methodology on two real-word datasets: a directed communication network and an undirected co-authorship network. 相似文献

4.

Criteria for Linear Model Selection Based on Kullback's Symmetric Divergence 总被引：1，自引：0，他引：1

Joseph E. Cavanaugh 《Australian & New Zealand Journal of Statistics》2004,46(2):257-274

Model selection criteria are frequently developed by constructing estimators of discrepancy measures that assess the disparity between the 'true' model and a fitted approximating model. The Akaike information criterion (AIC) and its variants result from utilizing Kullback's directed divergence as the targeted discrepancy. The directed divergence is an asymmetric measure of separation between two statistical models, meaning that an alternative directed divergence can be obtained by reversing the roles of the two models in the definition of the measure. The sum of the two directed divergences is Kullback's symmetric divergence. In the framework of linear models, a comparison of the two directed divergences reveals an important distinction between the measures. When used to evaluate fitted approximating models that are improperly specified, the directed divergence which serves as the basis for AIC is more sensitive towards detecting overfitted models, whereas its counterpart is more sensitive towards detecting underfitted models. Since the symmetric divergence combines the information in both measures, it functions as a gauge of model disparity which is arguably more balanced than either of its individual components. With this motivation, the paper proposes a new class of criteria for linear model selection based on targeting the symmetric divergence. The criteria can be regarded as analogues of AIC and two of its variants: 'corrected' AIC or AICc and 'modified' AIC or MAIC. The paper examines the selection tendencies of the new criteria in a simulation study and the results indicate that they perform favourably when compared to their AIC analogues. 相似文献

5.

NFL Draftnikology: Euclidean Metrics and Other Approaches to Scoring Ranking Predictions

Steven B. Caudill Franklin G. Mixon JR. C. Paul Mixon 《统计学通讯:模拟与计算》2013,42(2):237-248

With viewership of NFL (National Football League) football in the US rising above 20 million, interest in the NFL Draft has also been at all-time highs in recent years. Much of this interest is directed toward the “NFL draftniks” who provide draft predictions—so-called “mock drafts”—leading up to the NFL Draft. Despite increasing interest in “NFL draftnikology,” the scoring methodology used to evaluate mock NFL drafts lags far behind. This study offers a few alternative approaches, including a Euclidean metrics approach to evaluating mock NFL drafts. The usefulness of our methodologies extends to evaluation of economic and financial analysts. 相似文献

6.

De gustibus non est disputandum? A new approach to the estimation of equivalence scales

Mauro Maltagliati Gustavo De Santis 《Statistical Methods and Applications》2001,10(1-3):211-236

We argue that when the household composition changes, consumption patterns vary not only because of the cost effect that equivalence scales try to measure, but also because of a “taste” or “style” effect. This effect can be identified and measured, under a few assumptions, with the use of a new methodology, calledDM ² (Decomposition Model of the effects of Demographic Metamorphosis), that can be viewed as a generalisation of Ray's (1983) price-scaling approach to the construction of equivalent scales. An empirical application to data drawn from the Istat 1995 Italian Household Budget Survey suggests that the proposed method improves our understanding of households' consumption patterns and the reliability of the equivalence scales that we derive. We gratefully acknowledge helpful comments from Giorgio Calzolari, Franco Polverini, Ugo Trivellato and an anonymous referee, but retain full responsibility for all errors, and for the processing of Istat (Italian Institute of Statistics) microdata on Household Budgets. Financial support for this research was provided by the italian MURST (Research project on “Equivalence scales” directed by Prof. Guido Ferrari, University of Firenze, Ref. No. 9913105354; and Research project on “Low fertility in Italy: between economic constraints and value changes”, directed by Prof. Massimo Livi Bacci, University of Firenze, Ref. No. MM13107238). Preliminary findings on this research topic have been presented in a few seminars and conferences: cf., e.g., De Santis and Maltagliati (2000 and 2001). 相似文献

7.

The Effects of Rounding on Likelihood Procedures

L. Pace A. Salvan L. Ventura 《Journal of applied statistics》2004,31(1):29-48

The aim of this paper is to investigate the robustness properties of likelihood inference with respect to rounding effects. Attention is focused on exponential families and on inference about a scalar parameter of interest, also in the presence of nuisance parameters. A summary value of the influence function of a given statistic, the local-shift sensitivity, is considered. It accounts for small fluctuations in the observations. The main result is that the local-shift sensitivity is bounded for the usual likelihood-based statistics, i.e. the directed likelihood, the Wald and score statistics. It is also bounded for the modified directed likelihood, which is a higher-order adjustment of the directed likelihood. The practical implication is that likelihood inference is expected to be robust with respect to rounding effects. Theoretical analysis is supplemented and confirmed by a number of Monte Carlo studies, performed to assess the coverage probabilities of confidence intervals based on likelihood procedures when data are rounded. In addition, simulations indicate that the directed likelihood is less sensitive to rounding effects than the Wald and score statistics. This provides another criterion for choosing among first-order equivalent likelihood procedures. The modified directed likelihood shows the same robustness as the directed likelihood, so that its gain in inferential accuracy does not come at the price of an increase in instability with respect to rounding. 相似文献

8.

Joint regression modeling for missing categorical covariates in generalized linear models

Luis Carlos Pérez-Ruiz Gabriel Escarela 《Journal of applied statistics》2018,45(15):2741-2759

Missing covariates data is a common issue in generalized linear models (GLMs). A model-based procedure arising from properly specifying joint models for both the partially observed covariates and the corresponding missing indicator variables represents a sound and flexible methodology, which lends itself to maximum likelihood estimation as the likelihood function is available in computable form. In this paper, a novel model-based methodology is proposed for the regression analysis of GLMs when the partially observed covariates are categorical. Pair-copula constructions are used as graphical tools in order to facilitate the specification of the high-dimensional probability distributions of the underlying missingness components. The model parameters are estimated by maximizing the weighted log-likelihood function by using an EM algorithm. In order to compare the performance of the proposed methodology with other well-established approaches, which include complete-cases and multiple imputation, several simulation experiments of Binomial, Poisson and Normal regressions are carried out under both missing at random and non-missing at random mechanisms scenarios. The methods are illustrated by modeling data from a stage III melanoma clinical trial. The results show that the methodology is rather robust and flexible, representing a competitive alternative to traditional techniques. 相似文献

9.

A Bayesian Analysis for Accelerated Lifetime Tests Under an Exponential Power Law Model with Threshold Stress

Cynthia Tojeiro Francisco Louzada-Neto Heleno Bolfarine 《Journal of applied statistics》2004,31(6):685-691

In this paper, we present a Bayesian methodology for modelling accelerated lifetime tests under a stress response relationship with a threshold stress. Both Laplace and MCMC methods are considered. The methodology is described in detail for the case when an exponential distribution is assumed to express the behaviour of lifetimes, and a power law model with a threshold stress is assumed as the stress response relationship. We assume vague but proper priors for the parameters of interest. The methodology is illustrated by a accelerated failure test on an electrical insulation film. 相似文献

10.

Comparing treatment strategies using a synthesized clinical trial: an analysis of late versus early use of trimethoprim-sulfamethoxazole for AIDS patients

《Journal of statistical planning and inference》2001,96(1):179-189

This paper applies methodology of Finkelstein and Schoenfeld [Stat. Med. 13 (1994) 1747.] to consider new treatment strategies in a synthetic clinical trial. The methodology is an approach for estimating survival functions as a composite of subdistributions defined by an auxiliary event which is intermediate to the failure. The subdistributions are usually calculated utilizing all subjects in a study, by taking into account the path determined by each individual's auxiliary event. However, the method can be used to get a composite estimate of failure from different subpopulations of patients. We utilize this application of the methodology to test a new treatment strategy, that changes therapy at later stages of disease, by combining subdistributions from different treatment arms of a clinical trial that was conducted to test therapies for prevention of pneumocystis carinii pneumonia. 相似文献

11.

Validation of a longitudinally measured surrogate marker for a time-to-event endpoint

Didier Renard Helena Geys Geert Molenberghs Tomasz Burzykowski Marc Buyse Tony Vangeneugden Luc Bijnens 《Journal of applied statistics》2003,30(2):235-247

The objective of this paper is to extend the surrogate endpoint validation methodology proposed by Buyse et al. (2000) to the case of a longitudinally measured surrogate marker when the endpoint of interest is time to some key clinical event. A joint model for longitudinal and event time data is required. To this end, the model formulation of Henderson et al. (2000) is adopted. The methodology is applied to a set of two randomized clinical trials in advanced prostate cancer to evaluate the usefulness of prostate-specific antigen (PSA) level as a surrogate for survival. 相似文献

12.

Time-discrete beta-process model for interval-censored survival data

Debajyoti Sinha 《Revue canadienne de statistique》1997,25(4):445-456

Grouped survival data with possible interval censoring arise in a variety of settings. This paper presents nonparametric Bayes methods for the analysis of such data. The random cumulative hazard, common to every subject, is assumed to be a realization of a Lévy process. A time-discrete beta process, introduced by Hjort, is considered for modeling the prior process. A sampling-based Monte Carlo algorithm is used to find posterior estimates of several quantities of interest. The methodology presented here is used to check further modeling assumptions. Also, the methodology developed in this paper is illustrated with data for the times to cosmetic deterioration of breast-cancer patients. An extension of the methodology is presented to deal with two interval-censored times in tandem data (as with some AIDS incubation data). 相似文献

13.

Profile likelihood in directed graphical models from BUGS output

Malene Højbjerre 《Statistics and Computing》2003,13(1):57-66

We present a method for using posterior samples produced by the computer program BUGS (Bayesian inference Using Gibbs Sampling) to obtain approximate profile likelihood functions of parameters or functions of parameters in directed graphical models with incomplete data. The method can also be used to approximate integrated likelihood functions. It is easily implemented and it performs a good approximation. The profile likelihood represents an aspect of the parameter uncertainty which does not depend on the specification of prior distributions, and it can be used as a worthwhile supplement to BUGS that enable us to do both Bayesian and likelihood based analyses in directed graphical models. 相似文献

14.

Automatic Bayesian curve fitting

D. G. T. Denison B. K. Mallick & A. F. M. Smith 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(2):333-350

A method of estimating a variety of curves by a sequence of piecewise polynomials is proposed, motivated by a Bayesian model and an appropriate summary of the resulting posterior distribution. A joint distribution is set up over both the number and the position of the knots defining the piecewise polynomials. Throughout we use reversible jump Markov chain Monte Carlo methods to compute the posteriors. The methodology has been successful in giving good estimates for 'smooth' functions (i.e. continuous and differentiable) as well as functions which are not differentiable, and perhaps not even continuous, at a finite number of points. The methodology is extended to deal with generalized additive models. 相似文献

15.

Econometric methodology and the philosophy of science

《Journal of statistical planning and inference》1996,49(1):9-37

Econometricians have generally used the term ‘methodology’ to be synonymous with ‘methods’ and, consequently, the field of econometric methodology has been dominated by the discussion of econometric techniques. The purpose of this paper is to present an alternative perspective on econometric methodology by relating it to the more general field of economic methodology, particularly through the use of concepts drawn from the philosophy of science. Definitional and conceptual issues surrounding the term ‘methodology’ are clarified. Three methodologies, representing abstractions from the actual approaches found within econometrics, are identified. First, an ‘a priorist’ methodology, which tends to accord axiomatic status to economic theory, is outlined, and the philosophical foundations of this approach are explored with reference to the interpretive strand within the philosophy of the social sciences. A second approach is an ‘instrumentalist’ one emphasising prediction as the primary goal of econometrics, and a third methodology is ‘falsificationism’, which attempts to test economic theories. These are critically evaluated by introducing relevant issues from the philosophy of science, so that the taxonomy presented here can serve as a framework for future discussions of econometric methodology. 相似文献

16.

Total error in PES estimates of population

Mulry MH Spencer BD 《Journal of the American Statistical Association》1991,86(416):839-863

"We describe a methodology for estimating the accuracy of dual systems estimates (DSE's) of population, census estimates of population, and estimates of undercount in the census. The DSE's are based on the census and a post-enumeration survey (PES). We apply the methodology to the 1988 dress rehearsal census of St. Louis and east-central Missouri and we discuss its applicability to the 1990 [U.S.] census and PES. The methodology is based on decompositions of the total (or net) error into components, such as sampling error, matching error, and other nonsampling errors. Limited information about the accuracy of certain components of error, notably failure of assumptions in the 'capture-recapture' model, but others as well, lead us to offer tentative estimates of the errors of the census, DSE, and undercount estimates for 1988. Improved estimates are anticipated for 1990." Comments are included by Eugene P. Ericksen and Joseph B. Kadane (pp. 855-7) and Kenneth W. Wachter and Terence P. Speed (pp. 858-61), as well as a rejoinder by Mulry and Spencer (pp. 861-3). 相似文献

17.

An application of multinomial logistic regression to estimating performance of a multiple-screening test with incomplete verification

Chris J. Lloyd Donald J. Frommer 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(1):89-102

Summary. The paper describes a method of estimating the performance of a multiple-screening test where those who test negatively do not have their true disease status determined. The methodology is motivated by a data set on 49927 subjects who were given K =6 binary tests for bowel cancer. A complicating factor is that individuals may have polyps in the bowel, a condition that the screening test is not designed to detect but which may be worth diagnosing. The methodology is based on a multinomial logit model for Pr( S | R ₆), the probability distribution of patient status S (healthy, polyps or diseased) conditional on the results R ₆ from six binary tests. An advantage of the methodology described is that the modelling is data driven. In particular, we require no assumptions about correlation within subjects, the relative sensitivity of the K tests or the conditional independence of the tests. The model leads to simple estimates of the trade-off between different errors as the number of tests is varied, presented graphically by using receiver operating characteristic curves. Finally, the model allows us to estimate better protocols for assigning subjects to the disease group, as well as the gains in accuracy from these protocols. 相似文献

18.

An Exponential-Family Multidimensional Scaling Mixture Methodology

Michel Wedel Wayne S. Desarbo 《商业与经济统计学杂志》2013,31(4):447-459

A multidimensional scaling methodology (STUNMIX) for the analysis of subjects' preference/choice of stimuli that sets out to integrate the previous work in this area into a single framework, as well as to provide a variety of new options and models, is presented. Locations of the stimuli and the ideal points of derived segments of subjects on latent dimensions are estimated simultaneously. The methodology is formulated in the framework of the exponential family of distributions, whereby a wide range of different data types can be analyzed. Possible reparameterizations of stimulus coordinates by stimulus characteristics, as well as of probabilities of segment membership by subject background variables, are permitted. The models are estimated in a maximum likelihood framework. The performance of the models is demonstrated on synthetic data, and robustness is investigated. An empirical application is provided, concerning intentions to buy portable telephones. 相似文献

19.

Using instrumental variables for selecting the order of arma models

Fahimeh Rezayat Anandalingam G 《统计学通讯:理论与方法》2013,42(9):3029-3065

A methodology is developed for selecting the order of an ARMA representation of a short realization. The methodology is based on an extension of the Instrumental Variables technique and its theoretical logic is supported by the characteristic of extended Yule-Walker equations and Toeplitz matrices. The methodology is a modification of the Cormer Method and tries to identify a set of orders instead of a single order. The strength of the methodology is evaluated by comparing its numerical findings with that from the Corner Method and the Extended Sample Autocorrelation Function Method. The numerical results imply that (i) the proposed method performs, on the average better than the Corner Method and both methods outperform Extended Sample Autocorrelation Function method, and (ii) the selection of a set of orders provides more reliable results than the selection of a single order. 相似文献

20.

A monte carlo study of robust and least squares response surface methods

《Journal of Statistical Computation and Simulation》2012,82(1-3):1-18

Response surface methodology is useful for exploring a response over a region of factor space and in searching for extrema. Its generality, makes it applicable to a variety of areas. Classical response surface methodology for a continuous response variable is generally based on least squares fitting. The sensitivity of least squares to outlying observations carries over to the surface procedures. To overcome this sensitivity, we propose response surface methodology based on robust procedures for continuous response variables. This robust methodology is analogous to the methodology based on least squares, while being much less sensitive to outlying observations. The results of a Monte Carlo study comparing it and classical surface methodologies for normal and contaminated normal errors are presented. The results show that as the proportion of contamination increases, the robust methodology correctly identifies a higher proportion of extrema than the least squares methods and that the robust estimates of extrema tend to be closer to the true extrema than the least squares methods. 相似文献