期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Generic Identifiability of Linear Structural Equation Models by Ancestor Decomposition

Mathias Drton Luca Weihs 《Scandinavian Journal of Statistics》2016,43(4):1035-1045

Linear structural equation models, which relate random variables via linear interdependencies and Gaussian noise, are a popular tool for modelling multivariate joint distributions. The models correspond to mixed graphs that include both directed and bidirected edges representing the linear relationships and correlations between noise terms, respectively. A question of interest for these models is that of parameter identifiability, whether or not it is possible to recover edge coefficients from the joint covariance matrix of the random variables. For the problem of determining generic parameter identifiability, we present an algorithm building upon the half‐trek criterion. Underlying our new algorithm is the idea that ancestral subsets of vertices in the graph can be used to extend the applicability of a decomposition technique. 相似文献

2.

Estimating class-specific parametric models using finite mixtures: an application to a hedonic model of wine prices

Steven B. Caudill 《Journal of applied statistics》2016,43(7):1253-1261

Hedonic price models are commonly used in the study of markets for various goods, most notably those for wine, art, and jewelry. These models were developed to estimate implicit prices of product attributes within a given product class, where in the case of some goods, such as wine, substantial product differentiation exists. To address this issue, recent research on wine prices employs local polynomial regression clustering (LPRC) for estimating regression models under class uncertainty. This study demonstrates that a superior empirical approach – estimation of a mixture model – is applicable to a hedonic model of wine prices, provided only that the dependent variable in the model is rescaled. The present study also catalogues several of the advantages over LPRC modeling of estimating mixture models. 相似文献

3.

On analysis of binary response data in longitudinal factorial studies

Chunpeng Fan 《Journal of Statistical Computation and Simulation》2017,87(1):100-122

Binary data are commonly used as responses to assess the effects of independent variables in longitudinal factorial studies. Such effects can be assessed in terms of the rate difference (RD), the odds ratio (OR), or the rate ratio (RR). Traditionally, the logistic regression seems always a recommended method with statistical comparisons made in terms of the OR. Statistical inference in terms of the RD and RR can then be derived using the delta method. However, this approach is hard to realize when repeated measures occur. To obtain statistical inference in longitudinal factorial studies, the current article shows that the mixed-effects model for repeated measures, the logistic regression for repeated measures, the log-transformed regression for repeated measures, and the rank-based methods are all valid methods that lead to inference in terms of the RD, OR, and RR, respectively. Asymptotic linear relationships between the estimators of the regression coefficients of these models are derived when the weight (working covariance) matrix is an identity matrix. Conditions for the Wald-type tests to be asymptotically equivalent in these models are provided and powers were compared using simulation studies. A phase III clinical trial is used to illustrate the investigated methods with corresponding SAS® code supplied. 相似文献

4.

Bayeslan Credit Ratings

Paola Cerchiello Paolo Guidici 《统计学通讯:理论与方法》2014,43(4):867-878

In this contribution we aim at improving ordinal variable selection in the context of causal models for credit risk estimation. In this regard, we propose an approach that provides a formal inferential tool to compare the explanatory power of each covariate and, therefore, to select an effective model for classification purposes. Our proposed model is Bayesian nonparametric thus keeps the amount of model specification to a minimum. We consider the case in which information from the covariates is at the ordinal level. A noticeable instance of this regards the situation in which ordinal variables result from rankings of companies that are to be evaluated according to different macro and micro economic aspects, leading to ordinal covariates that correspond to various ratings, that entail different magnitudes of the probability of default. For each given covariate, we suggest to partition the statistical units in as many groups as the number of observed levels of the covariate. We then assume individual defaults to be homogeneous within each group and heterogeneous across groups. Our aim is to compare and, therefore select, the partition structures resulting from the consideration of different explanatory covariates. The metric we choose for variable comparison is the calculation of the posterior probability of each partition. The application of our proposal to a European credit risk database shows that it performs well, leading to a coherent and clear method for variable averaging of the estimated default probabilities. 相似文献

5.

A spatial analysis on Italian unemployment differences

Maria Francesca Cracolici Miranda Cuffaro Peter Nijkamp 《Statistical Methods and Applications》2009,18(2):275-291

Using spatial econometric models, this paper focuses attention on the spatial structure of provincial unemployment disparities of Italian provinces for the year 2003. On the basis of findings from the economic literature and of the available socio-economic data, various model specifications including supply- and demand-side variables are tested. Further we use ESDA analysis as equivalent to integration analysis on time series; therefore it is applied on each variable, dependent and independent, involved in the statistical model. The suggestions of ESDA lead us to the most adequate statistical model, which estimates indicate that there is a significant degree of neighbouring effect (i.e. positive spatial correlation) among labour markets at the provincial level in Italy; this effect is present notwithstanding we controlled for local characteristics. The unemployment shows a polarized spatial pattern that is strongly connected to labour demand and to a much lesser extent to the share of young population and economic structural composition. 相似文献

6.

Variable selection strategies in survival models with multiple imputations

Vonta F Karagrigoriou A 《Lifetime data analysis》2007,13(3):295-315

In this paper, the variable selection strategies (criteria) are thoroughly discussed and their use in various survival models is investigated. The asymptotic efficiency property, in the sense of Shibata Ann Stat 8: 147-164, 1980, of a class of variable selection strategies which includes the AIC and all criteria equivalent to it, is established for a general class of survival models, such as parametric frailty or transformation models and accelerated failure time models, under minimum conditions. Furthermore, a multiple imputations method is proposed which is found to successfully handle censored observations and constitutes a competitor to existing methods in the literature. A number of real and simulated data are used for illustrative purposes. 相似文献

7.

On-line expectation–maximization algorithm for latent data models

Olivier Cappé Eric Moulines 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(3):593-613

Summary. We propose a generic on-line (also sometimes called adaptive or recursive) version of the expectation–maximization (EM) algorithm applicable to latent variable models of independent observations. Compared with the algorithm of Titterington, this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete-data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback–Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e. that of the maximum likelihood estimator. In addition, the approach proposed is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model. 相似文献

8.

Homogeneity testing under finite location-scale mixtures

Jiahua Chen Pengfei Li Guanfu Liu 《Revue canadienne de statistique》2020,48(4):670-684

The testing problem for the order of finite mixture models has a long history and remains an active research topic. Since Ghosh & Sen (1985) revealed the hard-to-manage asymptotic properties of the likelihood ratio test, many successful alternative approaches have been developed. The most successful attempts include the modified likelihood ratio test and the EM-test, which lead to neat solutions for finite mixtures of univariate normal distributions, finite mixtures of single-parameter distributions, and several mixture-like models. The problem remains challenging, and there is still no generic solution for location-scale mixtures. In this article, we provide an EM-test solution for homogeneity for finite mixtures of location-scale family distributions. This EM-test has nonstandard limiting distributions, but we are able to find the critical values numerically. We use computer experiments to obtain appropriate values for the tuning parameters. A simulation study shows that the fine-tuned EM-test has close to nominal type I errors and very good power properties. Two application examples are included to demonstrate the performance of the EM-test. 相似文献

9.

Maximum likelihood estimates in the multivariate normal with patterned mean and covariance via the em algorithm

Dal ton F Andrade Ronald W Helms 《统计学通讯:理论与方法》2013,42(18):2239-2251

The maximum likelihood equations for a multivariate normal model with structured mean and structured covariance matrix may not have an explicit solution. In some cases the model's error term may be decomposed as the sum of two independent error terms, each having a patterned covariance matrix, such that if one of the unobservable error terms is artificially treated as "missing data", the EM algorithm can be used to compute the maximum likelihood estimates for the original problem. Some decompositions produce likelihood equations which do not have an explicit solution at each iteration of the EM algorithm, but within-iteration explicit solutions are shown for two general classes of models including covariance component models used for analysis of longitudinal data. 相似文献

10.

Imputation of Household Survey Data Using Linear Mixed Models

下载免费PDF全文

Luise Patricia Lago Robert Graham Clark 《Australian & New Zealand Journal of Statistics》2015,57(2):169-187

Mixed models are regularly used in the analysis of clustered data, but are only recently being used for imputation of missing data. In household surveys where multiple people are selected from each household, imputation of missing values should preserve the structure pertaining to people within households and should not artificially change the apparent intracluster correlation (ICC). This paper focuses on the use of multilevel models for imputation of missing data in household surveys. In particular, the performance of a best linear unbiased predictor for both stochastic and deterministic imputation using a linear mixed model is compared to imputation based on a single level linear model, both with and without information about household respondents. In this paper an evaluation is carried out in the context of imputing hourly wage rate in the Household, Income and Labour Dynamics of Australia Survey. Nonresponse is generated under various assumptions about the missingness mechanism for persons and households, and with low, moderate and high intra‐household correlation to assess the benefits of the multilevel imputation model under different conditions. The mixed model and single level model with information about the household respondent lead to clear improvements when the ICC is moderate or high, and when there is informative missingness. 相似文献

11.

Location-scale mixed models and goodness-of-fit assessment applied to insect ecology

R. A. Moral J. Hinde E. M. M. Ortega C. G. B. Demtrio W. A. C. Godoy 《Journal of applied statistics》2020,47(10):1776

Survival models have been extensively used to analyse time-until-event data. There is a range of extended models that incorporate different aspects, such as overdispersion/frailty, mixtures, and flexible response functions through semi-parametric models. In this work, we show how a useful tool to assess goodness-of-fit, the half-normal plot of residuals with a simulated envelope, implemented in the hnp package in R, can be used on a location-scale modelling context. We fitted a range of survival models to time-until-event data, where the event was an insect predator attacking a larva in a biological control experiment. We started with the Weibull model and then fitted the exponentiated-Weibull location-scale model with regressors both for the location and scale parameters. We performed variable selection for each model and, by producing half-normal plots with simulated envelopes for the deviance residuals of the model fits, we found that the exponentiated-Weibull fitted the data better. We then included a random effect in the exponentiated-Weibull model to accommodate correlated observations. Finally, we discuss possible implications of the results found in the case study. 相似文献

12.

Statistical models for e-learning data

Silvia Figini Paolo Giudici 《Statistical Methods and Applications》2009,18(2):293-304

In this paper we analyse a real e-learning dataset derived from the e-learning platform of the University of Pavia. The dataset concerns an online learning environment with in-depth teaching materials. The main focus of this paper is to supply a measure of the relative importance of the exercises (test) at the end of each training unit; to build predictive models of student’s performance and finally to personalize the e-learning platform. The methodology employed is based on nonparametric statistical methods for kernel density estimation and generalized linear models and generalized additive models for predictive purposes. 相似文献

13.

The Identifiability of Dependent Competing Risks Models Induced by Bivariate Frailty Models

下载免费PDF全文

Antai Wang Krishnendu Chandra Ruihua Xu Junfeng Sun 《Scandinavian Journal of Statistics》2015,42(2):427-437

In this paper, we propose to use a special class of bivariate frailty models to study dependent censored data. The proposed models are closely linked to Archimedean copula models. We give sufficient conditions for the identifiability of this type of competing risks models. The proposed conditions are derived based on a property shared by Archimedean copula models and satisfied by several well‐known bivariate frailty models. Compared with the models studied by Heckman and Honoré and Abbring and van den Berg, our models are more restrictive but can be identified with a discrete (even finite) covariate. Under our identifiability conditions, expectation–maximization (EM) algorithm provides us with consistent estimates of the unknown parameters. Simulation studies have shown that our estimation procedure works quite well. We fit a dependent censored leukaemia data set using the Clayton copula model and end our paper with some discussions. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics 相似文献

14.

Identifying Guttman Structures in Incomplete Rasch Datasets

Lucio Bertoli-Barsotti Silvia Bacci 《统计学通讯:理论与方法》2014,43(3):470-497

In applications of IRT, it often happens that many examinees omit a substantial proportion of item responses. This can occur for various reasons, though it may well be due to no more than the simple fact of design incompleteness. In such circumstances, literature not infrequently refers to various types of estimation problem, often in terms of generic “convergence problems” in the software used to estimate model parameters. With reference to the Partial Credit Model and the instance of data missing at random, this article demonstrates that as their number increases, so does that of anomalous datasets, intended as those not corresponding to a finite estimate of (the vector parameter that identifies) the model. Moreover, the necessary and sufficient conditions for the existence and uniqueness of the maximum likelihood estimation of the Partial Credit Model (and hence, in particular, the Rasch model) in the case of incomplete data are given – with reference to the model in its more general form, the number of response categories varying according to item. A taxonomy of possible cases of anomaly is then presented, together with an algorithm useful in diagnostics. 相似文献

15.

Bayesian calibration of computer models 总被引：5，自引：0，他引：5

Marc C. Kennedy & Anthony O'Hagan 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(3):425-464

We consider prediction and uncertainty analysis for systems which are approximated using complex mathematical models. Such models, implemented as computer codes, are often generic in the sense that by a suitable choice of some of the model's input parameters the code can be used to predict the behaviour of the system in a variety of specific applications. However, in any specific application the values of necessary parameters may be unknown. In this case, physical observations of the system in the specific context are used to learn about the unknown parameters. The process of fitting the model to the observed data by adjusting the parameters is known as calibration. Calibration is typically effected by ad hoc fitting, and after calibration the model is used, with the fitted input values, to predict the future behaviour of the system. We present a Bayesian calibration technique which improves on this traditional approach in two respects. First, the predictions allow for all sources of uncertainty, including the remaining uncertainty over the fitted parameters. Second, they attempt to correct for any inadequacy of the model which is revealed by a discrepancy between the observed data and the model predictions from even the best-fitting parameter values. The method is illustrated by using data from a nuclear radiation release at Tomsk, and from a more complex simulated nuclear accident exercise. 相似文献

16.

Modelling growth and decline in lung function in Duchenne's muscular dystrophy with an augmented linear mixed effects model

Marc A. Scott Robert G. Norman Kenneth I. Berger 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(3):507-521

Summary. Longitudinal modelling of lung function in Duchenne's muscular dystrophy is complicated by a mixture of both growth and decline in lung function within each subject, an unknown point of separation between these phases and significant heterogeneity between individual trajectories. Linear mixed effects models can be used, assuming a single changepoint for all cases; however, this assumption may be incorrect. The paper describes an extension of linear mixed effects modelling in which random changepoints are integrated into the model as parameters and estimated by using a stochastic EM algorithm. We find that use of this 'mixture modelling' approach improves the fit significantly. 相似文献

17.

Multivariate Poisson hidden Markov models with a case study of modelling seismicity

下载免费PDF全文

K. Orfanogiannaki D. Karlis 《Australian & New Zealand Journal of Statistics》2018,60(3):301-322

While the literature on multivariate models for continuous data flourishes, there is a lack of models for multivariate counts. We aim to contribute to this framework by extending the well known class of univariate hidden Markov models to the multidimensional case, by introducing multivariate Poisson hidden Markov models. Each state of the extended model is associated with a different multivariate discrete distribution. We consider different distributions with Poisson marginals, starting from the multivariate Poisson distribution and then extending to copula based distributions to allow flexible dependence structures. An EM type algorithm is developed for maximum likelihood estimation. A real data application is presented to illustrate the usefulness of the proposed models. In particular, we apply the models to the occurrence of strong earthquakes (surface wave magnitude ≥5), in three seismogenic subregions in the broad region of the North Aegean Sea for the time period from 1 January 1981 to 31 December 2008. Earthquakes occurring in one subregion may trigger events in adjacent ones and hence the observed time series of events are cross‐correlated. It is evident from the results that the three subregions interact with each other at times differing by up to a few months. This migration of seismic activity is captured by the model as a transition to a state of higher seismicity. 相似文献

18.

Bayesian Poisson models for the graphical combination of dependent expert information

Jim Q. Smith & Álvaro E. Faria Jr 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(3):525-544

A supra-Bayesian (SB) wants to combine the information from a group of k experts to produce her distribution of a probability θ. Each expert gives his counts of what he thinks are the numbers of successes and failures in a sequence of independent trials, each with probability θ of success. These counts, used as a surrogate for each expert's own individual probability assessment (together with his associated level of confidence in his estimate), allow the SB to build various plausible conjugate models. Such models reflect her beliefs about the reliability of different experts and take account of different possible patterns of overlap of information between them. Corresponding combination rules are then obtained and compared with other more established rules and their properties examined. 相似文献

19.

A comparison of regression models for ordinal data in an analysis of transplanted-kidney function

C. Greenwood V. Farewell 《Revue canadienne de statistique》1988,16(4):325-335

Three regression models for ordinal data, those of Fienberg, McCullagh, and Anderson, are applied to an analysis of kidney function among transplant recipients. The conclusions arising from each model are presented and contrasted. 相似文献

20.

The uncertainty of a selected graphical model

Iris Pigeot Fabian Sobotka Svend Kreiner Ronja Foraita 《Journal of applied statistics》2015,42(11):2335-2352

相似文献