首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The class of joint mean‐covariance models uses the modified Cholesky decomposition of the within subject covariance matrix in order to arrive to an unconstrained, statistically meaningful reparameterisation. The new parameterisation of the covariance matrix has two sets of parameters that separately describe the variances and correlations. Thus, with the mean or regression parameters, these models have three sets of distinct parameters. In order to alleviate the problem of inefficient estimation and downward bias in the variance estimates, inherent in the maximum likelihood estimation procedure, the usual REML estimation procedure adjusts for the degrees of freedom lost due to the estimation of the mean parameters. Because of the parameterisation of the joint mean covariance models, it is possible to adapt the usual REML procedure in order to estimate the variance (correlation) parameters by taking into account the degrees of freedom lost by the estimation of both the mean and correlation (variance) parameters. To this end, here we propose adjustments to the estimation procedures based on the modified and adjusted profile likelihoods. The methods are illustrated by an application to a real data set and simulation studies. The Canadian Journal of Statistics 40: 225–242; 2012 © 2012 Statistical Society of Canada  相似文献   

2.
Maximization of an auto-Gaussian log-likelihood function when spatial autocorrelation is present requires numerical evaluation of an n?×?n matrix determinant. Griffith and Sone proposed a solution to this problem. This article simplifies and then evaluates an alternative approximation that can also be used with massively large georeferenced data sets based upon a regular square tessellation; this makes it particularly relevant to remotely sensed image analysis. Estimation results reported for five data sets found in the literature confirm the utility of this newer approximation.  相似文献   

3.
The objective of this paper is to investigate through simulation the possible presence of the incidental parameters problem when performing frequentist model discrimination with stratified data. In this context, model discrimination amounts to considering a structural parameter taking values in a finite space, with k points, k≥2. This setting seems to have not yet been considered in the literature about the Neyman–Scott phenomenon. Here we provide Monte Carlo evidence of the severity of the incidental parameters problem also in the model discrimination setting and propose a remedy for a special class of models. In particular, we focus on models that are scale families in each stratum. We consider traditional model selection procedures, such as the Akaike and Takeuchi information criteria, together with the best frequentist selection procedure based on maximization of the marginal likelihood induced by the maximal invariant, or of its Laplace approximation. Results of two Monte Carlo experiments indicate that when the sample size in each stratum is fixed and the number of strata increases, correct selection probabilities for traditional model selection criteria may approach zero, unlike what happens for model discrimination based on exact or approximate marginal likelihoods. Finally, two examples with real data sets are given.  相似文献   

4.
The authors discuss a class of likelihood functions involving weak assumptions on data generating mechanisms. These likelihoods may be appropriate when it is difficult to propose models for the data. The properties of these likelihoods are given and it is shown how they can be computed numerically by use of the Blahut-Arimoto algorithm. The authors then show how these likelihoods can give useful inferences using a data set for which no plausible physical model is apparent. The plausibility of the inferences is enhanced by the extensive robustness analysis these likelihoods permit.  相似文献   

5.
ABSTRACT

We propose a multiple imputation method based on principal component analysis (PCA) to deal with incomplete continuous data. To reflect the uncertainty of the parameters from one imputation to the next, we use a Bayesian treatment of the PCA model. Using a simulation study and real data sets, the method is compared to two classical approaches: multiple imputation based on joint modelling and on fully conditional modelling. Contrary to the others, the proposed method can be easily used on data sets where the number of individuals is less than the number of variables and when the variables are highly correlated. In addition, it provides unbiased point estimates of quantities of interest, such as an expectation, a regression coefficient or a correlation coefficient, with a smaller mean squared error. Furthermore, the widths of the confidence intervals built for the quantities of interest are often smaller whilst ensuring a valid coverage.  相似文献   

6.
The authors consider a formulation of penalized likelihood regression that is sufficiently general to cover canonical and noncanonical links for exponential families as well as accelerated life models with censored survival data. They present an asymptotic analysis of convergence rates to justify a simple approach to the lower‐dimensional approximation of the estimates. Such an approximation allows for much faster numerical calculation, paving the way to the development of algorithms that scale well with large data sets.  相似文献   

7.
ABSTRACT

This paper proposes a hysteretic autoregressive model with GARCH specification and a skew Student's t-error distribution for financial time series. With an integrated hysteresis zone, this model allows both the conditional mean and conditional volatility switching in a regime to be delayed when the hysteresis variable lies in a hysteresis zone. We perform Bayesian estimation via an adaptive Markov Chain Monte Carlo sampling scheme. The proposed Bayesian method allows simultaneous inferences for all unknown parameters, including threshold values and a delay parameter. To implement model selection, we propose a numerical approximation of the marginal likelihoods to posterior odds. The proposed methodology is illustrated using simulation studies and two major Asia stock basis series. We conduct a model comparison for variant hysteresis and threshold GARCH models based on the posterior odds ratios, finding strong evidence of the hysteretic effect and some asymmetric heavy-tailness. Versus multi-regime threshold GARCH models, this new collection of models is more suitable to describe real data sets. Finally, we employ Bayesian forecasting methods in a Value-at-Risk study of the return series.  相似文献   

8.
Joint models for longitudinal and time-to-event data have been applied in many different fields of statistics and clinical studies. However, the main difficulty these models have to face with is the computational problem. The requirement for numerical integration becomes severe when the dimension of random effects increases. In this paper, a modified two-stage approach has been proposed to estimate the parameters in joint models. In particular, in the first stage, the linear mixed-effects models and best linear unbiased predictorsare applied to estimate parameters in the longitudinal submodel. In the second stage, an approximation of the fully joint log-likelihood is proposed using the estimated the values of these parameters from the longitudinal submodel. Survival parameters are estimated bymaximizing the approximation of the fully joint log-likelihood. Simulation studies show that the approach performs well, especially when the dimension of random effects increases. Finally, we implement this approach on AIDS data.  相似文献   

9.
Integrating a posterior function with respect to its parameters is required to compare the goodness-of-fit among Bayesian models which may have distinct priors or likelihoods. This paper is concerned with two integration methods for very high dimensional functions, using a Markovian Monte Carlo simulation or a Gaussian approximation. Numerical applications include analyses of spatial data in epidemiology and seismology.  相似文献   

10.
Clustering chemical compounds of similar structure is important in the pharmaceutical industry. One way of describing the structure is the chemical `fingerprint'. The fingerprint is a string of binary digits, and typical data sets consist of very large numbers of fingerprints; a suitable clustering procedure must take account of the properties of this method of coding, and must be able to handle large data sets. This paper describes the analysis of a set of fingerprint data. The analysis was based on an appropriate distance measure derived from the fingerprints, followed by metric scaling into a low-dimensional space. An approximation to metric scaling, suitable for very large data sets, was investigated. Cluster analysis using two programs, mclust and AutoClass-C, was carried out on the scaled data.  相似文献   

11.
Very often, the likelihoods for circular data sets are of quite complicated forms, and the functional forms of the normalising constants, which depend upon the unknown parameters, are unknown. This latter problem generally precludes rigorous, exact inference (both classical and Bayesian) for circular data.Noting the paucity of literature on Bayesian circular data analysis, and also because realistic data analysis is naturally permitted by the Bayesian paradigm, we address the above problem taking a Bayesian perspective. In particular, we propose a methodology that combines importance sampling and Markov chain Monte Carlo (MCMC) in a very effective manner to sample from the posterior distribution of the parameters, given the circular data. With simulation study and real data analysis, we demonstrate the considerable reliability and flexibility of our proposed methodology in analysing circular data.  相似文献   

12.
The correspondence analysis (CA) method appears to be an effective tool for analysis of interrelations between rows and columns in two-way contingency data. A discrete version of the method, box clustering, is developed in the paper using an approximation version of the CA model extended to the case when CA factor values are required to be Boolean. Several properties of the proposed SEFIT-BOX algorithm are proved to facilitate interpretation of its output. It is also shown that two known partitioning algorithms (applied within row or column sets only) could be considered as locally optimal algorithms for fitting the model, and extensions of these algorithms to a simultaneous row and column partitioning problem are proposed.  相似文献   

13.
The analysis of infectious disease data presents challenges arising from the dependence in the data and the fact that only part of the transmission process is observable. These difficulties are usually overcome by making simplifying assumptions. The paper explores the use of Markov chain Monte Carlo (MCMC) methods for the analysis of infectious disease data, with the hope that they will permit analyses to be made under more realistic assumptions. Two important kinds of data sets are considered, containing temporal and non-temporal information, from outbreaks of measles and influenza. Stochastic epidemic models are used to describe the processes that generate the data. MCMC methods are then employed to perform inference in a Bayesian context for the model parameters. The MCMC methods used include standard algorithms, such as the Metropolis–Hastings algorithm and the Gibbs sampler, as well as a new method that involves likelihood approximation. It is found that standard algorithms perform well in some situations but can exhibit serious convergence difficulties in others. The inferences that we obtain are in broad agreement with estimates obtained by other methods where they are available. However, we can also provide inferences for parameters which have not been reported in previous analyses.  相似文献   

14.
The permutation test is a nonparametric test that can be used to compare measures of spread for two data sets, but is yet to be explored in the context of three-dimensional rotation data. A permutation test for such data is developed and the statistical power of this test is considered under various conditions. The test is then used in a brief application comparing movement around the calcaneocuboid joint for a human, chimpanzee, and baboon.  相似文献   

15.
Confidence intervals for a single parameter are spanned by quantiles of a confidence distribution, and one‐sided p‐values are cumulative confidences. Confidence distributions are thus a unifying format for representing frequentist inference for a single parameter. The confidence distribution, which depends on data, is exact (unbiased) when its cumulative distribution function evaluated at the true parameter is uniformly distributed over the unit interval. A new version of the Neyman–Pearson lemma is given, showing that the confidence distribution based on the natural statistic in exponential models with continuous data is less dispersed than all other confidence distributions, regardless of how dispersion is measured. Approximations are necessary for discrete data, and also in many models with nuisance parameters. Approximate pivots might then be useful. A pivot based on a scalar statistic determines a likelihood in the parameter of interest along with a confidence distribution. This proper likelihood is reduced of all nuisance parameters, and is appropriate for meta‐analysis and updating of information. The reduced likelihood is generally different from the confidence density. Confidence distributions and reduced likelihoods are rooted in Fisher–Neyman statistics. This frequentist methodology has many of the Bayesian attractions, and the two approaches are briefly compared. Concepts, methods and techniques of this brand of Fisher–Neyman statistics are presented. Asymptotics and bootstrapping are used to find pivots and their distributions, and hence reduced likelihoods and confidence distributions. A simple form of inverting bootstrap distributions to approximate pivots of the abc type is proposed. Our material is illustrated in a number of examples and in an application to multiple capture data for bowhead whales.  相似文献   

16.
Abstract

The problem of testing Rayleigh distribution against exponentiality, based on a random sample of observations is considered. This problem arises in survival analysis, when testing a linearly increasing hazard function against a constant hazard function. It is shown that for this problem the most powerful invariant test is equivalent to the “ratio of maximized likelihoods” (RML) test. However, since the two families are separate, the RML test statistic does not have the usual asymptotic chi-square distribution. Normal and saddlepoint approximations to the distribution of the RML test statistic are derived. Simulations show that saddlepoint approximation is more accurate than the normal approximation, especially for tail probabilities that are the main values of interest in hypothesis testing.  相似文献   

17.
Summary.  Gaussian Markov random-field (GMRF) models are frequently used in a wide variety of applications. In most cases parts of the GMRF are observed through mutually independent data; hence the full conditional of the GMRF, a hidden GMRF (HGMRF), is of interest. We are concerned with the case where the likelihood is non-Gaussian, leading to non-Gaussian HGMRF models. Several researchers have constructed block sampling Markov chain Monte Carlo schemes based on approximations of the HGMRF by a GMRF, using a second-order expansion of the log-density at or near the mode. This is possible as the GMRF approximation can be sampled exactly with a known normalizing constant. The Markov property of the GMRF approximation yields computational efficiency.The main contribution in the paper is to go beyond the GMRF approximation and to construct a class of non-Gaussian approximations which adapt automatically to the particular HGMRF that is under study. The accuracy can be tuned by intuitive parameters to nearly any precision. These non-Gaussian approximations share the same computational complexity as those which are based on GMRFs and can be sampled exactly with computable normalizing constants. We apply our approximations in spatial disease mapping and model-based geostatistical models with different likelihoods, obtain procedures for block updating and construct Metropolized independence samplers.  相似文献   

18.
Summary.  The forward–backward algorithm is an exact filtering algorithm which can efficiently calculate likelihoods, and which can be used to simulate from posterior distributions. Using a simple result which relates gamma random variables with different rates, we show how the forward–backward algorithm can be used to calculate the distribution of a sum of gamma random variables, and to simulate from their joint distribution given their sum. One application is to calculating the density of the time of a specific event in a Markov process, as this time is the sum of exponentially distributed interevent times. This enables us to apply the forward–backward algorithm to a range of new problems. We demonstrate our method on three problems: calculating likelihoods and simulating allele frequencies under a non-neutral population genetic model, analysing a stochastic epidemic model and simulating speciation times in phylogenetics.  相似文献   

19.
One important type of question in statistical inference is how to interpret data as evidence. The law of likelihood provides a satisfactory answer in interpreting data as evidence for simple hypotheses, but remains silent for composite hypotheses. This article examines how the law of likelihood can be extended to composite hypotheses within the scope of the likelihood principle. From a system of axioms, we conclude that the strength of evidence for the composite hypotheses should be represented by an interval between lower and upper profiles likelihoods. This article is intended to reveal the connection between profile likelihoods and the law of likelihood under the likelihood principle rather than argue in favor of the use of profile likelihoods in addressing general questions of statistical inference. The interpretation of the result is also discussed.  相似文献   

20.
We define the maximum-relevance weighted likelihood estimator (MREWLE) using the relevance-weighted likelihood function introduced by Hu and Zidek (1995). Furthermore, we establish the consistency of the MREWLE under a wide range of conditions. Our results generalize those of Wald (1948) to both nonidentically distributed random variables and unequally weighted likelihoods (when dealing with independent data sets of varying relevance to the inferential problem of interest). Asymptotic normality is also proven. Applying these results to generalized smoothing model is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号