首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
Pairwise likelihood functions are convenient surrogates for the ordinary likelihood, useful when the latter is too difficult or even impractical to compute. One drawback of pairwise likelihood inference is that, for a multidimensional parameter of interest, the pairwise likelihood analogue of the likelihood ratio statistic does not have the standard chi-square asymptotic distribution. Invoking the theory of unbiased estimating functions, this paper proposes and discusses a computationally and theoretically attractive approach based on the derivation of empirical likelihood functions from the pairwise scores. This approach produces alternatives to the pairwise likelihood ratio statistic, which allow reference to the usual asymptotic chi-square distribution and which are useful when the elements of the Godambe information are troublesome to evaluate or in the presence of large data sets with relative small sample sizes. Two Monte Carlo studies are performed in order to assess the finite-sample performance of the proposed empirical pairwise likelihoods.  相似文献   

2.
Clustered longitudinal data feature cross‐sectional associations within clusters, serial dependence within subjects, and associations between responses at different time points from different subjects within the same cluster. Generalized estimating equations are often used for inference with data of this sort since they do not require full specification of the response model. When data are incomplete, however, they require data to be missing completely at random unless inverse probability weights are introduced based on a model for the missing data process. The authors propose a robust approach for incomplete clustered longitudinal data using composite likelihood. Specifically, pairwise likelihood methods are described for conducting robust estimation with minimal model assumptions made. The authors also show that the resulting estimates remain valid for a wide variety of missing data problems including missing at random mechanisms and so in such cases there is no need to model the missing data process. In addition to describing the asymptotic properties of the resulting estimators, it is shown that the method performs well empirically through simulation studies for complete and incomplete data. Pairwise likelihood estimators are also compared with estimators obtained from inverse probability weighted alternating logistic regression. An application to data from the Waterloo Smoking Prevention Project is provided for illustration. The Canadian Journal of Statistics 39: 34–51; 2011 © 2010 Statistical Society of Canada  相似文献   

3.
For the exchangeable binary data with random cluster sizes, we use a pairwise likelihood procedure to give a set of approximately optimal unbiased estimating equations for estimating the mean and variance parameters. Theoretical results are obtained establishing the large sample properties of the solutions to the estimating equations. An application to a developmental toxicity study is given. Simulation results show that the pairwise likelihood procedure is valid and performs better than the GEE procedure for the exchangeable binary data.  相似文献   

4.
Composite likelihood inference has gained much popularity thanks to its computational manageability and its theoretical properties. Unfortunately, performing composite likelihood ratio tests is inconvenient because of their awkward asymptotic distribution. There are many proposals for adjusting composite likelihood ratio tests in order to recover an asymptotic chi-square distribution, but they all depend on the sensitivity and variability matrices. The same is true for Wald-type and score-type counterparts. In realistic applications, sensitivity and variability matrices usually need to be estimated, but there are no comparisons of the performance of composite likelihood-based statistics in such an instance. A comparison of the accuracy of inference based on the statistics considering two methods typically employed for estimation of sensitivity and variability matrices, namely an empirical method that exploits independent observations, and Monte Carlo simulation, is performed. The results in two examples involving the pairwise likelihood show that a very large number of independent observations should be available in order to obtain accurate coverages using empirical estimation, while limited simulation from the full model provides accurate results regardless of the availability of independent observations. This suggests the latter as a default choice, whenever simulation from the model is possible.  相似文献   

5.
This article proposes a marginalized model for repeated or otherwise hierarchical, overdispersed time-to-event outcomes, adapting the so-called combined model for time-to-event outcomes of Molenberghs et al. (in press Molenberghs, G., Verbeke, G., Efendi, A., Braekers, R., Demétrio, C. G.B. (in press). A combined gamma frailty and normal random-effects model for repeated, overdispersed time-to-event data. In press. [Google Scholar]), who combined gamma and normal random effects. The two sets of random effects are used to accommodate simultaneously correlation between repeated measures and overdispersion. The proposed version allows for a direct marginal interpretation of all model parameters. The outcomes are allowed to be censored. Two estimation methods are proposed: full likelihood and pairwise likelihood. The proposed model is applied to data from a so-called comet assay and to data from recurrent asthma attacks in children. Both estimation methods perform very well. From simulation results, it follows that the marginalized combined model behaves similarly to the ordinary combined model in terms of point estimation and precision. It is also observed that the pairwise likelihood required more computation time on the one hand but is less sensitive to starting values and stabler in terms of bias with increasing sample size and censoring percentage than full likelihood, on the other, leaving room for both in practice.  相似文献   

6.
The authors propose a general model for the joint distribution of nominal, ordinal and continuous variables. Their work is motivated by the treatment of various types of data. They show how to construct parameter estimates for their model, based on the maximization of the full likelihood. They provide algorithms to implement it, and present an alternative estimation method based on the pairwise likelihood approach. They also touch upon the issue of statistical inference. They illustrate their methodology using data from a foreign language achievement study.  相似文献   

7.
Effective implementation of likelihood inference in models for high‐dimensional data often requires a simplified treatment of nuisance parameters, with these having to be replaced by handy estimates. In addition, the likelihood function may have been simplified by means of a partial specification of the model, as is the case when composite likelihood is used. In such circumstances tests and confidence regions for the parameter of interest may be constructed using Wald type and score type statistics, defined so as to account for nuisance parameter estimation or partial specification of the likelihood. In this paper a general analytical expression for the required asymptotic covariance matrices is derived, and suggestions for obtaining Monte Carlo approximations are presented. The same matrices are involved in a rescaling adjustment of the log likelihood ratio type statistic that we propose. This adjustment restores the usual chi‐squared asymptotic distribution, which is generally invalid after the simplifications considered. The practical implication is that, for a wide variety of likelihoods and nuisance parameter estimates, confidence regions for the parameters of interest are readily computable from the rescaled log likelihood ratio type statistic as well as from the Wald type and score type statistics. Two examples, a measurement error model with full likelihood and a spatial correlation model with pairwise likelihood, illustrate and compare the procedures. Wald type and score type statistics may give rise to confidence regions with unsatisfactory shape in small and moderate samples. In addition to having satisfactory shape, regions based on the rescaled log likelihood ratio type statistic show empirical coverage in reasonable agreement with nominal confidence levels.  相似文献   

8.
Missing observations in both responses and covariates arise frequently in longitudinal studies. When missing data are missing not at random, inferences under the likelihood framework often require joint modelling of response and covariate processes, as well as missing data processes associated with incompleteness of responses and covariates. Specification of these four joint distributions is a nontrivial issue from the perspectives of both modelling and computation. To get around this problem, we employ pairwise likelihood formulations, which avoid the specification of third or higher order association structures. In this paper, we consider three specific missing data mechanisms which lead to further simplified pairwise likelihood (SPL) formulations. Under these missing data mechanisms, inference methods based on SPL formulations are developed. The resultant estimators are consistent, and enjoy better robustness and computation convenience. The performance is evaluated empirically though simulation studies. Longitudinal data from the National Population Health Survey and Waterloo Smoking Prevention Project are analysed to illustrate the usage of our methods.  相似文献   

9.
Summary. There is currently great interest in understanding the way in which recombination rates vary, over short scales, across the human genome. Aside from inherent interest, an understanding of this local variation is essential for the sensible design and analysis of many studies aimed at elucidating the genetic basis of common diseases or of human population histories. Standard pedigree-based approaches do not have the fine scale resolution that is needed to address this issue. In contrast, samples of deoxyribonucleic acid sequences from unrelated chromosomes in the population carry relevant information, but inference from such data is extremely challenging. Although there has been much recent interest in the development of full likelihood inference methods for estimating local recombination rates from such data, they are not currently practicable for data sets of the size being generated by modern experimental techniques. We introduce and study two approximate likelihood methods. The first, a marginal likelihood, ignores some of the data. A careful choice of what to ignore results in substantial computational savings with virtually no loss of relevant information. For larger sequences, we introduce a 'composite' likelihood, which approximates the model of interest by ignoring certain long-range dependences. An informal asymptotic analysis and a simulation study suggest that inference based on the composite likelihood is practicable and performs well. We combine both methods to reanalyse data from the lipoprotein lipase gene, and the results seriously question conclusions from some earlier studies of these data.  相似文献   

10.
Multiple-membership logit models with random effects are models for clustered binary data, where each statistical unit can belong to more than one group. The likelihood function of these models is analytically intractable. We propose two different approaches for parameter estimation: indirect inference and data cloning (DC). The former is a non-likelihood-based method which uses an auxiliary model to select reasonable estimates. We propose an auxiliary model with the same dimension of parameter space as the target model, which is particularly convenient to reach good estimates very fast. The latter method computes maximum likelihood estimates through the posterior distribution of an adequate Bayesian model, fitted to cloned data. We implement a DC algorithm specifically for multiple-membership models. A Monte Carlo experiment compares the two methods on simulated data. For further comparison, we also report Bayesian posterior mean and Integrated Nested Laplace Approximation hybrid DC estimates. Simulations show a negligible loss of efficiency for the indirect inference estimator, compensated by a relevant computational gain. The approaches are then illustrated with two real examples on matched paired data.  相似文献   

11.
This article deals with the issue of using a suitable pseudo-likelihood, instead of an integrated likelihood, when performing Bayesian inference about a scalar parameter of interest in the presence of nuisance parameters. The proposed approach has the advantages of avoiding the elicitation on the nuisance parameters and the computation of multidimensional integrals. Moreover, it is particularly useful when it is difficult, or even impractical, to write the full likelihood function.

We focus on Bayesian inference about a scalar regression coefficient in various regression models. First, in the context of non-normal regression-scale models, we give a theroetical result showing that there is no loss of information about the parameter of interest when using a posterior distribution derived from a pseudo-likelihood instead of the correct posterior distribution. Second, we present non trivial applications with high-dimensional, or even infinite-dimensional, nuisance parameters in the context of nonlinear normal heteroscedastic regression models, and of models for binary outcomes and count data, accounting also for possibile overdispersion. In all these situtations, we show that non Bayesian methods for eliminating nuisance parameters can be usefully incorporated into a one-parameter Bayesian analysis.  相似文献   

12.
Estimating the parameters of multivariate mixed Poisson models is an important problem in image processing applications, especially for active imaging or astronomy. The classical maximum likelihood approach cannot be used for these models since the corresponding masses cannot be expressed in a simple closed form. This paper studies a maximum pairwise likelihood approach to estimate the parameters of multivariate mixed Poisson models when the mixing distribution is a multivariate Gamma distribution. The consistency and asymptotic normality of this estimator are derived. Simulations conducted on synthetic data illustrate these results and show that the proposed estimator outperforms classical estimators based on the method of moments. An application to change detection in low-flux images is also investigated.  相似文献   

13.
Summary.  Multilevel or mixed effects models are commonly applied to hierarchical data. The level 2 residuals, which are otherwise known as random effects, are often of both substantive and diagnostic interest. Substantively, they are frequently used for institutional comparisons or rankings. Diagnostically, they are used to assess the model assumptions at the group level. Inference on the level 2 residuals, however, typically does not account for 'data snooping', i.e. for the harmful effects of carrying out a multitude of hypothesis tests at the same time. We provide a very general framework that encompasses both of the following inference problems: inference on the 'absolute' level 2 residuals to determine which are significantly different from 0, and inference on any prespecified number of pairwise comparisons. Thus, the user has the choice of testing the comparisons of interest. As our methods are flexible with respect to the estimation method that is invoked, the user may choose the desired estimation method accordingly. We demonstrate the methods with the London education authority data, the wafer data and the National Educational Longitudinal Study data.  相似文献   

14.
Multivariate normal, due to its well-established theories, is commonly utilized to analyze correlated data of various types. However, the validity of the resultant inference is, more often than not, erroneous if the model assumption fails. We present a modification for making the multivariate normal likelihood acclimatize itself to general correlated data. The modified likelihood is asymptotically legitimate for any true underlying joint distributions so long as they have finite second moments. One can, hence, acquire full likelihood inference without knowing the true random mechanisms underlying the data. Simulations and real data analysis are provided to demonstrate the merit of our proposed parametric robust method.  相似文献   

15.
In studies that involve censored time-to-event data, stratification is frequently encountered due to different reasons, such as stratified sampling or model adjustment due to violation of model assumptions. Often, the main interest is not in the clustering variables, and the cluster-related parameters are treated as nuisance. When inference is about a parameter of interest in presence of many nuisance parameters, standard likelihood methods often perform very poorly and may lead to severe bias. This problem is particularly evident in models for clustered data with cluster-specific nuisance parameters, when the number of clusters is relatively high with respect to the within-cluster size. However, it is still unclear how the presence of censoring would affect this issue. We consider clustered failure time data with independent censoring, and propose frequentist inference based on an integrated likelihood. We then apply the proposed approach to a stratified Weibull model. Simulation studies show that appropriately defined integrated likelihoods provide very accurate inferential results in all circumstances, such as for highly clustered data or heavy censoring, even in extreme settings where standard likelihood procedures lead to strongly misleading results. We show that the proposed method performs generally as well as the frailty model, but it is superior when the frailty distribution is seriously misspecified. An application, which concerns treatments for a frequent disease in late-stage HIV-infected people, illustrates the proposed inferential method in Weibull regression models, and compares different inferential conclusions from alternative methods.  相似文献   

16.
The aim of this paper is to investigate the robustness properties of likelihood inference with respect to rounding effects. Attention is focused on exponential families and on inference about a scalar parameter of interest, also in the presence of nuisance parameters. A summary value of the influence function of a given statistic, the local-shift sensitivity, is considered. It accounts for small fluctuations in the observations. The main result is that the local-shift sensitivity is bounded for the usual likelihood-based statistics, i.e. the directed likelihood, the Wald and score statistics. It is also bounded for the modified directed likelihood, which is a higher-order adjustment of the directed likelihood. The practical implication is that likelihood inference is expected to be robust with respect to rounding effects. Theoretical analysis is supplemented and confirmed by a number of Monte Carlo studies, performed to assess the coverage probabilities of confidence intervals based on likelihood procedures when data are rounded. In addition, simulations indicate that the directed likelihood is less sensitive to rounding effects than the Wald and score statistics. This provides another criterion for choosing among first-order equivalent likelihood procedures. The modified directed likelihood shows the same robustness as the directed likelihood, so that its gain in inferential accuracy does not come at the price of an increase in instability with respect to rounding.  相似文献   

17.
We consider the combination of path sampling and perfect simulation in the context of both likelihood inference and non‐parametric Bayesian inference for pairwise interaction point processes. Several empirical results based on simulations and analysis of a data set are presented, and the merits of using perfect simulation are discussed.  相似文献   

18.
Biased and truncated data arise in many practical areas. Many efficient statistical methods have been studied in the literature. This paper discusses likelihood-based inferences for the two types of data in the presence of auxiliary information of known total sample size. It is shown that this information improves inference about the underlying distribution and its parameters in which we are interested. A semiparametric likelihood ratio confidence interval technique is employed. Also some simulation results are reported.  相似文献   

19.
We consider the problem of detecting a ‘bump’ in the intensity of a Poisson process or in a density. We analyze two types of likelihood ratio‐based statistics, which allow for exact finite sample inference and asymptotically optimal detection: The maximum of the penalized square root of log likelihood ratios (‘penalized scan’) evaluated over a certain sparse set of intervals and a certain average of log likelihood ratios (‘condensed average likelihood ratio’). We show that penalizing the square root of the log likelihood ratio — rather than the log likelihood ratio itself — leads to a simple penalty term that yields optimal power. The thus derived penalty may prove useful for other problems that involve a Brownian bridge in the limit. The second key tool is an approximating set of intervals that is rich enough to allow for optimal detection, but which is also sparse enough to allow justifying the validity of the penalization scheme simply via the union bound. This results in a considerable simplification in the theoretical treatment compared with the usual approach for this type of penalization technique, which requires establishing an exponential inequality for the variation of the test statistic. Another advantage of using the sparse approximating set is that it allows fast computation in nearly linear time. We present a simulation study that illustrates the superior performance of the penalized scan and of the condensed average likelihood ratio compared with the standard scan statistic.  相似文献   

20.
Likelihood‐based inference with missing data is challenging because the observed log likelihood is often an (intractable) integration over the missing data distribution, which also depends on the unknown parameter. Approximating the integral by Monte Carlo sampling does not necessarily lead to a valid likelihood over the entire parameter space because the Monte Carlo samples are generated from a distribution with a fixed parameter value. We consider approximating the observed log likelihood based on importance sampling. In the proposed method, the dependency of the integral on the parameter is properly reflected through fractional weights. We discuss constructing a confidence interval using the profile likelihood ratio test. A Newton–Raphson algorithm is employed to find the interval end points. Two limited simulation studies show the advantage of the Wilks inference over the Wald inference in terms of power, parameter space conformity and computational efficiency. A real data example on salamander mating shows that our method also works well with high‐dimensional missing data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号