首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Many probability distributions can be represented as compound distributions. Consider some parameter vector as random. The compound distribution is the expected distribution of the variable of interest given the random parameters. Our idea is to define a partition of the domain of definition of the random parameters, so that we can represent the expected density of the variable of interest as a finite mixture of conditional densities. We then model the mixture probabilities of the conditional densities using information on population categories, thus modifying the original overall model. We thus obtain specific models for sub-populations that stem from the overall model. The distribution of a sub-population of interest is thus completely specified in terms of mixing probabilities. All characteristics of interest can be derived from this distribution and the comparison between sub-populations easily proceeds from the comparison of the mixing probabilities. A real example based on EU-SILC data is given. Then the methodology is investigated through simulation.  相似文献   

2.
In this paper we study the cure rate survival model involving a competitive risk structure with missing categorical covariates. A parametric distribution that can be written as a sequence of one-dimensional conditional distributions is specified for the missing covariates. We consider the missing data at random situation so that the missing covariates may depend only on the observed ones. Parameter estimates are obtained by using the EM algorithm via the method of weights. Extensive simulation studies are conducted and reported to compare estimates efficiency with and without missing data. As expected, the estimation approach taking into consideration the missing covariates presents much better efficiency in terms of mean square errors than the complete case situation. Effects of increasing cured fraction and censored observations are also reported. We demonstrate the proposed methodology with two real data sets. One involved the length of time to obtain a BS degree in Statistics, and another about the time to breast cancer recurrence.  相似文献   

3.
Abstract.  This paper describes our studies on non-parametric maximum-likelihood estimators in a semiparametric mixture model for competing-risks data, in which proportional hazards models are specified for failure time models conditional on cause and a multinomial model is specified for the marginal distribution of cause conditional on covariates. We provide a verifiable identifiability condition and, based on it, establish an asymptotic profile likelihood theory for this model. We also provide efficient algorithms for the computation of the non-parametric maximum-likelihood estimate and its asymptotic variance. The success of this method is demonstrated in simulation studies and in the analysis of Taiwan severe acute respiratory syndrome data.  相似文献   

4.
Some conditional models to deal with binary longitudinal responses are proposed, extending random effects models to include serial dependence of Markovian form, and hence allowing for quite general association structures between repeated observations recorded on the same individual. The presence of both these components implies a form of dependence between them, and so a complicated expression for the resulting likelihood. To handle this problem, we introduce, as a first instance, what Follmann and Wu (1995) called, in a different setting, an approximate conditional model, which represents an optimal choice for the general framework of categorical longitudinal responses. Then we define two more formally correct models for the binary case, with no assumption about the distribution of the random effect. All of the discussed models are estimated by means of an EM algorithm for nonparametric maximum likelihood. The algorithm, an adaptation of that used by Aitkin (1996) for the analysis of overdispersed generalized linear models, is initially derived as a form of Gaussian quadrature, and then extended to a completely unknown mixing distribution. A large scale simulation work is described to explore the behaviour of the proposed approaches in a number of different situations.  相似文献   

5.
In the longitudinal studies, the mixture generalized estimation equation (mix-GEE) was proposed to improve the efficiency of the fixed-effects estimator for addressing the working correlation structure misspecification. When the subject-specific effect is one of interests, mixed-effects models were widely used to analyze longitudinal data. However, most of the existing approaches assume a normal distribution for the random effects, and this could affect the efficiency of the fixed-effects estimator. In this article, a conditional mixture generalized estimating equation (cmix-GEE) approach based on the advantage of mix-GEE and conditional quadratic inference function (CQIF) method is developed. The advantage of our new approach is that it does not require the normality assumption for random effects and can accommodate the serial correlation between observations within the same cluster. The feature of our proposed approach is that the estimators of the regression parameters are more efficient than CQIF even if the working correlation structure is not correctly specified. In addition, according to the estimates of some mixture proportions, the true working correlation matrix can be identified. We establish the asymptotic results for the fixed-effects parameter estimators. Simulation studies were conducted to evaluate our proposed method.  相似文献   

6.
Count data are routinely assumed to have a Poisson distribution, especially when there are no straightforward diagnostic procedures for checking this assumption. We reanalyse two data sets from crossover trials of treatments for angina pectoris , in which the outcomes are counts of anginal attacks. Standard analyses focus on treatment effects, averaged over subjects; we are also interested in the dispersion of these effects (treatment heterogeneity). We set up a log-Poisson model with random coefficients to estimate the distribution of the treatment effects and show that the analysis is very sensitive to the distributional assumption; the population variance of the treatment effects is confounded with the (variance) function that relates the conditional variance of the outcomes, given the subject's rate of attacks, to the conditional mean. Diagnostic model checks based on resampling from the fitted distribution indicate that the default choice of the Poisson distribution for the analysed data sets is poorly supported. We propose to augment the data sets with observations of the counts, made possibly outside the clinical setting, so that the conditional distribution of the counts could be established.  相似文献   

7.
In longitudinal studies or clustered designs, observations for each subject or cluster are dependent and exhibit intra-correlation. To account for this dependency, we consider Bayesian analysis for conditionally specified models, so-called generalized linear mixed model. In nonlinear mixed models, the maximum likelihood estimator of the regression coefficients is typically a function of the distribution of random effects, and so the misspecified choice of the distribution of random effects can cause bias in the estimator. To avoid the problem of the misspecification of the distribution of random effects, one can resort in nonparametric approaches. We give sufficient conditions for posterior consistency of the distribution of random effects as well as regression coefficients.  相似文献   

8.
Time series of counts occur in many different contexts, the counts being usually of certain events or objects in specified time intervals. In this paper we introduce a model called parameter-driven state-space model to analyse integer-valued time series data. A key property of such model is that the distribution of the observed count data is independent, conditional on the latent process, although the observations are correlated marginally. Our simulation shows that the Monte Carlo Expectation Maximization (MCEM) algorithm and the particle method are useful for the parameter estimation of the proposed model. In the application to Malaysia dengue data, our model fits better when compared with several other models including that of Yang et al. (2015)  相似文献   

9.
Abstract. This is probably the first paper which discusses likelihood inference for a random set using a germ‐grain model, where the individual grains are unobservable, edge effects occur and other complications appear. We consider the case where the grains form a disc process modelled by a marked point process, where the germs are the centres and the marks are the associated radii of the discs. We propose to use a recent parametric class of interacting disc process models, where the minimal sufficient statistic depends on various geometric properties of the random set, and the density is specified with respect to a given marked Poisson model (i.e. a Boolean model). We show how edge effects and other complications can be handled by considering a certain conditional likelihood. Our methodology is illustrated by analysing Peter Diggle's heather data set, where we discuss the results of simulation‐based maximum likelihood inference and the effect of specifying different reference Poisson models.  相似文献   

10.
Generalized linear mixed models are widely used for describing overdispersed and correlated data. Such data arise frequently in studies involving clustered and hierarchical designs. A more flexible class of models has been developed here through the Dirichlet process mixture. An additional advantage of using such mixture models is that the observations can be grouped together on the basis of the overdispersion present in the data. This paper proposes a partial empirical Bayes method for estimating all the model parameters by adopting a version of the EM algorithm. An augmented model that helps to implement an efficient Gibbs sampling scheme, under the non‐conjugate Dirichlet process generalized linear model, generates observations from the conditional predictive distribution of unobserved random effects and provides an estimate of the average number of mixing components in the Dirichlet process mixture. A simulation study has been carried out to demonstrate the consistency of the proposed method. The approach is also applied to a study on outdoor bacteria concentration in the air and to data from 14 retrospective lung‐cancer studies.  相似文献   

11.
Survival data obtained from prevalent cohort study designs are often subject to length-biased sampling. Frequentist methods including estimating equation approaches, as well as full likelihood methods, are available for assessing covariate effects on survival from such data. Bayesian methods allow a perspective of probability interpretation for the parameters of interest, and may easily provide the predictive distribution for future observations while incorporating weak prior knowledge on the baseline hazard function. There is lack of Bayesian methods for analyzing length-biased data. In this paper, we propose Bayesian methods for analyzing length-biased data under a proportional hazards model. The prior distribution for the cumulative hazard function is specified semiparametrically using I-Splines. Bayesian conditional and full likelihood approaches are developed for analyzing simulated and real data.  相似文献   

12.
Monte Carlo simulation methods are increasingly being used to evaluate the property of statistical estimators in a variety of settings. The utility of these methods depends upon the existence of an appropriate data-generating process. Observational studies are increasingly being used to estimate the effects of exposures and interventions on outcomes. Conventional regression models allow for the estimation of conditional or adjusted estimates of treatment effects. There is an increasing interest in statistical methods for estimating marginal or average treatment effects. However, in many settings, conditional treatment effects can differ from marginal treatment effects. Therefore, existing data-generating processes for conditional treatment effects are of little use in assessing the performance of methods for estimating marginal treatment effects. In the current study, we describe and evaluate the performance of two different data-generation processes for generating data with a specified marginal odds ratio. The first process is based upon computing Taylor Series expansions of the probabilities of success for treated and untreated subjects. The expansions are then integrated over the distribution of the random variables to determine the marginal probabilities of success for treated and untreated subjects. The second process is based upon an iterative process of evaluating marginal odds ratios using Monte Carlo integration. The second method was found to be computationally simpler and to have superior performance compared to the first method.  相似文献   

13.
This paper proposes a high dimensional factor multivariate stochastic volatility (MSV) model in which factor covariance matrices are driven by Wishart random processes. The framework allows for unrestricted specification of intertemporal sensitivities, which can capture the persistence in volatilities, kurtosis in returns, and correlation breakdowns and contagion effects in volatilities. The factor structure allows addressing high dimensional setups used in portfolio analysis and risk management, as well as modeling conditional means and conditional variances within the model framework. Owing to the complexity of the model, we perform inference using Markov chain Monte Carlo simulation from the posterior distribution. A simulation study is carried out to demonstrate the efficiency of the estimation algorithm. We illustrate our model on a data set that includes 88 individual equity returns and the two Fama–French size and value factors. With this application, we demonstrate the ability of the model to address high dimensional applications suitable for asset allocation, risk management, and asset pricing.  相似文献   

14.
This paper proposes a high dimensional factor multivariate stochastic volatility (MSV) model in which factor covariance matrices are driven by Wishart random processes. The framework allows for unrestricted specification of intertemporal sensitivities, which can capture the persistence in volatilities, kurtosis in returns, and correlation breakdowns and contagion effects in volatilities. The factor structure allows addressing high dimensional setups used in portfolio analysis and risk management, as well as modeling conditional means and conditional variances within the model framework. Owing to the complexity of the model, we perform inference using Markov chain Monte Carlo simulation from the posterior distribution. A simulation study is carried out to demonstrate the efficiency of the estimation algorithm. We illustrate our model on a data set that includes 88 individual equity returns and the two Fama-French size and value factors. With this application, we demonstrate the ability of the model to address high dimensional applications suitable for asset allocation, risk management, and asset pricing.  相似文献   

15.
Ecological studies are based on characteristics of groups of individuals, which are common in various disciplines including epidemiology. It is of great interest for epidemiologists to study the geographical variation of a disease by accounting for the positive spatial dependence between neighbouring areas. However, the choice of scale of the spatial correlation requires much attention. In view of a lack of studies in this area, this study aims to investigate the impact of differing definitions of geographical scales using a multilevel model. We propose a new approach – the grid-based partitions and compare it with the popular census region approach. Unexplained geographical variation is accounted for via area-specific unstructured random effects and spatially structured random effects specified as an intrinsic conditional autoregressive process. Using grid-based modelling of random effects in contrast to the census region approach, we illustrate conditions where improvements are observed in the estimation of the linear predictor, random effects, parameters, and the identification of the distribution of residual risk and the aggregate risk in a study region. The study has found that grid-based modelling is a valuable approach for spatially sparse data while the statistical local area-based and grid-based approaches perform equally well for spatially dense data.  相似文献   

16.
Longitudinal data often require a combination of flexible time trends and individual-specific random effects. For example, our methodological developments are motivated by a study on longitudinal body mass index profiles of children collected with the aim to gain a better understanding of factors driving childhood obesity. The high amount of nonlinearity and heterogeneity in these data and the complexity of the data set with a large number of observations, long longitudinal profiles and clusters of observations with specific deviations from the population model make the application challenging and prevent the application of standard growth curve models. We propose a fully Bayesian approach based on Markov chain Monte Carlo simulation techniques that allows for the semiparametric specification of both the trend function and the random effects distribution. Bayesian penalized splines are considered for the former, while a Dirichlet process mixture (DPM) specification allows for an adaptive amount of deviations from normality for the latter. The advantages of such DPM prior structures for random effects are investigated in terms of a simulation study to improve the understanding of the model specification before analyzing the childhood obesity data.  相似文献   

17.
A typical model for geostatistical data when the observations are counts is the spatial generalised linear mixed model. We present a criterion for optimal sampling design under this framework which aims to minimise the error in the prediction of the underlying spatial random effects. The proposed criterion is derived by performing an asymptotic expansion to the conditional prediction variance. We argue that the mean of the spatial process needs to be taken into account in the construction of the predictive design, which we demonstrate through a simulation study where we compare the proposed criterion against the widely used space-filling design. Furthermore, our results are applied to the Norway precipitation data and the rhizoctonia disease data.  相似文献   

18.
We extend the family of Poisson and negative binomial models to derive the joint distribution of clustered count outcomes with extra zeros. Two random effects models are formulated. The first model assumes a shared random effects term between the conditional probability of perfect zeros and the conditional mean of the imperfect state. The second formulation relaxes the shared random effects assumption by relating the conditional probability of perfect zeros and the conditional mean of the imperfect state to two different but correlated random effects variables. Under the conditional independence and the missing data at random assumption, a direct optimization of the marginal likelihood and an EM algorithm are proposed to fit the proposed models. Our proposed models are fitted to dental caries counts of children under the age of six in the city of Detroit.  相似文献   

19.
In this paper, we employ the parametric bootstrap to approximate the finite sample distribution of a goodness-of-fit test statistic in Fan (1994). We show that the proposed bootstrap procedure works in that the bootstrap distribution conditional on the random sample tends to the asymptotic distribution of the test statistic in probability. A simulation study demonstrates that the bootstrap approximation works extremely well in small samples with only 25 observations and is very robust to the value of the smoothing parameter in the kernel density estimation.  相似文献   

20.
Overdispersion due to a large proportion of zero observations in data sets is a common occurrence in many applications of many fields of research; we consider such scenarios in count panel (longitudinal) data. A well-known and widely implemented technique for handling such data is that of random effects modeling, which addresses the serial correlation inherent in panel data, as well as overdispersion. To deal with the excess zeros, a zero-inflated Poisson distribution has come to be canonical, which relaxes the equal mean-variance specification of a traditional Poisson model and allows for the larger variance characteristic of overdispersed data. A natural proposal then to approach count panel data with overdispersion due to excess zeros is to combine these two methodologies, deriving a likelihood from the resulting conditional probability. In performing simulation studies, we find that this approach in fact poses problems of identifiability. In this article, we construct and explain in full detail why a model obtained from the marriage of two classical and well-established techniques is unidentifiable and provide results of simulation studies demonstrating this effect. A discussion on alternative methodologies to resolve the problem is provided in the conclusion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号