首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
ABSTRACT

A general Bayesian random effects model for analyzing longitudinal mixed correlated continuous and negative binomial responses with and without missing data is presented. This Bayesian model, given some random effects, uses a normal distribution for the continuous response and a negative binomial distribution for the count response. A Markov Chain Monte Carlo sampling algorithm is described for estimating the posterior distribution of the parameters. This Bayesian model is illustrated by a simulation study. For sensitivity analysis to investigate the change of parameter estimates with respect to the perturbation from missing at random to not missing at random assumption, the use of posterior curvature is proposed. The model is applied to a medical data, obtained from an observational study on women, where the correlated responses are the negative binomial response of joint damage and continuous response of body mass index. The simultaneous effects of some covariates on both responses are also investigated.  相似文献   

2.
A random effects model for analyzing mixed longitudinal count and ordinal data is presented where the count response is inflated in two points (k and l) and an (k,l)-Inflated Power series distribution is used as its distribution. A full likelihood-based approach is used to obtain maximum likelihood estimates of parameters of the model. For data with non-ignorable missing values models with probit model for missing mechanism are used.The dependence between longitudinal sequences of responses and inflation parameters are investigated using a random effects approach. Also, to investigate the correlation between mixed ordinal and count responses of each individuals at each time, a shared random effect is used. In order to assess the performance of the model, a simulation study is performed for a case that the count response has (k,l)-Inflated Binomial distribution. Performance comparisons of count-ordinal random effect model, Zero-Inflated ordinal random effects model and (k,l)-Inflated ordinal random effects model are also given. The model is applied to a real social data set from the first two waves of the national longitudinal study of adolescent to adult health (Add Health study). In this data set, the joint responses are the number of days in a month that each individual smoked as the count response and the general health condition of each individual as the ordinal response. For the count response there is incidence of excess values of 0 and 30.  相似文献   

3.
In this paper, we study the indentifiability of a latent random effect model for the mixed correlated continuous and ordinal longitudinal responses. We derive conditions for the identifiability of the covariance parameters of the responses. Also, we proposed sensitivity analysis to investigate the perturbation from the non-identifiability of the covariance parameters, it is shown how one can use some elements of covariance structure. These elements associate conditions for identifiability of the covariance parameters of the responses. Influence of small perturbation of these elements on maximal normal curvature is also studied. The model is illustrated using medical data.  相似文献   

4.
In this article, we propose a multivariate random forest method for multiple responses of mixed types with missing responses. Imputation is performed for each bootstrap sample used to build the individual trees that form the forest. The individual trees are built using a weighted splitting rule allowing downweighting of imputed observations. A simulation study shows the benefits of this approach over complete case analysis when missing responses are missing completely at random and missing at random (MAR). In particular, the gain in prediction accuracy of the proposed method is larger in the MAR case and also increases as the proportion of missing increases.  相似文献   

5.
A longitudinal study commonly follows a set of variables, measured for each individual repeatedly over time, and usually suffers from incomplete data problem. A common approach for dealing with longitudinal categorical responses is to use the Generalized Linear Mixed Model (GLMM). This model induces the potential relation between response variables over time via a vector of random effects, assumed to be shared parameters in the non-ignorable missing mechanism. Most GLMMs assume that the random-effects parameters follow a normal or symmetric distribution and this leads to serious problems in real applications. In this paper, we propose GLMMs for the analysis of incomplete multivariate longitudinal categorical responses with a non-ignorable missing mechanism based on a shared parameter framework with the less restrictive assumption of skew-normality for the random effects. These models may contain incomplete data with monotone and non-monotone missing patterns. The performance of the model is evaluated using simulation studies and a well-known longitudinal data set extracted from a fluvoxamine trial is analyzed to determine the profile of fluvoxamine in ambulatory clinical psychiatric practice.  相似文献   

6.
This article addresses issues in creating public-use data files in the presence of missing ordinal responses and subsequent statistical analyses of the dataset by users. The authors propose a fully efficient fractional imputation (FI) procedure for ordinal responses with missing observations. The proposed imputation strategy retrieves the missing values through the full conditional distribution of the response given the covariates and results in a single imputed data file that can be analyzed by different data users with different scientific objectives. Two most critical aspects of statistical analyses based on the imputed data set,  validity  and  efficiency, are examined through regression analysis involving the ordinal response and a selected set of covariates. It is shown through both theoretical development and simulation studies that, when the ordinal responses are missing at random, the proposed FI procedure leads to valid and highly efficient inferences as compared to existing methods. Variance estimation using the fractionally imputed data set is also discussed. The Canadian Journal of Statistics 48: 138–151; 2020 © 2019 Statistical Society of Canada  相似文献   

7.
This article examines methods to efficiently estimate the mean response in a linear model with an unknown error distribution under the assumption that the responses are missing at random. We show how the asymptotic variance is affected by the estimator of the regression parameter, and by the imputation method. To estimate the regression parameter, the ordinary least squares is efficient only if the error distribution happens to be normal. If the errors are not normal, then we propose a one step improvement estimator or a maximum empirical likelihood estimator to efficiently estimate the parameter.To investigate the imputation’s impact on the estimation of the mean response, we compare the listwise deletion method and the propensity score method (which do not use imputation at all), and two imputation methods. We demonstrate that listwise deletion and the propensity score method are inefficient. Partial imputation, where only the missing responses are imputed, is compared to full imputation, where both missing and non-missing responses are imputed. Our results reveal that, in general, full imputation is better than partial imputation. However, when the regression parameter is estimated very poorly, the partial imputation will outperform full imputation. The efficient estimator for the mean response is the full imputation estimator that utilizes an efficient estimator of the parameter.  相似文献   

8.
When responses are missing at random, we propose a semiparametric direct estimator for the missing probability and density-weighted average derivatives of a general nonparametric multiple regression function. An estimator for the normalized version of the weighted average derivatives is constructed as well using instrumental variables regression. The proposed estimators are computationally simple and asymptotically normal, and provide a solution to the problem of estimating index coefficients of single-index models with responses missing at random. The developed theory generalizes the method of the density-weighted average derivatives estimation of Powell et al. (1989) for the non-missing data case. Monte Carlo simulation studies are conducted to study the performance of the methods.  相似文献   

9.
Regression models with random effects are proposed for joint analysis of negative binomial and ordinal longitudinal data with nonignorable missing values under fully parametric framework. The presented model simultaneously considers a multivariate probit regression model for the missing mechanisms, which provides the ability of examining the missing data assumptions and a multivariate mixed model for the responses. Random effects are used to take into account the correlation between longitudinal responses of the same individual. A full likelihood-based approach that allows yielding maximum likelihood estimates of the model parameters is used. The model is applied to a medical data, obtained from an observational study on women, where the correlated responses are the ordinal response of osteoporosis of the spine and negative binomial response is the number of joint damage. A sensitivity of the results to the assumptions is also investigated. The effect of some covariates on all responses are investigated simultaneously.  相似文献   

10.
For the case of a complete sample of univariate predictors and responses, the modern nonparametric regression matches results known for parametric and semiparametric regressions. The situation changes dramatically if some values in a sample are missing. This paper develops the theory of nonparametric regression for the classical case of responses missing at random. The main conclusion is that an adaptive estimator, based on a complete-case subsample, is asymptotically sharp minimax over all possible oracle-estimators that know: an underlying sample with missing responses; probability of observing the response given the predictor; smoothness of an underlying regression function; design density of the predictor; scale function of the regression error.  相似文献   

11.
We propose a joint model based on a latent variable for analyzing mixed power series and ordinal longitudinal data with and without missing values. A bivariate probit regression model is used for the missing mechanisms. Random effects are used to take into account the correlation between longitudinal responses. A full likelihood-based approach is used to yield maximum-likelihood estimates of the model parameters. Our model is applied to a medical data set, obtained from an observational study on women where the correlated responses are the ordinal response of osteoporosis of the spine and the power series response of the number of joint damages. Sensitivity analysis is also performed to study the influence of small perturbations of the parameters of the missing mechanisms and overdispersion of the model on likelihood displacement.  相似文献   

12.
ABSTRACT

In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology.  相似文献   

13.
In this paper, a regression semi-parametric model is considered where responses are assumed to be missing at random. From the empirical likelihood function defined based on the rank-based estimating equation, robust confidence intervals/regions of the true regression coefficient are derived. Monte Carlo simulation experiments show that the proposed approach provides more accurate confidence intervals/regions compared to its normal approximation counterpart under different model error structure. The approach is also compared with the least squares approach, and its superiority is shown whenever the error distribution in the simulation study is heavy tailed or contaminated. Finally, a real data example is given to illustrate our proposed method.  相似文献   

14.
Missing values are common in longitudinal data studies. The missing data mechanism is termed non-ignorable (NI) if the probability of missingness depends on the non-response (missing) observations. This paper presents a model for the ordinal categorical longitudinal data with NI non-monotone missing values. We assumed two separate models for the response and missing procedure. The response is modeled as ordinal logistic, whereas the logistic binary model is considered for the missing process. We employ these models in the context of so-called shared-parameter models, where the outcome and missing data models are connected by a common set of random effects. It is commonly assumed that the random effect follows the normal distribution in longitudinal data with or without missing data. This can be extremely restrictive in practice, and it may result in misleading statistical inferences. In this paper, we instead adopt a more flexible alternative distribution which is called the skew-normal distribution. The methodology is illustrated through an application to Schizophrenia Collaborative Study data [19 D. Hedeker, Generalized linear mixed models, in Encyclopedia of Statistics in Behavioral Science, B. Everitt and D. Howell, eds., John Wiley, London, 2005, pp. 729738. [Google Scholar]] and a simulation.  相似文献   

15.
The multivariate t linear mixed model (MtLMM) has been recently proposed as a robust tool for analysing multivariate longitudinal data with atypical observations. Missing outcomes frequently occur in longitudinal research even in well controlled situations. As a powerful alternative to the traditional expectation maximization based algorithm employing single imputation, we consider a Bayesian analysis of the MtLMM to account for the uncertainties of model parameters and missing outcomes through multiple imputation. An inverse Bayes formulas sampler coupled with Metropolis-within-Gibbs scheme is used to effectively draw the posterior distributions of latent data and model parameters. The techniques for multiple imputation of missing values, estimation of random effects, prediction of future responses, and diagnostics of potential outliers are investigated as well. The proposed methodology is illustrated through a simulation study and an application to AIDS/HIV data.  相似文献   

16.
Missing data are often problematic in social network analysis since what is missing may potentially alter the conclusions about what we have observed as tie-variables need to be interpreted in relation to their local neighbourhood and the global structure. Some ad hoc methods for dealing with missing data in social networks have been proposed but here we consider a model-based approach. We discuss various aspects of fitting exponential family random graph (or p-star) models (ERGMs) to networks with missing data and present a Bayesian data augmentation algorithm for the purpose of estimation. This involves drawing from the full conditional posterior distribution of the parameters, something which is made possible by recently developed algorithms. With ERGMs already having complicated interdependencies, it is particularly important to provide inference that adequately describes the uncertainty, something that the Bayesian approach provides. To the extent that we wish to explore the missing parts of the network, the posterior predictive distributions, immediately available at the termination of the algorithm, are at our disposal, which allows us to explore the distribution of what is missing unconditionally on any particular parameter values. Some important features of treating missing data and of the implementation of the algorithm are illustrated using a well-known collaboration network and a variety of missing data scenarios.  相似文献   

17.
This paper considers semiparametric partially linear single-index model with missing responses at random. Imputation approach is developed to estimate the regression coefficients, single-index coefficients and the nonparametric function, respectively. The imputation estimators for the regression coefficients and single-index coefficients are obtained by a stepwise approach. These estimators are shown to be asymptotically normal, and the estimator for the nonparametric function is proved to be asymptotically normal at any fixed point. The bandwidth problem is also considered in this paper, a delete-one cross validation method is used to select the optimal bandwidth. A simulation study is conducted to evaluate the proposed methods.  相似文献   

18.
In this paper, we investigate the asymptotic properties of a non-parametric conditional mode estimation given a functional explanatory variable, when functional stationary ergodic data and missing at random responses are observed. First of all, we establish asymptotic properties for a conditional density estimator from which we derive almost sure convergence (with rate) and asymptotic normality of a conditional mode estimator. This new estimate take into account missing data, and a simulation study is performed to illustrate how this fact allows to get higher predictive performances than those obtained with standard estimates.  相似文献   

19.
In this paper, a nonlinear model with response variables missing at random is studied. In order to improve the coverage accuracy for model parameters, the empirical likelihood (EL) ratio method is considered. On the complete data, the EL statistic for the parameters and its approximation have a χ2 asymptotic distribution. When the responses are reconstituted using a semi-parametric method, the empirical log-likelihood on the response variables associated with the imputed data is also asymptotically χ2. The Wilks theorem for EL on the parameters, based on reconstituted data, is also satisfied. These results can be used to construct the confidence region for the model parameters and the response variables. It is shown via Monte Carlo simulations that the EL methods outperform the normal approximation-based method in terms of coverage probability for the unknown parameter, including on the reconstituted data. The advantages of the proposed method are exemplified on real data.  相似文献   

20.
Some conditional models to deal with binary longitudinal responses are proposed, extending random effects models to include serial dependence of Markovian form, and hence allowing for quite general association structures between repeated observations recorded on the same individual. The presence of both these components implies a form of dependence between them, and so a complicated expression for the resulting likelihood. To handle this problem, we introduce, as a first instance, what Follmann and Wu (1995) called, in a different setting, an approximate conditional model, which represents an optimal choice for the general framework of categorical longitudinal responses. Then we define two more formally correct models for the binary case, with no assumption about the distribution of the random effect. All of the discussed models are estimated by means of an EM algorithm for nonparametric maximum likelihood. The algorithm, an adaptation of that used by Aitkin (1996) for the analysis of overdispersed generalized linear models, is initially derived as a form of Gaussian quadrature, and then extended to a completely unknown mixing distribution. A large scale simulation work is described to explore the behaviour of the proposed approaches in a number of different situations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号