首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Information from multiple informants is frequently used to assess psychopathology. We consider marginal regression models with multiple informants as discrete predictors and a time to event outcome. We fit these models to data from the Stirling County Study; specifically, the models predict mortality from self report of psychiatric disorders and also predict mortality from physician report of psychiatric disorders. Previously, Horton et al. found little relationship between self and physician reports of psychopathology, but that the relationship of self report of psychopathology with mortality was similar to that of physician report of psychopathology with mortality. Generalized estimating equations (GEE) have been used to fit marginal models with multiple informant covariates; here we develop a maximum likelihood (ML) approach and show how it relates to the GEE approach. In a simple setting using a saturated model, the ML approach can be constructed to provide estimates that match those found using GEE. We extend the ML technique to consider multiple informant predictors with missingness and compare the method to using inverse probability weighted (IPW) GEE. Our simulation study illustrates that IPW GEE loses little efficiency compared with ML in the presence of monotone missingness. Our example data has non-monotone missingness; in this case, ML offers a modest decrease in variance compared with IPW GEE, particularly for estimating covariates in the marginal models. In more general settings, e.g., categorical predictors and piecewise exponential models, the likelihood parameters from the ML technique do not have the same interpretation as the GEE. Thus, the GEE is recommended to fit marginal models for its flexibility, ease of interpretation and comparable efficiency to ML in the presence of missing data.  相似文献   

2.
Abstract. The zero‐inflated Poisson regression model is a special case of finite mixture models that is useful for count data containing many zeros. Typically, maximum likelihood (ML) estimation is used for fitting such models. However, it is well known that the ML estimator is highly sensitive to the presence of outliers and can become unstable when mixture components are poorly separated. In this paper, we propose an alternative robust estimation approach, robust expectation‐solution (RES) estimation. We compare the RES approach with an existing robust approach, minimum Hellinger distance (MHD) estimation. Simulation results indicate that both methods improve on ML when outliers are present and/or when the mixture components are poorly separated. However, the RES approach is more efficient in all the scenarios we considered. In addition, the RES method is shown to yield consistent and asymptotically normal estimators and, in contrast to MHD, can be applied quite generally.  相似文献   

3.
ABSTRACT

Clustered observations such as longitudinal data are often analysed with generalized linear mixed models (GLMM). Approximate Bayesian inference for GLMMs with normally distributed random effects can be done using integrated nested Laplace approximations (INLA), which is in general known to yield accurate results. However, INLA is known to be less accurate for GLMMs with binary response. For longitudinal binary response data it is common that patients do not change their health state during the study period. In this case the grouping covariate perfectly predicts a subset of the response, which implies a monotone likelihood with diverging maximum likelihood (ML) estimates for cluster-specific parameters. This is known as quasi-complete separation. In this paper we demonstrate, based on longitudinal data from a randomized clinical trial and two simulations, that the accuracy of INLA decreases with increasing degree of cluster-specific quasi-complete separation. Comparing parameter estimates by INLA, Markov chain Monte Carlo sampling and ML shows that INLA increasingly deviates from the other methods in such a scenario.  相似文献   

4.
Hierarchical generalized linear models (HGLMs) have become popular in data analysis. However, their maximum likelihood (ML) and restricted maximum likelihood (REML) estimators are often difficult to compute, especially when the random effects are correlated; this is because obtaining the likelihood function involves high-dimensional integration. Recently, an h-likelihood method that does not involve numerical integration has been proposed. In this study, we show how an h-likelihood method can be implemented by modifying the existing ML and REML procedures. A small simulation study is carried out to investigate the performances of the proposed methods for HGLMs with correlated random effects.  相似文献   

5.
We develop criteria that generate robust designs and use such criteria for the construction of designs that insure against possible misspecifications in logistic regression models. The design criteria we propose are different from the classical in that we do not focus on sampling error alone. Instead we use design criteria that account as well for error due to bias engendered by the model misspecification. Our robust designs optimize the average of a function of the sampling error and bias error over a specified misspecification neighbourhood. Examples of robust designs for logistic models are presented, including a case study implementing the methodologies using beetle mortality data.  相似文献   

6.
A fast and accurate method of confidence interval construction for the smoothing parameter in penalised spline and partially linear models is proposed. The method is akin to a parametric percentile bootstrap where Monte Carlo simulation is replaced by saddlepoint approximation, and can therefore be viewed as an approximate bootstrap. It is applicable in a quite general setting, requiring only that the underlying estimator be the root of an estimating equation that is a quadratic form in normal random variables. This is the case under a variety of optimality criteria such as those commonly denoted by maximum likelihood (ML), restricted ML (REML), generalized cross validation (GCV) and Akaike's information criteria (AIC). Simulation studies reveal that under the ML and REML criteria, the method delivers a near‐exact performance with computational speeds that are an order of magnitude faster than existing exact methods, and two orders of magnitude faster than a classical bootstrap. Perhaps most importantly, the proposed method also offers a computationally feasible alternative when no known exact or asymptotic methods exist, e.g. GCV and AIC. An application is illustrated by applying the methodology to well‐known fossil data. Giving a range of plausible smoothed values in this instance can help answer questions about the statistical significance of apparent features in the data.  相似文献   

7.
In testing product reliability, there is often a critical cutoff level that determines whether a specimen is classified as failed. One consequence is that the number of degradation data collected varies from specimen to specimen. The information of random sample size should be included in the model, and our study shows that it can be influential in estimating model parameters. Two-stage least squares (LS) and maximum modified likelihood (MML) estimation, which both assume fixed sample sizes, are commonly used for estimating parameters in the repeated measurements models typically applied to degradation data. However, the LS estimate is not consistent in the case of random sample sizes. This article derives the likelihood for the random sample size model and suggests using maximum likelihood (ML) for parameter estimation. Our simulation studies show that ML estimates have smaller biases and variances compared to the LS and MML estimates. All estimation methods can be greatly improved if the number of specimens increases from 5 to 10. A data set from a semiconductor application is used to illustrate our methods.  相似文献   

8.
In this article we study two methodologies which identify and specify canonical form VARMA models. The two methodologies are: (1) an extension of the scalar component methodology which specifies canonical VARMA models by identifying scalar components through canonical correlations analysis; and (2) the Echelon form methodology, which specifies canonical VARMA models through the estimation of Kronecker indices. We compare the actual forms and the methodologies on three levels. Firstly, we present a theoretical comparison. Secondly, we present a Monte Carlo simulation study that compares the performances of the two methodologies in identifying some pre-specified data generating processes. Lastly, we compare the out-of-sample forecast performance of the two forms when models are fitted to real macroeconomic data.  相似文献   

9.
In this article, we assume that the distribution of the error terms is skew t in two-way analysis of variance (ANOVA). Skew t distribution is very flexible for modeling the symmetric and the skew datasets, since it reduces to the well-known normal, skew normal, and Student's t distributions. We obtain the estimators of the model parameters by using the maximum likelihood (ML) and the modified maximum likelihood (MML) methodologies. We also propose new test statistics based on these estimators for testing the equality of the treatment and the block means and also the interaction effect. The efficiencies of the ML and the MML estimators and the power values of the test statistics based on them are compared with the corresponding normal theory results via Monte Carlo simulation study. Simulation results show that the proposed methodologies are more preferable. We also show that the test statistics based on the ML estimators are more powerful than the test statistics based on the MML estimators as expected. However, power values of the test statistics based on the MML estimators are very close to the corresponding test statistics based on the ML estimators. At the end of the study, a real life example is given to show the implementation of the proposed methodologies.  相似文献   

10.
A version of the nonparametric bootstrap, which resamples the entire subjects from original data, called the case bootstrap, has been increasingly used for estimating uncertainty of parameters in mixed‐effects models. It is usually applied to obtain more robust estimates of the parameters and more realistic confidence intervals (CIs). Alternative bootstrap methods, such as residual bootstrap and parametric bootstrap that resample both random effects and residuals, have been proposed to better take into account the hierarchical structure of multi‐level and longitudinal data. However, few studies have been performed to compare these different approaches. In this study, we used simulation to evaluate bootstrap methods proposed for linear mixed‐effect models. We also compared the results obtained by maximum likelihood (ML) and restricted maximum likelihood (REML). Our simulation studies evidenced the good performance of the case bootstrap as well as the bootstraps of both random effects and residuals. On the other hand, the bootstrap methods that resample only the residuals and the bootstraps combining case and residuals performed poorly. REML and ML provided similar bootstrap estimates of uncertainty, but there was slightly more bias and poorer coverage rate for variance parameters with ML in the sparse design. We applied the proposed methods to a real dataset from a study investigating the natural evolution of Parkinson's disease and were able to confirm that the methods provide plausible estimates of uncertainty. Given that most real‐life datasets tend to exhibit heterogeneity in sampling schedules, the residual bootstraps would be expected to perform better than the case bootstrap. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
Maximum likelihood (ML) estimation with spatial econometric models is a long-standing problem that finds application in several areas of economic importance. The problem is particularly challenging in the presence of missing data, since there is an implied dependence between all units, irrespective of whether they are observed or not. Out of the several approaches adopted for ML estimation in this context, that of LeSage and Pace [Models for spatially dependent missing data. J Real Estate Financ Econ. 2004;29(2):233–254] stands out as one of the most commonly used with spatial econometric models due to its ability to scale with the number of units. Here, we review their algorithm, and consider several similar alternatives that are also suitable for large datasets. We compare the methods through an extensive empirical study and conclude that, while the approximate approaches are suitable for large sampling ratios, for small sampling ratios the only reliable algorithms are those that yield exact ML or restricted ML estimates.  相似文献   

12.
In most of the existing specialized literature, monitoring regression models are a special case of profile monitoring. However, not every regression model always represents appropriately a profile data structure. This is clearly the case of the Weibull regression model (WRM) with common shape parameter γ. Even though it might be thought that existing methodologies (especially likelihood-ratio (LRT)-based methods) for monitoring generalized linear profiles can also be successfully applied to monitoring regression models with time-to-event response, it will be shown in this paper that those methodologies work fairly acceptable just for data structures with 1000 observations at least approximately. It was found out that some corrections, often referred to as Bartlett's adjustments, are needed to be implemented in order to improve the accuracy of using the asymptotic distributional properties of the LRT statistic for carrying out the monitoring of WRM with relatively small and moderate dimensions of the available datasets. Simulation studies suggest that the use of the aforementioned corrections make the resulting charts work quite acceptable when available data structures contain 30 observations at least. Detection abilities of the proposed schemes improve as dataset dimension increases.  相似文献   

13.
This paper reviews five related types of analysis, namely (i) sensitivity or what-if analysis, (ii) uncertainty or risk analysis, (iii) screening, (iv) validation, and (v) optimization. The main questions are: when should which type of analysis be applied; which statistical techniques may then be used? This paper claims that the proper sequence to follow in the evaluation of simulation models is as follows. 1) Validation, in which the availability of data on the real system determines which type of statistical technique to use for validation. 2) Screening: in the simulation‘s pilot phase the really important inputs can be identified through a novel technique, called sequential bifurcation, which uses aggregation and sequential experimentation. 3) Sensitivity analysis: the really important inputs should be subjected to a more detailed analysis, which includes interactions between these inputs; relevant statistical techniques are design of experiments (DOE) and regression analysis. 4) Uncertainty analysis: the important environmental inputs may have values that are not precisely known, so the uncertainties of the model outputs that result from the uncertainties in these model inputs should be quantified; relevant techniques are the Monte Carlo method and Latin hypercube sampling. 5) Optimization: the policy variables should be controlled; a relevant technique is Response Surface Methodology (RSM), which combines DOE, regression analysis, and steepest-ascent hill-climbing. The recommended sequence implies that sensitivity analysis procede uncertainty analysis. Several case studies for each phase are briefly discussed in this paper.  相似文献   

14.
Seongyoung Kim 《Statistics》2015,49(6):1189-1203
For categorical data exhibiting nonignorable non-responses, it is well known that maximum likelihood (ML) estimates with a boundary solution are implausible and do not provide a perfect fit to the observed data even for saturated models. We provide the conditions under which ML estimates for the generalized linear model (GLM) with the usual log/logit link function have a boundary solution. These conditions introduce a new GLM with appropriately defined power link functions where its ML estimates resolve the problems arising from a boundary solution and offer useful statistics for identifying the non-response mechanism. This model is applied to a real dataset and compared with Bayesian models.  相似文献   

15.
We consider data generating structures which can be represented as a Markov switching of nonlinear autoregressive model with considering skew-symmetric innovations such that switching between the states is controlled by a hidden Markov chain. We propose semi-parametric estimators for the nonlinear functions of the proposed model based on a maximum likelihood (ML) approach and study sufficient conditions for geometric ergodicity of the process. Also, an Expectation-Maximization type optimization for obtaining the ML estimators are presented. A simulation study and a real world application are also performed to illustrate and evaluate the proposed methodology.  相似文献   

16.
Finite mixtures of multivariate skew t (MST) distributions have proven to be useful in modelling heterogeneous data with asymmetric and heavy tail behaviour. Recently, they have been exploited as an effective tool for modelling flow cytometric data. A number of algorithms for the computation of the maximum likelihood (ML) estimates for the model parameters of mixtures of MST distributions have been put forward in recent years. These implementations use various characterizations of the MST distribution, which are similar but not identical. While exact implementation of the expectation-maximization (EM) algorithm can be achieved for ‘restricted’ characterizations of the component skew t-distributions, Monte Carlo (MC) methods have been used to fit the ‘unrestricted’ models. In this paper, we review several recent fitting algorithms for finite mixtures of multivariate skew t-distributions, at the same time clarifying some of the connections between the various existing proposals. In particular, recent results have shown that the EM algorithm can be implemented exactly for faster computation of ML estimates for mixtures with unrestricted MST components. The gain in computational time is effected by noting that the semi-infinite integrals on the E-step of the EM algorithm can be put in the form of moments of the truncated multivariate non-central t-distribution, similar to the restricted case, which subsequently can be expressed in terms of the non-truncated form of the central t-distribution function for which fast algorithms are available. We present comparisons to illustrate the relative performance of the restricted and unrestricted models, and demonstrate the usefulness of the recently proposed methodology for the unrestricted MST mixture, by some applications to three real datasets.  相似文献   

17.
Tweedie regression models (TRMs) provide a flexible family of distributions to deal with non-negative right-skewed data and can handle continuous data with probability mass at zero. Estimation and inference of TRMs based on the maximum likelihood (ML) method are challenged by the presence of an infinity sum in the probability function and non-trivial restrictions on the power parameter space. In this paper, we propose two approaches for fitting TRMs, namely quasi-likelihood (QML) and pseudo-likelihood (PML). We discuss their asymptotic properties and perform simulation studies to compare our methods with the ML method. We show that the QML method provides asymptotically efficient estimation for regression parameters. Simulation studies showed that the QML and PML approaches present estimates, standard errors and coverage rates similar to the ML method. Furthermore, the second-moment assumptions required by the QML and PML methods enable us to extend the TRMs to the class of quasi-TRMs in Wedderburn's style. It allows to eliminate the non-trivial restriction on the power parameter space, and thus provides a flexible regression model to deal with continuous data. We provide an R implementation and illustrate the application of TRMs using three data sets.  相似文献   

18.
We propose data generating structures which can be represented as the nonlinear autoregressive models with single and finite mixtures of scale mixtures of skew normal innovations. This class of models covers symmetric/asymmetric and light/heavy-tailed distributions, so provide a useful generalization of the symmetrical nonlinear autoregressive models. As semiparametric and nonparametric curve estimation are the approaches for exploring the structure of a nonlinear time series data set, in this article the semiparametric estimator for estimating the nonlinear function of the model is investigated based on the conditional least square method and nonparametric kernel approach. Also, an Expectation–Maximization-type algorithm to perform the maximum likelihood (ML) inference of unknown parameters of the model is proposed. Furthermore, some strong and weak consistency of the semiparametric estimator in this class of models are presented. Finally, to illustrate the usefulness of the proposed model, some simulation studies and an application to real data set are considered.  相似文献   

19.
Prediction on the basis of censored data has an important role in many fields. This article develops a non-Bayesian two-sample prediction based on a progressive Type-II right censoring scheme. We obtain the maximum likelihood (ML) prediction in a general form for lifetime models including the Weibull distribution. The Weibull distribution is considered to obtain the ML predictor (MLP), the ML prediction estimate (MLPE), the asymptotic ML prediction interval (AMLPI), and the asymptotic predictive ML intervals of the sth-order statistic in a future random sample (Ys) drawn independently from the parent population, for an arbitrary progressive censoring scheme. To reach this aim, we present three ML prediction methods namely the numerical solution, the EM algorithm, and the approximate ML prediction. We compare the performances of the different methods of ML prediction under asymptotic normality and bootstrap methods by Monte Carlo simulation with respect to biases and mean square prediction errors (MSPEs) of the MLPs of Ys as well as coverage probabilities (CP) and average lengths (AL) of the AMLPIs. Finally, we give a numerical example and a real data sample to assess the computational comparison of these methods of the ML prediction.  相似文献   

20.
Abstract.  In many spatial and spatial-temporal models, and more generally in models with complex dependencies, it may be too difficult to carry out full maximum-likelihood (ML) analysis. Remedies include the use of pseudo-likelihood (PL) and quasi-likelihood (QL) (also called the composite likelihood). The present paper studies the ML, PL and QL methods for general Markov chain models, partly motivated by the desire to understand the precise behaviour of the PL and QL methods in settings where this can be analysed. We present limiting normality results and compare performances in different settings. For Markov chain models, the PL and QL methods can be seen as maximum penalized likelihood methods. We find that QL is typically preferable to PL, and that it loses very little to ML, while sometimes earning in model robustness. It has also appeal and potential as a modelling tool. Our methods are illustrated for consonant-vowel transitions in poetry and for analysis of DNA sequence evolution-type models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号