This paper presents a robust extension of factor analysis model by assuming the multivariate normal mean–variance mixture of Birnbaum–Saunders distribution for the unobservable factors and errors. A computationally analytical EM-based algorithm is developed to find maximum likelihood estimates of the parameters. The asymptotic standard errors of parameter estimates are derived under an information-based paradigm. Numerical merits of the proposed methodology are illustrated using both simulated and real datasets.  相似文献   

An objective of Record Linkage is to link two data files by identifying common elements. A popular model for doing the separation is the probabilistic one from Fellegi and Sunter. To estimate the parameters needed for the model usually a mixture model is constructed and the EM algorithm is applied. For simplification, the assumption of conditional independence is often made. This assumption says that if several attributes of elements in the data are compared, then the results of the comparisons regarding the several attributes are independent within the mixture classes. A mixture model constructed with this assumption has been often used. Within this article a straightforward extension of the model is introduced which allows for conditional dependencies but is heavily dependent on the choice of the starting value. Therefore also an estimation procedure for the EM algorithm starting value is proposed. The two models are compared empirically in a simulation study based on telephone book entries. Particularly the effect of different starting values and conditional dependencies on the matching results is investigated.  相似文献   

The sample selection bias problem occurs when the outcome of interest is only observed according to some selection rule, where there is a dependence structure between the outcome and the selection rule. In a pioneering work, J. Heckman proposed a sample selection model based on a bivariate normal distribution for dealing with this problem. Due to the non-robustness of the normal distribution, many alternatives have been introduced in the literature by assuming extensions of the normal distribution like the Student-t and skew-normal models. One common limitation of the existent sample selection models is that they require a transformation of the outcome of interest, which is common R+-valued, such as income and wage. With this, data are analyzed on a non-original scale which complicates the interpretation of the parameters. In this paper, we propose a sample selection model based on the bivariate Birnbaum–Saunders distribution, which has the same number of parameters that the classical Heckman model. Further, our associated outcome equation is R+-valued. We discuss estimation by maximum likelihood and present some Monte Carlo simulation studies. An empirical application to the ambulatory expenditures data from the 2001 Medical Expenditure Panel Survey is presented.  相似文献   

This article focuses on the parameter estimation of experimental items/units from Weibull Poisson Model under progressive type-II censoring with binomial removals (PT-II CBRs). The expectation–maximization algorithm has been used for maximum likelihood estimators (MLEs). The MLEs and Bayes estimators have been obtained under symmetric and asymmetric loss functions. Performance of competitive estimators have been studied through their simulated risks. One sample Bayes prediction and expected experiment time have also been studied. Furthermore, through real bladder cancer data set, suitability of considered model and proposed methodology have been illustrated.  相似文献   

Suppose the same nonlinear function involving k parameters is fit to each of t populations. Suppose further it is of interest to compare a specific parameter of the models across the populations. Such comparisons can be expressed as linear hypotheses about the parameters of the nonlinear models. A weighted linear least squares (WLLS) procedure is proposed to test these linear hypotheses. The advantages and disadvantages of the WLLS procedure are discussed. This procedure is also compared to a nonlinear least squares procedure for testing these hypotheses in nonlinear models.  相似文献   

Hunt (1996) implemented the finite mixture model approach to clustering in a program called MULTIMIX. The program is designed to cluster multivariate data that have categorical and continuous variables and that possibly contain missing values. This paper describes the approach taken to design MULTIMIX and how some of the statistical problems were dealt with. As an example, the program is used to cluster a large medical dataset.  相似文献   


In this study, a Generalized, Multi-Stage Adjusted, Latent Class Linear Mixed Model is proposed for modeling the heterogeneous distributed phenotype and genetic information across the whole genome in the presence of both serial and familial correlations. Genome data were analyzed by applying the proposed model to Genetic Analysis Workshop (GAW) data, and the model results were compared to the results of standard models. Moreover, the potential of the model is discussed compared to simulated data. As a result of model comparisons, the information criteria and the genomic control parameter were found to be smaller. The results of a power analysis show that the proposed model is more powerful.  相似文献   

Abstract. Latent variable modelling has gradually become an integral part of mainstream statistics and is currently used for a multitude of applications in different subject areas. Examples of ‘traditional’ latent variable models include latent class models, item–response models, common factor models, structural equation models, mixed or random effects models and covariate measurement error models. Although latent variables have widely different interpretations in different settings, the models have a very similar mathematical structure. This has been the impetus for the formulation of general modelling frameworks which accommodate a wide range of models. Recent developments include multilevel structural equation models with both continuous and discrete latent variables, multiprocess models and nonlinear latent variable models.  相似文献   

Probabilistic matching of records is widely used to create linked data sets for use in health science, epidemiological, economic, demographic and sociological research. Clearly, this type of matching can lead to linkage errors, which in turn can lead to bias and increased variability when standard statistical estimation techniques are used with the linked data. In this paper we develop unbiased regression parameter estimates to be used when fitting a linear model with nested errors to probabilistically linked data. Since estimation of variance components is typically an important objective when fitting such a model, we also develop appropriate modifications to standard methods of variance components estimation in order to account for linkage error. In particular, we focus on three widely used methods of variance components estimation: analysis of variance, maximum likelihood and restricted maximum likelihood. Simulation results show that our estimators perform reasonably well when compared to standard estimation methods that ignore linkage errors.  相似文献   

In this paper, E-Bayesian and hierarchical Bayesian estimations of the shape parameter, when the underlying distribution belongs to the proportional reversed hazard rate model, are considered. Maximum likelihood, Bayesian and E-Bayesian estimates of the unknown parameter and reliability function are obtained based on record values. The Bayesian estimates are derived based on squared error and linear–exponential loss functions. It is pointed out that some previously obtained order relations of E-Bayesian estimates are inadequate and these results are improved. The relationship between E-Bayesian and hierarchical Bayesian estimations is obtained under the same loss functions. The comparison of the derived estimates is carried out by using Monte Carlo simulations. A real data set is analysed for an illustration of the findings.  相似文献   

The family of power series cure rate models provides a flexible modeling framework for survival data of populations with a cure fraction. In this work, we present a simplified estimation procedure for the maximum likelihood (ML) approach. ML estimates are obtained via the expectation-maximization (EM) algorithm where the expectation step involves computation of the expected number of concurrent causes for each individual. It has the big advantage that the maximization step can be decomposed into separate maximizations of two lower-dimensional functions of the regression and survival distribution parameters, respectively. Two simulation studies are performed: the first to investigate the accuracy of the estimation procedure for different numbers of covariates and the second to compare our proposal with the direct maximization of the observed log-likelihood function. Finally, we illustrate the technique for parameter estimation on a dataset of survival times for patients with malignant melanoma.  相似文献   

The purpose of this paper is to develop a Bayesian approach for the Weibull-Negative-Binomial regression model with cure rate under latent failure causes and presence of randomized activation mechanisms. We assume the number of competing causes of the event of interest follows a Negative Binomial (NB) distribution while the latent lifetimes are assumed to follow a Weibull distribution. Markov chain Monte Carlos (MCMC) methods are used to develop the Bayesian procedure. Model selection to compare the fitted models is discussed. Moreover, we develop case deletion influence diagnostics for the joint posterior distribution based on the ψ-divergence, which has several divergence measures as particular cases. The developed procedures are illustrated with a real data set.  相似文献   


In this paper, we consider the best linear unbiased estimators (BLUEs) based on double ranked set sampling (DRSS) and ordered DRSS (ODRSS) schemes for the simple linear regression model with replicated observations. We assume three symmetric distributions for the random error term, i.e., normal, Laplace and some scale contaminated normal distributions. The proposed BLUEs under DRSS (BLUEs-DRSS) and ODRSS (BLUEs-ODRSS) are compared with the BLUEs based on ordered simple random sampling (OSRS), ranked set sampling (RSS), and ordered RSS (ORSS) schemes. These estimators are compared in terms of relative efficiency (RE), RE of determinant (RED), and RE of trace (RET). It is found that the BLUEs-ODRSS are uniformly better than the BLUEs based on OSRS, RSS, ORSS, and DRSS schemes. We also compare the estimators based on imperfect RSS (IRSS) schemes. It is worth mentioning here that the BLUEs under ordered imperfect DRSS (OIDRSS) are better than their counterparts based on IRSS, ordered IRSS (OIRSS), and imperfect DRSS (IDRSS) methods. Moreover, for sensitivity analysis of the BLUEs, we calculate REs and REDs of the BLUEs under the assumption of normality when in fact the parent distribution follows a non normal symmetric distribution. It turns out that even under violation of normality assumptions, BLUEs of the intercept and the slope parameters are found to be unbiased with equal REs under each sampling scheme. It is also observed that the BLUEs under ODRSS are more efficient than the existing BLUEs.  相似文献   

Failure time data represent a particular case of binary longitudinal data. The corresponding analysis of the effect of explanatory covariates repeatedly collected over time on the failure rate has been largely facilitated by the Cox semi-parametric regression model. However, neither the interpretation of the estimated parameters associated with time-dependent covariates is straight-forward, nor does this model fully account for the dynamics of the effect of a covariate over time. Markovian regression models appear as complementary tools to address these specific issues from the predictive point of view. We illustrate these aspects using data from the WHO multicenter study, which was designed to analyze the relation between the duration of postpartum lactational amenorrhea and the breastfeeding pattern. One of the main advantage of this approach applied to the field of reproductive epidemiology was to provide a flexible tool, easily and directly understood by clinicians and fieldworkers, for simulating situations, which were still unobserved, and to predict their effects on the duration of amenorrhea.  相似文献   

We introduce the log-odd Weibull regression model based on the odd Weibull distribution (Cooray, 2006). We derive some mathematical properties of the log-transformed distribution. The new regression model represents a parametric family of models that includes as sub-models some widely known regression models that can be applied to censored survival data. We employ a frequentist analysis and a parametric bootstrap for the parameters of the proposed model. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and present some ways to assess global influence. Further, for different parameter settings, sample sizes and censoring percentages, some simulations are performed. In addition, the empirical distribution of some modified residuals are given and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We define martingale and deviance residuals to check the model assumptions. The extended regression model is very useful for the analysis of real data.  相似文献   

In recent years, numerous statisticians have focused their attention on the Bayesian analysis of different paired comparison models. While studying paired comparison techniques, the Davidson model is considered to be one of the famous paired comparison models in the available literature. In this article, we have introduced an amendment in the Davidson model which has been commenced to accommodate the option of not distinguishing the effects of two treatments when they are compared pairwise. Having made this amendment, the Bayesian analysis of the Amended Davidson model is performed using the noninformative (uniform and Jeffreys’) and informative (Dirichlet–gamma–gamma) priors. To study the model and to perform the Bayesian analysis with the help of an example, we have obtained the joint and marginal posterior distributions of the parameters, their posterior estimates, graphical presentations of the marginal densities, preference and predictive probabilities and the posterior probabilities to compare the treatment parameters.  相似文献   

A longitudinal mixture model for classifying patients into responders and non‐responders is established using both likelihood‐based and Bayesian approaches. The model takes into consideration responders in the control group. Therefore, it is especially useful in situations where the placebo response is strong, or in equivalence trials where the drug in development is compared with a standard treatment. Under our model, a treatment shows evidence of being effective if it increases the proportion of responders or increases the response rate among responders in the treated group compared with the control group. Therefore, the model has flexibility to accommodate different situations. The proposed method is illustrated using simulation and a depression clinical trial dataset for the likelihood‐based approach, and the same depression clinical trial dataset for the Bayesian approach. The likelihood‐based and Bayesian approaches generated consistent results for the depression trial data. In both the placebo group and the treated group, patients are classified into two components with distinct response rate. The proportion of responders is shown to be significantly higher in the treated group compared with the control group, suggesting the treatment paroxetine is effective. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

A random-effects transition model is proposed to model the economic activity status of household members. This model is introduced to take into account two kinds of correlations; one due to the longitudinal nature of the study, which will be considered using a transition parameter, and the other due to the existing correlation between responses of members of the same household which is taken into account by introducing random coefficients into the model. The results are presented based on the homogeneous (all parameters are not changed by time) and non-homogeneous Markov models with random coefficients. A Bayesian approach via the Gibbs sampling is used to perform parameter estimation. Results of using random-effects transition model are compared, using deviance information criterion, with those of three other models which exclude random effects and/or transition effects. It is shown that the full model gains more precision due to the consideration of all aspects of the process which generated the data. To illustrate the utility of the proposed model, a longitudinal data set which is extracted from the Iranian Labour Force Survey is analysed to explore the simultaneous effect of some covariates on the current economic activity as a nominal response. Also, some sensitivity analyses are performed to assess the robustness of the posterior estimation of the transition parameters to the perturbations of the prior parameters.  相似文献   

