首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Event history models typically assume that the entire population is at risk of experiencing the event of interest throughout the observation period. However, there will often be individuals, referred to as long-term survivors, who may be considered a priori to have a zero hazard throughout the study period. In this paper, a discrete-time mixture model is proposed in which the probability of long-term survivorship and the timing of event occurrence are modelled jointly. Another feature of event history data that often needs to be considered is that they may come from a population with a hierarchical structure. For example, individuals may be nested within geographical regions and individuals in the same region may have similar risks of experiencing the event of interest due to unobserved regional characteristics. Thus, the discrete-time mixture model is extended to allow for clustering in the likelihood and timing of an event within regions. The model is further extended to allow for unobserved individual heterogeneity in the hazard of event occurrence. The proposed model is applied in an analysis of contraceptive sterilization in Bangladesh. The results show that a woman's religion and education level affect her probability of choosing sterilization, but not when she gets sterilized. There is also evidence of community-level variation in sterilization timing, but not in the probability of sterilization.  相似文献   

2.
Summary. Conventional multilevel models assume that the explanatory variables are uncorrelated with the random effects. In some situations, this assumption may be invalid. One such example is the evaluation of a health or social programme that is non-randomly placed and/or in which participation is voluntary. In this case, there may be unobserved factors influencing the placement of the programme and the decision to participate that are correlated with the unobserved factors that influence the outcome of interest. The paper presents an application of a multiprocess multilevel model to assess the difference in rates of discontinuation of contraception between private and Government family planning providers, while accounting for the possibility that there may be unobserved individual and community level factors that influence both a couple's choice of provider and their probability of discontinuation.  相似文献   

3.
Unobservable individual effects in models of duration will cause estimation bias that include the structural parameters as well as the duration dependence. The maximum penalized likelihood estimator is examined as an estimator for the survivor model with heterogeneity. Proofs of the existence and uniqueness of the maximum penalized likelihood estimator in duration model with general forms of unobserved heterogeneity are provided. Some small sample evidence on the behavior of the maximum penalized likelihood estimator is given. The maximum penalized likelihood estimator is shown to be computationally feasible and to provide reasonable estimates in most cases.  相似文献   

4.
The paper deals with discrete-time regression models to analyze multistate—multiepisode models for event history data or failure time data collected in follow-up studies, retrospective studies, or longitudinal panels. The models are applicable if the events are not dated exactly but only a time interval is recorded. The models include individual specific parameters to account for unobserved heterogeneity. The explantory variables may be time-varying and random with distributions depending on the observed history of the process. Different estimation procedures are considered: Estimation of structural as well as individual specific parameters by maximization of a joint likelihood function, estimation of the structural parameters by maximization of a conditional likelihood function conditioning on a set of sufficient statistics for the individual specific parameters, and estimation of the structural parameters by maximization of a marginal likelihood function assuming that the individual specific parameters follow a distribution. The advantages and limitations of the different approaches are discussed.  相似文献   

5.
Tweedie regression models (TRMs) provide a flexible family of distributions to deal with non-negative right-skewed data and can handle continuous data with probability mass at zero. Estimation and inference of TRMs based on the maximum likelihood (ML) method are challenged by the presence of an infinity sum in the probability function and non-trivial restrictions on the power parameter space. In this paper, we propose two approaches for fitting TRMs, namely quasi-likelihood (QML) and pseudo-likelihood (PML). We discuss their asymptotic properties and perform simulation studies to compare our methods with the ML method. We show that the QML method provides asymptotically efficient estimation for regression parameters. Simulation studies showed that the QML and PML approaches present estimates, standard errors and coverage rates similar to the ML method. Furthermore, the second-moment assumptions required by the QML and PML methods enable us to extend the TRMs to the class of quasi-TRMs in Wedderburn's style. It allows to eliminate the non-trivial restriction on the power parameter space, and thus provides a flexible regression model to deal with continuous data. We provide an R implementation and illustrate the application of TRMs using three data sets.  相似文献   

6.
The traditional mixture model assumes that a dataset is composed of several populations of Gaussian distributions. In real life, however, data often do not fit the restrictions of normality very well. It is likely that data from a single population exhibiting either asymmetrical or heavy-tail behavior could be erroneously modeled as two populations, resulting in suboptimal decisions. To avoid these pitfalls, we generalize the mixture model using adaptive kernel density estimators. Because kernel density estimators enforce no functional form, we can adapt to non-normal asymmetric, kurtotic, and tail characteristics in each population independently. This, in effect, robustifies mixture modeling. We adapt two computational algorithms, genetic algorithm with regularized Mahalanobis distance and genetic expectation maximization algorithm, to optimize the kernel mixture model (KMM) and use results from robust estimation theory in order to data-adaptively regularize both. Finally, we likewise extend the information criterion ICOMP to score the KMM. We use these tools to simultaneously select the best mixture model and classify all observations without making any subjective decisions. The performance of the KMM is demonstrated on two medical datasets; in both cases, we recover the clinically determined group structure and substantially improve patient classification rates over the Gaussian mixture model.  相似文献   

7.
In high-dimensional linear regression, the dimension of variables is always greater than the sample size. In this situation, the traditional variance estimation technique based on ordinary least squares constantly exhibits a high bias even under sparsity assumption. One of the major reasons is the high spurious correlation between unobserved realized noise and several predictors. To alleviate this problem, a refitted cross-validation (RCV) method has been proposed in the literature. However, for a complicated model, the RCV exhibits a lower probability that the selected model includes the true model in case of finite samples. This phenomenon may easily result in a large bias of variance estimation. Thus, a model selection method based on the ranks of the frequency of occurrences in six votes from a blocked 3×2 cross-validation is proposed in this study. The proposed method has a considerably larger probability of including the true model in practice than the RCV method. The variance estimation obtained using the model selected by the proposed method also shows a lower bias and a smaller variance. Furthermore, theoretical analysis proves the asymptotic normality property of the proposed variance estimation.  相似文献   

8.
We define a parametric proportional odds frailty model to describe lifetime data incorporating heterogeneity between individuals. An unobserved individual random effect, called frailty, acts multiplicatively on the odds of failure by time t. We investigate fitting by maximum likelihood and by least squares. For the latter, the parametric survivor function is fitted to the nonparametric Kaplan–Meier estimate at the observed failure times. Bootstrap standard errors and confidence intervals are obtained for the least squares estimates. The models are applied successfully to simulated data and to two real data sets. Least squares estimates appear to have smaller bias than maximum likelihood.  相似文献   

9.
Summary.  In the empirical literature on assortative matching using linked employer–employee data, unobserved worker quality appears to be negatively correlated with unobserved firm quality. We show that this can be caused by standard estimation error. We develop formulae that show that the estimated correlation is biased downwards if there is true positive assortative matching and when any conditioning covariates are uncorrelated with the firm and worker fixed effects. We show that this bias is bigger the fewer movers there are in the data, which is 'limited mobility bias'. This result applies to any two-way (or higher) error components model that is estimated by fixed effects methods. We apply these bias corrections to a large German linked employer–employee data set. We find that, although the biases can be considerable, they are not sufficiently large to remove the negative correlation entirely.  相似文献   

10.
This paper examines the asymptotic properties of a binary response model estimator based on maximization of the Area Under receiver operating characteristic Curve (AUC). Given certain assumptions, AUC maximization is a consistent method of binary response model estimation up to normalizations. As AUC is equivalent to Mann-Whitney U statistics and Wilcoxon test of ranks, maximization of area under ROC curve is equivalent to the maximization of corresponding statistics. Compared to parametric methods, such as logit and probit, AUC maximization relaxes assumptions about error distribution, but imposes some restrictions on the distribution of explanatory variables, which can be easily checked, since this information is observable.  相似文献   

11.
Network meta‐analysis can be implemented by using arm‐based or contrast‐based models. Here we focus on arm‐based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial‐by‐treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi‐likelihood/pseudo‐likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi‐likelihood/pseudo‐likelihood and h‐likelihood reduce bias and yield satisfactory coverage rates. Sum‐to‐zero restriction and baseline contrasts for random trial‐by‐treatment interaction effects, as well as a residual ML‐like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi‐likelihood/pseudo‐likelihood and h‐likelihood are therefore recommended.  相似文献   

12.
13.
Using some logarithmic and integral transformation we transform a continuous covariate frailty model into a polynomial regression model with a random effect. The responses of this mixed model can be ‘estimated’ via conditional hazard function estimation. The random error in this model does not have zero mean and its variance is not constant along the covariate and, consequently, these two quantities have to be estimated. Since the asymptotic expression for the bias is complicated, the two-large-bandwidth trick is proposed to estimate the bias. The proposed transformation is very useful for clustered incomplete data subject to left truncation and right censoring (and for complex clustered data in general). Indeed, in this case no standard software is available to fit the frailty model, whereas for the transformed model standard software for mixed models can be used for estimating the unknown parameters in the original frailty model. A small simulation study illustrates the good behavior of the proposed method. This method is applied to a bladder cancer data set.  相似文献   

14.
Abstract.  We propose a new method for fitting proportional hazards models with error-prone covariates. Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates. For the purpose of imputation, a linear spline model is assumed on the baseline hazard. We discuss consistency and asymptotic normality of the resulting estimators, and propose a stochastic approximation scheme to obtain the estimates. The algorithm is easy to implement, and reduces to the ordinary Cox partial likelihood approach when the measurement error has a degenerate distribution. Simulations indicate high efficiency and robustness. We consider the special case where error-prone replicates are available on the unobserved true covariates. As expected, increasing the number of replicates for the unobserved covariates increases efficiency and reduces bias. We illustrate the practical utility of the proposed method with an Eastern Cooperative Oncology Group clinical trial where a genetic marker, c- myc expression level, is subject to measurement error.  相似文献   

15.
Conventional production function specifications are shown to impose restrictions on the probability distribution of output that cannot be tested with the conventional models. These restrictions have important implications for firm behavior under uncertainty. A flexible representation of a firm's stochastic technology is developed based on the moments of the probability distribution of output. These moments are a unique representation of the technology and are functions of inputs. Large-sample estimators are developed for a linear moment model that is sufficiently flexible to test the restrictions implied by conventional production function specifications. The flexible moment-based approach is applied to milk production data. The first three moments of output are statistically significant functions of inputs. The cross-moment restrictions implied by conventional models are rejected.  相似文献   

16.
The paper compares several versions of the likelihood ratio test for exponential homogeneity against mixtures of two exponentials. They are based on different implementations of the likelihood maximization algorithm. We show that global maximization of the likelihood is not appropriate to obtain a good power of the LR test. A simple starting strategy for the EM algorithm, which under the null hypothesis often fails to find the global maximum, results in a rather powerful test. On the other hand, a multiple starting strategy that comes close to global maximization under both the null and the alternative hypotheses leads to inferior power.  相似文献   

17.
蔡明超  张静 《统计研究》2000,17(12):21-24
一、引言随着我国证券市场的不断发展 ,机构投资者的队伍也迅速壮大 ,这种趋势对市场微观结构的明显影响就是组合管理策略显得越发重要。组合管理的核心之一就是资产分配 (AssetAllocation)比例的制定。机构投资者的资产分配通常由两部分构成 :( 1)政策性约束导致的被动分配 ,如我国《证券投资基金管理暂行办法》第五章第三十三条就明确规定 :一家基金投资于股票、债券的比例 ,不得低于该基金资产总值的 80 % ;一家基金投资于国家债券的比例 ,不得低于该基金资产净值的 2 0 % ;( 2 )积极的资产分配。一个正确的资产分配策略 …  相似文献   

18.
Abstract

In this work, we propose beta prime kernel estimator for estimation of a probability density functions defined with nonnegative support. For the proposed estimator, beta prime probability density function used as a kernel. It is free of boundary bias and nonnegative with a natural varying shape. We obtained the optimal rate of convergence for the mean squared error (MSE) and the mean integrated squared error (MISE). Also, we use adaptive Bayesian bandwidth selection method with Lindley approximation for heavy tailed distributions and compare its performance with the global least squares cross-validation bandwidth selection method. Simulation studies are performed to evaluate the average integrated squared error (ISE) of the proposed kernel estimator against some asymmetric competitors using Monte Carlo simulations. Moreover, real data sets are presented to illustrate the findings.  相似文献   

19.
In the classical growth curve setting, individuals are repeatedly measured over time on an outcome of interest. The objective of statistical modeling is to fit some function of time, generally a polynomial, that describes the outcome's behavior. The polynomial coefficients are assumed drawn from a multivariate normal mixing distribution. At times, it may be known that each individual's polynomial must follow a restricted form. When the polynomial coefficients lie close to the restriction boundary, or the outcome is subject to substantial measurement error, or relatively few observations per individual are recorded, it can be advantageous to incorporate known restrictions. This paper introduces a class of models where the polynomial coefficients are assumed drawn from a restricted multivariate normal whose support is confined to a theoretically permissible region. The model can handle a variety of restrictions on the space of random parameters. The restricted support ensures that each individual's random polynomial is theoretically plausible. Estimation, posterior calculations, and comparisons with the unrestricted approach are provided.  相似文献   

20.
We propose a hidden Markov model for longitudinal count data where sources of unobserved heterogeneity arise, making data overdispersed. The observed process, conditionally on the hidden states, is assumed to follow an inhomogeneous Poisson kernel, where the unobserved heterogeneity is modeled in a generalized linear model (GLM) framework by adding individual-specific random effects in the link function. Due to the complexity of the likelihood within the GLM framework, model parameters may be estimated by numerical maximization of the log-likelihood function or by simulation methods; we propose a more flexible approach based on the Expectation Maximization (EM) algorithm. Parameter estimation is carried out using a non-parametric maximum likelihood (NPML) approach in a finite mixture context. Simulation results and two empirical examples are provided.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号