首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 218 毫秒
The measurement of human immunodeficiency virus ribonucleic acid levels over time leads to censored longitudinal data. Suitable models for dynamic modelling of these levels need to take this data characteristic into account. If groups of patients with different developments of the levels over time are suspected the model class of finite mixtures of mixed effects models with censored data is required. We describe the model specification and derive the estimation with a suitable expectation-maximization algorithm. We propose a convenient implementation using closed form formulae for the expected mean and variance of the truncated multivariate distribution. Only efficient evaluation of the cumulative multivariate normal distribution function is required. Model selection as well as methods for inference are discussed. The application is demonstrated on the clinical trial ACTG 315 data.  相似文献   

The Tweedie compound Poisson distribution is a subclass of the exponential dispersion family with a power variance function, in which the value of the power index lies in the interval (1,2). It is well known that the Tweedie compound Poisson density function is not analytically tractable, and numerical procedures that allow the density to be accurately and fast evaluated did not appear until fairly recently. Unsurprisingly, there has been little statistical literature devoted to full maximum likelihood inference for Tweedie compound Poisson mixed models. To date, the focus has been on estimation methods in the quasi-likelihood framework. Further, Tweedie compound Poisson mixed models involve an unknown variance function, which has a significant impact on hypothesis tests and predictive uncertainty measures. The estimation of the unknown variance function is thus of independent interest in many applications. However, quasi-likelihood-based methods are not well suited to this task. This paper presents several likelihood-based inferential methods for the Tweedie compound Poisson mixed model that enable estimation of the variance function from the data. These algorithms include the likelihood approximation method, in which both the integral over the random effects and the compound Poisson density function are evaluated numerically; and the latent variable approach, in which maximum likelihood estimation is carried out via the Monte Carlo EM algorithm, without the need for approximating the density function. In addition, we derive the corresponding Markov Chain Monte Carlo algorithm for a Bayesian formulation of the mixed model. We demonstrate the use of the various methods through a numerical example, and conduct an array of simulation studies to evaluate the statistical properties of the proposed estimators.  相似文献   

Millions of smart meters that are able to collect individual load curves, that is, electricity consumption time series, of residential and business customers at fine scale time grids are now deployed by electricity companies all around the world. It may be complex and costly to transmit and exploit such a large quantity of information, therefore it can be relevant to use survey sampling techniques to estimate mean load curves of specific groups of customers. Data collection, like every mass process, may undergo technical problems at every point of the metering and collection chain resulting in missing values. We consider imputation approaches (linear interpolation, kernel smoothing, nearest neighbours, principal analysis by conditional estimation) that take advantage of the specificities of the data, that is to say the strong relation between the consumption at different instants of time. The performances of these techniques are compared on a real example of Irish electricity load curves under various scenarios of missing data. A general variance approximation of total estimators is also given which encompasses nearest neighbours, kernel smoothers imputation and linear imputation methods. The Canadian Journal of Statistics 47: 65–89; 2019 © 2018 Statistical Society of Canada  相似文献   

Probabilistic matching of records is widely used to create linked data sets for use in health science, epidemiological, economic, demographic and sociological research. Clearly, this type of matching can lead to linkage errors, which in turn can lead to bias and increased variability when standard statistical estimation techniques are used with the linked data. In this paper we develop unbiased regression parameter estimates to be used when fitting a linear model with nested errors to probabilistically linked data. Since estimation of variance components is typically an important objective when fitting such a model, we also develop appropriate modifications to standard methods of variance components estimation in order to account for linkage error. In particular, we focus on three widely used methods of variance components estimation: analysis of variance, maximum likelihood and restricted maximum likelihood. Simulation results show that our estimators perform reasonably well when compared to standard estimation methods that ignore linkage errors.  相似文献   

Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.  相似文献   

Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including resampling methods such as the jackknife and the bootstrap. In this paper, we consider the problem of doubly robust inference in the presence of imputed survey data. In the doubly robust literature, point estimation has been the main focus. In this paper, using the reverse framework for variance estimation, we derive doubly robust linearization variance estimators in the case of deterministic and random regression imputation within imputation classes. Also, we study the properties of several jackknife variance estimators under both negligible and nonnegligible sampling fractions. A limited simulation study investigates the performance of various variance estimators in terms of relative bias and relative stability. Finally, the asymptotic normality of imputed estimators is established for stratified multistage designs under both deterministic and random regression imputation. The Canadian Journal of Statistics 40: 259–281; 2012 © 2012 Statistical Society of Canada  相似文献   

Empirical Bayes (EB) estimates in general linear mixed models are useful for the small area estimation in the sense of increasing precision of estimation of small area means. However, one potential difficulty of EB is that the overall estimate for a larger geographical area based on a (weighted) sum of EB estimates is not necessarily identical to the corresponding direct estimate such as the overall sample mean. Another difficulty is that EB estimates yield over‐shrinking, which results in the sampling variance smaller than the posterior variance. One way to fix these problems is the benchmarking approach based on the constrained empirical Bayes (CEB) estimators, which satisfy the constraints that the aggregated mean and variance are identical to the requested values of mean and variance. In this paper, we treat the general mixed models, derive asymptotic approximations of the mean squared error (MSE) of CEB and provide second‐order unbiased estimators of MSE based on the parametric bootstrap method. These results are applied to natural exponential families with quadratic variance functions. As a specific example, the Poisson‐gamma model is dealt with, and it is illustrated that the CEB estimates and their MSE estimates work well through real mortality data.  相似文献   

The paper examplifies with Hsu’s model a general pattern as how to derive results of variance component estimation from well known results on mean estimation, as far as linear model theory is concerned. This ’ dispersion-mean-correspondence‘provides new and short proofs for various theorems from the literature, concerning unbiased invariant quadratic estimators with minimum BAYES risk or minimum variance. For pure variance component models, unbiased non-negative quadratic estimability is characterized in terms of the design matrices.  相似文献   

We propose a heterogeneous time-varying panel data model with a latent group structure that allows the coefficients to vary over both individuals and time. We assume that the coefficients change smoothly over time and form different unobserved groups. When treated as smooth functions of time, the individual functional coefficients are heterogeneous across groups but homogeneous within a group. We propose a penalized-sieve-estimation-based classifier-Lasso (C-Lasso) procedure to identify the individuals’ membership and to estimate the group-specific functional coefficients in a single step. The classification exhibits the desirable property of uniform consistency. The C-Lasso estimators and their post-Lasso versions achieve the oracle property so that the group-specific functional coefficients can be estimated as well as if the individuals’ membership were known. Several extensions are discussed. Simulations demonstrate excellent finite sample performance of the approach in both classification and estimation. We apply our method to study the heterogeneous trending behavior of GDP per capita across 91 countries for the period 1960–2012 and find four latent groups.  相似文献   

We derive the variance constant of continuous-time level dependent quasi-birth-and-death processes by investigating the expected integral functionals of the first return times. As an application, we consider the variance constant for the M/M/c retrial queue with non-persistent customers. For this model, analytical expressions and numerical results are obtained for the cases of single server and multiple servers, respectively. We also apply the obtained result to test the M/M/c vacation model for airport security pre-board screening checkpoint services by constructing a confidence interval for the mean queue length.  相似文献   

It is often critical to accurately model the upper tail behaviour of a random process. Nonparametric density estimation methods are commonly implemented as exploratory data analysis techniques for this purpose and can avoid model specification biases implied by using parametric estimators. In particular, kernel-based estimators place minimal assumptions on the data, and provide improved visualisation over scatterplots and histograms. However kernel density estimators can perform poorly when estimating tail behaviour above a threshold, and can over-emphasise bumps in the density for heavy tailed data. We develop a transformation kernel density estimator which is able to handle heavy tailed and bounded data, and is robust to threshold choice. We derive closed form expressions for its asymptotic bias and variance, which demonstrate its good performance in the tail region. Finite sample performance is illustrated in numerical studies, and in an expanded analysis of the performance of global climate models.  相似文献   

Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.  相似文献   

Quality adjusted survival has been increasingly advocated in clinical trials to be assessed as a synthesis of survival and quality of life. We investigate nonparametric estimation of its expectation for a general multistate process with incomplete follow-up data. Upon establishing a representation of expected quality adjusted survival through marginal distributions of a set of defined events, we propose two estimators for expected quality adjusted survival. Expressed as functions of Nelson-Aalen estimators, the two estimators are strongly consistent and asymptotically normal. We derive their asymptotic variances and propose sample-based variance estimates, along with evaluation of asymptotic relative efficiency. Monte Carlo studies show that these estimation procedures perform well for practical sample sizes. We illustrate the methods using data from a national, multicenter AIDS clinical trial.  相似文献   

We consider estimation of the tail index parameter from i.i.d. observations in Pareto and Weibull type models, using a local and asymptotic approach. The slowly varying function describing the non-tail behavior of the distribution is considered as an infinite dimensional nuisance parameter. Without further regularity conditions, we derive a local asymptotic normality (LAN) result for suitably chosen parametric submodels of the full semiparametric model. From this result, we immediately obtain the optimal rate of convergence of tail index parameter estimators for more specific models previously studied. On top of the optimal rate of convergence, our LAN result also gives the minimal limiting variance of estimators (regular for our parametric model) through the convolution theorem. We show that the classical Hill estimator is regular for the submodels introduced with limiting variance equal to the induced convolution theorem bound. We also discuss the Weibull model in this respect.  相似文献   

We study the use of a Scheffé-style simultaneous confidence band as applied to low-dose risk estimation with quantal response data. We consider two formulations for the dose-response risk function, an Abbott-adjusted Weibull model and an Abbott-adjusted log-logistic model. Using the simultaneous construction, we derive methods for estimating upper confidence limits on predicted extra risk and, by inverting the upper bands on risk, lower bounds on the benchmark dose, or BMD, at a specific level of ‘benchmark risk’. Monte Carlo evaluations explore the operating characteristics of the simultaneous limits.  相似文献   

Abstract.  A flexible semi-parametric regression model is proposed for modelling the relationship between a response and multivariate predictor variables. The proposed multiple-index model includes smooth unknown link and variance functions that are estimated non-parametrically. Data-adaptive methods for automatic smoothing parameter selection and for the choice of the number of indices M are considered. This model adapts to complex data structures and provides efficient adaptive estimation through the variance function component in the sense that the asymptotic distribution is the same as if the non-parametric components are known. We develop iterative estimation schemes, which include a constrained projection method for the case where the regression parameter vectors are mutually orthogonal. The proposed methods are illustrated with the analysis of data from a growth bioassay and a reproduction experiment with medflies. Asymptotic properties of the estimated model components are also obtained.  相似文献   

This paper concerns a method of estimation of variance components in a random effect linear model. It is mainly a resampling method and relies on the Jackknife principle. The derived estimators are presented as least squares estimators in an appropriate linear model, and one of them appears as a MINQUE (Minimum Norm Quadratic Unbiased Estimation) estimator. Our resampling method is illustrated by an example given by C. R. Rao [7] and some optimal properties of our estimator are derived for this example. In the last part, this method is used to derive an estimation of variance components in a random effect linear model when one of the components is assumed to be known.  相似文献   

The literature on the sampling properties of the inequality restricted and pre-test estimators typically assumes a properly specified model and focuses on the estimation of the regression coefficient vector. In this paper, we derive and evaluate the risk functions of these estimators for both the prediction vector and the disturbance variance in a model which is mis-specified through the exclusion of relevant regressors. The results suggest that unrestricted estimation is generally preferable to pre-testing or naively imposing restrictions.  相似文献   

Very often, in psychometric research, as in educational assessment, it is necessary to analyze item response from clustered respondents. The multiple group item response theory (IRT) model proposed by Bock and Zimowski [12] provides a useful framework for analyzing such type of data. In this model, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The usual assumption for parameter estimation in this model, which is that the latent traits are random variables following different symmetric normal distributions, has been questioned in many works found in the IRT literature. Furthermore, when this assumption does not hold, misleading inference can result. In this paper, we consider that the latent traits for each group follow different skew-normal distributions, under the centered parameterization. We named it skew multiple group IRT model. This modeling extends the works of Azevedo et al. [4], Bazán et al. [11] and Bock and Zimowski [12] (concerning the latent trait distribution). Our approach ensures that the model is identifiable. We propose and compare, concerning convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for parameter estimation. A simulation study was performed in order to evaluate parameter recovery for the proposed model and the selected algorithm concerning convergence issues. Results reveal that the proposed algorithm recovers properly all model parameters. Furthermore, we analyzed a real data set which presents asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of negative asymmetry for some latent trait distributions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号