首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Generalized linear mixed models are widely used for describing overdispersed and correlated data. Such data arise frequently in studies involving clustered and hierarchical designs. A more flexible class of models has been developed here through the Dirichlet process mixture. An additional advantage of using such mixture models is that the observations can be grouped together on the basis of the overdispersion present in the data. This paper proposes a partial empirical Bayes method for estimating all the model parameters by adopting a version of the EM algorithm. An augmented model that helps to implement an efficient Gibbs sampling scheme, under the non‐conjugate Dirichlet process generalized linear model, generates observations from the conditional predictive distribution of unobserved random effects and provides an estimate of the average number of mixing components in the Dirichlet process mixture. A simulation study has been carried out to demonstrate the consistency of the proposed method. The approach is also applied to a study on outdoor bacteria concentration in the air and to data from 14 retrospective lung‐cancer studies.  相似文献   

2.
3.
Time‐varying coefficient models are widely used in longitudinal data analysis. These models allow the effects of predictors on response to vary over time. In this article, we consider a mixed‐effects time‐varying coefficient model to account for the within subject correlation for longitudinal data. We show that when kernel smoothing is used to estimate the smooth functions in time‐varying coefficient models for sparse or dense longitudinal data, the asymptotic results of these two situations are essentially different. Therefore, a subjective choice between the sparse and dense cases might lead to erroneous conclusions for statistical inference. In order to solve this problem, we establish a unified self‐normalized central limit theorem, based on which a unified inference is proposed without deciding whether the data are sparse or dense. The effectiveness of the proposed unified inference is demonstrated through a simulation study and an analysis of Baltimore MACS data.  相似文献   

4.
Propensity score methods are increasingly used in medical literature to estimate treatment effect using data from observational studies. Despite many papers on propensity score analysis, few have focused on the analysis of survival data. Even within the framework of the popular proportional hazard model, the choice among marginal, stratified or adjusted models remains unclear. A Monte Carlo simulation study was used to compare the performance of several survival models to estimate both marginal and conditional treatment effects. The impact of accounting or not for pairing when analysing propensity‐score‐matched survival data was assessed. In addition, the influence of unmeasured confounders was investigated. After matching on the propensity score, both marginal and conditional treatment effects could be reliably estimated. Ignoring the paired structure of the data led to an increased test size due to an overestimated variance of the treatment effect. Among the various survival models considered, stratified models systematically showed poorer performance. Omitting a covariate in the propensity score model led to a biased estimation of treatment effect, but replacement of the unmeasured confounder by a correlated one allowed a marked decrease in this bias. Our study showed that propensity scores applied to survival data can lead to unbiased estimation of both marginal and conditional treatment effect, when marginal and adjusted Cox models are used. In all cases, it is necessary to account for pairing when analysing propensity‐score‐matched data, using a robust estimator of the variance. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

5.
The authors consider a class of state space models for the analysis of non‐normal longitudinal data whose latent process follows a stationary AR(1) model with exponential dispersion model margins. They propose to estimate parameters through an estimating equation approach based on the Kalman smoother. This allows them to carry out a straightforward analysis of a wide range of non‐normal data. They illustrate their approach via a simulation study and through analyses of Brazilian precipitation and US polio infection data.  相似文献   

6.
Abstract. To increase the predictive abilities of several plasma biomarkers on the coronary artery disease (CAD)‐related vital statuses over time, our research interest mainly focuses on seeking combinations of these biomarkers with the highest time‐dependent receiver operating characteristic curves. An extended generalized linear model (EGLM) with time‐varying coefficients and an unknown bivariate link function is used to characterize the conditional distribution of time to CAD‐related death. Based on censored survival data, two non‐parametric procedures are proposed to estimate the optimal composite markers, linear predictors in the EGLM model. Estimation methods for the classification accuracies of the optimal composite markers are also proposed. In the article we establish theoretical results of the estimators and examine the corresponding finite‐sample properties through a series of simulations with different sample sizes, censoring rates and censoring mechanisms. Our optimization procedures and estimators are further shown to be useful through an application to a prospective cohort study of patients undergoing angiography.  相似文献   

7.
In cost‐effectiveness analyses of drugs or health technologies, estimates of life years saved or quality‐adjusted life years saved are required. Randomised controlled trials can provide an estimate of the average treatment effect; for survival data, the treatment effect is the difference in mean survival. However, typically not all patients will have reached the endpoint of interest at the close‐out of a trial, making it difficult to estimate the difference in mean survival. In this situation, it is common to report the more readily estimable difference in median survival. Alternative approaches to estimating the mean have also been proposed. We conducted a simulation study to investigate the bias and precision of the three most commonly used sample measures of absolute survival gain – difference in median, restricted mean and extended mean survival – when used as estimates of the true mean difference, under different censoring proportions, while assuming a range of survival patterns, represented by Weibull survival distributions with constant, increasing and decreasing hazards. Our study showed that the three commonly used methods tended to underestimate the true treatment effect; consequently, the incremental cost‐effectiveness ratio (ICER) would be overestimated. Of the three methods, the least biased is the extended mean survival, which perhaps should be used as the point estimate of the treatment effect to be inputted into the ICER, while the other two approaches could be used in sensitivity analyses. More work on the trade‐offs between simple extrapolation using the exponential distribution and more complicated extrapolation using other methods would be valuable. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

8.
During drug development, the calculation of inhibitory concentration that results in a response of 50% (IC50) is performed thousands of times every day. The nonlinear model most often used to perform this calculation is a four‐parameter logistic, suitably parameterized to estimate the IC50 directly. When performing these calculations in a high‐throughput mode, each and every curve cannot be studied in detail, and outliers in the responses are a common problem. A robust estimation procedure to perform this calculation is desirable. In this paper, a rank‐based estimate of the four‐parameter logistic model that is analogous to least squares is proposed. The rank‐based estimate is based on the Wilcoxon norm. The robust procedure is illustrated with several examples from the pharmaceutical industry. When no outliers are present in the data, the robust estimate of IC50 is comparable with the least squares estimate, and when outliers are present in the data, the robust estimate is more accurate. A robust goodness‐of‐fit test is also proposed. To investigate the impact of outliers on the traditional and robust estimates, a small simulation study was conducted. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

9.
Longitudinal data often contain missing observations, and it is in general difficult to justify particular missing data mechanisms, whether random or not, that may be hard to distinguish. The authors describe a likelihood‐based approach to estimating both the mean response and association parameters for longitudinal binary data with drop‐outs. They specify marginal and dependence structures as regression models which link the responses to the covariates. They illustrate their approach using a data set from the Waterloo Smoking Prevention Project They also report the results of simulation studies carried out to assess the performance of their technique under various circumstances.  相似文献   

10.
Small area estimation has received considerable attention in recent years because of growing demand for small area statistics. Basic area‐level and unit‐level models have been studied in the literature to obtain empirical best linear unbiased prediction (EBLUP) estimators of small area means. Although this classical method is useful for estimating the small area means efficiently under normality assumptions, it can be highly influenced by the presence of outliers in the data. In this article, the authors investigate the robustness properties of the classical estimators and propose a resistant method for small area estimation, which is useful for downweighting any influential observations in the data when estimating the model parameters. To estimate the mean squared errors of the robust estimators of small area means, a parametric bootstrap method is adopted here, which is applicable to models with block diagonal covariance structures. Simulations are carried out to study the behaviour of the proposed robust estimators in the presence of outliers, and these estimators are also compared to the EBLUP estimators. Performance of the bootstrap mean squared error estimator is also investigated in the simulation study. The proposed robust method is also applied to some real data to estimate crop areas for counties in Iowa, using farm‐interview data on crop areas and LANDSAT satellite data as auxiliary information. The Canadian Journal of Statistics 37: 381–399; 2009 © 2009 Statistical Society of Canada  相似文献   

11.
Abstract. Estimating higher‐order moments, particularly fourth‐order moments in linear mixed models is an important, but difficult issue. In this article, an orthogonality‐based estimation of moments is proposed. Under only moment conditions, this method can easily be used to estimate the model parameters and moments, particularly those of higher order than the second order, and in the estimators the random effects and errors do not affect each other. The asymptotic normality of all the estimators is provided. Moreover, the method is readily extended to handle non‐linear, semiparametric and non‐linear models. A simulation study is carried out to examine the performance of the new method.  相似文献   

12.
The process comparing the empirical cumulative distribution function of the sample with a parametric estimate of the cumulative distribution function is known as the empirical process with estimated parameters and has been extensively employed in the literature for goodness‐of‐fit testing. The simplest way to carry out such goodness‐of‐fit tests, especially in a multivariate setting, is to use a parametric bootstrap. Although very easy to implement, the parametric bootstrap can become very computationally expensive as the sample size, the number of parameters, or the dimension of the data increase. An alternative resampling technique based on a fast weighted bootstrap is proposed in this paper, and is studied both theoretically and empirically. The outcome of this work is a generic and computationally efficient multiplier goodness‐of‐fit procedure that can be used as a large‐sample alternative to the parametric bootstrap. In order to approximately determine how large the sample size needs to be for the parametric and weighted bootstraps to have roughly equivalent powers, extensive Monte Carlo experiments are carried out in dimension one, two and three, and for models containing up to nine parameters. The computational gains resulting from the use of the proposed multiplier goodness‐of‐fit procedure are illustrated on trivariate financial data. A by‐product of this work is a fast large‐sample goodness‐of‐fit procedure for the bivariate and trivariate t distribution whose degrees of freedom are fixed. The Canadian Journal of Statistics 40: 480–500; 2012 © 2012 Statistical Society of Canada  相似文献   

13.
Kernel Density Estimation on a Linear Network   总被引:1,自引:0,他引:1       下载免费PDF全文
This paper develops a statistically principled approach to kernel density estimation on a network of lines, such as a road network. Existing heuristic techniques are reviewed, and their weaknesses are identified. The correct analogue of the Gaussian kernel is the ‘heat kernel’, the occupation density of Brownian motion on the network. The corresponding kernel estimator satisfies the classical time‐dependent heat equation on the network. This ‘diffusion estimator’ has good statistical properties that follow from the heat equation. It is mathematically similar to an existing heuristic technique, in that both can be expressed as sums over paths in the network. However, the diffusion estimate is an infinite sum, which cannot be evaluated using existing algorithms. Instead, the diffusion estimate can be computed rapidly by numerically solving the time‐dependent heat equation on the network. This also enables bandwidth selection using cross‐validation. The diffusion estimate with automatically selected bandwidth is demonstrated on road accident data.  相似文献   

14.
The authors consider the estimation of the parametric component of a partially nonlinear semiparametric regression model whose nonparametric component is viewed as a nuisance parameter. They show how estimation can proceed through a nonlinear mixed‐effects model approach. They prove that under certain regularity conditions, the proposed estimate is consistent and asymptotically Gaussian. They investigate its finite‐sample properties through simulations and illustrate its use with data on the relation between the photosynthetically active radiation and the net ecosystem‐atmosphere exchange of carbon dioxide.  相似文献   

15.
A diverse range of non‐cardiovascular drugs are associated with QT interval prolongation, which may be associated with a potentially fatal ventricular arrhythmia known as torsade de pointes. QT interval has been assessed for two recent submissions at GlaxoSmithKline. Meta‐analyses of ECG data from several clinical pharmacology studies were conducted for the two submissions. A general fixed effects meta‐analysis approach using summaries of the individual studies was used to calculate a pooled estimate and 90% confidence interval for the difference between each active dose and placebo following both single and repeat dosing separately. The meta‐analysis approach described provided a pragmatic solution to pooling complex and varied studies, and is a good way of addressing regulatory questions on QTc prolongation. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

16.
This paper deals with a longitudinal semi‐parametric regression model in a generalised linear model setup for repeated count data collected from a large number of independent individuals. To accommodate the longitudinal correlations, we consider a dynamic model for repeated counts which has decaying auto‐correlations as the time lag increases between the repeated responses. The semi‐parametric regression function involved in the model contains a specified regression function in some suitable time‐dependent covariates and a non‐parametric function in some other time‐dependent covariates. As far as the inference is concerned, because the non‐parametric function is of secondary interest, we estimate this function consistently using the independence assumption‐based well‐known quasi‐likelihood approach. Next, the proposed longitudinal correlation structure and the estimate of the non‐parametric function are used to develop a semi‐parametric generalised quasi‐likelihood approach for consistent and efficient estimation of the regression effects in the parametric regression function. The finite sample performance of the proposed estimation approach is examined through an intensive simulation study based on both large and small samples. Both balanced and unbalanced cluster sizes are incorporated in the simulation study. The asymptotic performances of the estimators are given. The estimation methodology is illustrated by reanalysing the well‐known health care utilisation data consisting of counts of yearly visits to a physician by 180 individuals for four years and several important primary and secondary covariates.  相似文献   

17.
Much of the small‐area estimation literature focuses on population totals and means. However, users of survey data are often interested in the finite‐population distribution of a survey variable and in the measures (e.g. medians, quartiles, percentiles) that characterize the shape of this distribution at the small‐area level. In this paper we propose a model‐based direct estimator (MBDE, Chandra and Chambers) of the small‐area distribution function. The MBDE is defined as a weighted sum of sample data from the area of interest, with weights derived from the calibrated spline‐based estimate of the finite‐population distribution function introduced by Harms and Duchesne, under an appropriately specified regression model with random area effects. We also discuss the mean squared error estimation of the MBDE. Monte Carlo simulations based on both simulated and real data sets show that the proposed MBDE and its associated mean squared error estimator perform well when compared with alternative estimators of the area‐specific finite‐population distribution function.  相似文献   

18.
We examine the relationships between electoral socio‐demographic characteristics and two‐party preferences in the six Australian federal elections held between 2001 and 2016. Socio‐demographic information is derived from the Australian Census which occurs every 5 years. Since a census is not directly available for each election, an imputation method is employed to estimate census data for the electorates at the time of each election. This accounts for both spatial and temporal changes in electoral characteristics between censuses. To capture any spatial heterogeneity, a spatial error model is estimated for each election, which incorporates a spatially structured random effect vector. Over time, the impact of most socio‐demographic characteristics that affect electoral two‐party preference do not vary, with age distribution, industry of work, incomes, household mobility and relationships having strong effects in each of the six elections. Education and unemployment are among those that have varying effects. All data featured in this study have been contributed to the eechidna R package (available on CRAN).  相似文献   

19.
Remote sensing of the earth with satellites yields datasets that can be massive in size, nonstationary in space, and non‐Gaussian in distribution. To overcome computational challenges, we use the reduced‐rank spatial random effects (SRE) model in a statistical analysis of cloud‐mask data from NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) instrument on board NASA's Terra satellite. Parameterisations of cloud processes are the biggest source of uncertainty and sensitivity in different climate models’ future projections of Earth's climate. An accurate quantification of the spatial distribution of clouds, as well as a rigorously estimated pixel‐scale clear‐sky‐probability process, is needed to establish reliable estimates of cloud‐distributional changes and trends caused by climate change. Here we give a hierarchical spatial‐statistical modelling approach for a very large spatial dataset of 2.75 million pixels, corresponding to a granule of MODIS cloud‐mask data, and we use spatial change‐of‐Support relationships to estimate cloud fraction at coarser resolutions. Our model is non‐Gaussian; it postulates a hidden process for the clear‐sky probability that makes use of the SRE model, EM‐estimation, and optimal (empirical Bayes) spatial prediction of the clear‐sky‐probability process. Measures of prediction uncertainty are also given.  相似文献   

20.
The Quermass‐interaction model allows to generalize the classical germ‐grain Boolean model in adding a morphological interaction between the grains. It enables to model random structures with specific morphologies, which are unlikely to be generated from a Boolean model. The Quermass‐interaction model depends in particular on an intensity parameter, which is impossible to estimate from classical likelihood or pseudo‐likelihood approaches because the number of points is not observable from a germ‐grain set. In this paper, we present a procedure based on the Takacs–Fiksel method, which is able to estimate all parameters of the Quermass‐interaction model, including the intensity. An intensive simulation study is conducted to assess the efficiency of the procedure and to provide practical recommendations. It also illustrates that the estimation of the intensity parameter is crucial in order to identify the model. The Quermass‐interaction model is finally fitted by our method to P. Diggle's heather data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号