首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 703 毫秒
1.
In this paper, a simulation study is conducted to systematically investigate the impact of dichotomizing longitudinal continuous outcome variables under various types of missing data mechanisms. Generalized linear models (GLM) with standard generalized estimating equations (GEE) are widely used for longitudinal outcome analysis, but these semi‐parametric approaches are only valid under missing data completely at random (MCAR). Alternatively, weighted GEE (WGEE) and multiple imputation GEE (MI‐GEE) were developed to ensure validity under missing at random (MAR). Using a simulation study, the performance of standard GEE, WGEE and MI‐GEE on incomplete longitudinal dichotomized outcome analysis is evaluated. For comparisons, likelihood‐based linear mixed effects models (LMM) are used for incomplete longitudinal original continuous outcome analysis. Focusing on dichotomized outcome analysis, MI‐GEE with original continuous missing data imputation procedure provides well controlled test sizes and more stable power estimates compared with any other GEE‐based approaches. It is also shown that dichotomizing longitudinal continuous outcome will result in substantial loss of power compared with LMM. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

2.
Multiple assessments of an efficacy variable are often conducted prior to the initiation of randomized treatments in clinical trials as baseline information. Two goals are investigated in this article, where the first goal is to investigate the choice of these baselines in the analysis of covariance (ANCOVA) to increase the statistical power, and the second to investigate the magnitude of power loss when a continuous efficacy variable is dichotomized to categorical variable as commonly reported the biomedical literature. A statistical power analysis is developed with extensive simulations based on data from clinical trials in study participants with end stage renal disease (ESRD). It is found that the baseline choices primarily depend on the correlations among the baselines and the efficacy variable, with substantial gains for correlations greater than 0.6 and negligible for less than 0.2. Continuous efficacy variables always give higher statistical power in the ANCOVA modeling and dichotomizing the efficacy variable generally decreases the statistical power by 25%, which is an important practicum in designing clinical trials for study sample size and realistically budget. These findings can be easily applied in and extended to other clinical trials with similar design.  相似文献   

3.
When the outcome of interest is semicontinuous and collected longitudinally, efficient testing can be difficult. Daily rainfall data is an excellent example which we use to illustrate the various challenges. Even under the simplest scenario, the popular ‘two-part model’, which uses correlated random-effects to account for both the semicontinuous and longitudinal characteristics of the data, often requires prohibitively intensive numerical integration and difficult interpretation. Reducing data to binary (truncating continuous positive values to equal one), while relatively straightforward, leads to a potentially substantial loss in power. We propose an alternative: using a non-parametric rank test recently proposed for joint longitudinal survival data. We investigate the potential benefits of such a test for the analysis of semicontinuous longitudinal data with regards to power and computational feasibility.  相似文献   

4.
This research was motivated by our goal to design an efficient clinical trial to compare two doses of docosahexaenoic acid supplementation for reducing the rate of earliest preterm births (ePTB) and/or preterm births (PTB). Dichotomizing continuous gestational age (GA) data using a classic binomial distribution will result in a loss of information and reduced power. A distributional approach is an improved strategy to retain statistical power from the continuous distribution. However, appropriate distributions that fit the data properly, particularly in the tails, must be chosen, especially when the data are skewed. A recent study proposed a skew-normal method. We propose a three-component normal mixture model and introduce separate treatment effects at different components of GA. We evaluate operating characteristics of mixture model, beta-binomial model, and skew-normal model through simulation. We also apply these three methods to data from two completed clinical trials from the USA and Australia. Finite mixture models are shown to have favorable properties in PTB analysis but minimal benefit for ePTB analysis. Normal models on log-transformed data have the largest bias. Therefore we recommend finite mixture model for PTB study. Either finite mixture model or beta-binomial model is acceptable for ePTB study.  相似文献   

5.
6.
The COVID-19 pandemic has manifold impacts on clinical trials. In response, drug regulatory agencies and public health bodies have issued guidance on how to assess potential impacts on ongoing clinical trials and stress the importance of a risk-assessment as a pre-requisite for modifications to the clinical trial conduct. This article presents a simulation study to assess the impact on the power of an ongoing clinical trial without the need to unblind trial data and compromise trial integrity. In the context of the CANNA-TICS trial, investigating the effect of nabiximols on reducing the total tic score of the Yale Global Tic Severity Scale (YGTSS-TTS) in patients with chronic tic disorders and Tourette syndrome, the impact of the two COVID-19 related intercurrent events handled by a treatment policy strategy is investigated using a multiplicative and additive data generating model. The empirical power is examined for the analysis of the YGTSS-TTS as a continuous and dichotomized endpoint using analysis techniques adjusted and unadjusted for the occurrence of the intercurrent event. In the investigated scenarios, the simulation studies showed that substantial power losses are possible, potentially making sample size increases necessary to retain sufficient power. However, we were also able to identify scenarios with only limited loss of power. By adjusting for the occurrence of the intercurrent event, the power loss could be diminished to different degrees in most scenarios. In summary, the presented risk assessment approach may support decisions on trial modifications like sample size increases, while maintaining trial integrity.  相似文献   

7.
Multiple imputation under the multivariate normality assumption has often been regarded as a viable model-based approach in dealing with incomplete continuous data in the last two decades. A situation where the measurements are taken on a continuous scale with an ultimate interest in dichotomized versions through discipline-specific thresholds is not uncommon in applied research, especially in medical and social sciences. In practice, researchers generally tend to impute missing values for continuous outcomes under a Gaussian imputation model, and then dichotomize them via commonly-accepted cut-off points. An alternative strategy is creating multiply imputed data sets after dichotomization under a log-linear imputation model that uses a saturated multinomial structure. In this work, the performances of the two imputation methods were examined on a fairly wide range of simulated incomplete data sets that exhibit varying distributional characteristics such as skewness and multimodality. Behavior of efficiency and accuracy measures was explored to determine the extent to which the procedures work properly. The conclusion drawn is that dichotomization before carrying out a log-linear imputation should be the preferred approach except for a few special cases. I recommend that researchers use the atypical second strategy whenever the interest centers on binary quantities that are obtained through underlying continuous measurements. A possible explanation is that erratic/idiosyncratic aspects that are not accommodated by a Gaussian model are probably transformed into better-behaving discrete trends in this particular missing-data setting. This premise outweighs the assertion that continuous variables inherently carry more information, leading to a counter-intuitive, but potentially useful result for practitioners.  相似文献   

8.
Many applications in public health, medical and biomedical or other studies demand modelling of two or more longitudinal outcomes jointly to get better insight into their joint evolution. In this regard, a joint model for a longitudinal continuous and a count sequence, the latter possibly overdispersed and zero-inflated (ZI), will be specified that assembles aspects coming from each one of them into one single model. Further, a subject-specific random effect is included to account for the correlation in the continuous outcome. For the count outcome, clustering and overdispersion are accommodated through two distinct sets of random effects in a generalized linear model as proposed by Molenberghs et al. [A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat Sci. 2010;25:325–347]; one is normally distributed, the other conjugate to the outcome distribution. The association among the two sequences is captured by correlating the normal random effects describing the continuous and count outcome sequences, respectively. An excessive number of zero counts is often accounted for by using a so-called ZI or hurdle model. ZI models combine either a Poisson or negative-binomial model with an atom at zero as a mixture, while the hurdle model separately handles the zero observations and the positive counts. This paper proposes a general joint modelling framework in which all these features can appear together. We illustrate the proposed method with a case study and examine it further with simulations.  相似文献   

9.
Summary. We model daily catches of fishing boats in the Grand Bank fishing grounds. We use data on catches per species for a number of vessels collected by the European Union in the context of the Northwest Atlantic Fisheries Organization. Many variables can be thought to influence the amount caught: a number of ship characteristics (such as the size of the ship, the fishing technique used and the mesh size of the nets) are obvious candidates, but one can also consider the season or the actual location of the catch. Our database leads to 28 possible regressors (arising from six continuous variables and four categorical variables, whose 22 levels are treated separately), resulting in a set of 177 million possible linear regression models for the log-catch. Zero observations are modelled separately through a probit model. Inference is based on Bayesian model averaging, using a Markov chain Monte Carlo approach. Particular attention is paid to the prediction of catches for single and aggregated ships.  相似文献   

10.
In outcome‐dependent sampling, the continuous or binary outcome variable in a regression model is available in advance to guide selection of a sample on which explanatory variables are then measured. Selection probabilities may either be a smooth function of the outcome variable or be based on a stratification of the outcome. In many cases, only data from the final sample is accessible to the analyst. A maximum likelihood approach for this data configuration is developed here for the first time. The likelihood for fully general outcome‐dependent designs is stated, then the special case of Poisson sampling is examined in more detail. The maximum likelihood estimator differs from the well‐known maximum sample likelihood estimator, and an information bound result shows that the former is asymptotically more efficient. A simulation study suggests that the efficiency difference is generally small. Maximum sample likelihood estimation is therefore recommended in practice when only sample data is available. Some new smooth sample designs show considerable promise.  相似文献   

11.
Density estimation for pre-binned data is challenging due to the loss of exact position information of the original observations. Traditional kernel density estimation methods cannot be applied when data are pre-binned in unequally spaced bins or when one or more bins are semi-infinite intervals. We propose a novel density estimation approach using the generalized lambda distribution (GLD) for data that have been pre-binned over a sequence of consecutive bins. This method enjoys the high power of the parametric model and the great shape flexibility of the GLD. The performances of the proposed estimators are benchmarked via simulation studies. Both simulation results and a real data application show that the proposed density estimators work well for data of moderate or large sizes.  相似文献   

12.
The objective of Taguchi's robust design method is to reduce the output variation from the target (the desired output) by making the performance insensitive to noise, such as manufacturing imperfections, environmental variations and deterioration. This objective has been recognized to be very effective in improving product and manufacturing process design. In application, however, Taguchi's analysis approach of modelling the average loss (or signal-to-noise ratios) may lead to non-optimal solutions, efficiency loss and information loss. In addition, since his modelling loss approach requires a special experimental format that contains a cross-product of two separate arrays for control and noise factors, this leads to less flexible and unnecessarily expensive experiments. The response model approach, an alternative approach proposed by Welch et al. , Box and Jones, Lucas and Shoemaker et al. , does not have these problems. However, this alternative approach also has its own problems. This paper reviews and discusses the potential problems of Taguchi's modelling approach. We illustrate these problems with examples and numerical studies. We also compare the advantages and disadvantages of Taguchi's approach and the alternative approach.  相似文献   

13.
We propose a flexible functional approach for modelling generalized longitudinal data and survival time using principal components. In the proposed model the longitudinal observations can be continuous or categorical data, such as Gaussian, binomial or Poisson outcomes. We generalize the traditional joint models that treat categorical data as continuous data by using some transformations, such as CD4 counts. The proposed model is data-adaptive, which does not require pre-specified functional forms for longitudinal trajectories and automatically detects characteristic patterns. The longitudinal trajectories observed with measurement error or random error are represented by flexible basis functions through a possibly nonlinear link function, combining dimension reduction techniques resulting from functional principal component (FPC) analysis. The relationship between the longitudinal process and event history is assessed using a Cox regression model. Although the proposed model inherits the flexibility of non-parametric methods, the estimation procedure based on the EM algorithm is still parametric in computation, and thus simple and easy to implement. The computation is simplified by dimension reduction for random coefficients or FPC scores. An iterative selection procedure based on Akaike information criterion (AIC) is proposed to choose the tuning parameters, such as the knots of spline basis and the number of FPCs, so that appropriate degree of smoothness and fluctuation can be addressed. The effectiveness of the proposed approach is illustrated through a simulation study, followed by an application to longitudinal CD4 counts and survival data which were collected in a recent clinical trial to compare the efficiency and safety of two antiretroviral drugs.  相似文献   

14.
The variational approach to Bayesian inference enables simultaneous estimation of model parameters and model complexity. An interesting feature of this approach is that it also leads to an automatic choice of model complexity. Empirical results from the analysis of hidden Markov models with Gaussian observation densities illustrate this. If the variational algorithm is initialized with a large number of hidden states, redundant states are eliminated as the method converges to a solution, thereby leading to a selection of the number of hidden states. In addition, through the use of a variational approximation, the deviance information criterion for Bayesian model selection can be extended to the hidden Markov model framework. Calculation of the deviance information criterion provides a further tool for model selection, which can be used in conjunction with the variational approach.  相似文献   

15.
Studies of the behaviors of glaciers, ice sheets, and ice streams rely heavily on both observations and physical models. Data acquired via remote sensing provide critical information on geometry and movement of ice over large sections of Antarctica and Greenland. However, uncertainties are present in both the observations and the models. Hence, there is a need for combining these information sources in a fashion that incorporates uncertainty and quantifies its impact on conclusions. We present a hierarchical Bayesian approach to modeling ice-stream velocities incorporating physical models and observations regarding velocity, ice thickness, and surface elevation from the North East Ice Stream in Greenland. The Bayesian model leads to interesting issues in model assessment and computation.  相似文献   

16.
The modified zero order approach to estimating coefficients in the face of missing observations treats them as parameters to be estimated simultaneously with the missing observations. The paper then investigates (in the context of Han's generalized regression model)(i) when parameter estimators don't vary between using the partial data points and using only the complete ones (the informationless result), and (ii) large sample properties of the modified zero order estimator. It's found the sequential cut property is crucial to the informationless result for coefficient estimators; consistency of the modified zero order estimator depends on the percentage of observations with missing elements for large sample sizes or the sequential cut property.  相似文献   

17.
Mild to moderate skew in errors can substantially impact regression mixture model results; one approach for overcoming this includes transforming the outcome into an ordered categorical variable and using a polytomous regression mixture model. This is effective for retaining differential effects in the population; however, bias in parameter estimates and model fit warrant further examination of this approach at higher levels of skew. The current study used Monte Carlo simulations; 3000 observations were drawn from each of two subpopulations differing in the effect of X on Y. Five hundred simulations were performed in each of the 10 scenarios varying in levels of skew in one or both classes. Model comparison criteria supported the accurate two-class model, preserving the differential effects, while parameter estimates were notably biased. The appropriate number of effects can be captured with this approach but we suggest caution when interpreting the magnitude of the effects.  相似文献   

18.
In longitudinal studies, observation times are often irregular and subject‐specific. Frequently they are related to the outcome measure or other variables that are associated with the outcome measure but undesirable to condition upon in the model for outcome. Regression analyses that are unadjusted for outcome‐dependent follow‐up then yield biased estimates. The authors propose a class of inverse‐intensity rate‐ratio weighted estimators in generalized linear models that adjust for outcome‐dependent follow‐up. The estimators, based on estimating equations, are very simple and easily computed; they can be used under mixtures of continuous and discrete observation times. The predictors of observation times can be past observed outcomes, cumulative values of outcome‐model covariates and other factors associated with the outcome. The authors validate their approach through simulations and they illustrate it using data from a supported housing program from the US federal government.  相似文献   

19.
Ordinal outcomes collected at multiple follow-up visits are common in clinical trials. Sometimes, one visit is chosen for the primary analysis and the scale is dichotomized amounting to loss of information. Multistate Markov models describe how a process moves between states over time. Here, simulation studies are performed to investigate the Type I error and power characteristics of multistate Markov models for panel data with limited non-adjacent state transitions. The results suggest that the multistate Markov models preserve the Type I error and adequate power is achieved with modest sample sizes for panel data with limited non-adjacent state transitions.  相似文献   

20.
Clinical trials and other types of studies often examine the effects of a particular treatment or experimental condition on a number of different response variables. Although the usual approach for analysing such data is to examine each variable separately, this can increase the chance of false positive findings. Bonferroni's inequality or Hotelling's T2 statistic can be employed to control the overall type I error rate, but these tests generally lack power for alternatives in which the treatment improves the outcome on most or all of the endpoints. For the comparison of independent groups, O'Brien (1984) developed a rank-sum type test that has greater power than the Bonferroni and T2 procedures when one treatment is uniformly better (i.e. for all endpoints) than the other treatment(s). In this paper we adapt the rank-sum test to studies involving paired data and demonstrate that it, too, has power advantages for such alternatives. Simulation results are described, and an example from a study measuring the effects of sleep loss on glucose metabolism is presented to illustrate the methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号