首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Paired binary data arise frequently in biomedical studies with unique features of their own. For instance, in clinical studies involving pairs such as ears, eyes etc., often both the intrapair association parameter and the event probability are of interest. In addition, we may be interested in the dependence of the association parameter on certain covariates as well. Although various methods have been proposed to model paired binary data, this paper proposes a unified approach for estimating various intrapair measures under a generalized linear model with simultaneous maximum likelihood estimates of the marginal probabilities and the intrapair association. The methods are illustrated with a twin morbidity study.  相似文献   

2.
The authors consider regression analysis for binary data collected repeatedly over time on members of numerous small clusters of individuals sharing a common random effect that induces dependence among them. They propose a mixed model that can accommodate both these structural and longitudinal dependencies. They estimate the parameters of the model consistently and efficiently using generalized estimating equations. They show through simulations that their approach yields significant gains in mean squared error when estimating the random effects variance and the longitudinal correlations, while providing estimates of the fixed effects that are just as precise as under a generalized penalized quasi‐likelihood approach. Their method is illustrated using smoking prevention data.  相似文献   

3.
Fuzzy least-square regression can be very sensitive to unusual data (e.g., outliers). In this article, we describe how to fit an alternative robust-regression estimator in fuzzy environment, which attempts to identify and ignore unusual data. The proposed approach concerns classical robust regression and estimation methods that are insensitive to outliers. In this regard, based on the least trimmed square estimation method, an estimation procedure is proposed for determining the coefficients of the fuzzy regression model for crisp input-fuzzy output data. The investigated fuzzy regression model is applied to bedload transport data forecasting suspended load by discharge based on a real world data. The accuracy of the proposed method is compared with the well-known fuzzy least-square regression model. The comparison results reveal that the fuzzy robust regression model performs better than the other models in suspended load estimation for the particular dataset. This comparison is done based on a similarity measure between fuzzy sets. The proposed model is general and can be used for modeling natural phenomena whose available observations are reported as imprecise rather than crisp.  相似文献   

4.
ABSTRACT

Motivated by a longitudinal oral health study, the Signal-Tandmobiel® study, a Bayesian approach has been developed to model misclassified ordinal response data. Two regression models have been considered to incorporate misclassification in the categorical response. Specifically, probit and logit models have been developed. The computational difficulties have been avoided by using data augmentation. This idea is exploited to derive efficient Markov chain Monte Carlo methods. Although the method is proposed for ordered categories, it can also be implemented for unordered ones in a simple way. The model performance is shown through a simulation-based example and the analysis of the motivating study.  相似文献   

5.
The varying coefficient (VC) model introduced by Hastie and Tibshirani [26 T. Hastie and R. Tibshirani, Varying-coefficient models, J. R. Statist. Soc. (Ser. B) 55 (1993), pp. 757796.[Web of Science ®] [Google Scholar]] is arguably one of the most remarkable recent developments in nonparametric regression theory. The VC model is an extension of the ordinary regression model where the coefficients are allowed to vary as smooth functions of an effect modifier possibly different from the regressors. The VC model reduces the modelling bias with its unique structure while also avoiding the ‘curse of dimensionality’ problem. While the VC model has been applied widely in a variety of disciplines, its application in economics has been minimal. The central goal of this paper is to apply VC modelling to the estimation of a hedonic house price function using data from Hong Kong, one of the world's most buoyant real estate markets. We demonstrate the advantages of the VC approach over traditional parametric and semi-parametric regressions in the face of a large number of regressors. We further combine VC modelling with quantile regression to examine the heterogeneity of the marginal effects of attributes across the distribution of housing prices.  相似文献   

6.
A method for robustness in linear models is to assume that there is a mixture of standard and outlier observations with a different error variance for each class. For generalised linear models (GLMs) the mixture model approach is more difficult as the error variance for many distributions has a fixed relationship to the mean. This model is extended to GLMs by changing the classes to one where the standard class is a standard GLM and the outlier class which is an overdispersed GLM achieved by including a random effect term in the linear predictor. The advantages of this method are it can be extended to any model with a linear predictor, and outlier observations can be easily identified. Using simulation the model is compared to an M-estimator, and found to have improved bias and coverage. The method is demonstrated on three examples.  相似文献   

7.
In this paper, we provide a method for constructing confidence interval for accuracy in correlated observations, where one sample of patients is being rated by two or more diagnostic tests. Confidence intervals for other measures of diagnostic tests, such as sensitivity, specificity, positive predictive value, and negative predictive value, have already been developed for clustered or correlated observations using the generalized estimating equations (GEE) method. Here, we use the GEE and delta‐method to construct confidence intervals for accuracy, the proportion of patients who are correctly classified. Simulation results verify that the estimated confidence intervals exhibit consistent/appropriate coverage rates.  相似文献   

8.
The research described herein was motivated by a study of the relationship between the performance of students in senior high schools and at universities in China. A special linear structural equation model is established, in which some parameters are known and both the responses and the covariables are measured with errors. To explore the relationship between the true responses and latent covariables and to estimate the parameters, we suggest a non-iterative estimation approach that can account for the external dependence between the true responses and latent covariables. This approach can also deal with the collinearity problem because the use of dimension-reduction techniques can remove redundant variables. Combining further with the information that some of parameters are given, we can perform estimation for the other unknown parameters. An easily implemented algorithm is provided. A simulation is carried out to provide evidence of the performance of the approach and to compare it with existing methods. The approach is applied to the education example for illustration, and it can be readily extended to more general models.  相似文献   

9.
In this paper, we study a new Bayesian approach for the analysis of linearly mixed structures. In particular, we consider the case of hyperspectral images, which have to be decomposed into a collection of distinct spectra, called endmembers, and a set of associated proportions for every pixel in the scene. This problem, often referred to as spectral unmixing, is usually considered on the basis of the linear mixing model (LMM). In unsupervised approaches, the endmember signatures have to be calculated by an endmember extraction algorithm, which generally relies on the supposition that there are pure (unmixed) pixels contained in the image. In practice, this assumption may not hold for highly mixed data and consequently the extracted endmember spectra differ from the true ones. A way out of this dilemma is to consider the problem under the normal compositional model (NCM). Contrary to the LMM, the NCM treats the endmembers as random Gaussian vectors and not as deterministic quantities. Existing Bayesian approaches for estimating the proportions under the NCM are restricted to the case that the covariance matrix of the Gaussian endmembers is a multiple of the identity matrix. The self-evident conclusion is that this model is not suitable when the variance differs from one spectral channel to the other, which is a common phenomenon in practice. In this paper, we first propose a Bayesian strategy for the estimation of the mixing proportions under the assumption of varying variances in the spectral bands. Then we generalize this model to handle the case of a completely unknown covariance structure. For both algorithms, we present Gibbs sampling strategies and compare their performance with other, state of the art, unmixing routines on synthetic as well as on real hyperspectral fluorescence spectroscopy data.  相似文献   

10.
Quantitative model validation is playing an increasingly important role in performance and reliability assessment of a complex system whenever computer modelling and simulation are involved. The foci of this paper are to pursue a Bayesian probabilistic approach to quantitative model validation with non-normality data, considering data uncertainty and to investigate the impact of normality assumption on validation accuracy. The Box–Cox transformation method is employed to convert the non-normality data, with the purpose of facilitating the overall validation assessment of computational models with higher accuracy. Explicit expressions for the interval hypothesis testing-based Bayes factor are derived for the transformed data in the context of univariate and multivariate cases. Bayesian confidence measure is presented based on the Bayes factor metric. A generalized procedure is proposed to implement the proposed probabilistic methodology for model validation of complicated systems. Classic hypothesis testing method is employed to conduct a comparison study. The impact of data normality assumption and decision threshold variation on model assessment accuracy is investigated by using both classical and Bayesian approaches. The proposed methodology and procedure are demonstrated with a univariate stochastic damage accumulation model, a multivariate heat conduction problem and a multivariate dynamic system.  相似文献   

11.
In this article Lindley's (1956) measure of average information is used to measure the loss of information due to the unavailability of a set of observations in an experiment. This measure of loss of information may be used to detect a set of most informative observations in a given design.  相似文献   

12.
In spite of the best set of covariates and statistical tools for the survival analysis, there are instances when experts do not rule out the existence of many non-observable factors that could influence the survival probability of an individual. The fact that every human body, sick or otherwise, strives to maximize time to death, renders the stochastic frontier analysis (vide 2 Kumbhakar, S. C. and Lovell, C. A.K. 2003. Stochastic Frontier Analysis, Cambridge: Cambridge University Press.  [Google Scholar]) as a meaningful tool to measure the unobservable individual-specific deficiency factor that accounts for the difference between the optimal and observed survival times. In this paper, given the survival data, an attempt is made to measure the deficiency factor for each individual in the data on adopting the stochastic frontier analysis. Such an attempt to quantify the effect of these unobservable factors can provide ample scope for further research in bio-medical studies. The utility of these estimates in the survival analysis is also highlighted using a real-life data.  相似文献   

13.
Liang and Zeger (1986) proposed an extension of generalized linear models to the analysis of longitudinal data. In their formulation, a common dispersion parameter assumption across observation times is required. However, this assumption is not expected to hold in most situations. Park (1993) proposed a simple extension of Liang and Zeger's formulation to allow for different dispersion parameters for each time point. The proposed model is easy to apply without heavy computations and useful to handle the cases when variations in over-dispersion over time exist. In this paper, we focus on evaluating the effect of additional dispersion parameters on the estimators of model parameters. Through a Monte Carlo simulation study, efficiency of Park's method is compared with the Liang and Zeger's method.  相似文献   

14.
We consider the problem of selecting a regression model from a large class of possible models in the case where no true model is believed to exist. In practice few statisticians, or scientists who employ statistical methods, believe that a true model exists, but nonetheless they seek to select a model as a proxy from which they want to predict. Unlike much of the recent work in this area we address this problem explicitly. We develop Bayesian predictive model selection techniques when proper conjugate priors are used and obtain an easily computed expression for the model selection criterion. We also derive expressions for updating the value of the statistic when a predictor is dropped from the model and apply this approach to a large well-known data set.  相似文献   

15.
Massive correlated data with many inputs are often generated from computer experiments to study complex systems. The Gaussian process (GP) model is a widely used tool for the analysis of computer experiments. Although GPs provide a simple and effective approximation to computer experiments, two critical issues remain unresolved. One is the computational issue in GP estimation and prediction where intensive manipulations of a large correlation matrix are required. For a large sample size and with a large number of variables, this task is often unstable or infeasible. The other issue is how to improve the naive plug-in predictive distribution which is known to underestimate the uncertainty. In this article, we introduce a unified framework that can tackle both issues simultaneously. It consists of a sequential split-and-conquer procedure, an information combining technique using confidence distributions (CD), and a frequentist predictive distribution based on the combined CD. It is shown that the proposed method maintains the same asymptotic efficiency as the conventional likelihood inference under mild conditions, but dramatically reduces the computation in both estimation and prediction. The predictive distribution contains comprehensive information for inference and provides a better quantification of predictive uncertainty as compared with the plug-in approach. Simulations are conducted to compare the estimation and prediction accuracy with some existing methods, and the computational advantage of the proposed method is also illustrated. The proposed method is demonstrated by a real data example based on tens of thousands of computer experiments generated from a computational fluid dynamic simulator.  相似文献   

16.
In this article, we propose a resampling method based on perturbing the estimating functions to compute the asymptotic variances of quantile regression estimators under missing at random condition. We prove that the conditional distributions of the resampling estimators are asymptotically equivalent to the distributions of quantile regression estimators. Our method can deal with complex situations, where the response and part of covariates are missing. Numerical results based on simulated and real data are provided under several designs.  相似文献   

17.
In this paper, we consider a model for repeated count data, with within-subject correlation and/or overdispersion. It extends both the generalized linear mixed model and the negative-binomial model. This model, proposed in a likelihood context [17 G. Molenberghs, G. Verbeke, and C.G.B. Demétrio, An extended random-effects approach to modeling repeated, overdispersion count data, Lifetime Data Anal. 13 (2007), pp. 457511.[Web of Science ®] [Google Scholar],18 G. Molenberghs, G. Verbeke, C.G.B. Demétrio, and A. Vieira, A family of generalized linear models for repeated measures with normal and conjugate random effects, Statist. Sci. 25 (2010), pp. 325347. doi: 10.1214/10-STS328[Crossref], [Web of Science ®] [Google Scholar]] is placed in a Bayesian inferential framework. An important contribution takes the form of Bayesian model assessment based on pivotal quantities, rather than the often less adequate DIC. By means of a real biological data set, we also discuss some Bayesian model selection aspects, using a pivotal quantity proposed by Johnson [12 V.E. Johnson, Bayesian model assessment using pivotal quantities, Bayesian Anal. 2 (2007), pp. 719734. doi: 10.1214/07-BA229[Crossref], [Web of Science ®] [Google Scholar]].  相似文献   

18.
Summary.  Model selection for marginal regression analysis of longitudinal data is challenging owing to the presence of correlation and the difficulty of specifying the full likelihood, particularly for correlated categorical data. The paper introduces a novel Bayesian information criterion type model selection procedure based on the quadratic inference function, which does not require the full likelihood or quasi-likelihood. With probability approaching 1, the criterion selects the most parsimonious correct model. Although a working correlation matrix is assumed, there is no need to estimate the nuisance parameters in the working correlation matrix; moreover, the model selection procedure is robust against the misspecification of the working correlation matrix. The criterion proposed can also be used to construct a data-driven Neyman smooth test for checking the goodness of fit of a postulated model. This test is especially useful and often yields much higher power in situations where the classical directional test behaves poorly. The finite sample performance of the model selection and model checking procedures is demonstrated through Monte Carlo studies and analysis of a clinical trial data set.  相似文献   

19.
The problem of spuriousity has been dealt with from a Bayesian perspective by, among others, Box and Taio (1968) and in several papers by Guttman with various co-authors, beginning with Guttman (1973), The main objective of these papers has been to obtain posterior distributions of parameters, and to base inference on these distributions. In the current paper, the Bayesian argument is carried one step further by deriving predictive distributions of future observations. Inferences are then based on these distributions. We will obtain predictive results for several models, First, we consider the univariate normal case with one spurious observation, This is then generalized to several spurious observations. The multivariate normal situation is studied next. Finally, we consider the general linear model with normal errors.  相似文献   

20.
In the area of diagnostics, it is common practice to leverage external data to augment a traditional study of diagnostic accuracy consisting of prospectively enrolled subjects to potentially reduce the time and/or cost needed for the performance evaluation of an investigational diagnostic device. However, the statistical methods currently being used for such leveraging may not clearly separate study design and outcome data analysis, and they may not adequately address possible bias due to differences in clinically relevant characteristics between the subjects constituting the traditional study and those constituting the external data. This paper is intended to draw attention in the field of diagnostics to the recently developed propensity score-integrated composite likelihood approach, which originally focused on therapeutic medical products. This approach applies the outcome-free principle to separate study design and outcome data analysis and can mitigate bias due to imbalance in covariates, thereby increasing the interpretability of study results. While this approach was conceived as a statistical tool for the design and analysis of clinical studies for therapeutic medical products, here, we will show how it can also be applied to the evaluation of sensitivity and specificity of an investigational diagnostic device leveraging external data. We consider two common scenarios for the design of a traditional diagnostic device study consisting of prospectively enrolled subjects, which is to be augmented by external data. The reader will be taken through the process of implementing this approach step-by-step following the outcome-free principle that preserves study integrity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号