共查询到20条相似文献,搜索用时 281 毫秒
1.
Viswanathan Ramakrishnan 《统计学通讯:模拟与计算》2013,42(3):405-418
In many genetic analyses of dichotomous twin data, odds ratios have been used to test hypotheses on heritability and shared common environment effects of a given disease (Lichtenstein et al., 2000; Ahlbom et al., 1997; Ramakrishnan et al., 1992, 4). However, estimates of these two effects have not been dealt with in the literature. In epidemiology, the attributable fraction (AF), a function of the odds ratio and the prevalence of the risk factor has been used to describe the contribution of a risk factor to a disease in a given population (Leviton, 1973). In this article, we adapt the AF to quantify the heritability and the shared common environment. Twin data on cancer, gallstone disease and phobia are used to illustrate the applicability of the AF estimate as a measure of heritability. 相似文献
2.
We first compare correspondence analysis, which uses chi-square distance, and an alternative approach using Hellinger distance, for representing categorical data in a contingency table. We propose a coefficient which globally measures the similarity between these two approaches. This coefficient can be decomposed into several components, one component for each principal dimension, indicating the contribution of the dimensions to the difference between the two representations. We also make comparisons with the logratio approach based on compositional data. These three methods of representation can produce quite similar results. Two illustrative examples are given. 相似文献
3.
MONIA LUPPARELLI GIOVANNI M. MARCHETTI WICHER P. BERGSMA 《Scandinavian Journal of Statistics》2009,36(3):559-576
Abstract. We discuss two parameterizations of models for marginal independencies for discrete distributions which are representable by bi-directed graph models, under the global Markov property. Such models are useful data analytic tools especially if used in combination with other graphical models. The first parameterization, in the saturated case, is also known as thenation multivariate logistic transformation, the second is a variant that allows, in some (but not all) cases, variation-independent parameters. An algorithm for maximum likelihood fitting is proposed, based on an extension of the Aitchison and Silvey method. 相似文献
4.
By entering the data (y i ,x i ) followed by (–y i ,–x i ), one can obtain an intercept-free regression Y = Xβ + ε from a program package that normally uses an intercept term. There is no bias in the resultant regression coefficients, but a minor postanalysis adjustment is needed to the residual variance and standard errors. 相似文献
5.
Conditional Prior Proposals in Dynamic Models 总被引:2,自引:0,他引:2
Leonhard Knorr-Held 《Scandinavian Journal of Statistics》1999,26(1):129-144
ABSTRACT. Dynamic models extend state space models to non-normal observations. This paper suggests a specific hybrid Metropolis–Hastings algorithm as a simple device for Bayesian inference via Markov chain Monte Carlo in dynamic models. Hastings proposals from the (conditional) prior distribution of the unknown, time-varying parameters are used to update the corresponding full conditional distributions. It is shown through simulated examples that the methodology has optimal performance in situations where the prior is relatively strong compared to the likelihood. Typical examples include smoothing priors for categorical data. A specific blocking strategy is proposed to ensure good mixing and convergence properties of the simulated Markov chain. It is also shown that the methodology is easily extended to robust transition models using mixtures of normals. The applicability is illustrated with an analysis of a binomial and a binary time series, known in the literature. 相似文献
6.
Inverse Gaussian first hitting time regression models sometimes provide an attractive representation of lifetime data. Various authors comment that dependence of both parameters on the same covariate may imply multicollinearity. The frequent appearance of conflicting signs for the two coefficients of the same covariate may be related to this. We carry out simulation studies to examine the reality of this possible multicollinearity. Although there is some dependence between estimates, multicollinearity does not seem to be a major problem. Fitting this model to data generated by a Weibull regression suggests that conflicting signs of estimates may be due to model misspecification. 相似文献
7.
The potency of antiretroviral agents in AIDS clinical trials can be assessed on the basis of a viral response such as viral decay rate or change in viral load (number of HIV RNA copies in plasma). Linear, nonlinear, and nonparametric mixed-effects models have been proposed to estimate such parameters in viral dynamic models. However, there are two critical questions that stand out: whether these models achieve consistent estimates for viral decay rates, and which model is more appropriate for use in practice. Moreover, one often assumes that a model random error is normally distributed, but this assumption may be unrealistic, obscuring important features of within- and among-subject variations. In this article, we develop a skew-normal (SN) Bayesian linear mixed-effects (SN-BLME) model, an SN Bayesian nonlinear mixed-effects (SN-BNLME) model, and an SN Bayesian semiparametric nonlinear mixed-effects (SN-BSNLME) model that relax the normality assumption by considering model random error to have an SN distribution. We compare the performance of these SN models, and also compare their performance with the corresponding normal models. An AIDS dataset is used to test the proposed models and methods. It was found that there is a significant incongruity in the estimated viral decay rates. The results indicate that SN-BSNLME model is preferred to the other models, implying that an arbitrary data truncation is not necessary. The findings also suggest that it is important to assume a model with an SN distribution in order to achieve reasonable results when the data exhibit skewness. 相似文献
8.
There has been much recent work on Bayesian approaches to survival analysis, incorporating features such as flexible baseline hazards, time-dependent covariate effects, and random effects. Some of the proposed methods are quite complicated to implement, and we argue that as good or better results can be obtained via simpler methods. In particular, the normal approximation to the log-gamma distribution yields easy and efficient computational methods in the face of simple multivariate normal priors for baseline log-hazards and time-dependent covariate effects. While the basic method applies to piecewise-constant hazards and covariate effects, it is easy to apply importance sampling to consider smoother functions. 相似文献
9.
Data from complex surveys are being used increasingly to build the same sort of explanatory and predictive models as those used in the rest of statistics. Unfortunately the assumptions underlying standard statistical methods are not even approximately valid for most survey data. The problem of parameter estimation has been largely solved, at least for routine data analysis, through the use of weighted estimating equations, and software for most standard analytical procedures is now available in the major statistical packages. One notable omission from standard software is an analogue of the likelihood ratio test. An exception is the Rao–Scott test for loglinear models in contingency tables. In this paper we show how the Rao–Scott test can be extended to handle arbitrary regression models. We illustrate the process of fitting a model to survey data with an example from NHANES. 相似文献
10.
Geert Molenberghs & Els Goetghebeur 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1997,59(2):401-414
A popular approach to estimation based on incomplete data is the EM algorithm. For categorical data, this paper presents a simple expression of the observed data log-likelihood and its derivatives in terms of the complete data for a broad class of models and missing data patterns. We show that using the observed data likelihood directly is easy and has some advantages. One can gain considerable computational speed over the EM algorithm and a straightforward variance estimator is obtained for the parameter estimates. The general formulation treats a wide range of missing data problems in a uniform way. Two examples are worked out in full. 相似文献
11.
Recently, least absolute deviations (LAD) estimator for median regression models with doubly censored data was proposed and the asymptotic normality of the estimator was established. However, it is invalid to make inference on the regression parameter vectors, because the asymptotic covariance matrices are difficult to estimate reliably since they involve conditional densities of error terms. In this article, three methods, which are based on bootstrap, random weighting, and empirical likelihood, respectively, and do not require density estimation, are proposed for making inference for the doubly censored median regression models. Simulations are also done to assess the performance of the proposed methods. 相似文献
12.
基于图形分析方法的函数型数据异常值检验实证研究 总被引:1,自引:0,他引:1
函数型数据本质上是一种复杂数据,其抽样、生成、结构和关联程度都会影响到数据的复杂性和描述性,有些情形甚至连基本的可视化描述都成为难点。在利用函数型数据的主成分得分、图基的数据深度和密度概念的基础上,引入函数型数据的打包图和箱线图,并针对函数型数据的图形分析提出了函数型数据异常值检测的三种方法。与已有的检测方法相比较,所提三种方法更易于识别函数型数据的异常值。 相似文献
13.
This paper is concerned with the analysis of ordinal data through linear models for rank function measures.Primary attention is directed at pairwise Mann-Whitney statistics for which dimension reduction is managed by use of a Bradley-Terry log-linear structure.The nature of linear models for such quantities is contrasted with that for mean ranks (or ridits).Aspects of application are illustrated with an example for which results of other methods are also given. 相似文献
14.
Many methods have been developed for detecting multiple outliers in a single multivariate sample, but very few for the case where there may be groups in the data. We propose a method of simultaneously determining groups (as in cluster analysis) and detecting outliers, which are points that are distant from every group. Our method is an adaptation of the BACON algorithm proposed by Billor, Hadi and Velleman for the robust detection of multiple outliers in a single group of multivariate data. There are two versions of our method, depending on whether or not the groups can be assumed to have equal covariance matrices. The effectiveness of the method is illustrated by its application to two real data sets and further shown by a simulation study for different sample sizes and dimensions for 2 and 3 groups, with and without planted outliers in the data. When the number of groups is not known in advance, the algorithm could be used as a robust method of cluster analysis, by running it for various numbers of groups and choosing the best solution. 相似文献
15.
Laura Ferreira 《统计学通讯:模拟与计算》2013,42(9):1925-1949
Functional data analysis (FDA)—the analysis of data that can be considered a set of observed continuous functions—is an increasingly common class of statistical analysis. One of the most widely used FDA methods is the cluster analysis of functional data; however, little work has been done to compare the performance of clustering methods on functional data. In this article, a simulation study compares the performance of four major hierarchical methods for clustering functional data. The simulated data varied in three ways: the nature of the signal functions (periodic, non periodic, or mixed), the amount of noise added to the signal functions, and the pattern of the true cluster sizes. The Rand index was used to compare the performance of each clustering method. As a secondary goal, clustering methods were also compared when the number of clusters has been misspecified. To illustrate the results, a real set of functional data was clustered where the true clustering structure is believed to be known. Comparing the clustering methods for the real data set confirmed the findings of the simulation. This study yields concrete suggestions to future researchers to determine the best method for clustering their functional data. 相似文献
16.
G. Jacob F. H. C. Marriott & P. A. Robbins 《Journal of the Royal Statistical Society. Series C, Applied statistics》1997,46(2):235-243
Records of gas flow during breathing are cyclical, with the cycles varying in duration. The shape of these cycles may change with the intensity of respiratory stimulation or the development of respiratory disease, but currently research is hampered by the lack of a fully satisfactory technique for determining the shape of a typical cycle. The approach adopted here is to replace the time series by a 'phase diagram', plotting the time integral of flow against flow itself. Principal curves are then fitted. These are curves `through the middle of the data' which were introduced by Hastie and Stuetzle. The shapes of these curves are compared, either directly or after reconstructing an average cycle corresponding to each fitted curve. This has the advantage that the shape of the waveform is separated from the amplitude, and from the duration of the breath. A disadvantage is that periods of zero flow are lost, and the reconstructed average cycle may show irregularities at points near zero flow as a result. In practice, the methodology showed clear differences in shape between the protocols, gave reasonable average cycles and ordered the waveform shapes according to the hardness of breathing induced by the protocols. 相似文献
17.
Multivariate survival data arise when eachstudy subject may experience multiple events or when study subjectsare clustered into groups. Statistical analyses of such dataneed to account for the intra-cluster dependence through appropriatemodeling. Frailty models are the most popular for such failuretime data. However, there are other approaches which model thedependence structure directly. In this article, we compare thefrailty models for bivariate data with the models based on bivariateexponential and Weibull distributions. Bayesian methods providea convenient paradigm for comparing the two sets of models weconsider. Our techniques are illustrated using two examples.One simulated example demonstrates model choice methods developedin this paper and the other example, based on a practical dataset of onset of blindness among patients with diabetic Retinopathy,considers Bayesian inference using different models. 相似文献
18.
The problem of outliers in statistical data has attracted many researchers for a long time. Consequently, numerous outlier detection methods have been proposed in the statistical literature. However, no consensus has emerged as to which method is uniformly better than the others or which one is recommended for use in practical situations. In this article, we perform an extensive comparative Monte Carlo simulation study to assess the performance of the multiple outlier detection methods that are either recently proposed or frequently cited in the outlier detection literature. Our simulation experiments include a wide variety of realistic and challenging regression scenarios. We give recommendations on which method is superior to others under what conditions. 相似文献
19.
This article considers a class of estimators for the location and scale parameters in the location-scale model based on ‘synthetic data’ when the observations are randomly censored on the right. The asymptotic normality of the estimators is established using counting process and martingale techniques when the censoring distribution is known and unknown, respectively. In the case when the censoring distribution is known, we show that the asymptotic variances of this class of estimators depend on the data transformation and have a lower bound which is not achievable by this class of estimators. However, in the case that the censoring distribution is unknown and estimated by the Kaplan–Meier estimator, this class of estimators has the same asymptotic variance and attains the lower bound for variance for the case of known censoring distribution. This is different from censored regression analysis, where asymptotic variances depend on the data transformation. Our method has three valuable advantages over the method of maximum likelihood estimation. First, our estimators are available in a closed form and do not require an iterative algorithm. Second, simulation studies show that our estimators being moment-based are comparable to maximum likelihood estimators and outperform them when sample size is small and censoring rate is high. Third, our estimators are more robust to model misspecification than maximum likelihood estimators. Therefore, our method can serve as a competitive alternative to the method of maximum likelihood in estimation for location-scale models with censored data. A numerical example is presented to illustrate the proposed method. 相似文献
20.
This article considers the estimation and testing of a within-group two-stage least squares (TSLS) estimator for instruments with varying degrees of weakness in a longitudinal (panel) data model. We show that adding the repeated cross-sectional information into a regression model can improve the estimation in weak instruments. Moreover, the consistency and limiting distribution of the TSLS estimator are established when both N and T tend to infinity. Some asymptotically pivotal tests are extended to a longitudinal data model and their asymptotic properties are examined. A Monte Carlo experiment is conducted to evaluate the finite sample performance of the proposed estimators. 相似文献