首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
The intra-cluster correlation is insisted on nested error regression model that, in practice, is rarely known. This article demonstrates the size in generalized least squares (GLS) F-test using Fuller–Battese transformation and modification F-test. For the balanced case, the former using strictly positive, analysis of covariance (ANCOVA) and analysis of variance (ANOVA) estimators of intra-cluster correlation can control the size for moderate intra-cluster correlations. For small intra-cluster correlation, they perform well when the numbers of cluster are large. The latter using the ANOVA estimator performs well except for small numbers of cluster. When intra-cluster correlation is large, it cannot control the size. For the unbalanced case, the GLS F-test using the Fuller–Battese transformation and the modification F-test using the strictly positive, the ANCOVA and the ANOVA estimators maintain the significance level for small total sample size and small intra-cluster correlations when there is a large variation in cluster sizes, but they perform well in controlling the size for large total sample size and small different variation in cluster sizes. Besides, Henderson’s method 3 estimator maintains the significance level for a few situations.  相似文献   

This article develops the theoretical framework needed to study the multinomial regression model for complex sample design with pseudo-minimum phi-divergence estimators. The numerical example and the simulation study propose new estimators for the parameter of the logistic regression with overdispersed multinomial distributions for the response variables, the pseudo-minimum Cressie–Read divergence estimators, as well as new estimators for the intra-cluster correlation coefficient. The simulation study shows that the Binder’s method for the intra-cluster correlation coefficient exhibits an excellent performance when the pseudo-minimum Cressie–Read divergence estimator, with \(\lambda =\frac{2}{3}\), is plugged.  相似文献   


Various methods have been proposed to estimate intra-cluster correlation coefficients (ICCs) for correlated binary data, and many are very sensitive to the type of design and underlying distributional assumptions. We proposed a new method to estimate ICC and its 95% confidence intervals based on resampling principles and U-statistics, where we resampled with replacement pairs of individuals from within and between clusters. We concluded from our simulation study that the resampling-based estimates approximate the population ICC more precisely than the analysis of variance and method of moments techniques for different event rates, varying number of clusters, and cluster sizes.  相似文献   

In the health and social sciences, researchers often encounter categorical data for which complexities come from a nested hierarchy and/or cross-classification for the sampling structure. A common feature of these studies is a non-standard data structure with repeated measurements which may have some degree of clustering. In this paper, methodology is presented for the joint estimation of quantities of interest in the context of a stratified two-stage sample with bivariate dichotomous data. These quantities are the mean value π of an observed dichotomous response for a certain condition or time-point and a set of correlation coefficients for intra-cluster association for each condition or time period and for inter-condition correlation within and among clusters. The methodology uses the cluster means and pairwise joint probability parameters from each cluster. They together provide appropriate information across clusters for the estimation of the correlation coefficients.  相似文献   

Tests that combine p-values, such as Fisher's product test, are popular to test the global null hypothesis H0 that each of n component null hypotheses, H1,…,Hn, is true versus the alternative that at least one of H1,…,Hn is false, since they are more powerful than classical multiple tests such as the Bonferroni test and the Simes tests. Recent modifications of Fisher's product test, popular in the analysis of large scale genetic studies include the truncated product method (TPM) of Zaykin et al. (2002), the rank truncated product (RTP) test of Dudbridge and Koeleman (2003) and more recently, a permutation based test—the adaptive rank truncated product (ARTP) method of Yu et al. (2009). The TPM and RTP methods require users' specification of a truncation point. The ARTP method improves the performance of the RTP method by optimizing selection of the truncation point over a set of pre-specified candidate points. In this paper we extend the ARTP by proposing to use all the possible truncation points {1,…,n} as the candidate truncation points. Furthermore, we derive the theoretical probability distribution of the test statistic under the global null hypothesis H0. Simulations are conducted to compare the performance of the proposed test with the Bonferroni test, the Simes test, the RTP test, and Fisher's product test. The simulation results show that the proposed test has higher power than the Bonferroni test and the Simes test, as well as the RTP method. It is also significantly more powerful than Fisher's product test when the number of truly false hypotheses is small relative to the total number of hypotheses, and has comparable power to Fisher's product test otherwise.  相似文献   

Heteroscedastic two-way ANOVA are frequently encountered in real data analysis. In the literature, classical F-tests are often blindly employed although they are often biased even for moderate heteroscedasticity. To overcome this problem, several approximate tests have been proposed in the literature. These tests, however, are either too complicated to implement or do not work well in terms of size controlling. In this paper, we propose a simple and accurate approximate degrees of freedom (ADF) test. The ADF test is shown to be invariant under affine-transformations, different choices of contrast matrix for the same null hypothesis, or different labeling schemes of cell means. Moreover, it can be conducted easily using the usual F-distribution with one unknown degree of freedom estimated from the data. Simulations demonstrate that the ADF test works well in various cell sizes and parameter configurations but the classical F-tests work badly when the cell variance homogeneity assumption is violated. A real data example illustrates the methodologies.  相似文献   

We estimate sib–sib correlation by maximizing the log-likelihood of a Kotz-type distribution. Using extensive simulations we conclude that estimating sib–sib correlation using the proposed method has many advantages. Results are illustrated on a real life data set due to Galton. Testing of hypothesis about this correlation is also discussed using the three likelihood based tests and a test based on Srivastava's estimator. It is concluded that score test derived using Kotz-type density performs the best.  相似文献   

Clustered failure time data are commonly encountered in biomedical research where the study subjects from the same cluster (e.g., family) share the common genetic and/or environmental factors such that the failure times within the same cluster are correlated. Two approaches that are commonly used to account for the intra-cluster association are frailty models and marginal models. In this paper, we study the marginal proportional hazards model, where the structure of dependence between individuals within a cluster is unspecified. An estimation procedure is developed based on a pseudo-likelihood approach, and a risk set sampling method is proposed for the formulation of the pseudo-likelihood. The asymptotic properties of the proposed estimators are studied, and the related issues regarding the statistical efficiencies are discussed. The performances of the proposed estimator are demonstrated by the simulation studies. A data example from a child vitamin A supplementation trial in Nepal (Nepal Nutrition Intervention Project-Sarlahi, or NNIPS) is used to illustrate this methodology.  相似文献   

The generalized estimating equations (GEE) introduced by Liang and Zeger (Biometrika 73 (1986) 13–22) have been widely used over the past decade to analyze longitudinal data. The method uses a generalized quasi-score function estimate for the regression coefficients, and moment estimates for the correlation parameters. Recently, Crowder (Biometrika 82 (1995) 407–410) has pointed out some pitfalls with the estimation of the correlation parameters in the GEE method. In this paper we present a new method for estimating the correlation parameters which overcomes those pitfalls. For some commonly assumed correlation structures, we obtain unique feasible estimates for the correlation parameters. Large sample properties of our estimates are also established.  相似文献   

In this paper we obtain asymptotic expansions, up to order n−1/2 and under a sequence of Pitman alternatives, for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of symmetric linear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes.  相似文献   

Optimal k-circulant supersaturated designs have been constructed in literature using computer intensive methods. A systematic method of construction for multi-level experiments based on balanced incomplete block designs is presented in this paper. The method is also applicable to two-level experiments. Illustrative examples are also given.  相似文献   

An improved likelihood-based method based on Fraser et al. (1999) is proposed in this paper to test the significance of the second lag of the stationary AR(2) model. Compared with the test proposed by Fan and Yao (2003) and the signed log-likelihood ratio test, the proposed method has remarkable accuracy. Simulation studies are performed to illustrate the accuracy of the proposed method. Application of the proposed method on historical data is presented to demonstrate the implementation of this method. Furthermore, the method can be extended to the general AR(p) model.  相似文献   

Based on various improved robust covariance estimators in the literature, several modified versions of the well-known correlated information criterion (CIC) for working intra-cluster correlation structure (ICS) selection are proposed. Performances of these modified criteria are examined and compared to the CIC via simulations. When the response is Gaussian, binary, or Poisson, the modified criteria are demonstrated to have higher detection rates when the true ICS is exchangeable, while the CIC would perform better when the true ICS is AR(1). An application of the criteria is made to a real dataset.  相似文献   

The K-means algorithm and the normal mixture model method are two common clustering methods. The K-means algorithm is a popular heuristic approach which gives reasonable clustering results if the component clusters are ball-shaped. Currently, there are no analytical results for this algorithm if the component distributions deviate from the ball-shape. This paper analytically studies how the K-means algorithm changes its classification rule as the normal component distributions become more elongated under the homoscedastic assumption and compares this rule with that of the Bayes rule from the mixture model method. We show that the classification rules of both methods are linear, but the slopes of the two classification lines change in the opposite direction as the component distributions become more elongated. The classification performance of the K-means algorithm is then compared to that of the mixture model method via simulation. The comparison, which is limited to two clusters, shows that the K-means algorithm provides poor classification performances consistently as the component distributions become more elongated while the mixture model method can potentially, but not necessarily, take advantage of this change and provide a much better classification performance.  相似文献   

The hypothesis testing and confidence region are considered for the common mean vector of several multivariate normal populations when the covariance matrices are unknown and possibly unequal. A generalized confidence region is derived using the concepts of generalized method based on the generalized pp-value. The generalized confidence region is illustrated with two numerical examples. The merits of the proposed method are numerically compared with those of existing methods with respect to their expected area or expected d-dimensional volumes and coverage probabilities under different scenarios.  相似文献   

In a clinical trial, we may randomize subjects (called clusters) to different treatments (called groups), and make observations from multiple sites (called units) of each subject. In this case, the observations within each subject could be dependent, whereas those from different subjects are independent. If the outcome of interest is the time to an event, we may use the standard rank tests proposed for independent survival data, such as the logrank and Wilcoxon tests, to test the equality of marginal survival distributions, but their standard error should be modified to accommodate the possible intracluster correlation. In this paper we propose a method of calculating the standard error of the rank tests for two-sample clustered survival data. The method is naturally extended to that for K-sample tests under dependence.  相似文献   

Clustering gene expression time course data is an important problem in bioinformatics because understanding which genes behave similarly can lead to the discovery of important biological information. Statistically, the problem of clustering time course data is a special case of the more general problem of clustering longitudinal data. In this paper, a very general and flexible model-based technique is used to cluster longitudinal data. Mixtures of multivariate t-distributions are utilized, with a linear model for the mean and a modified Cholesky-decomposed covariance structure. Constraints are placed upon the covariance structure, leading to a novel family of mixture models, including parsimonious models. In addition to model-based clustering, these models are also used for model-based classification, i.e., semi-supervised clustering. Parameters, including the component degrees of freedom, are estimated using an expectation-maximization algorithm and two different approaches to model selection are considered. The models are applied to simulated data to illustrate their efficacy; this includes a comparison with their Gaussian analogues—the use of these Gaussian analogues with a linear model for the mean is novel in itself. Our family of multivariate t mixture models is then applied to two real gene expression time course data sets and the results are discussed. We conclude with a summary, suggestions for future work, and a discussion about constraining the degrees of freedom parameter.  相似文献   

We propose penalized-likelihood methods for parameter estimation of high dimensional t distribution. First, we show that a general class of commonly used shrinkage covariance matrix estimators for multivariate normal can be obtained as penalized-likelihood estimator with a penalty that is proportional to the entropy loss between the estimate and an appropriately chosen shrinkage target. Motivated by this fact, we then consider applying this penalty to multivariate t distribution. The penalized estimate can be computed efficiently using EM algorithm for given tuning parameters. It can also be viewed as an empirical Bayes estimator. Taking advantage of its Bayesian interpretation, we propose a variant of the method of moments to effectively elicit the tuning parameters. Simulations and real data analysis demonstrate the competitive performance of the new methods.  相似文献   

This paper presents a new bivariate discrete distribution that generalizes the bivariate Beta-Binomial distribution. It is generated by Appell hypergeometric function F1 and can be obtained as a Binomial mixture with an Exton's Generalized Beta distribution. The model has different marginal distributions which are, together with the conditional distributions, more flexible than the Beta-Binomial distribution. It has non-linear regression curves and is useful for random variables with positive correlation. These features make the model very adequate to fit observed data as the two applications included show.  相似文献   

In this paper, we study M-estimators of regression parameters in semiparametric linear models for censored data. A class of consistent and asymptotically normal M-estimators is constructed. A resampling method is developed for the estimation of the asymptotic covariance matrix of the estimators.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号