首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
It has long been known that, for many joint distributions, Kendall's τ and Spearman's ρ have different values, as they measure different aspects of the dependence structure. Although the classical inequalities between Kendall's τ and Spearman's ρ for pairs of random variables are given, the joint distributions which can attain the bounds between Kendall's τ and Spearman's ρ are difficult to find. We use the simulated annealing method to find the bounds for ρ in terms of τ and its corresponding joint distribution which can attain those bounds. Furthermore, using this same method, we find the improved bounds between τ and ρ, which is different from that given by Durbin and Stuart.  相似文献   

2.
Kendall's τ is a non-parametric measure of correlation based on ranks and is used in a wide range of research disciplines. Although methods are available for making inference about Kendall's τ, none has been extended to modeling multiple Kendall's τs arising in longitudinal data analysis. Compounding this problem is the pervasive issue of missing data in such study designs. In this article, we develop a novel approach to provide inference about Kendall's τ within a longitudinal study setting under both complete and missing data. The proposed approach is illustrated with simulated data and applied to an HIV prevention study.  相似文献   

3.
Exploratory methods for determining appropriate lagged vsrlables in a vector nonlinear time series model are investigated. The first is a multivariate extension of the R statistic considered by Granger and Lin (1994), which is based on an estimate of the mutual information criterion. The second method uses Kendall's ρ and partial ρ statistics for lag determination. The methods provide nonlinear analogues of the autocorrelation and partial autocorrelation matrices for a vector time series. Simulation studies indicate that the R statistic reliabiy identifies appropriate lagged nonlinear moving average terms in a vector time series, while Kendall's ρ and partial ρ statistics have some power in identifying appropirate lagged nonlinear moving average and autoregressive terms, respectively, when the nonlinear relationship between lagged variables is monotonic. For illustration, the methods are applied to set of annual temperature and tree ring measurements at Campito Mountain In California.  相似文献   

4.
ABSTRACT

Panel datasets have been increasingly used in economics to analyze complex economic phenomena. Panel data is a two-dimensional array that combines cross-sectional and time series data. Through constructing a panel data matrix, the clustering method is applied to panel data analysis. This method solves the heterogeneity question of the dependent variable, which belongs to panel data, before the analysis. Clustering is a widely used statistical tool in determining subsets in a given dataset. In this article, we present that the mixed panel dataset is clustered by agglomerative hierarchical algorithms based on Gower's distance and by k-prototypes. The performance of these algorithms has been studied on panel data with mixed numerical and categorical features. The effectiveness of these algorithms is compared by using cluster accuracy. An experimental analysis is illustrated on a real dataset using Stata and R package software.  相似文献   

5.
Gluing Copulas     
We present a new way of constructing n-copulas, by scaling and gluing finitely many n-copulas. Gluing for bivariate copulas produces a copula that coincides with the independence copula on some grid of horizontal and vertical sections. Examples illustrate how gluing can be applied to build complicated copulas from simple ones. Finally, we investigate the analytical as well as statistical properties of the copulas obtained by gluing, in particular, the behavior of Spearman's ρ and Kendall's τ.  相似文献   

6.
The distribution of the sample correlation coefficient is derived when the population is a mixture of two bivariate normal distributions with zero mean but different covariances and mixing proportions 1 - λ and λ respectively; λ will be called the proportion of contamination. The test of ρ = 0 based on Student's t, Fisher's z, arcsine, or Ruben's transformation is shown numerically to be nonrobust when λ, the proportion of contamination, lies between 0.05 and 0.50 and the contaminated population has 9 times the variance of the standard (bivariate normal) population. These tests are also sensitive to the presence of outliers.  相似文献   

7.
We derive best-possible bounds on the class of copulas with known values at several points, under the assumption that the points are either in “increasing order” or in “decreasing order”. These bounds may be used to establish best-possible bounds on Kendall's τ and Spearman's ρ, for such copulas. An important special case is when the values of a copula are known at several diagonal points. We also use our results to establish best-possible bounds on the distribution function of the sum of two random variables with known marginal distributions when the values of the joint distribution function are known at several points.  相似文献   

8.
A version of the nonparametric bootstrap, which resamples the entire subjects from original data, called the case bootstrap, has been increasingly used for estimating uncertainty of parameters in mixed‐effects models. It is usually applied to obtain more robust estimates of the parameters and more realistic confidence intervals (CIs). Alternative bootstrap methods, such as residual bootstrap and parametric bootstrap that resample both random effects and residuals, have been proposed to better take into account the hierarchical structure of multi‐level and longitudinal data. However, few studies have been performed to compare these different approaches. In this study, we used simulation to evaluate bootstrap methods proposed for linear mixed‐effect models. We also compared the results obtained by maximum likelihood (ML) and restricted maximum likelihood (REML). Our simulation studies evidenced the good performance of the case bootstrap as well as the bootstraps of both random effects and residuals. On the other hand, the bootstrap methods that resample only the residuals and the bootstraps combining case and residuals performed poorly. REML and ML provided similar bootstrap estimates of uncertainty, but there was slightly more bias and poorer coverage rate for variance parameters with ML in the sparse design. We applied the proposed methods to a real dataset from a study investigating the natural evolution of Parkinson's disease and were able to confirm that the methods provide plausible estimates of uncertainty. Given that most real‐life datasets tend to exhibit heterogeneity in sampling schedules, the residual bootstraps would be expected to perform better than the case bootstrap. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

9.
Abstract. Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower‐bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two‐component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias‐corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.  相似文献   

10.
Under proper conditions, two independent tests of the null hypothesis of homogeneity of means are provided by a set of sample averages. One test, with tail probability P 1, relates to the variation between the sample averages, while the other, with tail probability P 2, relates to the concordance of the rankings of the sample averages with the anticipated rankings under an alternative hypothesis. The quantity G = P 1 P 2 is considered as the combined test statistic and, except for the discreteness in the null distribution of P 2, would correspond to the Fisher statistic for combining probabilities. Illustration is made, for the case of four means, on how to get critical values of G or critical values of P 1 for each possible value of P 2, taking discreteness into account. Alternative measures of concordance considered are Spearman's ρ and Kendall's τ. The concept results, in the case of two averages, in assigning two-thirds of the test size to the concordant tail, one-third to the discordant tail.  相似文献   

11.
The central limit theorem says that, provided an estimator fulfills certain weak conditions, then, for reasonable sample sizes, the sampling distribution of the estimator converges to normality. We propose a procedure to find out what a “reasonably large sample size” is. The procedure is based on the properties of Gini's mean difference decomposition. We show the results of implementations of the procedure from simulated datasets and data from the German Socio-economic Panel.  相似文献   

12.
Several methods for generating variates with univariate and multivariate Walleniu' and Fisher's noncentral hypergeometric distributions are developed. Methods for the univariate distributions include: simulation of urn experiments, inversion by binary search, inversion by chop-down search from the mode, ratio-of-uniforms rejection method, and rejection by sampling in the τ domain. Methods for the multivariate distributions include: simulation of urn experiments, conditional method, Gibbs sampling, and Metropolis-Hastings sampling. These methods are useful for Monte Carlo simulation of models of biased sampling and models of evolution and for calculating moments and quantiles of the distributions.  相似文献   

13.
Jointness is a Bayesian approach to capturing dependence among regressors in multivariate data. It addresses the general issue of whether explanatory factors for a given empirical phenomenon are complements or substitutes. I ask a number of questions about existing jointness concepts: Are the patterns revealed stable across datasets? Are results robust to prior choice and do data characteristics affect results? And importantly: What do the answers imply from a practical vista? The present study takes an applied, interdisciplinary and comparative perspective, validating jointness concepts on datasets across scientific fields with focus on life sciences (Parkinson's disease) and sociology. Simulations complement the study of real-world data. My findings suggest that results depend on which jointness concept is used: Some concepts deliver jointness patterns remarkably uniform across datasets, while all concepts are fairly robust to the choice of prior structure. This can be interpreted as critique of jointness from a practical perspective, given that the patterns revealed are at times very different and no concept emerges as overall advantageous. The composite indicators approach to combining information across jointness concepts is also explored, suggesting an avenue to facilitate the application of the concepts in future research.  相似文献   

14.
Since the early 1990s, there has been an increasing interest in statistical methods for detecting global spatial clustering in data sets. Tango's index is one of the most widely used spatial statistics for assessing whether spatially distributed disease rates are independent or clustered. Interestingly, this statistic can be partitioned into the sum of two terms: one term is similar to the usual chi-square statistic, being based on deviation patterns between the observed and expected values, and the other term, similar to Moran's I, is able to detect the proximity of similar values. In this paper, we examine this hybrid nature of Tango's index. The goal is to evaluate the possibility of distinguishing the spatial sources of clustering: lack of fit or spatial autocorrelation. To comply with the aims of the work, a simulation study is performed, by which examples of patterns driving the goodness-of-fit and spatial autocorrelation components of the statistic are provided. As for the latter aspect, it is worth noting that inducing spatial association among count data without adding lack of fit is not an easy task. In this respect, the overlapping sums method is adopted. The main findings of the simulation experiment are illustrated and a comparison with a previous research on this topic is also highlighted.  相似文献   

15.
A generalized version of inverted exponential distribution (IED) is considered in this paper. This lifetime distribution is capable of modeling various shapes of failure rates, and hence various shapes of aging criteria. The model can be considered as another useful two-parameter generalization of the IED. Maximum likelihood and Bayes estimates for two parameters of the generalized inverted exponential distribution (GIED) are obtained on the basis of a progressively type-II censored sample. We also showed the existence, uniqueness and finiteness of the maximum likelihood estimates of the parameters of GIED based on progressively type-II censored data. Bayesian estimates are obtained using squared error loss function. These Bayesian estimates are evaluated by applying the Lindley's approximation method and via importance sampling technique. The importance sampling technique is used to compute the Bayes estimates and the associated credible intervals. We further consider the Bayes prediction problem based on the observed samples, and provide the appropriate predictive intervals. Monte Carlo simulations are performed to compare the performances of the proposed methods and a data set has been analyzed for illustrative purposes.  相似文献   

16.
To accommodate testing for independence in bivariate data subject to censoring, several modifications of Kendall's τ are discussed. An extensive computer simulation is done to investigate power properties of these modifications under alternatives of the bivariate normal or bivariate exponential types. The statistics are then applied to available heart pacemaker patient survival data.  相似文献   

17.
In this paper, the goal of identifying disease subgroups based on differences in observed symptom profile is considered. Commonly referred to as phenotype identification, solutions to this task often involve the application of unsupervised clustering techniques. In this paper, we investigate the application of a Dirichlet process mixture model for this task. This model is defined by the placement of the Dirichlet process on the unknown components of a mixture model, allowing for the expression of uncertainty about the partitioning of observed data into homogeneous subgroups. To exemplify this approach, an application to phenotype identification in Parkinson's disease is considered, with symptom profiles collected using the Unified Parkinson's Disease Rating Scale.  相似文献   

18.
Xing-De Duan 《Statistics》2016,50(3):525-539
This paper develops a Bayesian approach to obtain the joint estimates of unknown parameters, nonparametric functions and random effects in generalized partially linear mixed models (GPLMMs), and presents three case deletion influence measures to identify influential observations based on the φ-divergence, Cook's posterior mean distance and Cook's posterior mode distance of parameters. Fisher's iterative scoring algorithm is developed to evaluate the posterior modes of parameters in GPLMMs. The first-order approximation to Cook's posterior mode distance is presented. The computationally feasible formulae for the φ-divergence diagnostic and Cook's posterior mean distance are given. Several simulation studies and an example are presented to illustrate our proposed methodologies.  相似文献   

19.
Inference based on the Central Limit Theorem has only first order accuracy. We give tests and confidence intervals (CIs) of second orderaccuracy for the shape parameter ρ of a gamma distribution for both the unscaled and scaled cases.

Tests and CIs based on moment and cumulant estimates are considered as well as those based on the maximum likelihood estimate (MLE).

For the unscaled case the MLE is the moment estimate of order zero; the most efficient moment estimate of integral order is the sample mean, having asymptotic relative efficiency (ARE) .61 when ρ= 1.

For the scaled case the most efficient moment estimate is a functionof the mean and variance. Its ARE is .39 when ρ = 1.

Our motivation for constructing these tests of ρ = 1 and CIs forρ is to provide a simple and convenient method for testing whether a distribution is exponential in situations such as rainfall models where such an assumption is commonly made.  相似文献   

20.
We consider the problem of making statistical inference on unknown parameters of a lognormal distribution under the assumption that samples are progressively censored. The maximum likelihood estimates (MLEs) are obtained by using the expectation-maximization algorithm. The observed and expected Fisher information matrices are provided as well. Approximate MLEs of unknown parameters are also obtained. Bayes and generalized estimates are derived under squared error loss function. We compute these estimates using Lindley's method as well as importance sampling method. Highest posterior density interval and asymptotic interval estimates are constructed for unknown parameters. A simulation study is conducted to compare proposed estimates. Further, a data set is analysed for illustrative purposes. Finally, optimal progressive censoring plans are discussed under different optimality criteria and results are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号