首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 14 毫秒
1.
The asympotic normal approximation to the distribution of the estimated measure [kcirc] for evaluating agreement between two raters has been shown to perform poorly for small sample sizes when the true kappa is nonzero. This paper examines the use of skewness corrections and transformations of [kcirc] on the attained confidence levels. Small sample simulations demonstrate the improvement in the agreement between the desired and actual levels of confidence intervals and hypothesis tests that incorporate these corrections.  相似文献   

2.
We evaluate the estimation performance of the Binary Dynamic Logit model for correlated ordinal variables (BDLCO model), and compare it to GEE and Ordinal Logistic Regression performance in terms of bias and Mean Absolute Percentage Error (MAPE) via Monte Carlo simulation. Our results indicate that when the proportional-odds assumption does not hold, the proposed BDLCO method is superior to existing models in estimating correlated ordinal data. Moreover, this method is flexible in terms of modeling dependence and allows unequal slopes for each category, and can be used to estimate an apple bloom data set where the proportional-odds assumption is violated. We also provide a function in R to implement BDLCO.  相似文献   

3.
Abstract

Inferential methods based on ranks present robust and powerful alternative methodology for testing and estimation. In this article, two objectives are followed. First, develop a general method of simultaneous confidence intervals based on the rank estimates of the parameters of a general linear model and derive the asymptotic distribution of the pivotal quantity. Second, extend the method to high dimensional data such as gene expression data for which the usual large sample approximation does not apply. It is common in practice to use the asymptotic distribution to make inference for small samples. The empirical investigation in this article shows that for methods based on the rank-estimates, this approach does not produce a viable inference and should be avoided. A method based on the bootstrap is outlined and it is shown to provide a reliable and accurate method of constructing simultaneous confidence intervals based on rank estimates. In particular it is shown that commonly applied methods of normal or t-approximation are not satisfactory, particularly for large-scale inferences. Methods based on ranks are uniquely suitable for analysis of microarray gene expression data since they often involve large scale inferences based on small samples containing a large number of outliers and violate the assumption of normality. A real microarray data is analyzed using the rank-estimate simultaneous confidence intervals. Viability of the proposed method is assessed through a Monte Carlo simulation study under varied assumptions.  相似文献   

4.
For a normal distribution with known variance, the standard confidence interval of the location parameter is derived from the classical Neyman procedure. When the parameter space is known to be restricted, the standard confidence interval is arguably unsatisfactory. Recent articles have addressed this problem and proposed confidence intervals for the mean of a normal distribution where the parameter space is not less than zero. In this article, we propose a new confidence interval, rp interval, and derive the Bayesian credible interval and likelihood ratio interval for general restricted parameter space. We compare these intervals with the standard interval and the minimax interval. Simulation studies are undertaken to assess the performances of these confidence intervals.  相似文献   

5.
The skew-normal model is a class of distributions that extends the Gaussian family by including a skewness parameter. This model presents some inferential problems linked to the estimation of the skewness parameter. In particular its maximum likelihood estimator can be infinite especially for moderate sample sizes and is not clear how to calculate confidence intervals for this parameter. In this work, we show how these inferential problems can be solved if we are interested in the distribution of extreme statistics of two random variables with joint normal distribution. Such situations are not uncommon in applications, especially in medical and environmental contexts, where it can be relevant to estimate the distribution of extreme statistics. A theoretical result, found by Loperfido [7 Loperfido, N. 2002. Statistical implications of selectively reported inferential results. Statist. Probab. Lett., 56: 1322. [Crossref], [Web of Science ®] [Google Scholar]], proves that such extreme statistics have a skew-normal distribution with skewness parameter that can be expressed as a function of the correlation coefficient between the two initial variables. It is then possible, using some theoretical results involving the correlation coefficient, to find approximate confidence intervals for the parameter of skewness. These theoretical intervals are then compared with parametric bootstrap intervals by means of a simulation study. Two applications are given using real data.  相似文献   

6.
7.
In a ground-breaking paper published in 1990 by the Journal of the Royal Statistical Society, J.R.M. Hosking defined the L-moment of a random variable as an expectation of certain linear combinations of order statistics. L-moments are an alternative to conventional moments and recently they have been used often in inferential statistics. L-moments have several advantages over the conventional moments, including robustness to the the presence of outliers, which may lead to more accurate estimates in some cases as the characteristics of distributions. In this contribution, asymptotic theory and L-moments are used to derive confidence intervals of the population parameters and quantiles of the three-parametric generalized Pareto and extreme-value distributions. Computer simulations are performed to determine the performance of confidence intervals for the population quantiles based on L-moments and to compare them to those obtained by traditional estimation techniques. The results obtained show that they perform well in comparison to the moments and maximum likelihood methods when the interest is in higher quantiles, or even best. L-moments are especially recommended when the tail of the distribution is rather heavier and the sample size is small. The derived intervals are applied to real economic data, and specifically to market-opening asset prices.  相似文献   

8.
9.
Dental arch form is an important part of dental orthodontic practice. Distance-based clustering methods are often used to find standard arch forms. In particular, S-J. Lee, S.I. Lee, J. Lim, H-J. Park, and T. Wheeler Method to classify human dental arch form, Am. J. Orthod. Dentofacial Orthop. (2010), to appear] propose a 1-type distance which is invariant to the location-shift and the rotational transformation. Despite the popularity of the distance-based methods, little attention is given to the choice of the distance which has a great influence on final clusters. We have three goals in this paper. First, we study the properties of the 1-type distance by Lee et al. (2010). Second, we propose a bootstrap-based procedure to evaluate quantitatively how good the clusters are. Finally, we apply the bootstrap procedure to the Korean standard occlusion study and compare the existing distance-based clustering methods in previous literature.  相似文献   

10.
It is shown that Strawderman's [1974. Minimax estimation of powers of the variance of a normal population under squared error loss. Ann. Statist. 2, 190–198] technique for estimating the variance of a normal distribution can be extended to estimating a general scale parameter in the presence of a nuisance parameter. Employing standard monotone likelihood ratio-type conditions, a new class of improved estimators for this scale parameter is derived under quadratic loss. By imposing an additional condition, a broader class of improved estimators is obtained. The dominating procedures are in form analogous to those in Strawderman [1974. Minimax estimation of powers of the variance of a normal population under squared error loss. Ann. Statist. 2, 190–198]. Application of the general results to the exponential distribution yields new sufficient conditions, other than those of Brewster and Zidek [1974. Improving on equivariant estimators. Ann. Statist. 2, 21–38] and Kubokawa [1994. A unified approach to improving equivariant estimators. Ann. Statist. 22, 290–299], for improving the best affine equivariant estimator of the scale parameter. A class of estimators satisfying the new conditions is constructed. The results shed new light on Strawderman's [1974. Minimax estimation of powers of the variance of a normal population under squared error loss. Ann. Statist. 2, 190–198] technique.  相似文献   

11.
Superpopulation models are proposed that should be appropriate for modelling sample-based audits of Medicare payments and other overpayment situations. Simulations are used to estimate the coverage probabilities of confidence intervals formed using the standard Stratified Expansion and Combined Ratio estimators of the total. Despite severe departures from the usual model of normal deviations, these methods have actual coverage probabilities reasonably close to the nominal level specified by the US government's sampling guidelines. An exception occurs when all claims from a single sampling unit are either completely allowed, or completely denied, and for this situation an alternative is explored. A balanced sampling design is also examined, but shown to make no improvement over ordinary stratified samples used in conjunction with ratio estimates.  相似文献   

12.
Abrupt changes often occur for environmental and financial time series. Most often, these changes are due to human intervention. Change point analysis is a statistical tool used to analyze sudden changes in observations along the time series. In this paper, we propose a Bayesian model for extreme values for environmental and economic datasets that present a typical change point behavior. The model proposed in this paper addresses the situation in which more than one change point can occur in a time series. By analyzing maxima, the distribution of each regime is a generalized extreme value distribution. In this model, the change points are unknown and considered parameters to be estimated. Simulations of extremes with two change points showed that the proposed algorithm can recover the true values of the parameters, in addition to detecting the true change points in different configurations. Also, the number of change points was a problem to be considered, and the Bayesian estimation can correctly identify the correct number of change points for each application. Environmental and financial data were analyzed and results showed the importance of considering the change point in the data and revealed that this change of regime brought about an increase in the return levels, increasing the number of floods in cities around the rivers. Stock market levels showed the necessity of a model with three different regimes.  相似文献   

13.
Motivated by the Singapore Longitudinal Aging Study (SLAS), we propose a Bayesian approach for the estimation of semiparametric varying-coefficient models for longitudinal continuous and cross-sectional binary responses. These models have proved to be more flexible than simple parametric regression models. Our development is a new contribution towards their Bayesian solution, which eases computational complexity. We also consider adapting all kinds of familiar statistical strategies to address the missing data issue in the SLAS. Our simulation results indicate that a Bayesian imputation (BI) approach performs better than complete-case (CC) and available-case (AC) approaches, especially under small sample designs, and may provide more useful results in practice. In the real data analysis for the SLAS, the results for longitudinal outcomes from BI are similar to AC analysis, differing from those with CC analysis.  相似文献   

14.
Likelihood-based, mixed-effects models for repeated measures (MMRMs) are occasionally used in primary analyses for group comparisons of incomplete continuous longitudinal data. Although MMRM analysis is generally valid under missing-at-random assumptions, it is invalid under not-missing-at-random (NMAR) assumptions. We consider the possibility of bias of estimated treatment effect using standard MMRM analysis in a motivational case, and propose simple and easily implementable pattern mixture models within the framework of mixed-effects modeling, to handle the NMAR data with differential missingness between treatment groups. The proposed models are a new form of pattern mixture model that employ a categorical time variable when modeling the outcome and a continuous time variable when modeling the missingness-data patterns. The models can directly provide an overall estimate of the treatment effect of interest using the average of the distribution of the missingness indicator and a categorical time variable in the same manner as MMRM analysis. Our simulation results indicate that the bias of the treatment effect for MMRM analysis was considerably larger than that for the pattern mixture model analysis under NMAR assumptions. In the case study, it would be dangerous to interpret only the results of the MMRM analysis, and the proposed pattern mixture model would be useful as a sensitivity analysis for treatment effect evaluation.  相似文献   

15.
This paper proposes an intuitive clustering algorithm capable of automatically self-organizing data groups based on the original data structure. Comparisons between the propopsed algorithm and EM [1 A. Banerjee, I.S. Dhillon, J. Ghosh, and S. Sra, Clustering on the unit hypersphere using von Mises–Fisher distribution, J. Mach. Learn. Res. 6 (2005), pp. 139. [Google Scholar]] and spherical k-means [7 I.S. Dhillon and D.S. Modha, Concept decompositions for large sparse text data using clustering, Mach. Learn. 42 (2001), pp. 143175. doi: 10.1023/A:1007612920971[Crossref], [Web of Science ®] [Google Scholar]] algorithms are given. These numerical results show the effectiveness of the proposed algorithm, using the correct classification rate and the adjusted Rand index as evaluation criteria [5 J.-M. Chiou and P.-L. Li, Functional clustering and identifying substructures of longitudinal data, J. R. Statist. Soc. Ser. B. 69 (2007), pp. 679699. doi: 10.1111/j.1467-9868.2007.00605.x[Crossref] [Google Scholar],6 J.-M. Chiou and P.-L. Li, Correlation-based functional clustering via subspace projection, J. Am. Statist. Assoc. 103 (2008), pp. 16841692. doi: 10.1198/016214508000000814[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]]. In 1995, Mayor and Queloz announced the detection of the first extrasolar planet (exoplanet) around a Sun-like star. Since then, observational efforts of astronomers have led to the detection of more than 1000 exoplanets. These discoveries may provide important information for understanding the formation and evolution of planetary systems. The proposed clustering algorithm is therefore used to study the data gathered on exoplanets. Two main implications are also suggested: (1) there are three major clusters, which correspond to the exoplanets in the regimes of disc, ongoing tidal and tidal interactions, respectively, and (2) the stellar metallicity does not play a key role in exoplanet migration.  相似文献   

16.
The use of asymptotic moments to increase the precision of the control variate technique for Monte Carlo estimation is dis­cussed. An application is made to the estimation of the mean and variance of the likelihood ratio goodness–of–fit statistic with the Pearson statistic used as a control variate. Estimates of the variance reductions are given.  相似文献   

17.
Multivariate analysis techniques are applied to the two-period repeated measures crossover design. The approach considered in this paper has the advantage over the univariate analysis approach proposed recently by Wallenstein and Fisher (1977) that the former does not require any specific structure on the variance-covariance matrix of the repeated measures factor. (It should be noted that sums and differences of observations over periods are used for all tests. Therefore, there are two matrices under consideration, one for sums and one for differences.) Tests of significance are derived using the Wilks? criterion, and the procedure is illustrated with a numerical example from the area of clinical trials.  相似文献   

18.
Construction of closed-form confidence intervals on linear combinations of variance components were developed generically for balanced data and studied mainly for one-way and two-way random effects analysis of variance models. The Satterthwaite approach is easily generalized to unbalanced data and modified to increase its coverage probability. They are applied on measures of assay precision in combination with (restricted) maximum likelihood and Henderson III Type 1 and 3 estimation. Simulations of interlaboratory studies with unbalanced data and with small sample sizes do not show superiority of any of the possible combinations of estimation methods and Satterthwaite approaches on three measures of assay precision. However, the modified Satterthwaite approach with Henderson III Type 3 estimation is often preferred above the other combinations.  相似文献   

19.
The data collection process and the inherent population structure are the main causes for clustered data. The observations in a given cluster are correlated, and the magnitude of such correlation is often measured by the intra-cluster correlation coefficient. The intra-cluster correlation can lead to an inflated size of the standard F test in a linear model. In this paper, we propose a solution to this problem. Unlike previous adjustments, our method does not require estimation of the intra-class correlation, which is problematic especially when the number of clusters is small. Our simulation results show that the new method outperforms the existing methods.  相似文献   

20.
Summary.  In magazine advertisements for new drugs, it is common to see summary tables that compare the relative frequency of several side-effects for the drug and for a placebo, based on results from placebo-controlled clinical trials. The paper summarizes ways to conduct a global test of equality of the population proportions for the drug and the vector of population proportions for the placebo. For multivariate normal responses, the Hotelling T 2-test is a well-known method for testing equality of a vector of means for two independent samples. The tests in the paper are analogues of this test for vectors of binary responses. The likelihood ratio tests can be computationally intensive or have poor asymptotic performance. Simple quadratic forms comparing the two vectors provide alternative tests. Much better performance results from using a score-type version with a null-estimated covariance matrix than from the sample covariance matrix that applies with an ordinary Wald test. For either type of statistic, asymptotic inference is often inadequate, so we also present alternative, exact permutation tests. Follow-up inferences are also discussed, and our methods are applied to safety data from a phase II clinical trial.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号