首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In single-arm clinical trials with survival outcomes, the Kaplan–Meier estimator and its confidence interval are widely used to assess survival probability and median survival time. Since the asymptotic normality of the Kaplan–Meier estimator is a common result, the sample size calculation methods have not been studied in depth. An existing sample size calculation method is founded on the asymptotic normality of the Kaplan–Meier estimator using the log transformation. However, the small sample properties of the log transformed estimator are quite poor in small sample sizes (which are typical situations in single-arm trials), and the existing method uses an inappropriate standard normal approximation to calculate sample sizes. These issues can seriously influence the accuracy of results. In this paper, we propose alternative methods to determine sample sizes based on a valid standard normal approximation with several transformations that may give an accurate normal approximation even with small sample sizes. In numerical evaluations via simulations, some of the proposed methods provided more accurate results, and the empirical power of the proposed method with the arcsine square-root transformation tended to be closer to a prescribed power than the other transformations. These results were supported when methods were applied to data from three clinical trials.  相似文献   

2.
The focus of geographical studies in epidemiology has recently moved towards looking for effects of exposures based on data taken at local levels of aggregation (i.e. small areas). This paper investigates how regression coefficients measuring covariate effects at the point level are modified under aggregation. Changing the level of aggregation can lead to completely different conclusions about exposure–effect relationships, a phenomenon often referred to as ecological bias. With partial knowledge of the within‐area distribution of the exposure variable, the notion of maximum entropy can be used to approximate that part of the distribution that is unknown. From the approximation, an expression for the ecological bias is obtained; simulations and an example show that the maximum‐entropy approximation is often better than other commonly used approximations.  相似文献   

3.
Rukhin's statistic family for goodness-of-fit, under the null hypothesis, has asymptotic chi-squared distribution; however, for small samples the chi-squared approximation in some cases does not well agree with the exact distribution. In this paper we consider this approximation and other three to get appropriate test levels in comparison with the exact level. Moreover, exact power comparisons for several values of the parameter under specified alternatives provide that the classical Pearson's statistic, obtained as a particular case of Rukhin statistic, can be improved by choosing other statistics from the family. An explanation is proposed in terms of the effects of individual cell frequencies on the Rukhin statistic. This work was supported in part by the DGCYT grants No. PR156/97-7159 and PB96-0635  相似文献   

4.
A sequential method for estimating the expected value of a random variable is proposed. Using a parametric model, the updating formula is based on the maximum likelihood estimators of the roots of the expected value function. Under certain conditions, it is demonstrated that the estimators of the roots are consistent, when a two-parameter logit model version of the procedure is used for binary data. In addition, the estimators of the logit parameters have an asymptotic normal distribution. A simulation study is performed to evaluate the effectiveness of the new method for small to medium sample sizes. Compared to other sequential approximation methods, the proposed method performed well, especially when estimating several roots simultaneously.  相似文献   

5.
The classical chi‐square test of goodness of fit compares the hypothesis that data arise from some parametric family of distributions, against the nonparametric alternative that they arise from some other distribution. However, the chi‐square test requires continuous data to be grouped into arbitrary categories. Furthermore, as the test is based upon an approximation, it can only be used if there are sufficient data. In practice, these requirements are often wasteful of information and overly restrictive. The authors explore the use of the fractional Bayes factor to obtain a Bayesian alternative to the chi‐square test when no specific prior information is available. They consider the extent to which their methodology can handle small data sets and continuous data without arbitrary grouping.  相似文献   

6.
Tests for unit roots in panel data have become very popular. Two attractive features of panel data unit root tests are the increased power compared to time-series tests, and the often well-behaved limiting distributions of the tests. In this paper we apply Monte Carlo simulations to investigate how well the normal approximation works for a heterogeneous panel data unit root test when there are only a few cross sections in the sample. We find that the normal approximation, which should be valid for large numbers of cross-sectional units, works well, at conventional significance levels, even when the number of cross sections is as small as two. This finding is valuable for the applied researcher since critical values will be easy to obtain and p-values will be readily available.  相似文献   

7.
In this paper we examine the small sample distribution of the likelihood ratio test in the random effects model which is often recommended for meta-analyses. We find that this distribution depends strongly on the true value of the heterogeneity parameter (between-study variance) of the model, and that the correct p-value may be quite different from its large sample approximation. We recommend that the dependence of the heterogeneity parameter be examined for the data at hand and suggest a (simulation) method for this. Our setup allows for explanatory variables on the study level (meta-regression) and we discuss other possible applications, too. Two data sets are analyzed and two simulation studies are performed for illustration.  相似文献   

8.
The estimation of probability densities based on available data is a central task in many statistical applications. Especially in the case of large ensembles with many samples or high-dimensional sample spaces, computationally efficient methods are needed. We propose a new method that is based on a decomposition of the unknown distribution in terms of so-called distribution elements (DEs). These elements enable an adaptive and hierarchical discretization of the sample space with small or large elements in regions with smoothly or highly variable densities, respectively. The novel refinement strategy that we propose is based on statistical goodness-of-fit and pairwise (as an approximation to mutual) independence tests that evaluate the local approximation of the distribution in terms of DEs. The capabilities of our new method are inspected based on several examples of different dimensionality and successfully compared with other state-of-the-art density estimators.  相似文献   

9.
Gōtze & Kūnsch (1990) announced that a certain version of the bootstrap percentile-t method, and the blocking method, can be used to improve on the normal approximation to the distribution of a Studentized statistic computed from dependent data. This paper shows that this result depends fundamentally on the method of Studentization. Indeed, if the percentile-t method is implemented naively, for dependent data, then it does not improve by an order of magnitude on the much simpler normal approximation despite all the computational effort that is required to implement it. On the other hand, if the variance estimator used for the percentile-t bootstrap is adjusted appropriately, then percentile-t can improve substantially on the normal approximation.  相似文献   

10.
Staudte  R.G.  Zhang  J. 《Lifetime data analysis》1997,3(4):383-398
The p-value evidence for an alternative to a null hypothesis regarding the mean lifetime can be unreliable if based on asymptotic approximations when there is only a small sample of right-censored exponential data. However, a guarded weight of evidence for the alternative can always be obtained without approximation, no matter how small the sample, and has some other advantages over p-values. Weights of evidence are defined as estimators of 0 when the null hypothesis is true and 1 when the alternative is true, and they are judged on the basis of the ensuing risks, where risk is mean squared error of estimation. The evidence is guarded in that a preassigned bound is placed on the risk under the hypothesis. Practical suggestions are given for choosing the bound and for interpreting the magnitude of the weight of evidence. Acceptability profiles are obtained by inversion of a family of guarded weights of evidence for two-sided alternatives to point hypotheses, just as confidence intervals are obtained from tests; these profiles are arguably more informative than confidence intervals, and are easily determined for any level and any sample size, however small. They can help understand the effects of different amounts of censoring. They are found for several small size data sets, including a sample of size 12 for post-operative cancer patients. Both singly Type I and Type II censored examples are included. An examination of the risk functions of these guarded weights of evidence suggests that if the censoring time is of the same magnitude as the mean lifetime, or larger, then the risks in using a guarded weight of evidence based on a likelihood ratio are not much larger than they would be if the parameter were known.  相似文献   

11.
This article describes a method for computing approximate statistics for large data sets, when exact computations may not be feasible. Such situations arise in applications such as climatology, data mining, and information retrieval (search engines). The key to our approach is a modular approximation to the cumulative distribution function (cdf) of the data. Approximate percentiles (as well as many other statistics) can be computed from this approximate cdf. This enables the reduction of a potentially overwhelming computational exercise into smaller, manageable modules. We illustrate the properties of this algorithm using a simulated data set. We also examine the approximation characteristics of the approximate percentiles, using a von Mises functional type approach. In particular, it is shown that the maximum error between the approximate cdf and the actual cdf of the data is never more than 1% (or any other preset level). We also show that under assumptions of underlying smoothness of the cdf, the approximation error is much lower in an expected sense. Finally, we derive bounds for the approximation error of the percentiles themselves. Simulation experiments show that these bounds can be quite tight in certain circumstances.  相似文献   

12.
The beta-binomial distribution, which is generated by a simple mixture model, has been widely applied in the social, physical, and health sciences. Problems of estimation, inference, and prediction have been addressed in the past, but not in a Bayesian framework. This article develops Bayesian procedures for the beta-binomial model and, using a suitable reparameterization, establishes a conjugate-type property for a beta family of priors. The transformed parameters have interesting interpretations, especially in marketing applications, and are likely to be more stable. More specifically, one of these parameters is the market share and the other is a measure of the heterogeneity of the customer population. Analytical results are developed for the posterior and prediction quantities, although the numerical evaluation is not trivial. Since the posterior moments are more easily calculated, we also propose the use of posterior approximation using the Pearson system. A particular case (when there are two trials), which occurs in taste testing, brand choice, media exposure, and some epidemiological applications, is analyzed in detail. Simulated and real data are used to demonstrate the feasibility of the calculations. The simulation results effectively demonstrate the superiority of Bayesian estimators, particularly in small samples, even with uniform (“non-informed”) priors. Naturally, “informed” priors can give even better results. The real data on television viewing behavior are used to illustrate the prediction results. In our analysis, several problems with the maximum likelihood estimators are encountered. The superior properties and performance of the Bayesian estimators and the excellent approximation results are strong indications that our results will be potentially of high value in small sample applications of the beta-binomial and in cases in which significant prior information exists.  相似文献   

13.
The Kolassa method implemented in the nQuery Advisor software has been widely used for approximating the power of the Wilcoxon–Mann–Whitney (WMW) test for ordered categorical data, in which Edgeworth approximation is used to estimate the power of an unconditional test based on the WMW U statistic. When the sample size is small or when the sizes in the two groups are unequal, Kolassa’s method may yield quite poor approximation to the power of the conditional WMW test that is commonly implemented in statistical packages. Two modifications of Kolassa’s formula are proposed and assessed by simulation studies.  相似文献   

14.
Case–control studies allow efficient estimation of the associations of covariates with a binary response in settings where the probability of a positive response is small. It is well known that covariate–response associations can be consistently estimated using a logistic model by acting as if the case–control (retrospective) data were prospective, and that this result does not hold for other binary regression models. However, in practice an investigator may be interested in fitting a non–logistic link binary regression model and this paper examines the magnitude of the bias resulting from ignoring the case–control sample design with such models. The paper presents an approximation to the magnitude of this bias in terms of the sampling rates of cases and controls, as well as simulation results that show that the bias can be substantial.  相似文献   

15.
Whittemore (1981) proposed an approach for calculating the sample size needed to test hypotheses with specified significance and power against a given alternative for logistic regression with small response probability. Based on the distribution of covariate, which could be either discrete or continuous, this approach first provides a simple closed-form approximation to the asymptotic covariance matrix of the maximum likelihood estimates, and then uses it to calculate the sample size needed to test a hypothesis about the parameter. Self et al. (1992) described a general approach for power and sample size calculations within the framework of generalized linear models, which include logistic regression as a special case. Their approach is based on an approximation to the distribution of the likelihood ratio statistic. Unlike the Whittemore approach, their approach is not limited to situations of small response probability. However, it is restricted to models with a finite number of covariate configurations. This study compares these two approaches to see how accurate they would be for the calculations of power and sample size in logistic regression models with various response probabilities and covariate distributions. The results indicate that the Whittemore approach has a slight advantage in achieving the nominal power only for one case with small response probability. It is outperformed for all other cases with larger response probabilities. In general, the approach proposed in Self et al. (1992) is recommended for all values of the response probability. However, its extension for logistic regression models with an infinite number of covariate configurations involves an arbitrary decision for categorization and leads to a discrete approximation. As shown in this paper, the examined discrete approximations appear to be sufficiently accurate for practical purpose.  相似文献   

16.
Based on type II censored data, an exact lower confidence limit is constructed for the reliability function of a two-parameter exponential distribution, using the concept of a generalized confidence interval due to Weerahandi (J. Amer. Statist. Assoc. 88 (1993) 899). It is shown that the interval is exact, i.e., it provides the intended coverage. The confidence limit has to be numerically obtained; however, the required computations are simple and straightforward. An approximation is also developed for the confidence limit and its performance is numerically investigated. The numerical results show that compared to what is currently available, our approximation is more satisfactory in terms of providing the intended coverage, especially for small samples.  相似文献   

17.
Summary. A drawback of a new method for integrating abundance and mark–recapture–recovery data is the need to combine likelihoods describing the different data sets. Often these likelihoods will be formed by using specialist computer programs, which is an obstacle to the joint analysis. This difficulty is easily circumvented by the use of a multivariate normal approximation. We show that it is only necessary to make the approximation for the parameters of interest in the joint analysis. The approximation is evaluated on data sets for two bird species and is shown to be efficient and accurate.  相似文献   

18.
Amparo Baíllo 《Statistics》2013,47(6):553-569
This work deals with estimating the vector of means of certain characteristics of small areas. In this context, a unit level multivariate model with correlated sampling errors is considered. An approximation is obtained for the mean-squared and cross-product errors of the empirical best linear unbiased predictors of the means, when model parameters are estimated either by maximum likelihood (ML) or by restricted ML. This approach has been implemented on a Monte Carlo study using social and labour data from the Spanish Labour Force Survey.  相似文献   

19.
Abstract. The second‐order random walk (RW2) model is commonly used for smoothing data and for modelling response functions. It is computationally efficient due to the Markov properties of the joint (intrinsic) Gaussian density. For evenly spaced locations the RW2 model is well established, whereas for irregularly spaced locations there is no well established construction in the literature. By considering the RW2 model as the solution of a stochastic differential equation (SDE), a discretely observed integrated Wiener process, it is possible to derive the density preserving the Markov properties by augmenting the state‐space with the velocities. Here, we derive a computationally more efficient RW2 model for irregular locations using a Galerkin approximation to the solution of the SDE without the need of augmenting the state‐space. Numerical comparison with the exact solution demonstrates that the error in the Galerkin approximation is small and negligible in applications.  相似文献   

20.
We investigate the problem of regression from multiple reproducing kernel Hilbert spaces by means of orthogonal greedy algorithm. The greedy algorithm is appealing as it uses a small portion of candidate kernels to represent the approximation of regression function, and can greatly reduce the computational burden of traditional multi-kernel learning. Satisfied learning rates are obtained based on the Rademacher chaos complexity and data dependent hypothesis spaces.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号