首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Biostatisticians recognize the importance of precise definitions of technical terms in randomized controlled clinical trial (RCCT) protocols, statistical analysis plans, and so on, in part because definitions are a foundation for subsequent actions. Imprecise definitions can be a source of controversies about appropriate statistical methods, interpretation of results, and extrapolations to larger populations. This paper presents precise definitions of some familiar terms and definitions of some new terms, some perhaps controversial. The glossary contains definitions that can be copied into a protocol, statistical analysis plan, or similar document and customized. The definitions were motivated and illustrated in the context of a longitudinal RCCT in which some randomized enrollees are non‐adherent, receive a corrupted treatment, or withdraw prematurely. The definitions can be adapted for use in a much wider set of RCCTs. New terms can be used in place of controversial terms, for example, subject. We define terms specifying a person's progress through RCCT phases and that precisely define the RCCT's phases and milestones. We define terms that distinguish between subsets of an RCCT's enrollees and a much larger patient population. ‘The intention‐to‐treat (ITT) principle’ has multiple interpretations that can be distilled to the definitions of the ‘ITT analysis set of randomized enrollees’. Most differences among interpretations of ‘the’ ITT principle stem from an RCCT's primary objective (mainly efficacy versus effectiveness). Four different ‘authoritative’ definitions of ITT analysis set of randomized enrollees illustrate the variety of interpretations. We propose a separate specification of the analysis set of data that will be used in a specific analysis. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

2.
Henryk Zähle 《Statistics》2013,47(5):951-964
Both Marcinkiewicz–Zygmund strong laws of large numbers (MZ-SLLNs) and ordinary strong laws of large numbers (SLLNs) for plug-in estimators of general statistical functionals are derived. It is used that if a statistical functional is ‘sufficiently regular’, then an (MZ-)SLLN for the estimator of the unknown distribution function yields an (MZ-)SLLN for the corresponding plug-in estimator. It is in particular shown that many L-, V- and risk functionals are ‘sufficiently regular’ and that known results on the strong convergence of the empirical process of α-mixing random variables can be improved. The presented approach does not only cover some known results but also provides some new strong laws for plug-in estimators of particular statistical functionals.  相似文献   

3.
In recent years, growing attention has been placed on the increasing pattern of ‘clumpy data’ in many empirical areas such as financial market microstructure, criminology and seismology, and digital media consumption to name just a few; but a well-defined and careful measurement of clumpiness has remained somewhat elusive. The related ‘hot hand’ effect has long been a widespread belief in sports, and has triggered a branch of interesting research which could shed some light on this domain. However, since many concerns have been raised about the low power of the existing ‘hot hand’ significance tests, we propose a new class of clumpiness measures which are shown to have higher statistical power in extensive simulations under a wide variety of statistical models for repeated outcomes. Finally, an empirical study is provided by using a unique dataset obtained from Hulu.com, an increasingly popular video streaming provider. Our results provide evidence that the ‘clumpiness phenomena’ is widely prevalent in digital content consumption, which supports the lore of ‘bingeability’ of online content believed to exist today.  相似文献   

4.
This paper makes the proposition that the only statistical analyses to achieve widespread popular use in statistical practice are those whose formulations are based on very smooth mathematical functions. The argument is made on an empirical basis, through examples. Given the truth of the proposition, the question ‘why should it be so?’ is intriguing, and any discussion has to be speculative. To aid that discussion, the paper starts with a list of statistical desiderata, with the view of seeing what properties are provided by underlying smoothness. This provides some rationale for the proposition. After that, the examples are considered. Methods that are widely used are listed, along with other methods which, despite impressive properties and possible early promise, have languished in the arena of practical application. Whatever the underlying causes may be, the proposition carries a worthwhile message for the formulation of new statistical methods, and for the adaptation of some of the old ones.  相似文献   

5.
Within the context of California's public report of coronary artery bypass graft (CABG) surgery outcomes, we first thoroughly review popular statistical methods for profiling healthcare providers. Extensive simulation studies are then conducted to compare profiling schemes based on hierarchical logistic regression (LR) modeling under various conditions. Both Bayesian and frequentist's methods are evaluated in classifying hospitals into ‘better’, ‘normal’ or ‘worse’ service providers. The simulation results suggest that no single method would dominate others on all accounts. Traditional schemes based on LR tend to identify too many false outliers, while those based on hierarchical modeling are relatively conservative. The issue of over shrinkage in hierarchical modeling is also investigated using the 2005–2006 California CABG data set. The article provides theoretical and empirical evidence in choosing the right methodology for provider profiling.  相似文献   

6.
A crucial component in the statistical simulation of a computationally expensive model is a good design of experiments. In this paper we compare the efficiency of the columnwise–pairwise (CP) and genetic algorithms for the optimization of Latin hypercubes (LH) for the purpose of sampling in statistical investigations. The performed experiments indicate, among other results, that CP methods are most efficient for small and medium size LH, while an adopted genetic algorithm performs better for large LH.Two optimality criteria suggested in the literature are evaluated with respect to statistical properties and efficiency. The obtained results lead us to favor a criterion based on the physical analogy of minimization of forces between charged particles suggested in Audze and Eglais (1977. Problems Dyn. Strength 35, 104–107) over a ‘maximin distance’ criterion from Johnson et al. (1990. J. Statist. Plann. Inference 26, 131–148).  相似文献   

7.
Model choice is one of the most crucial aspect in any statistical data analysis. It is well known that most models are just an approximation to the true data-generating process but among such model approximations, it is our goal to select the ‘best’ one. Researchers typically consider a finite number of plausible models in statistical applications, and the related statistical inference depends on the chosen model. Hence, model comparison is required to identify the ‘best’ model among several such candidate models. This article considers the problem of model selection for spatial data. The issue of model selection for spatial models has been addressed in the literature by the use of traditional information criteria-based methods, even though such criteria have been developed based on the assumption of independent observations. We evaluate the performance of some of the popular model selection critera via Monte Carlo simulation experiments using small to moderate samples. In particular, we compare the performance of some of the most popular information criteria such as Akaike information criterion (AIC), Bayesian information criterion, and corrected AIC in selecting the true model. The ability of these criteria to select the correct model is evaluated under several scenarios. This comparison is made using various spatial covariance models ranging from stationary isotropic to nonstationary models.  相似文献   

8.
A note on the correlation structure of transformed Gaussian random fields   总被引:1,自引:0,他引:1  
Transformed Gaussian random fields can be used to model continuous time series and spatial data when the Gaussian assumption is not appropriate. The main features of these random fields are specified in a transformed scale, while for modelling and parameter interpretation it is useful to establish connections between these features and those of the random field in the original scale. This paper provides evidence that for many ‘normalizing’ transformations the correlation function of a transformed Gaussian random field is not very dependent on the transformation that is used. Hence many commonly used transformations of correlated data have little effect on the original correlation structure. The property is shown to hold for some kinds of transformed Gaussian random fields, and a statistical explanation based on the concept of parameter orthogonality is provided. The property is also illustrated using two spatial datasets and several ‘normalizing’ transformations. Some consequences of this property for modelling and inference are also discussed.  相似文献   

9.
Forecasting methods are reviewed. They may be classified into univariate, multivariate and judgemental methods, and also by whether an automatic or non-automatic approach is adopted. The choice of ‘best’ method depends on a wide variety of considerations. The use of forecasting competitions to compare the accuracy of univariate methods is discussed. The strengths and weaknesses of different univariate methods are compared, both in automatic and non-automatic mode. Some general recommendations are made as well as some suggestions for future research.  相似文献   

10.
Recently, Lad, Sanfilippo, and Agro [(2015), ‘Extropy: Complementary Dual of Entropy’, Statistical Science, 30, 40–58.] showed the measure of entropy has a complementary dual, which is termed extropy. The present article introduces some estimators of the extropy of a continuous random variable. Properties of the proposed estimators are stated, and comparisons are made with Qiu and Jia’s estimators [(2018a), ‘Extropy Estimators with Applications in Testing uniformity’, Journal of Nonparametric Statistics, 30, 182–196]. The results indicate that the proposed estimators have a smaller mean squared error than competing estimators. A real example is presented and analysed.  相似文献   

11.
Statistical disclosure control (SDC) is a balancing act between mandatory data protection and the comprehensible demand from researchers for access to original data. In this paper, a family of methods is defined to ‘mask’ sensitive variables before data files can be released. In the first step, the variable to be masked is ‘cloned’ (C). Then, the duplicated variable as a whole or just a part of it is ‘suppressed’ (S). The masking procedure's third step ‘imputes’ (I) data for these artificial missings. Then, the original variable can be deleted and its masked substitute has to serve as the basis for the analysis of data. The idea of this general ‘CSI framework’ is to open the wide field of imputation methods for SDC. The method applied in the I-step can make use of available auxiliary variables including the original variable. Different members of this family of methods delivering variance estimators are discussed in some detail. Furthermore, a simulation study analyzes various methods belonging to the family with respect to both, the quality of parameter estimation and privacy protection. Based on the results obtained, recommendations are formulated for different estimation tasks.  相似文献   

12.
张伦俊 《统计研究》1997,14(5):31-38
1978年峨眉山会议以来,中国统计界长期被禁锢的思想终于获得了大解放,关于统计学性质及研究对象等基本理论问题上的各种不同的学术观点在改革开放的春风吹佛下纷纷萌发出来。“大统计”学科的设想提出以后更是受到统计界绝大多数同志的拥护和欢迎,人们都在热切地盼望着中国统计学发展的新时代。本文沿着时间的轨迹,就近20年来关于中国统计学之争鸣的一些有影响的观点做简单回顾。  相似文献   

13.
Comparisons of multivariate normal populations are made using a mul-tivariate approach (instead of reducing the problem to a univariate one). A rather negative finding is that, for comparisons with the ‘best’ of each variate, repeated univariate comparisons appear to be almost as efficient as multivariate comparisons, at least for the bivariate case and, under certain circumstances, for higher dimensional cases. Investigations are done on comparisons with the ‘MAX-best’ population (that one having the largest maximum of the marginal means), the ‘MIN-best’ (having the largest minimum) and the ‘O-best’ (being closest to largest in all marginal means). Detailed results are given for the bivariate normal with extensions indicated for the multivariate.  相似文献   

14.
This paper overviews some recent developments in panel data asymptotics, concentrating on the nonstationary panel case and gives a new result for models with individual effects. Underlying recent theory are asymptotics for multi-indexed processes in which both indexes may pass to infinity. We review some of the new limit theory that has been developed, show how it can be applied and give a new interpretation of individual effects in nonstationary panel data. Fundamental to the interpretation of much of the asymptotics is the concept of a panel regression coefficient which measures the long run average relation across a section of the panel. This concept is analogous to the statistical interpretation of the coefficient in a classical regression relation. A variety of nonstationary panel data models are discussed and the paper reviews the asymptotic properties of estimators in these various models. Some recent developments in panel unit root tests and stationary dynamic panel regression models are also reviewed.  相似文献   

15.
Nonstationary panel data analysis: an overview of some recent developments   总被引:2,自引:0,他引:2  
This paper overviews some recent developments in panel data asymptotics, concentrating on the nonstationary panel case and gives a new result for models with individual effects. Underlying recent theory are asymptotics for multi-indexed processes in which both indexes may pass to infinity. We review some of the new limit theory that has been developed, show how it can be applied and give a new interpretation of individual effects in nonstationary panel data. Fundamental to the interpretation of much of the asymptotics is the concept of a panel regression coefficient which measures the long run average relation across a section of the panel. This concept is analogous to the statistical interpretation of the coefficient in a classical regression relation. A variety of nonstationary panel data models are discussed and the paper reviews the asymptotic properties of estimators in these various models. Some recent developments in panel unit root tests and stationary dynamic panel regression models are also reviewed.  相似文献   

16.
The field of nonparametric function estimation has broadened its appeal in recent years with an array of new tools for statistical analysis. In particular, theoretical and applied research on the field of wavelets has had noticeable influence on statistical topics such as nonparametric regression, nonparametric density estimation, nonparametric discrimination and many other related topics. This is a survey article that attempts to synthetize a broad variety of work on wavelets in statistics and includes some recent developments in nonparametric curve estimation that have been omitted from review articles and books on the subject. After a short introduction to wavelet theory, wavelets are treated in the familiar context of estimation of «smooth» functions. Both «linear» and «nonlinear» wavelet estimation methods are discussed and cross-validation methods for choosing the smoothing parameters are addressed. Finally, some areas of related research are mentioned, such as hypothesis testing, model selection, hazard rate estimation for censored data, and nonparametric change-point problems. The closing section formulates some promising research directions relating to wavelets in statistics.  相似文献   

17.
Two analysis of means type randomization tests for testing the equality of I variances for unbalanced designs are presented. Randomization techniques for testing statistical hypotheses can be used when parametric tests are inappropriate. Suppose that I independent samples have been collected. Randomization tests are based on shuffles or rearrangements of the (combined) sample. Putting each of the I samples ‘in a bowl’ forms the combined sample. Drawing samples ‘from the bowl’ forms a shuffle. Shuffles can be made with replacement (bootstrap shuffling) or without replacement (permutation shuffling). The tests that are presented offer two advantages. They are robust to non-normality and they allow the user to graphically present the results via a decision chart similar to a Shewhart control chart. A Monte Carlo study is used to verify that the permutation version of the tests exhibit excellent power when compared to other robust tests. The Monte Carlo study also identifies circumstances under which the popular Levene's test fails.  相似文献   

18.
Let Y be distributed symmetrically about Xβ. Natural generalizations of odd location statistics, say T‘Y’, and even location-free statistics, say W‘Y’, that were used by Hogg ‘1960, 1967)’ are introduced. We show that T‘Y’ is distributed symmetrically about β and thus E[T‘Y’] = β and that each element of T‘Y’ is uncorrelated with each element of W‘Y’. Applications of this result are made to R-estiraators and the result is extended to a multivariate linear model situation.  相似文献   

19.
Rao (J. Indian Statist. Assoc. 17 (1979) 125) has given a ‘necessary form’ for an unbiased mean square error (MSE) estimator to be ‘uniformly non-negative’. The MSE is of a homogeneous linear estimator ‘subject to a specified constraint’, for a survey population total of a real variable of interest. We present a corresponding theorem when the ‘constraint’ is relaxed. Certain results are added presenting formulae for estimators of MSEs when the variate-values for the sampled individuals are not ascertainable. Though not ascertainable, they are supposed to be suitably estimated either by (1) randomized response techniques covering sensitive issues or by (2) further sampling in ‘subsequent’ stages in specific ways when the initial sampling units are composed of a number of sub-units. Using live numerical data, practical uses of the proposed alternative MSE estimators are demonstrated.  相似文献   

20.
The goal of the indifference zone formulation of selection (Bechhofer, 1954) consists of selecting the t best variants out of k variants with a probability of at least 1 − β if the parameter difference between the t ‘good’ variants and the kt ‘bad’ variants is not less than Δ. A review of generalized selection goals not using this difference condition is presented. Within some general classes of distributions, the suitable experimental designs for all these selection goals are identical. Similar results are described for the problem of selecting the best variant in comparison with a control, or standard.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号