期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A class of symmetric bivariate uniform distributions

T. S. Ferguson 《Statistical Papers》1995,36(1):31-40

A class of symmetric bivariate uniform distributions is proposed for use in statistical modeling. The distributions may be constructed to be absolutely continuous with correlations as close to±1 as desired. Expressions for the correlations, regressions and copulas are found. An extension to three dimensions is proposed. 相似文献

2.

Maximum-Entropy Prior Uncertainty and Correlation of Statistical Economic Data

João D. F. Rodrigues 《商业与经济统计学杂志》2016,34(3):357-367

Empirical estimates of source statistical economic data such as trade flows, greenhouse gas emissions, or employment figures are always subject to uncertainty (stemming from measurement errors or confidentiality) but information concerning that uncertainty is often missing. This article uses concepts from Bayesian inference and the maximum entropy principle to estimate the prior probability distribution, uncertainty, and correlations of source data when such information is not explicitly provided. In the absence of additional information, an isolated datum is described by a truncated Gaussian distribution, and if an uncertainty estimate is missing, its prior equals the best guess. When the sum of a set of disaggregate data is constrained to match an aggregate datum, it is possible to determine the prior correlations among disaggregate data. If aggregate uncertainty is missing, all prior correlations are positive. If aggregate uncertainty is available, prior correlations can be either all positive, all negative, or a mix of both. An empirical example is presented, which reports relative uncertainties and correlation priors for the County Business Patterns database. In this example, relative uncertainties range from 1% to 80% and 20% of data pairs exhibit correlations below ?0.9 or above 0.9. Supplementary materials for this article are available online. 相似文献

3.

Choice of Baselines in Clinical Trials: A Simulation Study from Statistical Power Perspective

P. G. Zhang T. Roe 《统计学通讯:模拟与计算》2013,42(7):1305-1317

Multiple assessments of an efficacy variable are often conducted prior to the initiation of randomized treatments in clinical trials as baseline information. Two goals are investigated in this article, where the first goal is to investigate the choice of these baselines in the analysis of covariance (ANCOVA) to increase the statistical power, and the second to investigate the magnitude of power loss when a continuous efficacy variable is dichotomized to categorical variable as commonly reported the biomedical literature. A statistical power analysis is developed with extensive simulations based on data from clinical trials in study participants with end stage renal disease (ESRD). It is found that the baseline choices primarily depend on the correlations among the baselines and the efficacy variable, with substantial gains for correlations greater than 0.6 and negligible for less than 0.2. Continuous efficacy variables always give higher statistical power in the ANCOVA modeling and dichotomizing the efficacy variable generally decreases the statistical power by 25%, which is an important practicum in designing clinical trials for study sample size and realistically budget. These findings can be easily applied in and extended to other clinical trials with similar design. 相似文献

4.

Ratio index variables or ANCOVA? Fisher's cats revisited

Yu‐Kang Tu Graham R. Law George T.H. Ellison Mark S. Gilthorpe 《Pharmaceutical statistics》2010,9(1):77-83

Over 60 years ago Ronald Fisher demonstrated a number of potential pitfalls with statistical analyses using ratio variables. Nonetheless, these pitfalls are largely overlooked in contemporary clinical and epidemiological research, which routinely uses ratio variables in statistical analyses. This article aims to demonstrate how very different findings can be generated as a result of less than perfect correlations among the data used to generate ratio variables. These imperfect correlations result from measurement error and random biological variation. While the former can often be reduced by improvements in measurement, random biological variation is difficult to estimate and eliminate in observational studies. Moreover, wherever the underlying biological relationships among epidemiological variables are unclear, and hence the choice of statistical model is also unclear, the different findings generated by different analytical strategies can lead to contradictory conclusions. Caution is therefore required when interpreting analyses of ratio variables whenever the underlying biological relationships among the variables involved are unspecified or unclear. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

5.

大数据背景下非概率抽样的统计推断问题

金勇进刘展《统计研究》2016,33(3):11-17

利用大数据进行抽样,很多情况下抽样框的构造比较困难,使得抽取的样本属于非概率样本,难以将传统的抽样推断理论应用到非概率样本中,如何解决非概率抽样的统计推断问题,是大数据背景下抽样调查面临的严重挑战。本文提出了解决非概率抽样统计推断问题的基本思路：一是抽样方法,可以考虑基于样本匹配的样本选择、链接跟踪抽样方法等,使得到的非概率样本近似于概率样本,从而可采用概率样本的统计推断理论;二是权数的构造与调整,可以考虑基于伪设计、模型和倾向得分等方法得到类似于概率样本的基础权数;三是估计,可以考虑基于伪设计、模型和贝叶斯的混合概率估计。最后,以基于样本匹配的样本选择为例探讨了具体解决方法。相似文献

6.

An R package for the simulation of correlated discrete variables

Alessandro Barbiero Pier Alda Ferrari 《统计学通讯:模拟与计算》2017,46(7):5123-5140

A package for the stochastic simulation of discrete variables with assigned marginal distributions and correlation matrix is presented and discussed. The simulating mechanism relies upon the Gaussian copula, linking the discrete distributions together, and an iterative scheme recovering the correlation matrix for the copula that ensures the desired correlations among the discrete variables. Examples of its use are provided as well as three possible applications (related to probability, sampling, and inference), which illustrate the utility of the package as an efficient and easy-to-use tool both in statistical research and for didactic purposes. 相似文献

7.

Statistical matching and uncertainty analysis in combining household income and expenditure data

Pier Luigi Conti Daniela Marella Andrea Neri 《Statistical Methods and Applications》2017,26(3):485-505

Among the goals of statistical matching, a very important one is the estimation of the joint distribution of variables not jointly observed in a sample survey but separately available from independent sample surveys. The absence of joint information on the variables of interest leads to uncertainty about the data generating model since the available sample information is unable to discriminate among a set of plausible joint distributions. In the present paper a short review of the concept of uncertainty in statistical matching under logical constraints, as well as how to measure uncertainty for continuous variables is presented. The notion of matching error is related to an appropriate measure of uncertainty and a criterion of selecting matching variables by choosing the variables minimizing such an uncertainty measure is introduced. Finally, a method to choose a plausible joint distribution for the variables of interest via iterative proportional fitting algorithm is described. The proposed methodology is then applied to household income and expenditure data when extra sample information regarding the average propensity to consume is available. This leads to a reconstructed complete dataset where each record includes measures on income and expenditure. 相似文献

8.

Modelling asymmetric volatility dynamics by multivariate BL-GARCH models

Giuseppe Storti 《Statistical Methods and Applications》2008,17(2):251-274

The class of Multivariate BiLinear GARCH (MBL-GARCH) models is proposed and its statistical properties are investigated. The model can be regarded as a generalization to a multivariate setting of the univariate BL-GARCH model proposed by Storti and Vitale (Stat Methods Appl 12:19–40, 2003a; Comput Stat 18:387–400, 2003b). It is shown how MBL-GARCH models allow to account for asymmetric effects in both conditional variances and correlations. An EM algorithm for the maximum likelihood estimation of the model parameters is derived. Furthermore, in order to test for the appropriateness of the conditional variance and covariance specifications, a set of robust conditional moments test statistics are defined. Finally, the effectiveness of MBL-GARCH models in a risk management setting is assessed by means of an application to the estimation of the optimal hedge ratio in futures hedging. 相似文献

9.

An empirical study of eight tests of partial correlation coefficients

Ronald C. Serlin Michael r Harwell 《统计学通讯:模拟与计算》2013,42(2):545-567

Partial correlations can be used to statistically control for the effects of unwanted variables.Perhaps the most frequently used test of a partial correlation is the parametric F test,which requires normality of the joint distribution of observations.The possibility that this assumption may not be met in practice suggests a need for procedures that do not require normality.Unfortunately,the statistical literature provides little guidance for choosing other tests when the normalityassumption is not satisfied.Several nonparametric tests of partial correlations are investigated using a computer simulation study.Recommendations are made for selecting certain tests under particular conditions 相似文献

10.

A General Multivariate Threshold GARCH Model With Dynamic Conditional Correlations

《商业与经济统计学杂志》2013,31(1):138-149

We introduce a new multivariate GARCH model with multivariate thresholds in conditional correlations and develop a two-step estimation procedure that is feasible in large dimensional applications. Optimal threshold functions are estimated endogenously from the data and the model conditional covariance matrix is ensured to be positive definite. We study the empirical performance of our model in two applications using U.S. stock and bond market data. In both applications our model has, in terms of statistical and economic significance, higher forecasting power than several other multivariate GARCH models for conditional correlations. 相似文献

11.

A simple approach for generating correlated binary variates ∗

《Journal of Statistical Computation and Simulation》2012,82(3):231-255

Correlated binary data arise frequently in medical as well as other scientific disciplines; and statistical methods, such as generalized estimating equation (GEE), have been widely used for their analysis. The need for simulating correlated binary variates arises for evaluating small sample properties of the GEE estimators when modeling such data. Also, one might generate such data to simulate and study biological phenomena such as tooth decay or periodontal disease. This article introduces a simple method for generating pairs of correlated binary data. A simple algorithm is also provided for generating an arbitrary dimensional random vector of non-negatively correlated binary variates. The method relies on the idea that correlations among the random variables arise as a result of their sharing some common components that induce such correlations. It then uses some properties of the binary variates to represent each variate in terms of these common components in addition to its own elements. Unlike most previous approaches that require solving nonlinear equations or use some distributional properties of other random variables, this method uses only some properties of the binary variate. As no intermediate random variables are required for generating the binary variates, the proposed method is shown to be faster than the other methods. To verify this claim, we compare the computational efficiency of the proposed method with those of other procedures. 相似文献

12.

行政记录整合的贝叶斯分层记录链接模型及应用

丁东洋周丽莉《统计与信息论坛》2016,(7):30-35

记录链接的技术问题与统计理论密切相关,尤其是在建立记录链接分类规则时需要构建统计模型,识别关键变量以完成数据匹配。在贝叶斯框架下构建分层模型整合行政记录,通过多元回归可以实现匹配错误率的估计,而且一对一限制下的记录链接允许通过模块反映记录信息的来源变化,基于MCMC模拟的后验分布计算方便,有助于提高数据整合效率。相似文献

13.

How far from identifiability? A systematic overview of the statistical matching problem in a non parametric framework

Pier Luigi Conti Mauro Scanu 《统计学通讯:理论与方法》2017,46(2):967-994

Statistical matching consists in estimating the joint characteristics of two variables observed in two distinct and independent sample surveys, respectively. In a parametric setup, ranges of estimates for non identifiable parameters are the only estimable items, unless restrictive assumptions on the probabilistic relationship between the non jointly observed variables are imposed. These ranges correspond to the uncertainty due to the absence of joint observations on the pair of variables of interest. The aim of this paper is to analyze the uncertainty in statistical matching in a non parametric setting. A measure of uncertainty is introduced, and its properties studied: this measure studies the “intrinsic” association between the pair of variables, which is constant and equal to 1/6 whatever the form of the marginal distribution functions of the two variables when knowledge on the pair of variables is the only one available in the two samples. This measure becomes useful in the context of the reduction of uncertainty due to further knowledge than data themselves, as in the case of structural zeros. In this case the proposed measure detects how the introduction of further knowledge shrinks the intrinsic uncertainty from 1/6 to smaller values, zero being the case of no uncertainty. Sampling properties of the uncertainty measure and of the bounds of the uncertainty intervals are also proved. 相似文献

14.

Multidimensional Scaling for Selecting Small Groups in College Courses

《The American statistician》2013,67(4):317-321

Many college courses use group work as a part of the learning and evaluation process. Class groups are often selected randomly or by allowing students to organize groups themselves. However, if it is desired to control some aspect of the group structure, such as increasing schedule compatibility within groups, multidimensional scaling can be used to form such groups. This article describes how this has been adopted in an undergraduate statistics course. Resulting groups have been more homogeneous with respect to student schedules than groups selected randomly—an example from winter quarter 2004 increased correlations between student schedules from a mean of .29 before grouping to a within-group mean of .50. Further, the exercise allows opportunities to discuss a wealth of statistical concepts in class, including surveys, association measures, multidimensional scaling, and statistical graphics. 相似文献

15.

Fast regression surrogates for computer models with time-dependent outputs

Dorin Drignei Dalia Eugenia Popescu 《Journal of Statistical Computation and Simulation》2013,83(6):1058-1067

The study of physical processes is often aided by computer models or codes. Computer models that simulate such processes are sometimes computationally intensive and therefore not very efficient exploratory tools. In this paper, we address computer models characterized by temporal dynamics and propose new statistical correlation structures aimed at modelling their time dependence. These correlations are embedded in regression models with input-dependent design matrix and input-correlated errors that act as fast statistical surrogates for the computationally intensive dynamical codes. The methods are illustrated with an automotive industry application involving a road load data acquisition computer model. 相似文献

16.

Comment on testing for spurious and cointegrated regressions: a wavelet approach

Javier Fernández-Macho 《Journal of applied statistics》2015,42(8):1759-1769

In a recent paper, Leong and Huang [6] proposed a wavelet-correlation-based approach to test for cointegration between two time series. However, correlation and cointegration are two different concepts even when wavelet analysis is used. It is known that statistics based on non-stationary integrated variables have non-standard asymptotic distributions. However, wavelet analysis offsets the integrating order of non-stationary series so that traditional asymptotics on stationary variables suffices to ascertain the statistical properties of wavelet-based statistics. Based on this, this note shows that wavelet correlations cannot be used as a test of cointegration. 相似文献

17.

On statistical tests of functional connectome fingerprinting

Zeyi Wang Haris I. Sair Ciprian Crainiceanu Martin Lindquist Bennett A. Landman Susan Resnick Joshua T. Vogelstein Brian Caffo 《Revue canadienne de statistique》2021,49(1):63-88

Fingerprinting of functional connectomes is an increasingly standard measure of reproducibility in functional magnetic resonance imaging connectomics. In such studies, one attempts to match a subject's first session image with their second, in a blinded fashion, in a group of subjects measured twice. The number or percentage of correct matches is usually reported as a statistic, which is then used in permutation tests. Despite the simplicity and increasing popularity of such procedures, the soundness of the statistical tests, the power, and the factors impacting the test are unstudied. In this article, we investigate the statistical tests of matching based on exchangeability assumption in the fingerprinting analysis. We show that a nearly universal Poisson(1) approximation applies for different matching schemes. We theoretically investigate the permutation tests and explore the issue that the test is overly sensitive to uninteresting directions in the alternative hypothesis, such as clustering due to familial status or demographics. We perform a numerical study on two functional magnetic resonance imaging (fMRI) resting‐state datasets, the Human Connectome Project (HCP) and the Baltimore Longitudinal Study of Aging (BLSA). These datasets are instructive, as the HCP includes technical replications of long scans and includes monozygotic and dizygotic twins, as well as non‐twin siblings. In contrast, the BLSA study incorporates more typical length resting‐state scans in a longitudinal study. Finally, a study of single regional connections is performed on the HCP data. 相似文献

18.

Propensity score matching and stratification using multiparty data without pooling

Jixian Wang Roland Marion-Gallois 《Pharmaceutical statistics》2023,22(1):4-19

Matching and stratification based on confounding factors or propensity scores (PS) are powerful approaches for reducing confounding bias in indirect treatment comparisons. However, implementing these approaches requires pooled individual patient data (IPD). The research presented here was motivated by an indirect comparison between a single-armed trial in acute myeloid leukemia (AML), and two external AML registries with current treatments for a control. For confidentiality reasons, IPD cannot be pooled. Common approaches to adjusting confounding bias, such as PS matching or stratification, cannot be applied as 1) a model for PS, for example, a logistic model, cannot be fitted without pooling covariate data; 2) pooling response data may be necessary for some statistical inference (e.g., estimating the SE of mean difference of matched pairs) after PS matching. We propose a set of approaches that do not require pooling IPD, using a combination of methods including a linear discriminant for matching and stratification, and secure multiparty computation for estimation of within-pair sample variance and for calculations involving multiple control sources. The approaches only need to share aggregated data offline, rather than real-time secure data transfer, as required by typical secure multiparty computation for model fitting. For survival analysis, we propose an approach using restricted mean survival time. A simulation study was conducted to evaluate this approach in several scenarios, in particular, with a mixture of continuous and binary covariates. The results confirmed the robustness and efficiency of the proposed approach. A real data example is also provided for illustration. 相似文献

19.

A Graphical Display of Large Correlation Matrices

D. J. Murdoch E. D. Chow 《The American statistician》2013,67(2):178-180

This department includes the two sections New Development in Statistical Computing and Statistical Computing Software Reviews; suitable contents for each of these sections are described under the respective section heading. Articles submitted for the department, outside the two sections, should not be highly technical and should be relevant to the teaching or practice of statistical computing.

Large correlation matrices are hard to look at. In this article we present correlations as elliptical glyphs for a simple intuitive display of large matrices. 相似文献

20.

Modeling the Dependence of Conditional Correlations on Market Volatility

Luc Bauwens Edoardo Otranto 《商业与经济统计学杂志》2016,34(2):254-268

Several models have been developed to capture the dynamics of the conditional correlations between time series of financial returns and several studies have shown that the market volatility is a major determinant of the correlations. We extend some models to include explicitly the dependence of the correlations on the market volatility. The models differ by the way—linear or nonlinear, direct or indirect—in which the volatility influences the correlations. Using a wide set of models with two measures of market volatility on two datasets, we find that for some models, the empirical results support to some extent the statistical significance and the economic significance of the volatility effect on the correlations, but the presence of the volatility effect does not improve the forecasting performance of the extended models. Supplementary materials for this article are available online. 相似文献