期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A proportional-reduction-in-impurity measure of association for categorical variables

Chang Sup Sung Sung Jin Ahn 《统计学通讯:理论与方法》2013,42(8):2083-2110

This paper presents a proportional-reduction-in-impurity (PRI) measure for categorical association, that employs application-dependent loss functions which make the measure widely applicable. The well-known proportional -reduction-in-error (PRE) measure is shown to be a special case of the new PRI measure. Moreover, the asymptotic variance of the maximum likelihood estimator (MLE) of the measure is derived to facilitate its use for statistical inference. An extension of the PRI measure to compositional association is made to show that it can have a variety of applications. Selected loss functions are treated to illustrate the derivation of the measure. 相似文献

2.

A new measure of association for bivariate survival data

N. Unnikrishnan Nair P.G. Sankaran 《Journal of statistical planning and inference》2010

Time dependent association measures between variables are of interest in bivariate survival data. Several such measures have been proposed in literature for the modelling and analysis of survival data. In this paper, we introduce a new measure of association for bivariate survival data using product moment residual life function and mean residual life function. Various properties of the proposed measure and its relationship with existing measures are discussed. We also develop a non-parametric estimator of the measure and study its asymptotic properties. The application of the result is illustrated using a real life data. Finally, a stimulation study is carried out to assess the performance of the estimator. 相似文献

3.

A new measure of association

Wallace Franck 《统计学通讯:理论与方法》2013,42(9):3051-3070

This paper presents a new measure of association. It is applicable to polytomies of either categorical or numerical type. It has the desirable property of being 0 if and only if the polytomies are independent. Its properties are studied and compared to those of existing measures. An interpretation of it is given. One situation where it is particularly useful is in measuring the ability to predict one polytomy given knowledge of the other. An example is given where the proposed measure is more relevant in describing the degree of association between two polytomies than are any of the existing measures. The corresponding sample quantity is presented and its asymptotic properties are studied. A discussion of its use in inference is given. The test for independence based on this measure is contrasted with the chi-square test. 相似文献

4.

Non-parametric Evaluation of Biomarker Accuracy under Nested Case-control Studies

Cai T Zheng Y 《Journal of the American Statistical Association》2011,106(494):569-580

To evaluate the clinical utility of new risk markers, a crucial step is to measure their predictive accuracy with prospective studies. However, it is often infeasible to obtain marker values for all study participants. The nested case-control (NCC) design is a useful cost-effective strategy for such settings. Under the NCC design, markers are only ascertained for cases and a fraction of controls sampled randomly from the risk sets. The outcome dependent sampling generates a complex data structure and therefore a challenge for analysis. Existing methods for analyzing NCC studies focus primarily on association measures. Here, we propose a class of non-parametric estimators for commonly used accuracy measures. We derived asymptotic expansions for accuracy estimators based on both finite population and Bernoulli sampling and established asymptotic equivalence between the two. Simulation results suggest that the proposed procedures perform well in finite samples. The new procedures were illustrated with data from the Framingham Offspring study. 相似文献

5.

Least circular distance regression for directional data

Ulric Lund 《Journal of applied statistics》1999,26(6):723-733

Least-squares regression is not appropriate when the response variable is circular, and can lead to erroneous results. The reason for this is that the squared difference is not an appropriate measure of distance on the circle. In this paper, a circular analog to least-squares regression is presented for predicting a circular response variable by another circular variable and a set of linear covariates. An alternative maximum-likelihood formulation yields the same regression parameter estimates. Under the maximum-likelihood model, asymptotic standard errors of the parameter estimates are obtained. As an example, the regression model is used to model data from a marine biology study. 相似文献

6.

Some maximum-indifference estimators for the slope of a univariate linear model

Claudio G. Borroni D. Michele Cifarelli 《Journal of nonparametric statistics》2016,28(2):395-412

As known, the least-squares estimator of the slope of a univariate linear model sets to zero the covariance between the regression residuals and the values of the explanatory variable. To prevent the estimation process from being influenced by outliers, which can be theoretically modelled by a heavy-tailed distribution for the error term, one can substitute covariance with some robust measures of association, for example Kendall's tau in the popular Theil–Sen estimator. In a scarcely known Italian paper, Cifarelli [(1978), ‘La Stima del Coefficiente di Regressione Mediante l'Indice di Cograduazione di Gini’, Rivista di matematica per le scienze economiche e sociali, 1, 7–38. A translation into English is available at http://arxiv.org/abs/1411.4809 and will appear in Decisions in Economics and Finance] shows that a gain of efficiency can be obtained by using Gini's cograduation index instead of Kendall's tau. This paper introduces a new estimator, derived from another association measure recently proposed. Such a measure is strongly related to Gini's cograduation index, as they are both built to vanish in the general framework of indifference. The newly proposed estimator is shown to be unbiased and asymptotically normally distributed. Moreover, all considered estimators are compared via their asymptotic relative efficiency and a small simulation study. Finally, some indications about the performance of the considered estimators in the presence of contaminated normal data are provided. 相似文献

7.

Measuring association between nominal categorical variables: an alternative to the Goodman–Kruskal lambda

Tarald O. Kvålseth 《Journal of applied statistics》2018,45(6):1118-1132

As a measure of association between two nominal categorical variables, the lambda coefficient or Goodman–Kruskal's lambda has become a most popular measure. Its popularity is primarily due to its simple and meaningful definition and interpretation in terms of the proportional reduction in error when predicting a random observation's category for one variable given (versus not knowing) its category for the other variable. It is an asymmetric measure, although a symmetric version is available. The lambda coefficient does, however, have a widely recognized limitation: it can equal zero even when there is no independence between the variables and when all other measures take on positive values. In order to mitigate this problem, an alternative lambda coefficient is introduced in this paper as a slight modification of the Goodman–Kruskal lambda. The properties of the new measure are discussed and a symmetric form is introduced. A statistical inference procedure is developed and a numerical example is provided. 相似文献

8.

Copula regression spline models for binary outcomes

Rosalba Radice Giampiero Marra Małgorzata Wojtyś 《Statistics and Computing》2016,26(5):981-995

We introduce a framework for estimating the effect that a binary treatment has on a binary outcome in the presence of unobserved confounding. The methodology is applied to a case study which uses data from the Medical Expenditure Panel Survey and whose aim is to estimate the effect of private health insurance on health care utilization. Unobserved confounding arises when variables which are associated with both treatment and outcome are not available (in economics this issue is known as endogeneity). Also, treatment and outcome may exhibit a dependence which cannot be modeled using a linear measure of association, and observed confounders may have a non-linear impact on the treatment and outcome variables. The problem of unobserved confounding is addressed using a two-equation structural latent variable framework, where one equation essentially describes a binary outcome as a function of a binary treatment whereas the other equation determines whether the treatment is received. Non-linear dependence between treatment and outcome is dealt using copula functions, whereas covariate-response relationships are flexibly modeled using a spline approach. Related model fitting and inferential procedures are developed, and asymptotic arguments presented. 相似文献

9.

Quality of Fit Measures in the Framework of Quantile Regression

HOHSUK NOH ANOUAR EL GHOUCH INGRID VAN KEILEGOM 《Scandinavian Journal of Statistics》2013,40(1):105-118

Abstract. In regression experiments, to learn about the strength of the relationship between a covariate vector and a dependent variable, we propose a ‘coefficient of determination’ based on the quantiles. Such a coefficient is a ‘local’ measure in the sense that the strength is measured at a prespecified quantile level. Once estimated, it can be used, for example, to measure the relative importance of a subset of covariates in the quantile regression context. Related to this coefficient, we also propose a new ‘local’ lack‐of‐fit measure of a given parametric model. We provide some asymptotic results of the proposed measures and carry out a Monte Carlo simulation study to illustrate their use and performance in practice. 相似文献

10.

Performance of tests of association in misspecified generalized linear models

《Journal of statistical planning and inference》2006,136(9):3090-3100

We examine the effects of modelling errors, such as underfitting and overfitting, on the asymptotic power of tests of association between an explanatory variable x and an outcome in the setting of generalized linear models. The regression function for x is approximated by a polynomial or another simple function, and a chi-square statistic is used to test whether the coefficients of the approximation are simultaneously equal to zero. Adding terms to the approximation increases asymptotic power if and only if the fit of the model increases by a certain quantifiable amount. Although a high degree of freedom approximation offers robustness to the shape of the unknown regression function, a low degree of freedom approximation can yield much higher asymptotic power even when the approximation is very poor. In practice, it is useful to compute the power of competing test statistics across the range of alternatives that are plausible a priori. This approach is illustrated through an application in epidemiology. 相似文献

11.

Quantile regression and variable selection for the single-index model

Yazhao Lv Weihua Zhao Jicai Liu 《Journal of applied statistics》2014,41(7):1565-1577

In this paper, we propose a new full iteration estimation method for quantile regression (QR) of the single-index model (SIM). The asymptotic properties of the proposed estimator are derived. Furthermore, we propose a variable selection procedure for the QR of SIM by combining the estimation method with the adaptive LASSO penalized method to get sparse estimation of the index parameter. The oracle properties of the variable selection method are established. Simulations with various non-normal errors are conducted to demonstrate the finite sample performance of the estimation method and the variable selection procedure. Furthermore, we illustrate the proposed method by analyzing a real data set. 相似文献

12.

A Copula‐Based Non‐parametric Measure of Regression Dependence

HOLGER DETTE KARL F. SIBURG PAVEL A. STOIMENOV 《Scandinavian Journal of Statistics》2013,40(1):21-41

Abstract. This article presents a framework for comparing bivariate distributions according to their degree of regression dependence. We introduce the general concept of a regression dependence order (RDO). In addition, we define a new non‐parametric measure of regression dependence and study its properties. Besides being monotone in the new RDOs, the measure takes on its extreme values precisely at independence and almost sure functional dependence, respectively. A consistent non‐parametric estimator of the new measure is constructed and its asymptotic properties are investigated. Finally, the finite sample properties of the estimate are studied by means of a small simulation study. 相似文献

13.

Conditional mode estimation for functional stationary ergodic data with responses missing at random

Nengxiang Ling Yang Liu Philippe Vieu 《Statistics》2016,50(5):991-1013

In this paper, we investigate the asymptotic properties of a non-parametric conditional mode estimation given a functional explanatory variable, when functional stationary ergodic data and missing at random responses are observed. First of all, we establish asymptotic properties for a conditional density estimator from which we derive almost sure convergence (with rate) and asymptotic normality of a conditional mode estimator. This new estimate take into account missing data, and a simulation study is performed to illustrate how this fact allows to get higher predictive performances than those obtained with standard estimates. 相似文献

14.

Correlation curve estimation for multiplicative distortion measurement errors data

Zhenghui Feng Yujie Gai 《Journal of nonparametric statistics》2019,31(2):435-450

A correlation curve measures the strength of the association between two variables locally at different values of covariate. This paper studies how to estimate the correlation curve under the multiplicative distortion measurement errors setting. The unobservable variables are both distorted in a multiplicative fashion by an observed confounding variable. We obtain asymptotic normality results for the estimated correlation curve. We conduct Monte Carlo simulation experiments to examine the performance of the proposed estimator. The estimated correlation curve is applied to analyze a real dataset for an illustration. 相似文献

15.

Testing stochastic orders in tails of contingency tables

Chi Tim Ng Kyu S. Hahn 《Journal of applied statistics》2011,38(6):1133-1149

Testing for the difference in the strength of bivariate association in two independent contingency tables is an important issue that finds applications in various disciplines. Currently, many of the commonly used tests are based on single-index measures of association. More specifically, one obtains single-index measurements of association from two tables and compares them based on asymptotic theory. Although they are usually easy to understand and use, often much of the information contained in the data is lost with single-index measures. Accordingly, they fail to fully capture the association in the data. To remedy this shortcoming, we introduce a new summary statistic measuring various types of association in a contingency table. Based on this new summary statistic, we propose a likelihood ratio test comparing the strength of association in two independent contingency tables. The proposed test examines the stochastic order between summary statistics. We derive its asymptotic null distribution and demonstrate that the least favorable distributions are chi-bar distributions. We numerically compare the power of the proposed test to that of the tests based on single-index measures. Finally, we provide two examples illustrating the new summary statistics and the related tests. 相似文献

16.

A note on the asymptotic variance of the gray and williams measure of partial association

Robert J. Anderson J. Richard Landis 《统计学通讯:理论与方法》2013,42(13):1303-1314

The correct asymptotic variancef or the partial association analog of Goodman and Kruskal's T measure of associati o n based on proportional prediction is derived for both full multinomial sampling and product multinomial sampling. These results are illustrated within the context of an example data set. 相似文献

17.

Confidence Intervals on Regression Models with Censored Data

Jesus Orbe Vicente Núñez-antón 《统计学通讯:模拟与计算》2013,42(9):2140-2159

Stute (1993, Consistent estimation under random censorship when covariables are present. Journal of Multivariate Analysis 45, 89–103) proposed a new method to estimate regression models with a censored response variable using least squares and showed the consistency and asymptotic normality for his estimator. This article proposes a new bootstrap-based methodology that improves the performance of the asymptotic interval estimation for the small sample size case. Therefore, we compare the behavior of Stute's asymptotic confidence interval with that of several confidence intervals that are based on resampling bootstrap techniques. In order to build these confidence intervals, we propose a new bootstrap resampling method that has been adapted for the case of censored regression models. We use simulations to study the improvement the performance of the proposed bootstrap-based confidence intervals show when compared to the asymptotic proposal. Simulation results indicate that, for the new proposals, coverage percentages are closer to the nominal values and, in addition, intervals are narrower. 相似文献

18.

Efficient mean estimation in log-normal linear models

Haipeng Shen Zhengyuan Zhu 《Journal of statistical planning and inference》2008

Log-normal linear models are widely used in applications, and many times it is of interest to predict the response variable or to estimate the mean of the response variable at the original scale for a new set of covariate values. In this paper we consider the problem of efficient estimation of the conditional mean of the response variable at the original scale for log-normal linear models. Several existing estimators are reviewed first, including the maximum likelihood (ML) estimator, the restricted ML (REML) estimator, the uniformly minimum variance unbiased (UMVU) estimator, and a bias-corrected REML estimator. We then propose two estimators that minimize the asymptotic mean squared error and the asymptotic bias, respectively. A parametric bootstrap procedure is also described to obtain confidence intervals for the proposed estimators. Both the new estimators and the bootstrap procedure are very easy to implement. Comparisons of the estimators using simulation studies suggest that our estimators perform better than the existing ones, and the bootstrap procedure yields confidence intervals with good coverage properties. A real application of estimating the mean sediment discharge is used to illustrate the methodology. 相似文献

19.

Testing time series for interpolability and whiteness

Roberto Baragona Francesco Battaglia 《统计学通讯:理论与方法》2013,42(11):2623-2644

We propose a test to decide if a time series is represented by its linear interpolator better than by its mean value. The same test can be employed to decide if a time series has to be considered white noise. The test is based on a new estimate of the index of linear determinism (Battaglia, 1983, Inverse autocovariances and a measure of linear determinism for a stationary process, J. Time Series Anal. 4, 79-87) and its asymptotic distribution is derived. Comparison with the popular Ljung-Box portmanteau test has been performed based on both asymptotic power and a simulation experiment. The new test 相似文献

20.

Consistent Bayesian information criterion based on a mixture prior for possibly high-dimensional multivariate linear regression models

Haruki Kono Tatsuya Kubokawa 《Scandinavian Journal of Statistics》2023,50(3):1022-1047

In the problem of selecting variables in a multivariate linear regression model, we derive new Bayesian information criteria based on a prior mixing a smooth distribution and a delta distribution. Each of them can be interpreted as a fusion of the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Inheriting their asymptotic properties, our information criteria are consistent in variable selection in both the large-sample and the high-dimensional asymptotic frameworks. In numerical simulations, variable selection methods based on our information criteria choose the true set of variables with high probability in most cases. 相似文献