期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of the Hosmer–Lemeshow,Pigeon–Heyse,and Tsiatis goodness-of-fit tests for binary logistic regression under two grouping methods

Jana D. Canary Leigh Blizzard Ronald P. Barry David W. Hosmer Stephen J. Quinn 《统计学通讯:模拟与计算》2017,46(3):1871-1894

Algebraic relationships between Hosmer–Lemeshow (HL), Pigeon–Heyse (J²), and Tsiatis (T) goodness-of-fit statistics for binary logistic regression models with continuous covariates were investigated, and their distributional properties and performances studied using simulations. Groups were formed under deciles-of-risk (DOR) and partition-covariate-space (PCS) methods. Under DOR, HL and T followed reported null distributions, while J² did not. Under PCS, only T followed its reported null distribution, with HL and J² dependent on model covariate number and partitioning. Generally, all had similar power. Of the three, T performed best, maintaining Type-I error rates and having a distribution invariant to covariate characteristics, number, and partitioning. 相似文献

2.

Measures of Association and Visualization of Log Odds Ratio Structure for a Two Way Contingency Table

下载免费PDF全文

Pasquale Sarnacchiaro Luigi D'Ambra Ida Camminatiello 《Australian & New Zealand Journal of Statistics》2015,57(3):363-376

The odds ratio (OR) is a measure of association used for analysing an I × J contingency table. The total number of ORs to check grows with I and J. Several statistical methods have been developed for summarising them. These methods begin from two different starting points, the I × J contingency table and the two‐way table composed by the ORs. In this paper we focus our attention on the relationship between these methods and point out that, for an exhaustive analysis of association through log ORs, it is necessary to consider all the outcomes of these methods. We also introduce some new methodological and graphical features. In order to illustrate previously used methodologies, we consider a data table of the cross‐classification of the colour of eyes and hair of 5387 children from Scotland. We point out how, through the log OR analysis, it is possible to extract useful information about the association between variables. 相似文献

3.

A general long-term aging model with different underlying activation mechanisms: Modeling,Bayesian estimation,and case influence diagnostics

Adriano K. Suzuki Gladys D. C. Barriga Francisco Louzada Vicente G. Cancho 《统计学通讯:理论与方法》2017,46(6):3080-3098

In this paper we propose a general cure rate aging model. Our approach enables different underlying activation mechanisms which lead to the event of interest. The number of competing causes of the event of interest is assumed to follow a logarithmic distribution. The model is parameterized in terms of the cured fraction which is then linked to covariates. We explore the use of Markov chain Monte Carlo methods to develop a Bayesian analysis for the proposed model. Moreover, some discussions on the model selection to compare the fitted models are given, as well as case deletion influence diagnostics are developed for the joint posterior distribution based on the ψ-divergence, which has several divergence measures as particular cases, such as the Kullback–Leibler (K-L), J-distance, L₁ norm, and χ²-square divergence measures. Simulation studies are performed and experimental results are illustrated based on a real malignant melanoma data. 相似文献

4.

Adaptive lifting for nonparametric regression

Matthew A. Nunes Marina I. Knight Guy P. Nason 《Statistics and Computing》2006,16(2):143-159

Many wavelet shrinkage methods assume that the data are observed on an equally spaced grid of length of the form 2^J for some J. These methods require serious modification or preprocessed data to cope with irregularly spaced data. The lifting scheme is a recent mathematical innovation that obtains a multiscale analysis for irregularly spaced data. A key lifting component is the “predict” step where a prediction of a data point is made. The residual from the prediction is stored and can be thought of as a wavelet coefficient. This article exploits the flexibility of lifting by adaptively choosing the kind of prediction according to a criterion. In this way the smoothness of the underlying ‘wavelet’ can be adapted to the local properties of the function. Multiple observations at a point can readily be handled by lifting through a suitable choice of prediction. We adapt existing shrinkage rules to work with our adaptive lifting methods. We use simulation to demonstrate the improved sparsity of our techniques and improved regression performance when compared to both wavelet and non-wavelet methods suitable for irregular data. We also exhibit the benefits of our adaptive lifting on the real inductance plethysmography and motorcycle data. 相似文献

5.

Do bootstrap confidence procedures behave well uniformly in P?

Joseph P. Romano 《Revue canadienne de statistique》1989,17(1):75-80

An important statistical problem is to construct a confidence set for some functional T(P) of some unknown probability distribution P. Typically, this involves approximating the sampling distribution J_n(P) of some pivot based on a sample of size n from P. A bootstrap procedure is to estimate J_n(P) by J_n(&Pcirc;_n), where P?_n is the empirical measure based on a sample of size n from P. Typically, one has that J_n(P) and J_n(P?_n) are close in an appropriate sense. Two questions are addressed in this note. Are J_n(P) and J_n(P?_n) uniformly close as P varies as well? If so, do confidence statements about T(P) possess a corresponding uniformity property? In the case T(P) = P, the answer to the first questions is yes; the answer to the second is no. However, bootstrap confidence statements about T(P) can be made uniform over a restricted, though large, class of P. Similar results apply to other functional T(P). 相似文献

6.

Bayesian inference and diagnostics in zero-inflated generalized power series regression model

Gladys D. Cacsire Barriga Dipak K. Dey 《统计学通讯:理论与方法》2013,42(22):6553-6568

ABSTRACT

The paper provides a Bayesian analysis for the zero-inflated regression models based on the generalized power series distribution. The approach is based on Markov chain Monte Carlo methods. The residual analysis is discussed and case-deletion influence diagnostics are developed for the joint posterior distribution, based on the ψ-divergence, which includes several divergence measures such as the Kullback–Leibler, J-distance, L₁ norm, and χ²-square in zero-inflated general power series models. The methodology is reflected in a data set collected by wildlife biologists in a state park in California. 相似文献

7.

Joint hypothesis testing of the area under the receiver operating characteristic curve and the Youden index

Jingjing Yin Fedelis Mutiso Lili Tian 《Pharmaceutical statistics》2021,20(3):657-674

In the receiver operating characteristic (ROC) analysis, the area under the ROC curve (AUC ) serves as an overall measure of diagnostic accuracy. Another popular ROC index is the Youden index (J ), which corresponds to the maximum sum of sensitivity and specificity minus one. Since the AUC and J describe different aspects of diagnostic performance, we propose to test if a biomarker beats the pre-specified targeting values of AUC₀ and J₀ simultaneously with H₀ : AUC ≤ AUC₀ or J ≤ J₀ against H_a : AUC > AUC₀ and J > J₀ . This is a multivariate order restrictive hypothesis with a non-convex space in H_a , and traditional likelihood ratio-based tests cannot apply. The intersection–union test (IUT) and the joint test are proposed for such test. While the IUT test independently tests for the AUC and the Youden index, the joint test is constructed based on the joint confidence region. Findings from the simulation suggest both tests yield similar power estimates. We also illustrated the tests using a real data example and the results of both tests are consistent. In conclusion, testing jointly on AUC and J gives more reliable results than using a single index, and the IUT is easy to apply and have similar power as the joint test. 相似文献

8.

Some small-sample properties of some recently proposed multivariate outlier detection techniques

《Journal of Statistical Computation and Simulation》2012,82(8):701-712

Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p _n, their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p _n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables. 相似文献

9.

On the relevance of weaker instruments

Bertille Antoine 《Econometric Reviews》2017,36(6-9):928-945

ABSTRACT

We study the asymptotic properties of the standard GMM estimator when additional moment restrictions, weaker than the original ones, are available. We provide conditions under which these additional weaker restrictions improve the efficiency of the GMM estimator. To detect “spurious” identification that may come from invalid moments, we rely on the Hansen J-test that assesses the compatibility between existing restrictions and additional ones. Our simulations reveal that the J-test has good power properties and that its power increases with the weakness of the additional restrictions. Our theoretical characterization of the J-test provides some intuition for why that is. 相似文献

10.

Lessons From Quantile Panel Estimation of the Environmental Kuznets Curve

Carlos A. Flores Alfonso Flores-Lagunes Dimitrios Kapetanakis 《Econometric Reviews》2014,33(8):815-853

We employ quantile regression fixed effects models to estimate the income-pollution relationship on NO _x (nitrogen oxide) and SO ₂ (sulfur dioxide) using U.S. data. Conditional median results suggest that conditional mean methods provide too optimistic estimates about emissions reduction for NO _x, while the opposite is found for SO ₂. Deleting outlier states reverses the absence of a turning point for SO ₂ in the conditional mean model, while the conditional median model is robust to them. We also document the relationship's sensitivity to including additional covariates for NO _x, and undertake simulations to shed light on some estimation issues of the methods employed. 相似文献

11.

De l'unicité des estimateurs robustes en régression lorsque le paramètre d'échelle et le paramètre de la régression sont estimés simultanément

Louis-Paul Rivest 《Revue canadienne de statistique》1989,17(2):141-153

Several methods have been suggested to calculate robust M- and G-M -estimators of the regression parameter β and of the error scale parameter σ in a linear model. This paper shows that, for some data sets well known in robust statistics, the nonlinear systems of equations for the simultaneous estimation of β, with an M-estimate with a redescending ψ-function, and σ, with the residual median absolute deviation (MAD), have many solutions. This multiplicity is not caused by the possible lack of uniqueness, for redescending ψ-functions, of the solutions of the system defining β with known σ; rather, the simultaneous estimation of β and σ together creates the problem. A way to avoid these multiple solutions is to proceed in two steps. First take σ as the median absolute deviation of the residuals for a uniquely defined robust M-estimate such as Huber's Proposal 2 or the L₁-estimate. Then solve the nonlinear system for the M-estimate with σ equal to the value obtained at the first step to get the estimate of β. Analytical conditions for the uniqueness of M and G-M-estimates are also given. 相似文献

12.

On the Comparison of Two Ordinal Responses

Maria Kateri 《统计学通讯:理论与方法》2013,42(21):3748-3763

Responses of two groups, measured on the same ordinal scale, are compared through the column effect association model, applied on the corresponding 2 × J contingency table. Monotonic or umbrella shaped ordering for the scores of the model are related to stochastic or umbrella ordering of the underlying response distributions, respectively. An algorithm for testing all possible hypotheses of stochastic ordering and deciding on an appropriate one is proposed. 相似文献

13.

Switching nonparametric regression models

Camila P. E. de Souza Nancy E. Heckman 《Journal of nonparametric statistics》2014,26(4):617-637

We propose a methodology to analyse data arising from a curve that, over its domain, switches among J states. We consider a sequence of response variables, where each response y depends on a covariate x according to an unobserved state z. The states form a stochastic process and their possible values are j=1,?…?, J. If z equals j the expected response of y is one of J unknown smooth functions evaluated at x. We call this model a switching nonparametric regression model. We develop an Expectation–Maximisation algorithm to estimate the parameters of the latent state process and the functions corresponding to the J states. We also obtain standard errors for the parameter estimates of the state process. We conduct simulation studies to analyse the frequentist properties of our estimates. We also apply the proposed methodology to the well-known motorcycle dataset treating the data as coming from more than one simulated accident run with unobserved run labels. 相似文献

14.

Log‐rank permutation tests for trend: saddlepoint p‐values and survival rate confidence intervals

Ehab F. Abd‐Elfattah Ronald W. Butler 《Revue canadienne de statistique》2009,37(1):5-16

Suppose p + 1 experimental groups correspond to increasing dose levels of a treatment and all groups are subject to right censoring. In such instances, permutation tests for trend can be performed based on statistics derived from the weighted log‐rank class. This article uses saddlepoint methods to determine the mid‐P‐values for such permutation tests for any test statistic in the weighted log‐rank class. Permutation simulations are replaced by analytical saddlepoint computations which provide extremely accurate mid‐P‐values that are exact for most practical purposes and almost always more accurate than normal approximations. The speed of mid‐P‐value computation allows for the inversion of such tests to determine confidence intervals for the percentage increase in mean (or median) survival time per unit increase in dosage. The Canadian Journal of Statistics 37: 5‐16; 2009 © 2009 Statistical Society of Canada 相似文献

15.

Estimation and comparison of lognormal parameters in the presence of censored data

《Journal of Statistical Computation and Simulation》2012,82(3):157-169

It is assumed that k(k?>?2) independent samples of sizes n _i (i?=?1, …, k) are available from k lognormal distributions. Four hypothesis cases (H ₁ – H ₄) are defined. Under H ₁, all k median parameters as well as all k skewness parameters are equal; under H ₂, all k skewness parameters are equal but not all k median parameters are equal; under H ₃, all k median parameters are equal but not all k skewness parameters are equal; under H ₄, neither the k median parameters nor the k skewness parameters are equal. The Expectation Maximization (EM) algorithm is used to obtain the maximum likelihood (ML) estimates of the lognormal parameters in each of these four hypothesis cases. A (2k???1) degree polynomial is solved at each step of the EM algorithm for the H ₃ case. A two-stage procedure for testing the equality of the medians either under skewness homogeneity or under skewness heterogeneity is also proposed and discussed. A simulation study was performed for the case k?=?3. 相似文献

16.

CONFIDENCE INTERVALS UTILIZING PRIOR INFORMATION IN THE BEHRENS–FISHER PROBLEM

Paul Kabaila Jarrod Tuck 《Australian & New Zealand Journal of Statistics》2008,50(4):309-328

Consider two independent random samples of size f + 1 , one from an N (μ₁, σ²₁) distribution and the other from an N (μ₂, σ²₂) distribution, where σ²₁/σ²₂∈ (0, ∞) . The Welch ‘approximate degrees of freedom’ (‘approximate t‐solution’) confidence interval for μ₁?μ₂ is commonly used when it cannot be guaranteed that σ²₁/σ²₂= 1 . Kabaila (2005, Comm. Statist. Theory and Methods 34 , 291–302) multiplied the half‐width of this interval by a positive constant so that the resulting interval, denoted by J₀, has minimum coverage probability 1 ?α. Now suppose that we have uncertain prior information that σ²₁/σ²₂= 1. We consider a broad class of confidence intervals for μ₁?μ₂ with minimum coverage probability 1 ?α. This class includes the interval J₀, which we use as the standard against which other members of will be judged. A confidence interval utilizes the prior information substantially better than J₀ if (expected length of J)/(expected length of J₀) is (a) substantially less than 1 (less than 0.96, say) for σ²₁/σ²₂= 1 , and (b) not too much larger than 1 for all other values of σ²₁/σ²₂ . For a given f, does there exist a confidence interval that satisfies these conditions? We focus on the question of whether condition (a) can be satisfied. For each given f, we compute a lower bound to the minimum over of (expected length of J)/(expected length of J₀) when σ²₁/σ²₂= 1 . For 1 ?α= 0.95 , this lower bound is not substantially less than 1. Thus, there does not exist any confidence interval belonging to that utilizes the prior information substantially better than J₀. 相似文献

17.

Estimation of scale functions to model heteroscedasticity by regularised kernel-based quantile methods

R. Hable A. Christmann 《Journal of nonparametric statistics》2014,26(2):219-239

A main goal of regression is to derive statistical conclusions on the conditional distribution of the output variable Y given the input values x. Two of the most important characteristics of a single distribution are location and scale. Regularised kernel methods (RKMs) – also called support vector machines in a wide sense – are well established to estimate location functions like the conditional median or the conditional mean. We investigate the estimation of scale functions by RKMs when the conditional median is unknown, too. Estimation of scale functions is important, e.g. to estimate the volatility in finance. We consider the median absolute deviation (MAD) and the interquantile range as measures of scale. Our main result shows the consistency of MAD-type RKMs. 相似文献

18.

Model weights for model choice and averaging

Peter Congdon 《Statistical Methodology》2007,4(2):143-157

A method is suggested to estimate posterior model probabilities and model averaged parameters via MCMC sampling under a Bayesian approach. The estimates use pooled output for J models (J>1) whereby all models are updated at each iteration. Posterior probabilities are based on averages of continuous weights obtained for each model at each iteration, while samples of averaged parameters are obtained from iteration specific averages that are based on these weights. Parallel sampling of models assists in deriving posterior densities for parameter contrasts between models and in assessing hypotheses regarding model averaged parameters. Four worked examples illustrate application of the approach, two involving fixed effect regression, and two involving random effects. 相似文献

19.

Asymptotic results for empirical and partial-sum processes: A review

Ronald Pyke 《Revue canadienne de statistique》1984,12(4):241-264

Two processes of importance in statistics and probability are the empirical and partial-sum processes. Based on d-dimensional data X₁, … X_a the empirical measure is defined for any A ⊂ R^d by the sample proportion of observations in A. When normalized, F_n yields the empirical process W_n: = n^1/2 (F_n - F), where F denotes the “true” probability measure. To define partial-sum processes, one needs data that are assigned to specified locations (in contrast to the above, where specified unit masses are assigned to random locations). A suitable context for many applications is that of data attached to points of a lattice, say {X_j:j ϵ J^d} where J = {1, 2,…}, for which the partial sums are defined for any A ⊂ R^d by Thus S(A) is the sum of the data contained in A. When normalized, S yields the partial-sum process. This paper provides an overview of asymptotic results for empirical and partial-sum processes, including strong laws and central limit theorems, together with some indications of their inferential implications. 相似文献

20.

Jackknifing L-estimates

Kuang-Fu Cheng 《Revue canadienne de statistique》1982,10(1):49-58

Linear functions of order statistics (“L-estimates”) of the form T_n =under jackknifing are investigated. This paper proves that with suitable conditions on the function J, the jackknifed version T_n of the L-estimate T_n has the same limit distribution as T_n. It is also shown that the jackknife estimate of the asymptotic variance of n^1/2 is consistent. Furthermore, the Berry-Esséen rate associated with asymptotic normality, and a law of the iterated logarithm of a class of jackknife L-estimates, are characterized. 相似文献