期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On variable selection in generalized linear and related regression models

Lennart Nordberg 《统计学通讯:理论与方法》2013,42(21):2427-2449

This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's. 相似文献

2.

Using biweight m-estimates in the two-sample problem part 1: symmetric populations

Karen Kafadar 《统计学通讯:理论与方法》2013,42(17):1883-1901

We propose replacing the usual Student's-t statistic, which tests for equality of means of two distributions and is used to construct a confidence interval for the difference, by a biweight-“t” statistic. The biweight-“t” is a ratio of the difference of the biweight estimates of location from the two samples to an estimate of the standard error of this difference. Three forms of the denominator are evaluated: weighted variance estimates using both pooled and unpooled scale estimates, and unweighted variance estimates using an unpooled scale estimate. Monte Carlo simulations reveal that resulting confidence intervals are highly efficient on moderate sample sizes, and that nominal levels are nearly attained, even when considering extreme percentage points. 相似文献

3.

Comparison of unweighted and weighted rank based tests for an ordered alternative in randomized complete block designs

Hua Zhang Daniel Young 《统计学通讯:模拟与计算》2017,46(6):4452-4464

In randomized complete block designs, a monotonic relationship among treatment groups may already be established from prior information, e.g., a study with different dose levels of a drug. The test statistic developed by Page and another from Jonckheere and Terpstra are two unweighted rank based tests used to detect ordered alternatives when the assumptions in the traditional two-way analysis of variance are not satisfied. We consider a new weighted rank based test by utilizing a weight for each subject based on the sample variance in computing the new test statistic. The new weighted rank based test is compared with the two commonly used unweighted tests with regard to power under various conditions. The weighted test is generally more powerful than the two unweighted tests when the number of treatment groups is small to moderate. 相似文献

4.

Behavior of agreement measures in the presence of zero cells and biased marginal distributions

Viswanathan Shankar Shrikant I. Bangdiwala 《Journal of applied statistics》2008,35(4):445-464

Kappa and B assess agreement between two observers independently classifying N units into k categories. We study their behavior under zero cells in the contingency table and unbalanced asymmetric marginal distributions. Zero cells arise when a cross-classification is never endorsed by both observers; biased marginal distributions occur when some categories are preferred differently between the observers. Simulations studied the distributions of the unweighted and weighted statistics for k=4, under fixed proportions of diagonal agreement and different patterns off-diagonal, with various sample sizes, and under various zero cell count scenarios. Marginal distributions were first uniform and homogeneous, and then unbalanced asymmetric distributions. Results for unweighted kappa and B statistics were comparable to work of Muñoz and Bangdiwala, even with zero cells. A slight increased variation was observed as the sample size decreased. Weighted statistics did show greater variation as the number of zero cells increased, with weighted kappa increasing substantially more than weighted B. Under biased marginal distributions, weighted kappa with Cicchetti weights were higher than with squared weights. Both statistics for observer agreement behaved well under zero cells. The weighted B was less variable than the weighted kappa under similar circumstances and different weights. In general, B's performance and graphical interpretation make it preferable to kappa under the studied scenarios. 相似文献

5.

Modeling longitudinal count data with dropouts

Mohamed Alosh 《Pharmaceutical statistics》2010,9(1):35-45

This paper explores the utility of different approaches for modeling longitudinal count data with dropouts arising from a clinical study for the treatment of actinic keratosis lesions on the face and balding scalp. A feature of these data is that as the disease for subjects on the active arm improves their data show larger dispersion compared with those on the vehicle, exhibiting an over‐dispersion relative to the Poisson distribution. After fitting the marginal (or population averaged) model using the generalized estimating equation (GEE), we note that inferences from such a model might be biased as dropouts are treatment related. Then, we consider using a weighted GEE (WGEE) where each subject's contribution to the analysis is weighted inversely by the subject's probability of dropout. Based on the model findings, we argue that the WGEE might not address the concerns about the impact of dropouts on the efficacy findings when dropouts are treatment related. As an alternative, we consider likelihood‐based inference where random effects are added to the model to allow for heterogeneity across subjects. Finally, we consider a transition model where, unlike the previous approaches that model the log‐link function of the mean response, we model the subject's actual lesion counts. This model is an extension of the Poisson autoregressive model of order 1, where the autoregressive parameter is taken to be a function of treatment as well as other covariates to induce different dispersions and correlations for the two treatment arms. We conclude with a discussion about model selection. Published in 2009 by John Wiley & Sons, Ltd. 相似文献

6.

Forecasting functional time series

Rob J. Hyndman Han Lin Shang 《Journal of the Korean Statistical Society》2009,38(3):199-211

We propose forecasting functional time series using weighted functional principal component regression and weighted functional partial least squares regression. These approaches allow for smooth functions, assign higher weights to more recent data, and provide a modeling scheme that is easily adapted to allow for constraints and other information. We illustrate our approaches using age-specific French female mortality rates from 1816 to 2006 and age-specific Australian fertility rates from 1921 to 2006, and show that these weighted methods improve forecast accuracy in comparison to their unweighted counterparts. We also propose two new bootstrap methods to construct prediction intervals, and evaluate and compare their empirical coverage probabilities. 相似文献

7.

On the existence of maximum likelihood estimates for weighted logistic regression

Guoping Zeng 《统计学通讯:理论与方法》2017,46(22):11194-11203

The problems of existence and uniqueness of maximum likelihood estimates for logistic regression were completely solved by Silvapulle in 1981 and Albert and Anderson in 1984. In this paper, we extend the well-known results by Silvapulle and by Albert and Anderson to weighted logistic regression. We analytically prove the equivalence between the overlap condition used by Albert and Anderson and that used by Silvapulle. We show that the maximum likelihood estimate of weighted logistic regression does not exist if there is a complete separation or a quasicomplete separation of the data points, and exists and is unique if there is an overlap of data points. Our proofs and results for weighted logistic apply to unweighted logistic regression. 相似文献

8.

Design‐Based Inference in a Mixture Model for Ordinal Variables for a Two Stage Stratified Design

R. Gambacorta M. Iannario R. Valliant 《Australian & New Zealand Journal of Statistics》2014,56(2):125-143

In this paper we present methods for inference on data selected by a complex sampling design for a class of statistical models for the analysis of ordinal variables. Specifically, assuming that the sampling scheme is not ignorable, we derive for the class of cub models (Combination of discrete Uniform and shifted Binomial distributions) variance estimates for a complex two stage stratified sample. Both Taylor linearization and repeated replication variance estimators are presented. We also provide design‐based test diagnostics and goodness‐of‐fit measures. We illustrate by means of real data analysis the differences between survey‐weighted and unweighted point estimates and inferences for cub model parameters. 相似文献

9.

Weighted empirical likelihood inference for multiple samples

Yuejiao Fu Xiaogang Wang Changbao Wu 《Journal of statistical planning and inference》2009

We propose a weighted empirical likelihood approach to inference with multiple samples, including stratified sampling, the estimation of a common mean using several independent and non-homogeneous samples and inference on a particular population using other related samples. The weighting scheme and the basic result are motivated and established under stratified sampling. We show that the proposed method can ideally be applied to the common mean problem and problems with related samples. The proposed weighted approach not only provides a unified framework for inference with multiple samples, including two-sample problems, but also facilitates asymptotic derivations and computational methods. A bootstrap procedure is also proposed in conjunction with the weighted approach to provide better coverage probabilities for the weighted empirical likelihood ratio confidence intervals. Simulation studies show that the weighted empirical likelihood confidence intervals perform better than existing ones. 相似文献

10.

Coherent mortality forecasting by the weighted multilevel functional principal component approach

Ruhao Wu 《Journal of applied statistics》2019,46(10):1774-1791

In human mortality modelling, if a population consists of several subpopulations it can be desirable to model their mortality rates simultaneously while taking into account the heterogeneity among them. The mortality forecasting methods tend to result in divergent forecasts for subpopulations when independence is assumed. However, under closely related social, economic and biological backgrounds, mortality patterns of these subpopulations are expected to be non-divergent in the future. In this article, we propose a new method for coherent modelling and forecasting of mortality rates for multiple subpopulations, in the sense of nondivergent life expectancy among subpopulations. The mortality rates of subpopulations are treated as multilevel functional data and a weighted multilevel functional principal component (wMFPCA) approach is proposed to model and forecast them. The proposed model is applied to sex-specific data for nine developed countries, and the results show that, in terms of overall forecasting accuracy, the model outperforms the independent model and the Product-Ratio model as well as the unweighted multilevel functional principal component approach. 相似文献

11.

Considering Taguchi loss function on statistically constrained economic sum of squares exponentially weighted moving average charts

《Journal of Statistical Computation and Simulation》2012,82(3):572-586

In this paper, Duncan's cost model combined Taguchi's quadratic loss function is applied to develop the economic-statistical design of the sum of squares exponentially weighted moving average (SS-EWMA) chart. The genetic algorithm is applied to search for the optimal decision variables of SS-EWMA chart such that the expected cost is minimized. Sensitivity analysis reveals that the optimal sample size and sampling interval decrease; optimal smoothing constant and control limit increase as the mean and/or variance increases. Moreover, the combination of optimal parameter levels in orthogonal array experiment plays an important guideline for monitoring the process mean and/or variance. 相似文献

12.

Goodness-of-fit statistics for log-link regression models

《Journal of Statistical Computation and Simulation》2012,82(12):2533-2545

The use of log binomial regression, regression on binary outcomes using a log link, is becoming increasingly popular because it provides estimates of relative risk. However, little work has been done on model evaluation. We used simulations to compare the performance of five goodness-of-fit statistics applied to different models in a log binomial setting, namely the Hosmer–Lemeshow, the normalized Pearson chi-square, the normalized unweighted sum of squares, Le Cessie and van Howelingen's statistic based on smoothed residuals and the Hjort–Hosmer test. The normalized Pearson chi-square was unsuitable as the rejection rate depended also on the range of predicted probabilities. The Le Cessie and van Howelingen's test statistic had poor sampling properties when evaluating a correct model and was also considered to be unsuitable in this context. The performance of the remaining three statistics was comparable in most simulations. However, using real data the Hjort–Hosmer outperformed the other two statistics. 相似文献

13.

Weighted kappa as a function of unweighted kappas

N. Moradzadeh M. Ganjali 《统计学通讯:模拟与计算》2017,46(5):3769-3780

The kappa coefficient is a widely used measure for assessing agreement on a nominal scale. Weighted kappa is an extension of Cohen's kappa that is commonly used for measuring agreement on an ordinal scale. In this article, it is shown that weighted kappa can be computed as a function of unweighted kappas. The latter coefficients are kappa coefficients that correspond to smaller contingency tables that are obtained by merging categories. 相似文献

14.

Examples of Differing Weighted and Unweighted Estimates from a Sample Survey 总被引：1，自引：0，他引：1

Edward L. Korn Barry I. Graubard 《The American statistician》2013,67(3):291-295

Unweighted estimators using data collected in a sample survey can be badly biased, whereas weighted estimators are approximately unbiased for population parameters. We present four examples using data from the 1988 National Maternal and Infant Health Survey to demonstrate that weighted and unweighted estimators can be quite different, and to show the underlying causes of such differences. 相似文献

15.

GOES-8 X-ray sensor variance stabilization using the multiscale data-driven Haar–Fisz transform

Piotr Fryzlewicz Véronique Delouille Guy P. Nason 《Journal of the Royal Statistical Society. Series C, Applied statistics》2007,56(1):99-116

Summary. We consider the stochastic mechanisms behind the data that were collected by the solar X-ray sensor (XRS) on board the GOES-8 satellite. We discover and justify a non-trivial mean–variance relationship within the XRS data. Transforming such data so that their variance is stable and its distribution is taken closer to the Gaussian distribution is the aim of many techniques (e.g. Anscombe and Box–Cox). Recently, new techniques based on the Haar–Fisz transform have been introduced that use a multiscale method to transform and stabilize data with a known mean–variance relationship. In many practical cases, such as the XRS data, the variance of the data can be assumed to increase with the mean, but other characteristics of the distribution are unknown. We introduce a method, the data-driven Haar–Fisz transform, which uses the Haar–Fisz transform but also estimates the mean–variance relationship. For known noise distributions, the data-driven Haar–Fisz transform is shown to be competitive with the fixed Haar–Fisz methods. We show how our data-driven Haar–Fisz transform method denoises the XRS series where other existing methods fail. 相似文献

16.

Hazard ratio inference in stratified clinical trials with time‐to‐event endpoints and limited sample size

Rengyi Xu Devan V. Mehrotra Pamela A. Shaw 《Pharmaceutical statistics》2019,18(3):366-376

The stratified Cox model is commonly used for stratified clinical trials with time‐to‐event endpoints. The estimated log hazard ratio is approximately a weighted average of corresponding stratum‐specific Cox model estimates using inverse‐variance weights; the latter are optimal only under the (often implausible) assumption of a constant hazard ratio across strata. Focusing on trials with limited sample sizes (50‐200 subjects per treatment), we propose an alternative approach in which stratum‐specific estimates are obtained using a refined generalized logrank (RGLR) approach and then combined using either sample size or minimum risk weights for overall inference. Our proposal extends the work of Mehrotra et al, to incorporate the RGLR statistic, which outperforms the Cox model in the setting of proportional hazards and small samples. This work also entails development of a remarkably accurate plug‐in formula for the variance of RGLR‐based estimated log hazard ratios. We demonstrate using simulations that our proposed two‐step RGLR analysis delivers notably better results through smaller estimation bias and mean squared error and larger power than the stratified Cox model analysis when there is a treatment‐by‐stratum interaction, with similar performance when there is no interaction. Additionally, our method controls the type I error rate while the stratified Cox model does not in small samples. We illustrate our method using data from a clinical trial comparing two treatments for colon cancer. 相似文献

17.

Designing a multiple state repetitive group sampling plan based on the coefficient of variation

Aijun Yan Sanyang Liu Muhammad Azam 《统计学通讯:模拟与计算》2017,46(9):7154-7165

A multiple state repetitive group sampling (MSRGS) plan is developed on the basis of the coefficient of variation (CV) of the quality characteristic which follows a normal distribution with unknown mean and variance. The optimal plan parameters of the proposed plan are solved by a nonlinear optimization model, which satisfies the given producer's risk and consumer's risk at the same time and minimizes the average sample number required for inspection. The advantages of the proposed MSRGS plan over the existing sampling plans are discussed. Finally an example is given to illustrate the proposed plan. 相似文献

18.

Efficient estimation of regression parameters from multistage studies with validation of outcome and covariates

《Journal of statistical planning and inference》1997,65(2):349-374

Often the variables in a regression model are difficult or expensive to obtain so auxiliary variables are collected in a preliminary step of a study and the model variables are measured at later stages on only a subsample of the study participants called the validation sample. We consider a study in which at the first stage some variables, throughout called auxiliaries, are collected; at the second stage the true outcome is measured on a subsample of the first-stage sample, and at the third stage the true covariates are collected on a subset of the second-stage sample. In order to increase efficiency, the probabilities of selection into the second and third-stage samples are allowed to depend on the data observed at the previous stages. In this paper we describe a class of inverse-probability-of-selection-weighted semiparametric estimators for the parameters of the model for the conditional mean of the outcomes given the covariates. We assume that a subject's probability of being sampled at subsequent stages is bounded away from zero and depends only on the subject's data collected at the previous sampling stages. We show that the asymptotic variance of the optimal estimator in our class is equal to the semiparametric variance bound for the model. Since the optimal estimator depends on unknown population parameters it is not available for data analysis. We therefore propose an adaptive estimation procedure for locally efficient inferences. A simulation study is carried out to study the finite sample properties of the proposed estimators. 相似文献

19.

Association models in the weighted log ratio analysis for rates

下载免费PDF全文

A. D'Ambra A. Crisci L. D'Ambra 《Australian & New Zealand Journal of Statistics》2017,59(2):169-185

In this paper we consider Goodman's association models and weighted log ratio analysis (LRA). In particular, by combining these two methods, we obtain different weighted log ratio analyses that we can extend to analyse a rates matrix, obtained by calculating the ratio between two initial multidimensional contingency tables. Our approach is illustrated by an empirical study. The selection of the model, to be analysed through the weighted LRA plot, is carried out by means of Poisson regression on rates. 相似文献

20.

Change Point Estimation with Independent Observations and Piece-Wise Continuous Variance Function

Michael Last 《统计学通讯:理论与方法》2013,42(5):722-736

Grégoire and Hamrouni use locally linear smoothers to find jumps in the mean of a regression function under a set of weak assumptions. This paper extends this work to find jumps in the variance function of a mean zero series of independent observations. We transform this problem into the problem considered by Grégoire and Hamrouni by means of a log transform. We also demonstrate that a bootstrap technique proposed by Gijbels and Goderniaux is valid in this setting. 相似文献