期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Regression analysis under non-standard situations: a pairwise pseudolikelihood approach

Kung-Yee Liang & Jing Qin 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(4):773-786

Regression analysis is one of the most used statistical methods for data analysis. There are, however, many situations in which one cannot base inference solely on f ( y ∣ x ; β), the conditional probability (density) function for the response variable Y , given x , the covariates. Examples include missing data where the missingness is non-ignorable, sampling surveys in which subjects are selected on the basis of the Y -values and meta-analysis where published studies are subject to 'selection bias'. The conventional approaches require the correct specification of the missingness mechanism, sampling probability and probability for being published respectively. In this paper, we propose an alternative estimating procedure for β based on an idea originated by Kalbfleisch. The novelty of this method is that no assumption on the missingness probability mechanisms etc. mentioned above is required to be specified. Asymptotic efficiency calculations and simulation studies were conducted to compare the method proposed with the two existing methods: the conditional likelihood and the weighted estimating function approaches. 相似文献

2.

Semiparametric inference for estimating equations with nonignorably missing covariates

Ji Chen Fang Fang 《Journal of nonparametric statistics》2018,30(3):796-812

We consider statistical inference of unknown parameters in estimating equations (EEs) when some covariates have nonignorably missing values, which is quite common in practice but has rarely been discussed in the literature. When an instrument, a fully observed covariate vector that helps identifying parameters under nonignorable missingness, is available, the conditional distribution of the missing covariates given other covariates can be estimated by the pseudolikelihood method of Zhao and Shao [(2015), ‘Semiparametric pseudo likelihoods in generalised linear models with nonignorable missing data’, Journal of the American Statistical Association, 110, 1577–1590)] and be used to construct unbiased EEs. These modified EEs then constitute a basis for valid inference by empirical likelihood. Our method is applicable to a wide range of EEs used in practice. It is semiparametric since no parametric model for the propensity of missing covariate data is assumed. Asymptotic properties of the proposed estimator and the empirical likelihood ratio test statistic are derived. Some simulation results and a real data analysis are presented for illustration. 相似文献

3.

ODD MAN OUT—THE MEIXNER HYPERGEOMETRIC DISTRIBUTION

C. D. Lai D. Vere-Jones 《Australian & New Zealand Journal of Statistics》1979,21(3):256-265

Two open problems are described for the hyperbolic secant distribution, as a special case of the more general Meixner hypergeomet-ric distribution. The first concerns the completeness of the family of functions sech θ x , θ≥ 0. The second concerns the characterization of the class of canonical correlation sequences in bivariate distributions with marginals sech θ x , sech θ y. In both cases some partial results are put forward. 相似文献

4.

Exact Short Confidence Intervals from Discrete Data

Paul Kabaila & John Byrne 《Australian & New Zealand Journal of Statistics》2001,43(3):303-309

Suppose that X is a discrete random variable whose possible values are {0, 1, 2,⋯} and whose probability mass function belongs to a family indexed by the scalar parameter θ . This paper presents a new algorithm for finding a 1 − α confidence interval for θ based on X which possesses the following three properties: (i) the infimum over θ of the coverage probability is 1 − α ; (ii) the confidence interval cannot be shortened without violating the coverage requirement; (iii) the lower and upper endpoints of the confidence intervals are increasing functions of the observed value x . This algorithm is applied to the particular case that X has a negative binomial distribution. 相似文献

5.

Doubly robust empirical likelihood inference in covariate-missing data problems

Biao Zhang 《Statistics》2016,50(5):1173-1194

Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study. 相似文献

6.

Detecting covariates with non-random missing values in a survey of primary education in Madagascar

J. K. Lindsey & P. J. Lindsey 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2001,164(2):327-338

A major survey of the determinants of access to primary education in Madagascar was carried out in 1994. The probability of enrolment, probability of admission, delay before beginning school, probability of repeating a year and probability of dropping out were studied. The results of the survey are briefly described. In the analysis, one major problem was non-random missing values in the covariates. Some simple methods were developed for detecting whether a response variable depends on the missingness of a given covariate and whether eliminating the missing values would distort the resulting model. A way of incorporating covariates with randomly missing values was used such that the individuals having the missing values did not need to be eliminated. These methods are described and examples are given on how they were applied for one of the key covariates that had a large number of non-random missing values and for one for which the values appear to be randomly missing. 相似文献

7.

Estimation of the additive hazards model with interval-censored data and missing covariates

Huiqiong Li Han Zhang Liang Zhu Ni Li Jianguo Sun 《Revue canadienne de statistique》2020,48(3):499-517

The additive hazards model is one of the most commonly used regression models in the analysis of failure time data and many methods have been developed for its inference in various situations. However, no established estimation procedure exists when there are covariates with missing values and the observed responses are interval-censored; both types of complications arise in various settings including demographic, epidemiological, financial, medical and sociological studies. To address this deficiency, we propose several inverse probability weight-based and reweighting-based estimation procedures for the situation where covariate values are missing at random. The resulting estimators of regression model parameters are shown to be consistent and asymptotically normal. The numerical results that we report from a simulation study suggest that the proposed methods work well in practical situations. An application to a childhood cancer survival study is provided. The Canadian Journal of Statistics 48: 499–517; 2020 © 2020 Statistical Society of Canada 相似文献

8.

Joint regression modeling for missing categorical covariates in generalized linear models

Luis Carlos Pérez-Ruiz Gabriel Escarela 《Journal of applied statistics》2018,45(15):2741-2759

Missing covariates data is a common issue in generalized linear models (GLMs). A model-based procedure arising from properly specifying joint models for both the partially observed covariates and the corresponding missing indicator variables represents a sound and flexible methodology, which lends itself to maximum likelihood estimation as the likelihood function is available in computable form. In this paper, a novel model-based methodology is proposed for the regression analysis of GLMs when the partially observed covariates are categorical. Pair-copula constructions are used as graphical tools in order to facilitate the specification of the high-dimensional probability distributions of the underlying missingness components. The model parameters are estimated by maximizing the weighted log-likelihood function by using an EM algorithm. In order to compare the performance of the proposed methodology with other well-established approaches, which include complete-cases and multiple imputation, several simulation experiments of Binomial, Poisson and Normal regressions are carried out under both missing at random and non-missing at random mechanisms scenarios. The methods are illustrated by modeling data from a stage III melanoma clinical trial. The results show that the methodology is rather robust and flexible, representing a competitive alternative to traditional techniques. 相似文献

9.

Non-ignorable missing covariate data in survival analysis: a case-study of an International Breast Cancer Study Group trial

Amy H. Herring Joseph G. Ibrahim Stuart R. Lipsitz 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(2):293-310

Summary. Non-ignorable missing data, a serious problem in both clinical trials and observational studies, can lead to biased inferences. Quality-of-life measures have become increasingly popular in clinical trials. However, these measures are often incompletely observed, and investigators may suspect that missing quality-of-life data are likely to be non-ignorable. Although several recent references have addressed missing covariates in survival analysis, they all required the assumption that missingness is at random or that all covariates are discrete. We present a method for estimating the parameters in the Cox proportional hazards model when missing covariates may be non-ignorable and continuous or discrete. Our method is useful in reducing the bias and improving efficiency in the presence of missing data. The methodology clearly specifies assumptions about the missing data mechanism and, through sensitivity analysis, helps investigators to understand the potential effect of missing data on study results. 相似文献

10.

Weighting for item non-response in attitude scales by using latent variable models with covariates

Irini Moustaki & Martin Knott 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2000,163(3):445-459

We discuss the use of latent variable models with observed covariates for computing response propensities for sample respondents. A response propensity score is often used to weight item and unit responders to account for item and unit non-response and to obtain adjusted means and proportions. In the context of attitude scaling, we discuss computing response propensity scores by using latent variable models for binary or nominal polytomous manifest items with covariates. Our models allow the response propensity scores to be found for several different items without refitting. They allow any pattern of missing responses for the items. If one prefers, it is possible to estimate population proportions directly from the latent variable models, so avoiding the use of propensity scores. Artificial data sets and a real data set extracted from the 1996 British Social Attitudes Survey are used to compare the various methods proposed. 相似文献

11.

Validity and efficiency in analyzing ordinal responses with missing observations

Xichen She Changbao Wu 《Revue canadienne de statistique》2020,48(2):138-151

This article addresses issues in creating public-use data files in the presence of missing ordinal responses and subsequent statistical analyses of the dataset by users. The authors propose a fully efficient fractional imputation (FI) procedure for ordinal responses with missing observations. The proposed imputation strategy retrieves the missing values through the full conditional distribution of the response given the covariates and results in a single imputed data file that can be analyzed by different data users with different scientific objectives. Two most critical aspects of statistical analyses based on the imputed data set, validity and efficiency, are examined through regression analysis involving the ordinal response and a selected set of covariates. It is shown through both theoretical development and simulation studies that, when the ordinal responses are missing at random, the proposed FI procedure leads to valid and highly efficient inferences as compared to existing methods. Variance estimation using the fractionally imputed data set is also discussed. The Canadian Journal of Statistics 48: 138–151; 2020 © 2019 Statistical Society of Canada 相似文献

12.

The competing risks Cox model with auxiliary case covariates under weaker missing-at-random cause of failure

Daniel Nevo Reiko Nishihara Shuji Ogino Molin Wang 《Lifetime data analysis》2018,24(3):425-442

In the analysis of time-to-event data with multiple causes using a competing risks Cox model, often the cause of failure is unknown for some of the cases. The probability of a missing cause is typically assumed to be independent of the cause given the time of the event and covariates measured before the event occurred. In practice, however, the underlying missing-at-random assumption does not necessarily hold. Motivated by colorectal cancer molecular pathological epidemiology analysis, we develop a method to conduct valid analysis when additional auxiliary variables are available for cases only. We consider a weaker missing-at-random assumption, with missing pattern depending on the observed quantities, which include the auxiliary covariates. We use an informative likelihood approach that will yield consistent estimates even when the underlying model for missing cause of failure is misspecified. The superiority of our method over naive methods in finite samples is demonstrated by simulation study results. We illustrate the use of our method in an analysis of colorectal cancer data from the Nurses’ Health Study cohort, where, apparently, the traditional missing-at-random assumption fails to hold. 相似文献

13.

Weighted empirical likelihood for quantile regression with non ignorable missing covariates

Xiaohui Yuan Xiaogang Dong 《统计学通讯:理论与方法》2019,48(12):3068-3084

In this paper, we propose an empirical likelihood-based weighted estimator of regression parameter in quantile regression model with non ignorable missing covariates. The proposed estimator is computationally simple and achieves semiparametric efficiency if the probability of missingness on the fully observed variables is correctly specified. The efficiency gain of the proposed estimator over the complete-case-analysis estimator is quantified theoretically and illustrated via simulation and a real data application. 相似文献

14.

Analysis of two-phase sampling data with semiparametric additive hazards models

Yanqing Sun Xiyuan Qian Qiong Shou Peter B. Gilbert 《Lifetime data analysis》2017,23(3):377-399

Under the case-cohort design introduced by Prentice (Biometrica 73:1–11, 1986), the covariate histories are ascertained only for the subjects who experience the event of interest (i.e., the cases) during the follow-up period and for a relatively small random sample from the original cohort (i.e., the subcohort). The case-cohort design has been widely used in clinical and epidemiological studies to assess the effects of covariates on failure times. Most statistical methods developed for the case-cohort design use the proportional hazards model, and few methods allow for time-varying regression coefficients. In addition, most methods disregard data from subjects outside of the subcohort, which can result in inefficient inference. Addressing these issues, this paper proposes an estimation procedure for the semiparametric additive hazards model with case-cohort/two-phase sampling data, allowing the covariates of interest to be missing for cases as well as for non-cases. A more flexible form of the additive model is considered that allows the effects of some covariates to be time varying while specifying the effects of others to be constant. An augmented inverse probability weighted estimation procedure is proposed. The proposed method allows utilizing the auxiliary information that correlates with the phase-two covariates to improve efficiency. The asymptotic properties of the proposed estimators are established. An extensive simulation study shows that the augmented inverse probability weighted estimation is more efficient than the widely adopted inverse probability weighted complete-case estimation method. The method is applied to analyze data from a preventive HIV vaccine efficacy trial. 相似文献

15.

On maximum likelihood estimation in parametric regression with missing covariates

《Journal of statistical planning and inference》2005,134(1):206-223

We consider parametric regression problems with some covariates missing at random. It is shown that the regression parameter remains identifiable under natural conditions. When the always observed covariates are discrete, we propose a semiparametric maximum likelihood method, which does not require parametric specification of the missing data mechanism or the covariate distribution. The global maximum likelihood estimator (MLE), which maximizes the likelihood over the whole parameter set, is shown to exist under simple conditions. For ease of computation, we also consider a restricted MLE which maximizes the likelihood over covariate distributions supported by the observed values. Under regularity conditions, the two MLEs are asymptotically equivalent and strongly consistent for a class of topologies on the parameter set. 相似文献

16.

Missing covariates in generalized linear models when the missing data mechanism is non-ignorable

J. G. Ibrahim S. R. Lipsitz & M.-H. Chen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1999,61(1):173-190

We propose a method for estimating parameters in generalized linear models with missing covariates and a non-ignorable missing data mechanism. We use a multinomial model for the missing data indicators and propose a joint distribution for them which can be written as a sequence of one-dimensional conditional distributions, with each one-dimensional conditional distribution consisting of a logistic regression. We allow the covariates to be either categorical or continuous. The joint covariate distribution is also modelled via a sequence of one-dimensional conditional distributions, and the response variable is assumed to be completely observed. We derive the E- and M-steps of the EM algorithm with non-ignorable missing covariate data. For categorical covariates, we derive a closed form expression for the E- and M-steps of the EM algorithm for obtaining the maximum likelihood estimates (MLEs). For continuous covariates, we use a Monte Carlo version of the EM algorithm to obtain the MLEs via the Gibbs sampler. Computational techniques for Gibbs sampling are proposed and implemented. The parametric form of the assumed missing data mechanism itself is not `testable' from the data, and thus the non-ignorable modelling considered here can be viewed as a sensitivity analysis concerning a more complicated model. Therefore, although a model may have `passed' the tests for a certain missing data mechanism, this does not mean that we have captured, even approximately, the correct missing data mechanism. Hence, model checking for the missing data mechanism and sensitivity analyses play an important role in this problem and are discussed in detail. Several simulations are given to demonstrate the methodology. In addition, a real data set from a melanoma cancer clinical trial is presented to illustrate the methods proposed. 相似文献

17.

Using auxiliary data for parameter estimation with non-ignorably missing outcomes

Joseph G. Ibrahim Stuart R. Lipsitz & Nick Horton 《Journal of the Royal Statistical Society. Series C, Applied statistics》2001,50(3):361-373

We propose a method for estimating parameters in generalized linear models when the outcome variable is missing for some subjects and the missing data mechanism is non-ignorable. We assume throughout that the covariates are fully observed. One possible method for estimating the parameters is maximum likelihood with a non-ignorable missing data model. However, caution must be used when fitting non-ignorable missing data models because certain parameters may be inestimable for some models. Instead of fitting a non-ignorable model, we propose the use of auxiliary information in a likelihood approach to reduce the bias, without having to specify a non-ignorable model. The method is applied to a mental health study. 相似文献

18.

Efficient inverse probability weighting method for quantile regression with nonignorable missing data

Pu-Ying Zhao De-Peng Jiang 《Statistics》2017,51(2):363-386

Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies. 相似文献

19.

Parametric models for response-biased sampling

Kani Chen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(4):775-789

Suppose that subjects in a population follow the model f ( y ^* x ^*; ) where y ^* denotes a response, x ^* denotes a vector of covariates and is the parameter to be estimated. We consider response-biased sampling, in which a subject is observed with a probability which is a function of its response. Such response-biased sampling frequently occurs in econometrics, epidemiology and survey sampling. The semiparametric maximum likelihood estimate of is derived, along with its asymptotic normality, efficiency and variance estimates. The estimate proposed can be used as a maximum partial likelihood estimate in stratified response-selective sampling. Some computation algorithms are also provided. 相似文献

20.

Penalized inverse probability weighted estimators for weighted rank regression with missing covariates

Hu Yang Jing Lv 《统计学通讯:理论与方法》2013,42(5):1388-1402

Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators. 相似文献