首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Variable selection is an important issue in all regression analysis and in this paper, we discuss this in the context of regression analysis of recurrent event data. Recurrent event data often occur in long-term studies in which individuals may experience the events of interest more than once and their analysis has recently attracted a great deal of attention (Andersen et al., Statistical models based on counting processes, 1993; Cook and Lawless, Biometrics 52:1311–1323, 1996, The analysis of recurrent event data, 2007; Cook et al., Biometrics 52:557–571, 1996; Lawless and Nadeau, Technometrics 37:158-168, 1995; Lin et al., J R Stat Soc B 69:711–730, 2000). However, it seems that there are no established approaches to the variable selection with respect to recurrent event data. For the problem, we adopt the idea behind the nonconcave penalized likelihood approach proposed in Fan and Li (J Am Stat Assoc 96:1348–1360, 2001) and develop a nonconcave penalized estimating function approach. The proposed approach selects variables and estimates regression coefficients simultaneously and an algorithm is presented for this process. We show that the proposed approach performs as well as the oracle procedure in that it yields the estimates as if the correct submodel was known. Simulation studies are conducted for assessing the performance of the proposed approach and suggest that it works well for practical situations. The proposed methodology is illustrated by using the data from a chronic granulomatous disease study.  相似文献   

2.
On MSE of EBLUP   总被引:1,自引:1,他引:0  
We consider Best Linear Unbiased Predictors (BLUPs) and Empirical Best Linear Unbiased Predictors (EBLUPs) under the general mixed linear model. The BLUP was proposed by Henderson (Ann Math Stat 21:309–310, 1950). The formula of this BLUP includes unknown elements of the variance-covariance matrix of random variables. If the elements in the formula of the BLUP proposed by Henderson (Ann Math Stat 21:309–310, 1950) are replaced by some type of estimators, we obtain the two-stage predictor called the EBLUP which is model-unbiased (Kackar and Harville in Commun Stat A 10:1249–1261, 1981). Kackar and Harville (J Am Stat Assoc 79:853–862, 1984) show an approximation of the mean square error (the MSE) of the predictor and propose an estimator of the MSE. The MSE and estimators of the MSE are also studied by Prasad and Rao (J Am Stat Assoc 85:163–171, 1990), Datta and Lahiri (Stat Sin 10:613–627, 2000) and Das et al. (Ann Stat 32(2):818–840, 2004). In the paper we consider the BLUP proposed by Royall (J Am Stat Assoc 71:657–473, 1976. Ża̧dło (On unbiasedness of some EBLU predictor. Physica-Verlag, Heidelberg, pp 2019–2026, 2004) shows that the BLUP proposed by Royall (J Am Stat Assoc 71:657–473, 1976) may be treated as a generalisation of the BLUP proposed by Henderson (Ann Math Stat 21:309–310, 1950) and proves model unbiasedness of the EBLUP based on the formula of the BLUP proposed by Royall (J Am Stat Assoc 71:657–473, 1976) under some assumptions. In this paper we derive the formula of the approximate MSE of the EBLUP and its estimators. We prove that the approximation of the MSE is accurate to terms o(D −1) and that the estimator of the MSE is approximately unbiased in the sense that its bias is o(D −1) under some assumptions, where D is the number of domains. The proof is based on the results obtained by Datta and Lahiri (Stat Sin 10:613–627, 2000). Using our results we show some EBLUP based on the special case of the general linear model. We also present the formula of its MSE and estimators of its MSE and their performance in Monte Carlo simulation study.   相似文献   

3.
In this paper, we consider a constructive representation of skewed distributions, which proposed by Ferreira and Steel (J Am Stat Assoc 101:823–829, 2006), and its basic properties is presented. We study the five versions of skew- normal distributions in this general setting. An appropriate empirical model for a skewed distribution is introduced. In data analysis, we compare this empirical model with the other four versions of skew-normal distributions, via some reasonable criteria. It is shown that the proposed empirical model has a better fit for density estimation.  相似文献   

4.
Statistics and Computing - We present a novel method for the estimation of variance parameters in generalised linear mixed models. The method has its roots in Harville (J Am Stat Assoc...  相似文献   

5.
An alternative stochastic restricted Liu estimator in linear regression   总被引:2,自引:1,他引:1  
In this paper, we introduce an alternative stochastic restricted Liu estimator for the vector of parameters in a linear regression model when additional stochastic linear restrictions on the parameter vector are assumed to hold. The new estimator is a generalization of the ordinary mixed estimator (OME) (Durbin in J Am Stat Assoc 48:799–808, 1953; Theil and Goldberger in Int Econ Rev 2:65–78, 1961; Theil in J Am Stat Assoc 58:401–414, 1963) and Liu estimator proposed by Liu (Commun Stat Theory Methods 22:393–402, 1993). Necessary and sufficient conditions for the superiority of the new stochastic restricted Liu estimator over the OME, the Liu estimator and the estimator proposed by Hubert and Wijekoon (Stat Pap 47:471–479, 2006) in the mean squared error matrix (MSEM) sense are derived. Furthermore, a numerical example based on the widely analysed dataset on Portland cement (Woods et al. in Ind Eng Chem 24:1207–1241, 1932) and a Monte Carlo evaluation of the estimators are also given to illustrate some of the theoretical results.  相似文献   

6.
Statistical Methods & Applications - Semiparametric likelihoods for regression models with missing at random data (Chen in J Am Stat Assoc 99:1176–1189, 2004, Zhang and Rockette in J Stat...  相似文献   

7.
In empirical Bayes inference one is typically interested in sampling from the posterior distribution of a parameter with a hyper-parameter set to its maximum likelihood estimate. This is often problematic particularly when the likelihood function of the hyper-parameter is not available in closed form and the posterior distribution is intractable. Previous works have dealt with this problem using a multi-step approach based on the EM algorithm and Markov Chain Monte Carlo (MCMC). We propose a framework based on recent developments in adaptive MCMC, where this problem is addressed more efficiently using a single Monte Carlo run. We discuss the convergence of the algorithm and its connection with the EM algorithm. We apply our algorithm to the Bayesian Lasso of Park and Casella (J. Am. Stat. Assoc. 103:681–686, 2008) and on the empirical Bayes variable selection of George and Foster (J. Am. Stat. Assoc. 87:731–747, 2000).  相似文献   

8.
Penalized variable selection methods have been extensively studied for standard time-to-event data. Such methods cannot be directly applied when subjects are at risk of multiple mutually exclusive events, known as competing risks. The proportional subdistribution hazard (PSH) model proposed by Fine and Gray (J Am Stat Assoc 94:496–509, 1999) has become a popular semi-parametric model for time-to-event data with competing risks. It allows for direct assessment of covariate effects on the cumulative incidence function. In this paper, we propose a general penalized variable selection strategy that simultaneously handles variable selection and parameter estimation in the PSH model. We rigorously establish the asymptotic properties of the proposed penalized estimators and modify the coordinate descent algorithm for implementation. Simulation studies are conducted to demonstrate the good performance of the proposed method. Data from deceased donor kidney transplants from the United Network of Organ Sharing illustrate the utility of the proposed method.  相似文献   

9.
Rasul A. Khan 《Statistics》2015,49(3):705-710
Let X1, X2, …, Xn be iid N(μ, aμ2) (a>0) random variables with an unknown mean μ>0 and known coefficient of variation (CV) √a. The estimation of μ is revisited and it is shown that a modified version of an unbiased estimator of μ [cf. Khan RA. A note on estimating the mean of a normal distribution with known CV. J Am Stat Assoc. 1968;63:1039–1041] is more efficient. A certain linear minimum mean square estimator of Gleser and Healy [Estimating the mean of a normal distribution with known CV. J Am Stat Assoc. 1976;71:977–981] is also modified and improved. These improved estimators are being compared with the maximum likelihood estimator under squared-error loss function. Based on asymptotic consideration, a large sample confidence interval is also mentioned.  相似文献   

10.
Four testing procedures are considered for testing the response rate of one sample correlated binary data with a cluster size of one or two, which often occurs in otolaryngologic and ophthalmologic studies. Although an asymptotic approach is often used for statistical inference, it is criticized for unsatisfactory type I error control in small sample settings. An alternative to the asymptotic approach is an unconditional approach. The first unconditional approach is the one based on estimation, also known as parametric bootstrap (Lee and Young in Stat Probab Lett 71(2):143–153, 2005). The other two unconditional approaches considered in this article are an approach based on maximization (Basu in J Am Stat Assoc 72(358):355–366, 1977), and an approach based on estimation and maximization (Lloyd in Biometrics 64(3):716–723, 2008a). These two unconditional approaches guarantee the test size and are generally more reliable than the asymptotic approach. We compare these four approaches in conjunction with a test proposed by Lee and Dubin (Stat Med 13(12):1241–1252, 1994) and a likelihood ratio test derived in this article, in regards to type I error rate and power for sample sizes from small to medium. An example from an otolaryngologic study is provided to illustrate the various testing procedures. The unconditional approach based on estimation and maximization using the test in Lee and Dubin (Stat Med 13(12):1241–1252, 1994) is preferable due to the power advantageous.  相似文献   

11.
The randomized response technique (RRT) is an important tool that is commonly used to protect a respondent’s privacy and avoid biased answers in surveys on sensitive issues. In this work, we consider the joint use of the unrelated-question RRT of Greenberg et al. (J Am Stat Assoc 64:520–539, 1969) and the related-question RRT of Warner (J Am Stat Assoc 60:63–69, 1965) dealing with the issue of an innocuous question from the unrelated-question RRT. Unlike the existing unrelated-question RRT of Greenberg et al. (1969), the approach can provide more information on the innocuous question by using the related-question RRT of Warner (1965) to effectively improve the efficiency of the maximum likelihood estimator of Scheers and Dayton (J Am Stat Assoc 83:969–974, 1988). We can then estimate the prevalence of the sensitive characteristic by using logistic regression. In this new design, we propose the transformation method and provide large-sample properties. From the case of two survey studies, an extramarital relationship study and a cable TV study, we develop the joint conditional likelihood method. As part of this research, we conduct a simulation study of the relative efficiencies of the proposed methods. Furthermore, we use the two survey studies to compare the analysis results under different scenarios.  相似文献   

12.
Recurrent event data occur in many clinical and observational studies (Cook and Lawless, Analysis of recurrent event data, 2007) and in these situations, there may exist a terminal event such as death that is related to the recurrent event of interest (Ghosh and Lin, Biometrics 56:554–562, 2000; Wang et al., J Am Stat Assoc 96:1057–1065, 2001; Huang and Wang, J Am Stat Assoc 99:1153–1165, 2004; Ye et al., Biometrics 63:78–87, 2007). In addition, sometimes there may exist more than one type of recurrent events, that is, one faces multivariate recurrent event data with some dependent terminal event (Chen and Cook, Biostatistics 5:129–143, 2004). It is apparent that for the analysis of such data, one has to take into account the dependence both among different types of recurrent events and between the recurrent and terminal events. In this paper, we propose a joint modeling approach for regression analysis of the data and both finite and asymptotic properties of the resulting estimates of unknown parameters are established. The methodology is applied to a set of bivariate recurrent event data arising from a study of leukemia patients.  相似文献   

13.
Clinical studies aimed at identifying effective treatments to reduce the risk of disease or death often require long term follow-up of participants in order to observe a sufficient number of events to precisely estimate the treatment effect. In such studies, observing the outcome of interest during follow-up may be difficult and high rates of censoring may be observed which often leads to reduced power when applying straightforward statistical methods developed for time-to-event data. Alternative methods have been proposed to take advantage of auxiliary information that may potentially improve efficiency when estimating marginal survival and improve power when testing for a treatment effect. Recently, Parast et al. (J Am Stat Assoc 109(505):384–394, 2014) proposed a landmark estimation procedure for the estimation of survival and treatment effects in a randomized clinical trial setting and demonstrated that significant gains in efficiency and power could be obtained by incorporating intermediate event information as well as baseline covariates. However, the procedure requires the assumption that the potential outcomes for each individual under treatment and control are independent of treatment group assignment which is unlikely to hold in an observational study setting. In this paper we develop the landmark estimation procedure for use in an observational setting. In particular, we incorporate inverse probability of treatment weights (IPTW) in the landmark estimation procedure to account for selection bias on observed baseline (pretreatment) covariates. We demonstrate that consistent estimates of survival and treatment effects can be obtained by using IPTW and that there is improved efficiency by using auxiliary intermediate event and baseline information. We compare our proposed estimates to those obtained using the Kaplan–Meier estimator, the original landmark estimation procedure, and the IPTW Kaplan–Meier estimator. We illustrate our resulting reduction in bias and gains in efficiency through a simulation study and apply our procedure to an AIDS dataset to examine the effect of previous antiretroviral therapy on survival.  相似文献   

14.
A class of tests due to Shoemaker (Commun Stat Simul Comput 28: 189–205, 1999) for differences in scale which is valid for a variety of both skewed and symmetric distributions when location is known or unknown is considered. The class is based on the interquantile range and requires that the population variances are finite. In this paper, we firstly propose a permutation version of it that does not require the condition of finite variances and is remarkably more powerful than the original one. Secondly we solve the question of what quantile choose by proposing a combined interquantile test based on our permutation version of Shoemaker tests. Shoemaker showed that the more extreme interquantile range tests are more powerful than the less extreme ones, unless the underlying distributions are very highly skewed. Since in practice you may not know if the underlying distributions are very highly skewed or not, the question arises. The combined interquantile test solves this question, is robust and more powerful than the stand alone tests. Thirdly we conducted a much more detailed simulation study than that of Shoemaker (1999) that compared his tests to the F and the squared rank tests showing that his tests are better. Since the F and the squared rank test are not good for differences in scale, his results suffer of such a drawback, and for this reason instead of considering the squared rank test we consider, following the suggestions of several authors, tests due to Brown–Forsythe (J Am Stat Assoc 69:364–367, 1974), Pan (J Stat Comput Simul 63:59–71, 1999), O’Brien (J Am Stat Assoc 74:877–880, 1979) and Conover et al. (Technometrics 23:351–361, 1981).  相似文献   

15.
In estimating the proportion of people bearing a sensitive attribute A, say, in a given community, following Warner’s (J Am Stat Assoc 60:63–69, 1965) pioneering work, certain randomized response (RR) techniques are available for application. These are intended to ensure efficient and unbiased estimation protecting a respondent’s privacy when it touches a person’s socially stigmatizing feature like rash driving, tax evasion, induced abortion, testing HIV positive, etc. Lanke (Int Stat Rev 44:197–203, 1976), Leysieffer and Warner (J Am Stat Assoc 71:649–656, 1976), Anderson (Int Stat Rev 44:213–217, 1976, Scand J Stat 4:11–19, 1977) and Nayak (Commun Stat Theor Method 23:3303–3321, 1994) among others have discussed how maintenance of efficiency is in conflict with protection of privacy. In their RR-related activities the sample selection is traditionally by simple random sampling (SRS) with replacement (WR). In this paper, an extension of an essential similarity in case of general unequal probability sample selection even without replacement is reported. Large scale surveys overwhelmingly employ complex designs other than SRSWR. So extension of RR techniques to complex designs is essential and hence this paper principally refers to them. New jeopardy measures to protect revelation of secrecy presented here are needed as modifications of those in the literature covering SRSWR alone. Observing that multiple responses are feasible in addressing such a dichotomous situation especially with Kuk’s (Biometrika 77:436–438, 1990) and Christofides’ (Metrika 57:195–200, 2003) RR devices, an average of the response-specific jeopardizing measures is proposed. This measure which is device dependent, could be regarded as a technical characteristic of the device and it should be made known to the participants before they agree to use the randomization device. The views expressed are the authors’, not of the organizations they work for. Prof Chaudhuri’s research is partially supported by CSIR Grant No. 21(0539)/02/EMR-II.  相似文献   

16.
This article deals with a new profile empirical-likelihood inference for a class of frequently used single-index-coefficient regression models (SICRM), which were proposed by Xia and Li (J. Am. Stat. Assoc. 94:1275–1285, 1999a). Applying the empirical likelihood method (Owen in Biometrika 75:237–249, 1988), a new estimated empirical log-likelihood ratio statistic for the index parameter of the SICRM is proposed. To increase the accuracy of the confidence region, a new profile empirical likelihood for each component of the relevant parameter is obtained by using maximum empirical likelihood estimators (MELE) based on a new and simple estimating equation for the parameters in the SICRM. Hence, the empirical likelihood confidence interval for each component is investigated. Furthermore, corrected empirical likelihoods for functional components are also considered. The resulting statistics are shown to be asymptotically standard chi-squared distributed. Simulation studies are undertaken to assess the finite sample performance of our method. A study of real data is also reported.  相似文献   

17.
The simulation-extrapolation (SIMEX) approach of Cook and Stefanski (J. Am. Stat. Assoc. 89:1314–1328, 1994) has proved to be successful in obtaining reliable estimates if variables are measured with (additive) errors. In particular for nonlinear models, this approach has advantages compared to other procedures such as the instrumental variable approach if only variables measured with error are available. However, it has always been assumed that measurement errors for the dependent variable are not correlated with those related to the explanatory variables although such scenario is quite likely. In such a case the (standard) SIMEX suffers from misspecification even for the simple linear regression model. Our paper reports first results from a generalized SIMEX (GSIMEX) approach which takes account of this correlation. We also demonstrate in our simulation study that neglect of the correlation will lead to estimates which may be worse than those from the naive estimator which completely disregards measurement errors.  相似文献   

18.
In this paper, we study the MDPDE (minimizing a density power divergence estimator), proposed by Basu et al. (Biometrika 85:549–559, 1998), for mixing distributions whose component densities are members of some known parametric family. As with the ordinary MDPDE, we also consider a penalized version of the estimator, and show that they are consistent in the sense of weak convergence. A simulation result is provided to illustrate the robustness. Finally, we apply the penalized method to analyzing the red blood cell SLC data presented in Roeder (J Am Stat Assoc 89:487–495, 1994). This research was supported (in part) by KOSEF through Statistical Research Center for Complex Systems at Seoul National University.  相似文献   

19.
This work studies outlier detection and robust estimation with data that are naturally distributed into groups and which follow approximately a linear regression model with fixed group effects. For this, several methods are considered. First, the robust fitting method of Peña and Yohai [A fast procedure for outlier diagnostics in large regression problems. J Am Stat Assoc. 1999;94:434–445], called principal sensitivity components (PSC) method, is adapted to the grouped data structure and the mentioned model. The robust methods RDL1 of Hubert and Rousseeuw [Robust regression with both continuous and binary regressors. J Stat Plan Inference. 1997;57:153–163] and M-S of Maronna and Yohai [Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 2000;89:197–214] are also considered. These three methods are compared in terms of their effectiveness in outlier detection and their robustness through simulations, considering several contamination scenarios and growing contamination levels. Results indicate that the adapted PSC procedure is able to detect a high percentage of true outliers and a small number of false outliers. It is appropriate when the contamination is in the error term or in the covariates, detecting also possibly masked high leverage points. Moreover, in simulations the final robust regression estimator preserved good efficiency under Normality while keeping good robustness properties.  相似文献   

20.
In order to guarantee confidentiality and privacy of firm-level data, statistical offices apply various disclosure limitation techniques. However, each anonymization technique has its protection limits such that the probability of disclosing the individual information for some observations is not minimized. To overcome this problem, we propose combining two separate disclosure limitation techniques, blanking and multiplication of independent noise, in order to protect the original dataset. The proposed approach yields a decrease in the probability of reidentifying/disclosing individual information and can be applied to linear and nonlinear regression models. We show how to combine the blanking method with the multiplicative measurement error method and how to estimate the model by combining the multiplicative Simulation-Extrapolation (M-SIMEX) approach from Nolte (, 2007) on the one side with the Inverse Probability Weighting (IPW) approach going back to Horwitz and Thompson (J. Am. Stat. Assoc. 47:663–685, 1952) and on the other side with matching methods, as an alternative to IPW, like the semiparametric M-Estimator proposed by Flossmann (, 2007). Based on Monte Carlo simulations, we show that multiplicative measurement error combined with blanking as a masking procedure does not necessarily lead to a severe reduction in the estimation quality, provided that its effects on the data generating process are known.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号