首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In randomized clinical trials, we are often concerned with comparing two-sample survival data. Although the log-rank test is usually suitable for this purpose, it may result in substantial power loss when the two groups have nonproportional hazards. In a more general class of survival models of Yang and Prentice (Biometrika 92:1–17, 2005), which includes the log-rank test as a special case, we improve model efficiency by incorporating auxiliary covariates that are correlated with the survival times. In a model-free form, we augment the estimating equation with auxiliary covariates, and establish the efficiency improvement using the semiparametric theories in Zhang et al. (Biometrics 64:707–715, 2008) and Lu and Tsiatis (Biometrics, 95:674–679, 2008). Under minimal assumptions, our approach produces an unbiased, asymptotically normal estimator with additional efficiency gain. Simulation studies and an application to a leukemia study show the satisfactory performance of the proposed method.  相似文献   

2.
For a confidence interval (L(X),U(X)) of a parameter θ in one-parameter discrete distributions, the coverage probability is a variable function of θ. The confidence coefficient is the infimum of the coverage probabilities, inf  θ P θ (θ∈(L(X),U(X))). Since we do not know which point in the parameter space the infimum coverage probability occurs at, the exact confidence coefficients are unknown. Beside confidence coefficients, evaluation of a confidence intervals can be based on the average coverage probability. Usually, the exact average probability is also unknown and it was approximated by taking the mean of the coverage probabilities at some randomly chosen points in the parameter space. In this article, methodologies for computing the exact average coverage probabilities as well as the exact confidence coefficients of confidence intervals for one-parameter discrete distributions are proposed. With these methodologies, both exact values can be derived.  相似文献   

3.
Abstract. The focus of this article is on simultaneous confidence bands over a rectangular covariate region for a linear regression model with k>1 covariates, for which only conservative or approximate confidence bands are available in the statistical literature stretching back to Working & Hotelling (J. Amer. Statist. Assoc. 24 , 1929; 73–85). Formulas of simultaneous confidence levels of the hyperbolic and constant width bands are provided. These involve only a k‐dimensional integral; it is unlikely that the simultaneous confidence levels can be expressed as an integral of less than k‐dimension. These formulas allow the construction for the first time of exact hyperbolic and constant width confidence bands for at least a small k(>1) by using numerical quadrature. Comparison between the hyperbolic and constant width bands is then addressed under both the average width and minimum volume confidence set criteria. It is observed that the constant width band can be drastically less efficient than the hyperbolic band when k>1. Finally it is pointed out how the methods given in this article can be applied to more general regression models such as fixed‐effect or random‐effect generalized linear regression models.  相似文献   

4.
The cumulative incidence function provides intuitive summary information about competing risks data. Via a mixture decomposition of this function, Chang and Wang (Statist. Sinca 19:391–408, 2009) study how covariates affect the cumulative incidence probability of a particular failure type at a chosen time point. Without specifying the corresponding failure time distribution, they proposed two estimators and derived their large sample properties. The first estimator utilized the technique of weighting to adjust for the censoring bias, and can be considered as an extension of Fine’s method (J R Stat Soc Ser B 61: 817–830, 1999). The second used imputation and extends the idea of Wang (J R Stat Soc Ser B 65: 921–935, 2003) from a nonparametric setting to the current regression framework. In this article, when covariates take only discrete values, we extend both approaches of Chang and Wang (Statist Sinca 19:391–408, 2009) by allowing left truncation. Large sample properties of the proposed estimators are derived, and their finite sample performance is investigated through a simulation study. We also apply our methods to heart transplant survival data.  相似文献   

5.
In many complex diseases such as cancer, a patient undergoes various disease stages before reaching a terminal state (say disease free or death). This fits a multistate model framework where a prognosis may be equivalent to predicting the state occupation at a future time t. With the advent of high-throughput genomic and proteomic assays, a clinician may intent to use such high-dimensional covariates in making better prediction of state occupation. In this article, we offer a practical solution to this problem by combining a useful technique, called pseudo-value (PV) regression, with a latent factor or a penalized regression method such as the partial least squares (PLS) or the least absolute shrinkage and selection operator (LASSO), or their variants. We explore the predictive performances of these combinations in various high-dimensional settings via extensive simulation studies. Overall, this strategy works fairly well provided the models are tuned properly. Overall, the PLS turns out to be slightly better than LASSO in most settings investigated by us, for the purpose of temporal prediction of future state occupation. We illustrate the utility of these PV-based high-dimensional regression methods using a lung cancer data set where we use the patients’ baseline gene expression values.  相似文献   

6.
In view of its ongoing importance for a variety of practical applications, feature selection via 1-regularization methods like the lasso has been subject to extensive theoretical as well empirical investigations. Despite its popularity, mere 1-regularization has been criticized for being inadequate or ineffective, notably in situations in which additional structural knowledge about the predictors should be taken into account. This has stimulated the development of either systematically different regularization methods or double regularization approaches which combine 1-regularization with a second kind of regularization designed to capture additional problem-specific structure. One instance thereof is the ‘structured elastic net’, a generalization of the proposal in Zou and Hastie (J. R. Stat. Soc. Ser. B 67:301–320, 2005), studied in Slawski et al. (Ann. Appl. Stat. 4(2):1056–1080, 2010) for the class of generalized linear models.  相似文献   

7.
Clusters of galaxies are a useful proxy to trace the distribution of mass in the universe. By measuring the mass of clusters of galaxies on different scales, one can follow the evolution of the mass distribution (Martínez and Saar, Statistics of the Galaxy Distribution, 2002). It can be shown that finding galaxy clusters is equivalent to finding density contour clusters (Hartigan, Clustering Algorithms, 1975): connected components of the level set S c ≡{f>c} where f is a probability density function. Cuevas et al. (Can. J. Stat. 28, 367–382, 2000; Comput. Stat. Data Anal. 36, 441–459, 2001) proposed a nonparametric method for density contour clusters, attempting to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data.  相似文献   

8.
We consider a linear regression model where there are group structures in covariates. The group LASSO has been proposed for group variable selections. Many nonconvex penalties such as smoothly clipped absolute deviation and minimax concave penalty were extended to group variable selection problems. The group coordinate descent (GCD) algorithm is used popularly for fitting these models. However, the GCD algorithms are hard to be applied to nonconvex group penalties due to computational complexity unless the design matrix is orthogonal. In this paper, we propose an efficient optimization algorithm for nonconvex group penalties by combining the concave convex procedure and the group LASSO algorithm. We also extend the proposed algorithm for generalized linear models. We evaluate numerical efficiency of the proposed algorithm compared to existing GCD algorithms through simulated data and real data sets.  相似文献   

9.
In this study, we consider the problem of selecting explanatory variables of fixed effects in linear mixed models under covariate shift, which is when the values of covariates in the model for prediction differ from those in the model for observed data. We construct a variable selection criterion based on the conditional Akaike information introduced by Vaida & Blanchard (2005). We focus especially on covariate shift in small area estimation and demonstrate the usefulness of the proposed criterion. In addition, numerical performance is investigated through simulations, one of which is a design‐based simulation using a real dataset of land prices. The Canadian Journal of Statistics 46: 316–335; 2018 © 2018 Statistical Society of Canada  相似文献   

10.
11.
There are many situations where the usual random sample from a population of interest is not available, due to the data having unequal probabilities of entering the sample. The method of weighted distributions models this ascertainment bias by adjusting the probabilities of actual occurrence of events to arrive at a specification of the probabilities of the events as observed and recorded. We consider two different classes of contaminated or mixture of weight functions, Γ a ={w(x):w(x)=(1−ε)w 0(x)+εq(x),qQ} and Γ g ={w(x):w(x)=w 0 1−ε (x)q ε(x),qQ} wherew 0(x) is the elicited weighted function,Q is a class of positive functions and 0≤ε≤1 is a small number. Also, we study the local variation of ϕ-divergence over classes Γ a and Γ g . We devote on measuring robustness using divergence measures which is based on the Bayesian approach. Two examples will be studied.  相似文献   

12.
Releases of GDP data undergo a series of revisions over time. These revisions have an impact on the results of macroeconometric models documented by the growing literature on real-time data applications. Revisions of U.S. GDP data can be explained and are partly predictable according to Faust et al. (J. Money Credit Bank. 37(3):403–419, 2005) or Fixler and Grimm (J. Product. Anal. 25:213–229, 2006). This analysis proposes the inclusion of mixed frequency data for forecasting GDP revisions. Thereby, the information set available around the first data vintage can be better exploited than the pure quarterly data. In-sample and out-of-sample results suggest that forecasts of GDP revisions can be improved by using mixed frequency data.  相似文献   

13.
When a (p+q)-variate column vector (x′,y′)′ has a (p+q)-variate normal density with mean vector (μ12) and covariance matrix Ω, unknown, Schervish (1980) obtains prediction intervals for the linear functions of a future y, given x. He bases the prediction interval on the F-distribution. However, for a specified linear function the statistic to be used is Student's t, since the prediction intervals based on t are shorter than those based on F. Similar results hold for the multivariate linear regression model.  相似文献   

14.
In this paper, we consider the non-penalty shrinkage estimation method of random effect models with autoregressive errors for longitudinal data when there are many covariates and some of them may not be active for the response variable. In observational studies, subjects are followed over equally or unequally spaced visits to determine the continuous response and whether the response is associated with the risk factors/covariates. Measurements from the same subject are usually more similar to each other and thus are correlated with each other but not with observations of other subjects. To analyse this data, we consider a linear model that contains both random effects across subjects and within-subject errors that follows autoregressive structure of order 1 (AR(1)). Considering the subject-specific random effect as a nuisance parameter, we use two competing models, one includes all the covariates and the other restricts the coefficients based on the auxiliary information. We consider the non-penalty shrinkage estimation strategy that shrinks the unrestricted estimator in the direction of the restricted estimator. We discuss the asymptotic properties of the shrinkage estimators using the notion of asymptotic biases and risks. A Monte Carlo simulation study is conducted to examine the relative performance of the shrinkage estimators with the unrestricted estimator when the shrinkage dimension exceeds two. We also numerically compare the performance of the shrinkage estimators to that of the LASSO estimator. A longitudinal CD4 cell count data set will be used to illustrate the usefulness of shrinkage and LASSO estimators.  相似文献   

15.
Mixture experiments are commonly encountered in many fields including chemical, pharmaceutical and consumer product industries. Due to their wide applications, mixture experiments, a special study of response surface methodology, have been given greater attention in both model building and determination of designs compared with other experimental studies. In this paper, some new approaches are suggested on model building and selection for the analysis of the data in mixture experiments by using a special generalized linear models, logistic regression model, proposed by Chen et al. [7]. Generally, the special mixture models, which do not have a constant term, are highly affected by collinearity in modeling the mixture experiments. For this reason, in order to alleviate the undesired effects of collinearity in the analysis of mixture experiments with logistic regression, a new mixture model is defined with an alternative ratio variable. The deviance analysis table is given for standard mixture polynomial models defined by transformations and special mixture models used as linear predictors. The effects of components on the response in the restricted experimental region are given by using an alternative representation of Cox's direction approach. In addition, odds ratio and the confidence intervals of odds ratio are identified according to the chosen reference and control groups. To compare the suggested models, some model selection criteria, graphical odds ratio and the confidence intervals of the odds ratio are used. The advantage of the suggested approaches is illustrated on tumor incidence data set.  相似文献   

16.
This paper considers the problem of modeling migraine severity assessments and their dependence on weather and time characteristics. We take on the viewpoint of a patient who is interested in an individual migraine management strategy. Since factors influencing migraine can differ between patients in number and magnitude, we show how a patient’s headache calendar reporting the severity measurements on an ordinal scale can be used to determine the dominating factors for this special patient. One also has to account for dependencies among the measurements. For this the autoregressive ordinal probit (AOP) model of Müller and Czado (J Comput Graph Stat 14: 320–338, 2005) is utilized and fitted to a single patient’s migraine data by a grouped move multigrid Monte Carlo (GM-MGMC) Gibbs sampler. Initially, covariates are selected using proportional odds models. Model fit and model comparison are discussed. A comparison with proportional odds specifications shows that the AOP models are preferred.  相似文献   

17.
This paper describes the performance of specific-to-general composition of forecasting models that accord with (approximate) linear autoregressions. Monte Carlo experiments are complemented with ex-ante forecasting results for 97 macroeconomic time series collected for the G7 economies in Stock and Watson (J. Forecast. 23:405–430, 2004). In small samples, the specific-to-general strategy is superior in terms of ex-ante forecasting performance in comparison with a commonly applied strategy of successive model reduction according to weakest parameter significance. Applied to real data, the specific-to-general approach turns out to be preferable. In comparison with successive model reduction, the successive model expansion is less likely to involve overly large losses in forecast accuracy and is particularly recommended if the diagnosed prediction schemes are characterized by a medium to large number of predictors.  相似文献   

18.
This paper considers a linear regression model with regression parameter vector β. The parameter of interest is θ= aTβ where a is specified. When, as a first step, a data‐based variable selection (e.g. minimum Akaike information criterion) is used to select a model, it is common statistical practice to then carry out inference about θ, using the same data, based on the (false) assumption that the selected model had been provided a priori. The paper considers a confidence interval for θ with nominal coverage 1 ‐ α constructed on this (false) assumption, and calls this the naive 1 ‐ α confidence interval. The minimum coverage probability of this confidence interval can be calculated for simple variable selection procedures involving only a single variable. However, the kinds of variable selection procedures used in practice are typically much more complicated. For the real‐life data presented in this paper, there are 20 variables each of which is to be either included or not, leading to 220 different models. The coverage probability at any given value of the parameters provides an upper bound on the minimum coverage probability of the naive confidence interval. This paper derives a new Monte Carlo simulation estimator of the coverage probability, which uses conditioning for variance reduction. For these real‐life data, the gain in efficiency of this Monte Carlo simulation due to conditioning ranged from 2 to 6. The paper also presents a simple one‐dimensional search strategy for parameter values at which the coverage probability is relatively small. For these real‐life data, this search leads to parameter values for which the coverage probability of the naive 0.95 confidence interval is 0.79 for variable selection using the Akaike information criterion and 0.70 for variable selection using Bayes information criterion, showing that these confidence intervals are completely inadequate.  相似文献   

19.
This paper considers the problem of hypothesis testing in a simple panel data regression model with random individual effects and serially correlated disturbances. Following Baltagi et al. (Econom. J. 11:554–572, 2008), we allow for the possibility of non-stationarity in the regressor and/or the disturbance term. While Baltagi et al. (Econom. J. 11:554–572, 2008) focus on the asymptotic properties and distributions of the standard panel data estimators, this paper focuses on testing of hypotheses in this setting. One important finding is that unlike the time-series case, one does not necessarily need to rely on the “super-efficient” type AR estimator by Perron and Yabu (J. Econom. 151:56–69, 2009) to make an inference in the panel data. In fact, we show that the simple t-ratio always converges to the standard normal distribution, regardless of whether the disturbances and/or the regressor are stationary.  相似文献   

20.
This paper presents estimates for the parameters included in the Block and Basu bivariate lifetime distributions in the presence of covariates and cure fraction, applied to analyze survival data when some individuals may never experience the event of interest and two lifetimes are associated with each unit. A Bayesian procedure is used to get point and confidence intervals for the unknown parameters. Posterior summaries of interest are obtained using standard Markov Chain Monte Carlo methods in rjags package for R software. An illustration of the proposed methodology is given for a Diabetic Retinopathy Study data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号