期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficiency improvement in a class of survival models through model-free covariate incorporation 总被引：1，自引：1，他引：0

Garcia TP Ma Y Yin G 《Lifetime data analysis》2011,17(4):552-565

In randomized clinical trials, we are often concerned with comparing two-sample survival data. Although the log-rank test is usually suitable for this purpose, it may result in substantial power loss when the two groups have nonproportional hazards. In a more general class of survival models of Yang and Prentice (Biometrika 92:1–17, 2005), which includes the log-rank test as a special case, we improve model efficiency by incorporating auxiliary covariates that are correlated with the survival times. In a model-free form, we augment the estimating equation with auxiliary covariates, and establish the efficiency improvement using the semiparametric theories in Zhang et al. (Biometrics 64:707–715, 2008) and Lu and Tsiatis (Biometrics, 95:674–679, 2008). Under minimal assumptions, our approach produces an unbiased, asymptotically normal estimator with additional efficiency gain. Simulation studies and an application to a leukemia study show the satisfactory performance of the proposed method. 相似文献

2.

Exact average coverage probabilities and confidence coefficients of confidence intervals for discrete distributions 总被引：1，自引：0，他引：1

Hsiuying Wang 《Statistics and Computing》2009,19(2):139-148

For a confidence interval (L(X),U(X)) of a parameter θ in one-parameter discrete distributions, the coverage probability is a variable function of θ. The confidence coefficient is the infimum of the coverage probabilities, inf _θ P _θ(θ∈(L(X),U(X))). Since we do not know which point in the parameter space the infimum coverage probability occurs at, the exact confidence coefficients are unknown. Beside confidence coefficients, evaluation of a confidence intervals can be based on the average coverage probability. Usually, the exact average probability is also unknown and it was approximated by taking the mean of the coverage probabilities at some randomly chosen points in the parameter space. In this article, methodologies for computing the exact average coverage probabilities as well as the exact confidence coefficients of confidence intervals for one-parameter discrete distributions are proposed. With these methodologies, both exact values can be derived. 相似文献

3.

Simultaneous Confidence Bands for Linear Regression with Covariates Constrained in Intervals

WEI LIU PASCAL AH‐KINE SANYU ZHOU 《Scandinavian Journal of Statistics》2012,39(3):543-553

Abstract. The focus of this article is on simultaneous confidence bands over a rectangular covariate region for a linear regression model with k>1 covariates, for which only conservative or approximate confidence bands are available in the statistical literature stretching back to Working & Hotelling (J. Amer. Statist. Assoc. 24 , 1929; 73–85). Formulas of simultaneous confidence levels of the hyperbolic and constant width bands are provided. These involve only a k‐dimensional integral; it is unlikely that the simultaneous confidence levels can be expressed as an integral of less than k‐dimension. These formulas allow the construction for the first time of exact hyperbolic and constant width confidence bands for at least a small k(>1) by using numerical quadrature. Comparison between the hyperbolic and constant width bands is then addressed under both the average width and minimum volume confidence set criteria. It is observed that the constant width band can be drastically less efficient than the hyperbolic band when k>1. Finally it is pointed out how the methods given in this article can be applied to more general regression models such as fixed‐effect or random‐effect generalized linear regression models. 相似文献

4.

Regression analysis for cumulative incidence probability under competing risks and left-truncated sampling

Shen PS 《Lifetime data analysis》2012,18(1):1-18

The cumulative incidence function provides intuitive summary information about competing risks data. Via a mixture decomposition of this function, Chang and Wang (Statist. Sinca 19:391–408, 2009) study how covariates affect the cumulative incidence probability of a particular failure type at a chosen time point. Without specifying the corresponding failure time distribution, they proposed two estimators and derived their large sample properties. The first estimator utilized the technique of weighting to adjust for the censoring bias, and can be considered as an extension of Fine’s method (J R Stat Soc Ser B 61: 817–830, 1999). The second used imputation and extends the idea of Wang (J R Stat Soc Ser B 65: 921–935, 2003) from a nonparametric setting to the current regression framework. In this article, when covariates take only discrete values, we extend both approaches of Chang and Wang (Statist Sinca 19:391–408, 2009) by allowing left truncation. Large sample properties of the proposed estimators are derived, and their finite sample performance is investigated through a simulation study. We also apply our methods to heart transplant survival data. 相似文献

5.

Temporal prediction of future state occupation in a multistate model from high-dimensional baseline covariates via pseudo-value regression

Sandipan Dutta Susmita Datta Somnath Datta 《Journal of Statistical Computation and Simulation》2017,87(7):1363-1378

In many complex diseases such as cancer, a patient undergoes various disease stages before reaching a terminal state (say disease free or death). This fits a multistate model framework where a prognosis may be equivalent to predicting the state occupation at a future time t. With the advent of high-throughput genomic and proteomic assays, a clinician may intent to use such high-dimensional covariates in making better prediction of state occupation. In this article, we offer a practical solution to this problem by combining a useful technique, called pseudo-value (PV) regression, with a latent factor or a penalized regression method such as the partial least squares (PLS) or the least absolute shrinkage and selection operator (LASSO), or their variants. We explore the predictive performances of these combinations in various high-dimensional settings via extensive simulation studies. Overall, this strategy works fairly well provided the models are tuned properly. Overall, the PLS turns out to be slightly better than LASSO in most settings investigated by us, for the purpose of temporal prediction of future state occupation. We illustrate the utility of these PV-based high-dimensional regression methods using a lung cancer data set where we use the patients’ baseline gene expression values. 相似文献

6.

The structured elastic net for quantile regression and?support vector classification

Martin Slawski 《Statistics and Computing》2012,22(1):153-168

In view of its ongoing importance for a variety of practical applications, feature selection via ℓ ₁-regularization methods like the lasso has been subject to extensive theoretical as well empirical investigations. Despite its popularity, mere ℓ ₁-regularization has been criticized for being inadequate or ineffective, notably in situations in which additional structural knowledge about the predictors should be taken into account. This has stimulated the development of either systematically different regularization methods or double regularization approaches which combine ℓ ₁-regularization with a second kind of regularization designed to capture additional problem-specific structure. One instance thereof is the ‘structured elastic net’, a generalization of the proposal in Zou and Hastie (J. R. Stat. Soc. Ser. B 67:301–320, 2005), studied in Slawski et al. (Ann. Appl. Stat. 4(2):1056–1080, 2010) for the class of generalized linear models. 相似文献

7.

Cluster analysis of massive datasets in astronomy

Woncheol Jang Martin Hendry 《Statistics and Computing》2007,17(3):253-262

Clusters of galaxies are a useful proxy to trace the distribution of mass in the universe. By measuring the mass of clusters of galaxies on different scales, one can follow the evolution of the mass distribution (Martínez and Saar, Statistics of the Galaxy Distribution, 2002). It can be shown that finding galaxy clusters is equivalent to finding density contour clusters (Hartigan, Clustering Algorithms, 1975): connected components of the level set S _c≡{f>c} where f is a probability density function. Cuevas et al. (Can. J. Stat. 28, 367–382, 2000; Comput. Stat. Data Anal. 36, 441–459, 2001) proposed a nonparametric method for density contour clusters, attempting to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data. 相似文献

8.

Sparse optimization for nonconvex group penalized estimation

《Journal of Statistical Computation and Simulation》2012,82(3):597-610

We consider a linear regression model where there are group structures in covariates. The group LASSO has been proposed for group variable selections. Many nonconvex penalties such as smoothly clipped absolute deviation and minimax concave penalty were extended to group variable selection problems. The group coordinate descent (GCD) algorithm is used popularly for fitting these models. However, the GCD algorithms are hard to be applied to nonconvex group penalties due to computational complexity unless the design matrix is orthogonal. In this paper, we propose an efficient optimization algorithm for nonconvex group penalties by combining the concave convex procedure and the group LASSO algorithm. We also extend the proposed algorithm for generalized linear models. We evaluate numerical efficiency of the proposed algorithm compared to existing GCD algorithms through simulated data and real data sets. 相似文献

9.

Conditional Akaike information under covariate shift with application to small area estimation

下载免费PDF全文

Yuki Kawakubo Shonosuke Sugasawa Tatsuya Kubokawa 《Revue canadienne de statistique》2018,46(2):316-335

In this study, we consider the problem of selecting explanatory variables of fixed effects in linear mixed models under covariate shift, which is when the values of covariates in the model for prediction differ from those in the model for observed data. We construct a variable selection criterion based on the conditional Akaike information introduced by Vaida & Blanchard (2005). We focus especially on covariate shift in small area estimation and demonstrate the usefulness of the proposed criterion. In addition, numerical performance is investigated through simulations, one of which is a design‐based simulation using a real dataset of land prices. The Canadian Journal of Statistics 46: 316–335; 2018 © 2018 Statistical Society of Canada 相似文献

10.

Hierarchical Bayesian LASSO for a negative binomial regression

《Journal of Statistical Computation and Simulation》2012,82(11):2182-2203

相似文献

11.

Measuring robustness for weighted distributions: Bayesian perspective

Younshik Chung Chansoo Kim 《Statistical Papers》2004,45(1):15-31

There are many situations where the usual random sample from a population of interest is not available, due to the data having unequal probabilities of entering the sample. The method of weighted distributions models this ascertainment bias by adjusting the probabilities of actual occurrence of events to arrive at a specification of the probabilities of the events as observed and recorded. We consider two different classes of contaminated or mixture of weight functions, Γ_a={w(x):w(x)=(1−ε)w ₀(x)+εq(x),q∈Q} and Γ_g={w(x):w(x)=w ₀ ^1−ε (x)q ^ε(x),q∈Q} wherew ₀(x) is the elicited weighted function,Q is a class of positive functions and 0≤ε≤1 is a small number. Also, we study the local variation of ϕ-divergence over classes Γ_a and Γ_g. We devote on measuring robustness using divergence measures which is based on the Bayesian approach. Two examples will be studied. 相似文献

12.

Forecasting data revisions of GDP: a mixed frequency approach

Jens Hogrefe 《AStA Advances in Statistical Analysis》2008,92(3):271-296

Releases of GDP data undergo a series of revisions over time. These revisions have an impact on the results of macroeconometric models documented by the growing literature on real-time data applications. Revisions of U.S. GDP data can be explained and are partly predictable according to Faust et al. (J. Money Credit Bank. 37(3):403–419, 2005) or Fixler and Grimm (J. Product. Anal. 25:213–229, 2006). This analysis proposes the inclusion of mixed frequency data for forecasting GDP revisions. Thereby, the information set available around the first data vintage can be better exploited than the pure quarterly data. In-sample and out-of-sample results suggest that forecasts of GDP revisions can be improved by using mixed frequency data. 相似文献

13.

A note on tolerance regions for random vectors and best linear predictors

D. G. Kabe A. K. Gupta 《Statistical Papers》1990,31(1):285-289

When a (p+q)-variate column vector (x′,y′)′ has a (p+q)-variate normal density with mean vector (μ₁,μ₂) and covariance matrix Ω, unknown, Schervish (1980) obtains prediction intervals for the linear functions of a future y, given x. He bases the prediction interval on the F-distribution. However, for a specified linear function the statistic to be used is Student's t, since the prediction intervals based on t are shorter than those based on F. Similar results hold for the multivariate linear regression model. 相似文献

14.

Non-penalty shrinkage estimation of random effect models for longitudinal data with AR(1) errors

Le An Lac 《Journal of Statistical Computation and Simulation》2018,88(16):3230-3247

In this paper, we consider the non-penalty shrinkage estimation method of random effect models with autoregressive errors for longitudinal data when there are many covariates and some of them may not be active for the response variable. In observational studies, subjects are followed over equally or unequally spaced visits to determine the continuous response and whether the response is associated with the risk factors/covariates. Measurements from the same subject are usually more similar to each other and thus are correlated with each other but not with observations of other subjects. To analyse this data, we consider a linear model that contains both random effects across subjects and within-subject errors that follows autoregressive structure of order 1 (AR(1)). Considering the subject-specific random effect as a nuisance parameter, we use two competing models, one includes all the covariates and the other restricts the coefficients based on the auxiliary information. We consider the non-penalty shrinkage estimation strategy that shrinks the unrestricted estimator in the direction of the restricted estimator. We discuss the asymptotic properties of the shrinkage estimators using the notion of asymptotic biases and risks. A Monte Carlo simulation study is conducted to examine the relative performance of the shrinkage estimators with the unrestricted estimator when the shrinkage dimension exceeds two. We also numerically compare the performance of the shrinkage estimators to that of the LASSO estimator. A longitudinal CD4 cell count data set will be used to illustrate the usefulness of shrinkage and LASSO estimators. 相似文献

15.

Alternative modeling techniques for the quantal response data in mixture experiments

Kadri Ulas Akay Müjgan Tez 《Journal of applied statistics》2011,38(11):2597-2616

Mixture experiments are commonly encountered in many fields including chemical, pharmaceutical and consumer product industries. Due to their wide applications, mixture experiments, a special study of response surface methodology, have been given greater attention in both model building and determination of designs compared with other experimental studies. In this paper, some new approaches are suggested on model building and selection for the analysis of the data in mixture experiments by using a special generalized linear models, logistic regression model, proposed by Chen et al. [7]. Generally, the special mixture models, which do not have a constant term, are highly affected by collinearity in modeling the mixture experiments. For this reason, in order to alleviate the undesired effects of collinearity in the analysis of mixture experiments with logistic regression, a new mixture model is defined with an alternative ratio variable. The deviance analysis table is given for standard mixture polynomial models defined by transformations and special mixture models used as linear predictors. The effects of components on the response in the restricted experimental region are given by using an alternative representation of Cox's direction approach. In addition, odds ratio and the confidence intervals of odds ratio are identified according to the chosen reference and control groups. To compare the suggested models, some model selection criteria, graphical odds ratio and the confidence intervals of the odds ratio are used. The advantage of the suggested approaches is illustrated on tumor incidence data set. 相似文献

16.

Modeling individual migraine severity with autoregressive ordered probit models

Claudia Czado Anette Heyn Gernot Müller 《Statistical Methods and Applications》2011,20(1):101-121

This paper considers the problem of modeling migraine severity assessments and their dependence on weather and time characteristics. We take on the viewpoint of a patient who is interested in an individual migraine management strategy. Since factors influencing migraine can differ between patients in number and magnitude, we show how a patient’s headache calendar reporting the severity measurements on an ordinal scale can be used to determine the dominating factors for this special patient. One also has to account for dependencies among the measurements. For this the autoregressive ordinal probit (AOP) model of Müller and Czado (J Comput Graph Stat 14: 320–338, 2005) is utilized and fitted to a single patient’s migraine data by a grouped move multigrid Monte Carlo (GM-MGMC) Gibbs sampler. Initially, covariates are selected using proportional odds models. Model fit and model comparison are discussed. A comparison with proportional odds specifications shows that the AOP models are preferred. 相似文献

17.

Specific-to-general predictor selection in approximate autoregressions—Monte Carlo evidence and a large scale performance assessment with real data

Helmut Herwartz 《AStA Advances in Statistical Analysis》2011,95(2):147-168

This paper describes the performance of specific-to-general composition of forecasting models that accord with (approximate) linear autoregressions. Monte Carlo experiments are complemented with ex-ante forecasting results for 97 macroeconomic time series collected for the G7 economies in Stock and Watson (J. Forecast. 23:405–430, 2004). In small samples, the specific-to-general strategy is superior in terms of ex-ante forecasting performance in comparison with a commonly applied strategy of successive model reduction according to weakest parameter significance. Applied to real data, the specific-to-general approach turns out to be preferable. In comparison with successive model reduction, the successive model expansion is less likely to involve overly large losses in forecast accuracy and is particularly recommended if the diagnosed prediction schemes are characterized by a medium to large number of predictors. 相似文献

18.

ON THE COVERAGE PROBABILITY OF CONFIDENCE INTERVALS IN REGRESSION AFTER VARIABLE SELECTION 总被引：1，自引：1，他引：0

Paul Kabaila 《Australian & New Zealand Journal of Statistics》2005,47(4):549-562

This paper considers a linear regression model with regression parameter vector β. The parameter of interest is θ= a^Tβ where a is specified. When, as a first step, a data‐based variable selection (e.g. minimum Akaike information criterion) is used to select a model, it is common statistical practice to then carry out inference about θ, using the same data, based on the (false) assumption that the selected model had been provided a priori. The paper considers a confidence interval for θ with nominal coverage 1 ‐ α constructed on this (false) assumption, and calls this the naive 1 ‐ α confidence interval. The minimum coverage probability of this confidence interval can be calculated for simple variable selection procedures involving only a single variable. However, the kinds of variable selection procedures used in practice are typically much more complicated. For the real‐life data presented in this paper, there are 20 variables each of which is to be either included or not, leading to 2²⁰ different models. The coverage probability at any given value of the parameters provides an upper bound on the minimum coverage probability of the naive confidence interval. This paper derives a new Monte Carlo simulation estimator of the coverage probability, which uses conditioning for variance reduction. For these real‐life data, the gain in efficiency of this Monte Carlo simulation due to conditioning ranged from 2 to 6. The paper also presents a simple one‐dimensional search strategy for parameter values at which the coverage probability is relatively small. For these real‐life data, this search leads to parameter values for which the coverage probability of the naive 0.95 confidence interval is 0.79 for variable selection using the Akaike information criterion and 0.70 for variable selection using Bayes information criterion, showing that these confidence intervals are completely inadequate. 相似文献

19.

Test of hypotheses in panel data models when the regressor and disturbances are possibly non-stationary

Badi H. Baltagi Chihwa Kao Sanggon Na 《AStA Advances in Statistical Analysis》2011,95(4):329-350

This paper considers the problem of hypothesis testing in a simple panel data regression model with random individual effects and serially correlated disturbances. Following Baltagi et al. (Econom. J. 11:554–572, 2008), we allow for the possibility of non-stationarity in the regressor and/or the disturbance term. While Baltagi et al. (Econom. J. 11:554–572, 2008) focus on the asymptotic properties and distributions of the standard panel data estimators, this paper focuses on testing of hypotheses in this setting. One important finding is that unlike the time-series case, one does not necessarily need to rely on the “super-efficient” type AR estimator by Perron and Yabu (J. Econom. 151:56–69, 2009) to make an inference in the panel data. In fact, we show that the simple t-ratio always converges to the standard normal distribution, regardless of whether the disturbances and/or the regressor are stationary. 相似文献

20.

Block and Basu bivariate lifetime distribution in the presence of cure fraction

Jorge Alberto Achcar Emílio Augusto Coelho-Barros Josmar Mazucheli 《Journal of applied statistics》2013,40(9):1864-1874

This paper presents estimates for the parameters included in the Block and Basu bivariate lifetime distributions in the presence of covariates and cure fraction, applied to analyze survival data when some individuals may never experience the event of interest and two lifetimes are associated with each unit. A Bayesian procedure is used to get point and confidence intervals for the unknown parameters. Posterior summaries of interest are obtained using standard Markov Chain Monte Carlo methods in rjags package for R software. An illustration of the proposed methodology is given for a Diabetic Retinopathy Study data set. 相似文献