首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nested case–control (NCC) sampling is widely used in large epidemiological cohort studies for its cost effectiveness, but its data analysis primarily relies on the Cox proportional hazards model. In this paper, we consider a family of linear transformation models for analyzing NCC data and propose an inverse selection probability weighted estimating equation method for inference. Consistency and asymptotic normality of our estimators for regression coefficients are established. We show that the asymptotic variance has a closed analytic form and can be easily estimated. Numerical studies are conducted to support the theory and an application to the Wilms’ Tumor Study is also given to illustrate the methodology.  相似文献   

2.
In stratified case-cohort designs, samplings of case-cohort samples are conducted via a stratified random sampling based on covariate information available on the entire cohort members. In this paper, we extended the work of Kang & Cai (2009) to a generalized stratified case-cohort study design for failure time data with multiple disease outcomes. Under this study design, we developed weighted estimating procedures for model parameters in marginal multiplicative intensity models and for the cumulative baseline hazard function. The asymptotic properties of the estimators are studied using martingales, modern empirical process theory, and results for finite population sampling.  相似文献   

3.
4.
A nested case–control (NCC) study is an efficient cohort-sampling design in which a subset of controls are sampled from the risk set at each event time. Since covariate measurements are taken only for the sampled subjects, time and efforts of conducting a full scale cohort study can be saved. In this paper, we consider fitting a semiparametric accelerated failure time model to failure time data from a NCC study. We propose to employ an efficient induced smoothing procedure for rank-based estimating method for regression parameters estimation. For variance estimation, we propose to use an efficient resampling method that utilizes the robust sandwich form. We extend our proposed methods to a generalized NCC study that allows a sampling of cases. Finite sample properties of the proposed estimators are investigated via an extensive stimulation study. An application to a tumor study illustrates the utility of the proposed method in routine data analysis.  相似文献   

5.
Abstract.  In this paper, we consider a semiparametric time-varying coefficients regression model where the influences of some covariates vary non-parametrically with time while the effects of the remaining covariates follow certain parametric functions of time. The weighted least squares type estimators for the unknown parameters of the parametric coefficient functions as well as the estimators for the non-parametric coefficient functions are developed. We show that the kernel smoothing that avoids modelling of the sampling times is asymptotically more efficient than a single nearest neighbour smoothing that depends on the estimation of the sampling model. The asymptotic optimal bandwidth is also derived. A hypothesis testing procedure is proposed to test whether some covariate effects follow certain parametric forms. Simulation studies are conducted to compare the finite sample performances of the kernel neighbourhood smoothing and the single nearest neighbour smoothing and to check the empirical sizes and powers of the proposed testing procedures. An application to a data set from an AIDS clinical trial study is provided for illustration.  相似文献   

6.
In many clinical applications, understanding when measurement of new markers is necessary to provide added accuracy to existing prediction tools could lead to more cost effective disease management. Many statistical tools for evaluating the incremental value (IncV) of the novel markers over the routine clinical risk factors have been developed in recent years. However, most existing literature focuses primarily on global assessment. Since the IncVs of new markers often vary across subgroups, it would be of great interest to identify subgroups for which the new markers are most/least useful in improving risk prediction. In this paper we provide novel statistical procedures for systematically identifying potential traditional-marker based subgroups in whom it might be beneficial to apply a new model with measurements of both the novel and traditional markers. We consider various conditional time-dependent accuracy parameters for censored failure time outcome to assess the subgroup-specific IncVs. We provide non-parametric kernel-based estimation procedures to calculate the proposed parameters. Simultaneous interval estimation procedures are provided to account for sampling variation and adjust for multiple testing. Simulation studies suggest that our proposed procedures work well in finite samples. The proposed procedures are applied to the Framingham Offspring Study to examine the added value of an inflammation marker, C-reactive protein, on top of the traditional Framingham risk score for predicting 10-year risk of cardiovascular disease.  相似文献   

7.
The area under the receiver operating characteristic (ROC) curve (AUC) is one of the commonly used measure to evaluate or compare the predictive ability of markers to the disease status. Motivated by an angiographic coronary artery disease (CAD) study, our objective is mainly to evaluate and compare the performance of several baseline plasma levels in the prediction of CAD-related vital status over time. Based on censored survival data, the non-parametric estimators are proposed for the time-dependent AUC. The limiting Gaussian processes of the estimators and the estimated asymptotic variance–covariance functions enable us to further construct confidence bands and develop testing procedures. Applications and finite sample properties of the proposed estimation methods and inference procedures are demonstrated through the CAD-related death data from the British Columbia Vital Statistics Agency and Monte Carlo simulations.  相似文献   

8.
Let’s consider a finite population of P units, each of them assumes a specific amount of the quantitative variable X. Moreover we assume that the range of values of X is subdivided into k classes and the sampling data come out from a two stage stratified sampling. The main purpose of the work is to determine the estimators, as well as their asymptotic distribution, of the partial means of classes, each of them is defined as a non linear function of the other parameters. Particularly, we are interested in determining the linear approximation estimators and, under convergence theorems, the asymptotic distribution. Afterwards we define the estimator of the vector of the partial means of classes and its asymptotic convergence to multivariate normal distribution is determined. These results are useful to develop simultaneous inferential procedures.  相似文献   

9.
Gap times between recurrent events are often of primary interest in medical and observational studies. The additive hazards model, focusing on risk differences rather than risk ratios, has been widely used in practice. However, the marginal additive hazards model does not take the dependence among gap times into account. In this paper, we propose an additive mixed effect model to analyze gap time data, and the proposed model includes a subject-specific random effect to account for the dependence among the gap times. Estimating equation approaches are developed for parameter estimation, and the asymptotic properties of the resulting estimators are established. In addition, some graphical and numerical procedures are presented for model checking. The finite sample behavior of the proposed methods is evaluated through simulation studies, and an application to a data set from a clinic study on chronic granulomatous disease is provided.  相似文献   

10.
The theoretical literature on quantile and distribution function estimation in infinite populations is very rich, and invariance plays an important role in these studies. This is not the case for the commonly occurring problem of estimation of quantiles in finite populations. The latter is more complicated and interesting because an optimal strategy consists not only of an estimator, but also of a sampling design, and the estimator may depend on the design and on the labels of sampled individuals, whereas in iid sampling, design issues and labels do not exist.We study the estimation of finite population quantiles, with emphasis on estimators that are invariant under the group of monotone transformations of the data, and suitable invariant loss functions. Invariance under the finite group of permutation of the sample is also considered. We discuss nonrandomized and randomized estimators, best invariant and minimax estimators, and sampling strategies relative to different classes. Invariant loss functions and estimators in finite population sampling have a nonparametric flavor, and various natural combinatorial questions and tools arise as a result.  相似文献   

11.
This article proposes a variable selection procedure for partially linear models with right-censored data via penalized least squares. We apply the SCAD penalty to select significant variables and estimate unknown parameters simultaneously. The sampling properties for the proposed procedure are investigated. The rate of convergence and the asymptotic normality of the proposed estimators are established. Furthermore, the SCAD-penalized estimators of the nonzero coefficients are shown to have the asymptotic oracle property. In addition, an iterative algorithm is proposed to find the solution of the penalized least squares. Simulation studies are conducted to examine the finite sample performance of the proposed method.  相似文献   

12.
Exact formulas for the expected value and variance of the median and trimmed mean are found as functions of the elements of a finite population under simple random sampling. A simulation study is performed to compare the performance of the median and trimmed mean versus the mean when sampling from various simulated finite populations. Finally, the asymptotic performance of these estimators, when sampling from infinite populations, is compared with the finite populations results.  相似文献   

13.
An estimator of the Gini coefficient (the well-known income inequality measure) of a finite population is defined for an arbitrary probability sampling design, taking the sampling design into consideration. Alternative estimators of the variance of the estimated Gini coefficient are introduced. The sampling performance of the Gini coefficient estimator and its variance estimators is studied by means of a Monte Carlo study, using stratified sampling from a miniature population of Swedish households with authentic income data.  相似文献   

14.
Risk estimation is an important statistical question for the purposes of selecting a good estimator (i.e., model selection) and assessing its performance (i.e., estimating generalization error). This article introduces a general framework for cross-validation and derives distributional properties of cross-validated risk estimators in the context of estimator selection and performance assessment. Arbitrary classes of estimators are considered, including density estimators and predictors for both continuous and polychotomous outcomes. Results are provided for general full data loss functions (e.g., absolute and squared error, indicator, negative log density). A broad definition of cross-validation is used in order to cover leave-one-out cross-validation, V-fold cross-validation, Monte Carlo cross-validation, and bootstrap procedures. For estimator selection, finite sample risk bounds are derived and applied to establish the asymptotic optimality of cross-validation, in the sense that a selector based on a cross-validated risk estimator performs asymptotically as well as an optimal oracle selector based on the risk under the true, unknown data generating distribution. The asymptotic results are derived under the assumption that the size of the validation sets converges to infinity and hence do not cover leave-one-out cross-validation. For performance assessment, cross-validated risk estimators are shown to be consistent and asymptotically linear for the risk under the true data generating distribution and confidence intervals are derived for this unknown risk. Unlike previously published results, the theorems derived in this and our related articles apply to general data generating distributions, loss functions (i.e., parameters), estimators, and cross-validation procedures.  相似文献   

15.
Summary. The regression literature contains hundreds of studies on serially correlated disturbances. Most of these studies assume that the structure of the error covariance matrix Ω is known or can be estimated consistently from data. Surprisingly, few studies investigate the properties of estimated generalized least squares (GLS) procedures when the structure of Ω is incorrectly identified and the parameters are inefficiently estimated. We compare the finite sample efficiencies of ordinary least squares (OLS), GLS and incorrect GLS (IGLS) estimators. We also prove new theorems establishing theoretical efficiency bounds for IGLS relative to GLS and OLS. Results from an exhaustive simulation study are used to evaluate the finite sample performance and to demonstrate the robustness of IGLS estimates vis-à-vis OLS and GLS estimates constructed for models with known and estimated (but correctly identified) Ω. Some of our conclusions for finite samples differ from established asymptotic results.  相似文献   

16.
ABSTRACT

This article investigates the finite sample properties of a range of inference methods for propensity score-based matching and weighting estimators frequently applied to evaluate the average treatment effect on the treated. We analyze both asymptotic approximations and bootstrap methods for computing variances and confidence intervals in our simulation designs, which are based on German register data and U.S. survey data. We vary the design w.r.t. treatment selectivity, effect heterogeneity, share of treated, and sample size. The results suggest that in general, theoretically justified bootstrap procedures (i.e., wild bootstrapping for pair matching and standard bootstrapping for “smoother” treatment effect estimators) dominate the asymptotic approximations in terms of coverage rates for both matching and weighting estimators. Most findings are robust across simulation designs and estimators.  相似文献   

17.
In a longitudinal study, an individual is followed up over a period of time. Repeated measurements on the response and some time-dependent covariates are taken at a series of sampling times. The sampling times are often irregular and depend on covariates. In this paper, we propose a sampling adjusted procedure for the estimation of the proportional mean model without having to specify a sampling model. Unlike existing procedures, the proposed method is robust to model misspecification of the sampling times. Large sample properties are investigated for the estimators of both regression coefficients and the baseline function. We show that the proposed estimation procedure is more efficient than the existing procedures. Large sample confidence intervals for the baseline function are also constructed by perturbing the estimation equations. A simulation study is conducted to examine the finite sample properties of the proposed estimators and to compare with some of the existing procedures. The method is illustrated with a data set from a recurrent bladder cancer study.  相似文献   

18.
This paper examines strategies for estimating the mean of a finite population in the following situation: A linear regression model is assumed to describe the population scatter. Various estimators β for the vector of regression parameters β are considered. Several ways of transforming each estimator β into a model-based estimator for the population mean are considered. Some estimators constructed in this way become sensitive to correctness of the assumed model. The estimators favoured in this paper are the ones in which the observations are weighted to reflect the sampling design, so that asymptotic design unbiasedness is achieved. For these estimators, the randomization distribution gives protection against model breakdown.  相似文献   

19.
ABSTRACT

Despite the sizeable literature associated with the seemingly unrelated regression models, not much is known about the use of Stein-rule estimators in these models. This gap is remedied in this paper, in which two families of Stein-rule estimators in seemingly unrelated regression equations are presented and their large sample asymptotic properties explored and evaluated. One family of estimators uses a shrinkage factor obtained solely from the equation under study while the other has a shrinkage factor based on all the equations of the model. Using a quadratic loss measure and Monte-Carlo sampling experiments, the finite sample risk performance of these estimators is also evaluated and compared with the traditional feasible generalized least squares estimator.  相似文献   

20.
To enhance modeling flexibility, the authors propose a nonparametric hazard regression model, for which the ordinary and weighted least squares estimation and inference procedures are studied. The proposed model does not assume any parametric specifications on the covariate effects, which is suitable for exploring the nonlinear interactions between covariates, time and some exposure variable. The authors propose the local ordinary and weighted least squares estimators for the varying‐coefficient functions and establish the corresponding asymptotic normality properties. Simulation studies are conducted to empirically examine the finite‐sample performance of the new methods, and a real data example from a recent breast cancer study is used as an illustration. The Canadian Journal of Statistics 37: 659–674; 2009 © 2009 Statistical Society of Canada  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号