首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
To evaluate the clinical utility of new risk markers, a crucial step is to measure their predictive accuracy with prospective studies. However, it is often infeasible to obtain marker values for all study participants. The nested case-control (NCC) design is a useful cost-effective strategy for such settings. Under the NCC design, markers are only ascertained for cases and a fraction of controls sampled randomly from the risk sets. The outcome dependent sampling generates a complex data structure and therefore a challenge for analysis. Existing methods for analyzing NCC studies focus primarily on association measures. Here, we propose a class of non-parametric estimators for commonly used accuracy measures. We derived asymptotic expansions for accuracy estimators based on both finite population and Bernoulli sampling and established asymptotic equivalence between the two. Simulation results suggest that the proposed procedures perform well in finite samples. The new procedures were illustrated with data from the Framingham Offspring study.  相似文献   

2.
Suppose that we need to classify a population of subjects into several well-defined ordered risk categories for disease prevention or management with their “baseline” risk factors/markers. In this article, we present a systematic approach to identify subjects using their conventional risk factors/markers who would benefit from a new set of risk markers for more accurate classification. Specifically for each subgroup of individuals with the same conventional risk estimate, we present inference procedures for the reclassification and the corresponding correct re-categorization rates with the new markers. We then apply these new tools to analyze the data from the Cardiovascular Health Study sponsored by the US National Heart, Lung, and Blood Institute. We used Framingham risk factors plus the information of baseline anti-hypertensive drug usage to identify adult American women who may benefit from the measurement of a new blood biomarker, CRP, for better risk classification in order to intensify prevention of coronary heart disease for the subsequent 10 years.  相似文献   

3.
Many cancers and neuro‐related diseases display significant phenotypic and genetic heterogeneity across subjects and subpopulations. Characterizing such heterogeneity could transform our understanding of the etiology of these conditions and inspire new approaches to urgently needed prevention, diagnosis, treatment, and prognosis. However, most existing statistical methods face major challenges in delineating such heterogeneity at both the group and individual levels. The aim of this article is to propose a novel statistical disease‐mapping (SDM) framework to address some of these challenges. We develop an efficient estimation method to estimate unknown parameters in SDM and delineate individual and group disease maps. Statistical inference procedures such as hypothesis‐testing problems are also investigated for parameters of interest. Both simulation studies and real data analysis on the ADNI hippocampal surface dataset show that our SDM not only effectively detects diseased regions in each patient but also provides a group disease‐mapping analysis of Alzheimer subgroups.  相似文献   

4.
This paper is concerned with extreme value density estimation. The generalized Pareto distribution (GPD) beyond a given threshold is combined with a nonparametric estimation approach below the threshold. This semiparametric setup is shown to generalize a few existing approaches and enables density estimation over the complete sample space. Estimation is performed via the Bayesian paradigm, which helps identify model components. Estimation of all model parameters, including the threshold and higher quantiles, and prediction for future observations is provided. Simulation studies suggest a few useful guidelines to evaluate the relevance of the proposed procedures. They also provide empirical evidence about the improvement of the proposed methodology over existing approaches. Models are then applied to environmental data sets. The paper is concluded with a few directions for future work.  相似文献   

5.
Widely recognized in many fields including economics, engineering, epidemiology, health sciences, technology and wildlife management, length-biased sampling generates biased and right-censored data but often provide the best information available for statistical inference. Different from traditional right-censored data, length-biased data have unique aspects resulting from their sampling procedures. We exploit these unique aspects and propose a general imputation-based estimation method for analyzing length-biased data under a class of flexible semiparametric transformation models. We present new computational algorithms that can jointly estimate the regression coefficients and the baseline function semiparametrically. The imputation-based method under the transformation model provides an unbiased estimator regardless whether the censoring is independent or not on the covariates. We establish large-sample properties using the empirical processes method. Simulation studies show that under small to moderate sample sizes, the proposed procedure has smaller mean square errors than two existing estimation procedures. Finally, we demonstrate the estimation procedure by a real data example.  相似文献   

6.
In family-based longitudinal genetic studies, investigators collect repeated measurements on a trait that changes with time along with genetic markers. Since repeated measurements are nested within subjects and subjects are nested within families, both the subject-level and measurement-level correlations must be taken into account in the statistical analysis to achieve more accurate estimation. In such studies, the primary interests include to test for quantitative trait locus (QTL) effect, and to estimate age-specific QTL effect and residual polygenic heritability function. We propose flexible semiparametric models along with their statistical estimation and hypothesis testing procedures for longitudinal genetic designs. We employ penalized splines to estimate nonparametric functions in the models. We find that misspecifying the baseline function or the genetic effect function in a parametric analysis may lead to substantially inflated or highly conservative type I error rate on testing and large mean squared error on estimation. We apply the proposed approaches to examine age-specific effects of genetic variants reported in a recent genome-wide association study of blood pressure collected in the Framingham Heart Study.  相似文献   

7.
Panel count data arise in many fields and a number of estimation procedures have been developed along with two procedures for variable selection. In this paper, we discuss model selection and parameter estimation together. For the former, a focused information criterion (FIC) is presented and for the latter, a frequentist model average (FMA) estimation procedure is developed. A main advantage, also the difference from the existing model selection methods, of the FIC is that it emphasizes the accuracy of the estimation of the parameters of interest, rather than all parameters. Further efficiency gain can be achieved by the FMA estimation procedure as unlike existing methods, it takes into account the variability in the stage of model selection. Asymptotic properties of the proposed estimators are established, and a simulation study conducted suggests that the proposed methods work well for practical situations. An illustrative example is also provided. © 2014 Board of the Foundation of the Scandinavian Journal of Statistics  相似文献   

8.
Time-varying coefficient models with autoregressive and moving-average–generalized autoregressive conditional heteroscedasticity structure are proposed for examining the time-varying effects of risk factors in longitudinal studies. Compared with existing models in the literature, the proposed models give explicit patterns for the time-varying coefficients. Maximum likelihood and marginal likelihood (based on a Laplace approximation) are used to estimate the parameters in the proposed models. Simulation studies are conducted to evaluate the performance of these two estimation methods, which is measured in terms of the Kullback–Leibler divergence and the root mean square error. The marginal likelihood approach leads to the more accurate parameter estimates, although it is more computationally intensive. The proposed models are applied to the Framingham Heart Study to investigate the time-varying effects of covariates on coronary heart disease incidence. The Bayesian information criterion is used for specifying the time series structures of the coefficients of the risk factors.  相似文献   

9.
The Modulated Power Law process has been recently proposed as a suitable model for describing the failure pattern of repairable systems when both renewal-type behaviour and time trend are present. Unfortunately, the maximum likelihood method provides neither accurate confidence intervals on the model parameters for small or moderate sample sizes nor predictive intervals on future observations.

This paper proposes a Bayes approach, based on both non-informative and vague prior, as an alternative to the classical method. Point and interval estimation of the parameters, as well as point and interval prediction of future failure times, are given. Monte Carlo simulation studies show that the Bayes estimation and prediction possess good statistical properties in a frequentist context and, thus, are a valid alternative to the maximum likelihood approach.

Numerical examples illustrate the estimation and prediction procedures.  相似文献   

10.
In this article, a novel hybrid method to forecast stock price is proposed. This hybrid method is based on wavelet transform, wavelet denoising, linear models (autoregressive integrated moving average (ARIMA) model and exponential smoothing (ES) model), and nonlinear models (BP Neural Network and RBF Neural Network). The wavelet transform provides a set of better-behaved constitutive series than stock series for prediction. Wavelet denoising is used to eliminate some slight random fluctuations of stock series. ARIMA model and ES model are used to forecast the linear component of denoised stock series, and then BP Neural Network and RBF Neural Network are developed as tools for nonlinear pattern recognition to correct the estimation error of the prediction of linear models. The proposed method is examined in the stock market of Shanghai and Shenzhen and the results are compared with some of the most recent stock price forecast methods. The results show that the proposed hybrid method can provide a considerable improvement for the forecasting accuracy. Meanwhile, this proposed method can also be applied to analysis and forecast reliability of products or systems and improve the accuracy of reliability engineering.  相似文献   

11.
In this paper we propose a novel procedure, for the estimation of semiparametric survival functions. The proposed technique adapts penalized likelihood survival models to the context of lifetime value modeling. The method extends classical Cox model by introducing a smoothing parameter that can be estimated by means of penalized maximum likelihood procedures. Markov Chain Monte Carlo methods are employed to effectively estimate such smoothing parameter, using an algorithm which combines Metropolis–Hastings and Gibbs sampling. Our proposal is contextualized and compared with conventional models, with reference to a marketing application that involves the prediction of customer’s lifetime value estimation.  相似文献   

12.
Chandrasekar and Kale (1984) considered the problem of estimating a vector interesting parameter in the presence of nuisance parameters through vector unbiased statistical estimation functions (USEFs) and obtained an extension of the Cramér-Rao inequality. Based on this result, three optimality criteria were proposed and their equivalence was established. In this paper, motivated by the uniformly minimum risk criterion (Zacks, 1971, p. 102) for estimators, we propose a new optimality criterion for vector USEFs in the nuisance parameter case and show that it is equivalent to the three existing criteria.  相似文献   

13.
We proposed a new class of maximum a posteriori estimators for the parameters of the Gamma distribution. These estimators have simple closed-form expressions and can be rewritten as a bias-corrected maximum likelihood estimators presented by Ye and Chen [Closed-form estimators for the gamma distribution derived from likelihood equations. Am Statist. 2017;71(2):177–181]. A simulation study was carried out to compare different estimation procedures. Numerical results revels that our new estimation scheme outperforms the existing closed-form estimators and produces extremely efficient estimates for both parameters, even for small sample sizes.  相似文献   

14.
In this article, a new method named cumulative slicing principle fitted component (CUPFC) model is proposed to conduct sufficient dimension reduction and prediction in regression. Based on the classical PFC methods, the CUPFC avoids selecting some parameters such as the specific basis function form or the number of slices in slicing estimation. We develop the estimator of the central subspace in the CUPFC method under three error-term structures and establish its consistency. The simulations investigate the effectiveness of the new method in prediction and reduction estimation with other competitors. The results indicate that the new proposed method generally outperforms the existing PFC methods no matter how the predictors are truly related to the response. The application to real data also verifies the validity of the proposed method.  相似文献   

15.
针对现有时间分层组合预测法中协方差估计存在的不足,提出一种Shrinkage方法对协方差进行修正,以增强协方差估计的稳定性。Monte Carlo模拟结果表明:Shrinkage时间分层组合预测法的预测准确性较现有方法有所提高,且对季节因素不敏感;将该方法应用于中国进口贸易的预测中,结果显示新方法可显著提升各层预测效果,且对较高层序列的调整效果更佳;Shrinkage分层组合预测法能为海关等相关部门提供一种可针对不同时间维度进行同时性预测的新思路。  相似文献   

16.
面板数据的自适应Lasso分位回归方法研究   总被引:1,自引:0,他引:1  
如何在对参数进行估计的同时自动选择重要解释变量,一直是面板数据分位回归模型中讨论的热点问题之一。通过构造一种含多重随机效应的贝叶斯分层分位回归模型,在假定固定效应系数先验服从一种新的条件Laplace分布的基础上,给出了模型参数估计的Gibbs抽样算法。考虑到不同重要程度的解释变量权重系数压缩程度应该不同,所构造的先验信息具有自适应性的特点,能够准确地对模型中重要解释变量进行自动选取,且设计的切片Gibbs抽样算法能够快速有效地解决模型中各个参数的后验均值估计问题。模拟结果显示,新方法在参数估计精确度和变量选择准确度上均优于现有文献的常用方法。通过对中国各地区多个宏观经济指标的面板数据进行建模分析,演示了新方法估计参数与挑选变量的能力。  相似文献   

17.
吴梦云等 《统计研究》2021,38(8):132-145
多分类数据分析在实证研究中具有重要意义。然而,由于高维数、小样本及低信噪比等原因,现有的多分类方法仍面临信息量不足而导致的效果不佳问题。为此,学者们通过收集更多信息源 数据以更全面地刻画实际问题。不同于收集相同自变量的不同源样本,目前较为流行的多源数据收集了相同样本的不同源自变量,它们的独立性和相关性为统计建模带来了新的挑战。本文提出基于典型变量回归的多分类纵向整合分析方法,其中利用惩罚技术实现变量选择,并独特地考虑不同源数据间的关联结构,提出高效的ADMM算法进行模型优化。数值模拟结果表明,该方法在变量选择和分类预测 上均具有优越性。基于我国上证50的多源股票数据,利用该方法对2019年股票日收益率的影响因素进行了实证探究。研究表明,本文提出的多分类整合分析在筛选出具有解释意义变量的同时具有更好的预测效果。  相似文献   

18.
In this paper, a penalized weighted least squares approach is proposed for small area estimation under the unit level model. The new method not only unifies the traditional empirical best linear unbiased prediction that does not take sampling design into account and the pseudo‐empirical best linear unbiased prediction that incorporates sampling weights but also has the desirable robustness property to model misspecification compared with existing methods. The empirical small area estimator is given, and the corresponding second‐order approximation to mean squared error estimator is derived. Numerical comparisons based on synthetic and real data sets show superior performance of the proposed method to currently available estimators in the literature.  相似文献   

19.
The article considers a new approach for small area estimation based on a joint modelling of mean and variances. Model parameters are estimated via expectation–maximization algorithm. The conditional mean squared error is used to evaluate the prediction error. Analytical expressions are obtained for the conditional mean squared error and its estimator. Our approximations are second‐order correct, an unwritten standardization in the small area literature. Simulation studies indicate that the proposed method outperforms the existing methods in terms of prediction errors and their estimated values.  相似文献   

20.
The Receiver Operating Characteristic (ROC) curve and the Area Under the ROC Curve (AUC) are effective statistical tools for evaluating the accuracy of diagnostic tests for binary‐class medical data. However, many real‐world biomedical problems involve more than two categories. The Volume Under the ROC Surface (VUS) and Hypervolume Under the ROC Manifold (HUM) measures are extensions for the AUC under three‐class and multiple‐class models. Inference methods for such measures have been proposed recently. We develop a method of constructing a linear combination of markers for which the VUS or HUM of the combined markers is maximized. Asymptotic validity of the estimator is justified by extending the results for maximum rank correlation estimation that are well known in econometrics. A bootstrap resampling method is then applied to estimate the sampling variability. Simulations and examples are provided to demonstrate our methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号