首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Various nonparametric and parametric estimators of extremal dependence have been proposed in the literature. Nonparametric methods commonly suffer from the curse of dimensionality and have been mostly implemented in extreme-value studies up to three dimensions, whereas parametric models can tackle higher-dimensional settings. In this paper, we assess, through a vast and systematic simulation study, the performance of classical and recently proposed estimators in multivariate settings. In particular, we first investigate the performance of nonparametric methods and then compare them with classical parametric approaches under symmetric and asymmetric dependence structures within the commonly used logistic family. We also explore two different ways to make nonparametric estimators satisfy the necessary dependence function shape constraints, finding a general improvement in estimator performance either (i) by substituting the estimator with its greatest convex minorant, developing a computational tool to implement this method for dimensions \(D\ge 2\) or (ii) by projecting the estimator onto a subspace of dependence functions satisfying such constraints and taking advantage of Bernstein–Bézier polynomials. Implementing the convex minorant method leads to better estimator performance as the dimensionality increases.  相似文献   

2.
In this paper, we consider the problem of estimation of semi-linear regression models. Using invariance arguments, Bhowmik and King [2007. Maximal invariant likelihood based testing of semi-linear models. Statist. Papers 48, 357–383] derived the probability density function of the maximal invariant statistic for the non-linear component of these models. Using this density function as a likelihood function allows us to estimate these models in a two-step process. First the non-linear component parameters are estimated by maximising the maximal invariant likelihood function. Then the non-linear component, with the parameter values replaced by estimates, is treated as a regressor and ordinary least squares is used to estimate the remaining parameters. We report the results of a simulation study conducted to compare the accuracy of this approach with full maximum likelihood and maximum profile-marginal likelihood estimation. We find maximising the maximal invariant likelihood function typically results in less biased and lower variance estimates than those from full maximum likelihood.  相似文献   

3.
We investigate here small sample properties of approximate F-tests about fixed effects parameters in nonlinear mixed models. For estimation of population fixed effects parameters as well as variance components, we apply the two-stage approach. This method is useful and popular when the number of observations per sampling unit is large enough. The approximate F-test is constructed based on large-sample approximation to the distribution of nonlinear least-squares estimates of subject-specific parameters. We recommend a modified test statistic that takes into consideration approximation to the large-sample Fisher information matrix (See [Volaufova J, Burton JH. Note on hypothesis testing in mixed models. Oral presentation at: LINSTAT 2012/21st IWMS; 2012; Bedlewo, Poland]). Our main focus is on comparing finite sample properties of broadly used approximate tests (Wald test and likelihood ratio test) and the modified F-test under the null hypothesis, especially accuracy of p-values (See [Volaufova J, LaMotte L. Comparison of approximate tests of fixed effects in linear repeated measures design models with covariates. Tatra Mountains. 2008;39:17–25]). For that purpose two extensive simulation studies are conducted based on pharmacokinetic models (See [Hartford A, Davidian M. Consequences of misspecifying assumptions in nonlinear mixed effects models. Comput Stat and Data Anal. 2000;34:139–164; Pinheiro J, Bates D. Approximations to the log-likelihood function in the non-linear mixed-effects model. J Comput Graph Stat. 1995;4(1):12–35]).  相似文献   

4.
Partially linear regression models are semiparametric models that contain both linear and nonlinear components. They are extensively used in many scientific fields for their flexibility and convenient interpretability. In such analyses, testing the significance of the regression coefficients in the linear component is typically a key focus. Under the high-dimensional setting, i.e., “large p, small n,” the conventional F-test strategy does not apply because the coefficients need to be estimated through regularization techniques. In this article, we develop a new test using a U-statistic of order two, relying on a pseudo-estimate of the nonlinear component from the classical kernel method. Using the martingale central limit theorem, we prove the asymptotic normality of the proposed test statistic under some regularity conditions. We further demonstrate our proposed test's finite-sample performance by simulation studies and by analyzing some breast cancer gene expression data.  相似文献   

5.
In this article, we propose a class of mixed models for recurrent event data. The new models include the proportional rates model and Box–Cox transformation rates models as special cases, and allow the effects of covariates on the rate functions of counting processes to be proportional or convergent. For inference on the model parameters, estimating equation approaches are developed. The asymptotic properties of the resulting estimators are established and the finite sample performance of the proposed procedure is evaluated through simulation studies. A real example with data taken from a clinic study on chronic granulomatous disease (CGD) is also illustrated for the use of the proposed methodology. The Canadian Journal of Statistics 39: 578–590; 2011. © 2011 Statistical Society of Canada  相似文献   

6.
Fitting cross-classified multilevel models with binary response is challenging. In this setting a promising method is Bayesian inference through Integrated Nested Laplace Approximations (INLA), which performs well in several latent variable models. We devise a systematic simulation study to assess the performance of INLA with cross-classified binary data under different scenarios defined by the magnitude of the variances of the random effects, the number of observations, the number of clusters, and the degree of cross-classification. In the simulations INLA is systematically compared with the popular method of Maximum Likelihood via Laplace Approximation. By an application to the classical salamander mating data, we compare INLA with the best performing methods. Given the computational speed and the generally good performance, INLA turns out to be a valuable method for fitting logistic cross-classified models.  相似文献   

7.
For small area estimation of area‐level data, the Fay–Herriot model is extensively used as a model‐based method. In the Fay–Herriot model, it is conventionally assumed that the sampling variances are known, whereas estimators of sampling variances are used in practice. Thus, the settings of knowing sampling variances are unrealistic, and several methods are proposed to overcome this problem. In this paper, we assume the situation where the direct estimators of the sampling variances are available as well as the sample means. Using this information, we propose a Bayesian yet objective method producing shrinkage estimation of both means and variances in the Fay–Herriot model. We consider the hierarchical structure for the sampling variances, and we set uniform prior on model parameters to keep objectivity of the proposed model. For validity of the posterior inference, we show under mild conditions that the posterior distribution is proper and has finite variances. We investigate the numerical performance through simulation and empirical studies.  相似文献   

8.
The random censorship model (RCM) is commonly used in biomedical science for modeling life distributions. The popular non-parametric Kaplan–Meier estimator and some semiparametric models such as Cox proportional hazard models are extensively discussed in the literature. In this paper, we propose to fit the RCM with the assumption that the actual life distribution and the censoring distribution have a proportional odds relationship. The parametric model is defined using Marshall–Olkin's extended Weibull distribution. We utilize the maximum-likelihood procedure to estimate model parameters, the survival distribution, the mean residual life function, and the hazard rate as well. The proportional odds assumption is also justified by the newly proposed bootstrap Komogorov–Smirnov type goodness-of-fit test. A simulation study on the MLE of model parameters and the median survival time is carried out to assess the finite sample performance of the model. Finally, we implement the proposed model on two real-life data sets.  相似文献   

9.
The partial least squares (PLS) approach first constructs new explanatory variables, known as factors (or components), which are linear combinations of available predictor variables. A small subset of these factors is then chosen and retained for prediction. We study the performance of PLS in estimating single-index models, especially when the predictor variables exhibit high collinearity. We show that PLS estimates are consistent up to a constant of proportionality. We present three simulation studies that compare the performance of PLS in estimating single-index models with that of sliced inverse regression (SIR). In the first two studies, we find that PLS performs better than SIR when collinearity exists. In the third study, we learn that PLS performs well even when there are multiple dependent variables, the link function is non-linear and the shape of the functional form is not known.  相似文献   

10.
Markov regression models are useful tools for estimating the impact of risk factors on rates of transition between multiple disease states. Alzheimer's disease (AD) is an example of a multi-state disease process in which great interest lies in identifying risk factors for transition. In this context, non-homogeneous models are required because transition rates change as subjects age. In this report we propose a non-homogeneous Markov regression model that allows for reversible and recurrent disease states, transitions among multiple states between observations, and unequally spaced observation times. We conducted simulation studies to demonstrate performance of estimators for covariate effects from this model and compare performance with alternative models when the underlying non-homogeneous process was correctly specified and under model misspecification. In simulation studies, we found that covariate effects were biased if non-homogeneity of the disease process was not accounted for. However, estimates from non-homogeneous models were robust to misspecification of the form of the non-homogeneity. We used our model to estimate risk factors for transition to mild cognitive impairment (MCI) and AD in a longitudinal study of subjects included in the National Alzheimer's Coordinating Center's Uniform Data Set. Using our model, we found that subjects with MCI affecting multiple cognitive domains were significantly less likely to revert to normal cognition.  相似文献   

11.
Dynamic regression models are widely used because they express and model the behaviour of a system over time. In this article, two dynamic regression models, the distributed lag (DL) model and the autoregressive distributed lag model, are evaluated focusing on their lag lengths. From a classical statistics point of view, there are various methods to determine the number of lags, but none of them are the best in all situations. This is a serious issue since wrong choices will provide bad estimates for the effects of the regressors on the response variable. We present an alternative for the aforementioned problems by considering a Bayesian approach. The posterior distributions of the numbers of lags are derived under an improper prior for the model parameters. The fractional Bayes factor technique [A. O'Hagan, Fractional Bayes factors for model comparison (with discussion), J. R. Statist. Soc. B 57 (1995), pp. 99–138] is used to handle the indeterminacy in the likelihood function caused by the improper prior. The zero-one loss function is used to penalize wrong decisions. A naive method using the specified maximum number of DLs is also presented. The proposed and the naive methods are verified using simulation data. The results are promising for the method we proposed. An illustrative example with a real data set is provided.  相似文献   

12.
The prognosis for patients with high grade gliomas is poor, with a median survival of 1 year. Treatment efficacy assessment is typically unavailable until 5-6 months post diagnosis. Investigators hypothesize that quantitative magnetic resonance imaging can assess treatment efficacy 3 weeks after therapy starts, thereby allowing salvage treatments to begin earlier. The purpose of this work is to build a predictive model of treatment efficacy by using quantitative magnetic resonance imaging data and to assess its performance. The outcome is 1-year survival status. We propose a joint, two-stage Bayesian model. In stage I, we smooth the image data with a multivariate spatiotemporal pairwise difference prior. We propose four summary statistics that are functionals of posterior parameters from the first-stage model. In stage II, these statistics enter a generalized non-linear model as predictors of survival status. We use the probit link and a multivariate adaptive regression spline basis. Gibbs sampling and reversible jump Markov chain Monte Carlo methods are applied iteratively between the two stages to estimate the posterior distribution. Through both simulation studies and model performance comparisons we find that we can achieve higher overall correct classification rates by accounting for the spatiotemporal correlation in the images and by allowing for a more complex and flexible decision boundary provided by the generalized non-linear model.  相似文献   

13.
Transition models are an important framework that can be used to model longitudinal categorical data. A relevant issue in applying these models is the condition of stationarity, or homogeneity of transition probabilities over time. We propose two tests to assess stationarity in transition models: Wald and likelihood-ratio tests, which do not make use of transition probabilities, using only the estimated parameters of the models in contrast to the classical test available in the literature. In this paper, we present two motivating studies, with ordinal longitudinal data, to which proportional odds transition models are fitted and the two proposed tests are applied as well as the classical test. Additionally, their performances are assessed through simulation studies. The results show that the proposed tests have good performance, being better for control of type-I error and they present equivalent power functions asymptotically. Also, the correlations between the Wald, likelihood-ratio and the classical test statistics are positive and large, an indicator of general concordance. Additionally, both of the proposed tests are more flexible and can be applied in studies with qualitative and quantitative covariates.  相似文献   

14.
Variable selection in the presence of outliers may be performed by using a robust version of Akaike's information criterion (AIC). In this paper, explicit expressions are obtained for such criteria when S- and MM-estimators are used. The performance of these criteria is compared with the existing AIC based on M-estimators and with the classical non-robust AIC. In a simulation study and in data examples, we observe that the proposed AIC with S and MM-estimators selects more appropriate models in case outliers are present.  相似文献   

15.
In this work, we propose a generalization of the classical Markov-switching ARMA models to the periodic time-varying case. Specifically, we propose a Markov-switching periodic ARMA (MS-PARMA) model. In addition of capturing regime switching often encountered during the study of many economic time series, this new model also captures the periodicity feature in the autocorrelation structure. We first provide some probabilistic properties of this class of models, namely the strict periodic stationarity and the existence of higher-order moments. We thus propose a procedure for computing the autocovariance function where we show that the autocovariances of the MS-PARMA model satisfy a system of equations similar to the PARMA Yule–Walker equations. We propose also an easily implemented algorithm which can be used to obtain parameter estimates for the MS-PARMA model. Finally, a simulation study of the performance of the proposed estimation method is provided.  相似文献   

16.
One of the main problems that the drug discovery research field confronts is to identify small molecules, modulators of protein function, which are likely to be therapeutically useful. Common practices rely on the screening of vast libraries of small molecules (often 1–2 million molecules) in order to identify a molecule, known as a lead molecule, which specifically inhibits or activates the protein function. To search for the lead molecule, we investigate the molecular structure, which generally consists of an extremely large number of fragments. Presence or absence of particular fragments, or groups of fragments, can strongly affect molecular properties. We study the relationship between molecular properties and its fragment composition by building a regression model, in which predictors, represented by binary variables indicating the presence or absence of fragments, are grouped in subsets and a bi-level penalization term is introduced for the high dimensionality of the problem. We evaluate the performance of this model in two simulation studies, comparing different penalization terms and different clustering techniques to derive the best predictor subsets structure. Both studies are characterized by small sets of data relative to the number of predictors under consideration. From the results of these simulation studies, we show that our approach can generate models able to identify key features and provide accurate predictions. The good performance of these models is then exhibited with real data about the MMP–12 enzyme.  相似文献   

17.
The degrees are a classical and relevant way to study the topology of a network. They can be used to assess the goodness of fit for a given random graph model. In this paper, we introduce goodness-of-fit tests for two classes of models. First, we consider the case of independent graph models such as the heterogeneous Erdös-Rényi model in which the edges have different connection probabilities. Second, we consider a generic model for exchangeable random graphs called the W-graph. The stochastic block model and the expected degree distribution model fall within this framework. We prove the asymptotic normality of the degree mean square under these independent and exchangeable models and derive formal tests. We study the power of the proposed tests and we prove the asymptotic normality under specific sparsity regimes. The tests are illustrated on real networks from social sciences and ecology, and their performances are assessed via a simulation study.  相似文献   

18.
A version of the nonparametric bootstrap, which resamples the entire subjects from original data, called the case bootstrap, has been increasingly used for estimating uncertainty of parameters in mixed‐effects models. It is usually applied to obtain more robust estimates of the parameters and more realistic confidence intervals (CIs). Alternative bootstrap methods, such as residual bootstrap and parametric bootstrap that resample both random effects and residuals, have been proposed to better take into account the hierarchical structure of multi‐level and longitudinal data. However, few studies have been performed to compare these different approaches. In this study, we used simulation to evaluate bootstrap methods proposed for linear mixed‐effect models. We also compared the results obtained by maximum likelihood (ML) and restricted maximum likelihood (REML). Our simulation studies evidenced the good performance of the case bootstrap as well as the bootstraps of both random effects and residuals. On the other hand, the bootstrap methods that resample only the residuals and the bootstraps combining case and residuals performed poorly. REML and ML provided similar bootstrap estimates of uncertainty, but there was slightly more bias and poorer coverage rate for variance parameters with ML in the sparse design. We applied the proposed methods to a real dataset from a study investigating the natural evolution of Parkinson's disease and were able to confirm that the methods provide plausible estimates of uncertainty. Given that most real‐life datasets tend to exhibit heterogeneity in sampling schedules, the residual bootstraps would be expected to perform better than the case bootstrap. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

19.
The Buckley–James estimator (BJE) is a widely recognized approach in dealing with right-censored linear regression models. There have been a lot of discussions in the literature on the estimation of the BJE as well as its asymptotic distribution. So far, no simulation has been done to directly estimate the asymptotic variance of the BJE. Kong and Yu [Asymptotic distributions of the Buckley–James estimator under nonstandard conditions, Statist. Sinica 17 (2007), pp. 341–360] studied the asymptotic distribution under discontinuous assumptions. Based on their methodology, we recalculate and correct some missing terms in the expression of the asymptotic variance in Theorem 2 of their work. We propose an estimator of the standard deviation of the BJE by using plug-in estimators. The estimator is shown to be consistent. The performance of the estimator is accessed through simulation studies under discrete underline distributions. We further extend our studies to several continuous underline distributions through simulation. The estimator is also applied to a real medical data set. The simulation results suggest that our estimation is a good approximation to the true standard deviation with reference to the empirical standard deviation.  相似文献   

20.
Summary. The classical approach to statistical analysis is usually based upon finding values for model parameters that maximize the likelihood function. Model choice in this context is often also based on the likelihood function, but with the addition of a penalty term for the number of parameters. Though models may be compared pairwise by using likelihood ratio tests for example, various criteria such as the Akaike information criterion have been proposed as alternatives when multiple models need to be compared. In practical terms, the classical approach to model selection usually involves maximizing the likelihood function associated with each competing model and then calculating the corresponding criteria value(s). However, when large numbers of models are possible, this quickly becomes infeasible unless a method that simultaneously maximizes over both parameter and model space is available. We propose an extension to the traditional simulated annealing algorithm that allows for moves that not only change parameter values but also move between competing models. This transdimensional simulated annealing algorithm can therefore be used to locate models and parameters that minimize criteria such as the Akaike information criterion, but within a single algorithm, removing the need for large numbers of simulations to be run. We discuss the implementation of the transdimensional simulated annealing algorithm and use simulation studies to examine its performance in realistically complex modelling situations. We illustrate our ideas with a pedagogic example based on the analysis of an autoregressive time series and two more detailed examples: one on variable selection for logistic regression and the other on model selection for the analysis of integrated recapture–recovery data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号