期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

MARS as an alternative approach of Gaussian graphical model for biochemical networks

Ezgi Ayyıldız Melih Ağraz 《Journal of applied statistics》2017,44(16):2858-2876

The Gaussian graphical model (GGM) is one of the well-known modelling approaches to describe biological networks under the steady-state condition via the precision matrix of data. In literature there are different methods to infer model parameters based on GGM. The neighbourhood selection with the lasso regression and the graphical lasso method are the most common techniques among these alternative estimation methods. But they can be computationally demanding when the system's dimension increases. Here, we suggest a non-parametric statistical approach, called the multivariate adaptive regression splines (MARS) as an alternative of GGM. To compare the performance of both models, we evaluate the findings of normal and non-normal data via the specificity, precision, F-measures and their computational costs. From the outputs, we see that MARS performs well, resulting in, a plausible alternative approach with respect to GGM in the construction of complex biological systems. 相似文献

2.

A generalization of the Grizzle model to the estimation of treatment effects in crossover trials with non-compliance

Ali Reza Soltanian Soghrat Faghihzadeh 《Journal of applied statistics》2012,39(5):1037-1048

Compliance with one specified dosing strategy of assigned treatments is a common problem in randomized drug clinical trials. Recently, there has been much interest in methods used for analysing treatment effects in randomized clinical trials that are subject to non-compliance. In this paper, we estimate and compare treatment effects based on the Grizzle model (GM) (ignorable non-compliance) as the custom model and the generalized Grizzle model (GGM) (non-ignorable non-compliance) as the new model. A real data set based on the treatment of knee osteoarthritis is used to compare these models. The results based on the likelihood ratio statistics and simulation study show the advantage of the proposed model (GGM) over the custom model (GGM). 相似文献

3.

Sparse Bayesian variable selection in multinomial probit regression model with application to high-dimensional data classification

Yang Aijun Xiang Liming Lin Jinguan 《统计学通讯:理论与方法》2017,46(12):6137-6150

Here we consider a multinomial probit regression model where the number of variables substantially exceeds the sample size and only a subset of the available variables is associated with the response. Thus selecting a small number of relevant variables for classification has received a great deal of attention. Generally when the number of variables is substantial, sparsity-enforcing priors for the regression coefficients are called for on grounds of predictive generalization and computational ease. In this paper, we propose a sparse Bayesian variable selection method in multinomial probit regression model for multi-class classification. The performance of our proposed method is demonstrated with one simulated data and three well-known gene expression profiling data: breast cancer data, leukemia data, and small round blue-cell tumors. The results show that compared with other methods, our method is able to select the relevant variables and can obtain competitive classification accuracy with a small subset of relevant genes. 相似文献

4.

Efficient sampling schemes for Bayesian MARS models with many predictors

David J. Nott Anthony Y. C. Kuk Hiep Duc 《Statistics and Computing》2005,15(2):93-101

Multivariate adaptive regression spline fitting or MARS (Friedman 1991) provides a useful methodology for flexible adaptive regression with many predictors. The MARS methodology produces an estimate of the mean response that is a linear combination of adaptively chosen basis functions. Recently, a Bayesian version of MARS has been proposed (Denison, Mallick and Smith 1998a, Holmes and Denison, 2002) combining the MARS methodology with the benefits of Bayesian methods for accounting for model uncertainty to achieve improvements in predictive performance. In implementation of the Bayesian MARS approach, Markov chain Monte Carlo methods are used for computations, in which at each iteration of the algorithm it is proposed to change the current model by either (a) Adding a basis function (birth step) (b) Deleting a basis function (death step) or (c) Altering an existing basis function (change step). In the algorithm of Denison, Mallick and Smith (1998a), when a birth step is proposed, the type of basis function is determined by simulation from the prior. This works well in problems with a small number of predictors, is simple to program, and leads to a simple form for Metropolis-Hastings acceptance probabilities. However, in problems with very large numbers of predictors where many of the predictors are useless it may be difficult to find interesting interactions with such an approach. In the original MARS algorithm of Friedman (1991) a heuristic is used of building up higher order interactions from lower order ones, which greatly reduces the complexity of the search for good basis functions to add to the model. While we do not exactly follow the intuition of the original MARS algorithm in this paper, we nevertheless suggest a similar idea in which the Metropolis-Hastings proposals of Denison, Mallick and Smith (1998a) are altered to allow dependence on the current model. Our modification allows more rapid identification and exploration of important interactions, especially in problems with very large numbers of predictor variables and many useless predictors. Performance of the algorithms is compared in simulation studies. 相似文献

5.

非线性参数异质Phillips曲线模型及应用

蒋翠侠许启发《统计研究》2014,31(5):95-101

基于分位数回归理论与非线性回归方法,提出一个包容性较强的模型：非线性参数异质Phillips曲线模型,并给出其估计、检验与条件密度预测方法。该模型不仅可用于刻画Phillips曲线的非线性与非对称等典型特征,而且还可以揭示在不同经济环境下通货膨胀的完整分布变动规律,从而能够准确掌握通货膨胀的不确定性,便于科学决策。最后,将该模型应用于中国Phillips曲线特征研究,结果显示：该模型在拟合优度、结构分析、预测能力等方面优于其他Phillips曲线模型。相似文献

6.

Comparisons of some bivariate regression models

《Journal of Statistical Computation and Simulation》2012,82(7):937-949

The bivariate negative binomial regression (BNBR) and the bivariate Poisson log-normal regression (BPLR) models have been used to describe count data that are over-dispersed. In this paper, a new bivariate generalized Poisson regression (BGPR) model is defined. An advantage of the new regression model over the BNBR and BPLR models is that the BGPR can be used to model bivariate count data with either over-dispersion or under-dispersion. In this paper, we carry out a simulation study to compare the three regression models when the true data-generating process exhibits over-dispersion. In the simulation experiment, we observe that the bivariate generalized Poisson regression model performs better than the bivariate negative binomial regression model and the BPLR model. 相似文献

7.

A robust class of homoscedastic nonlinear regression models

Mohsen Maleki Zahra Barkhordar Zahra Khodadadi Darren Wraith 《Journal of Statistical Computation and Simulation》2019,89(14):2765-2781

In this paper, we examine a nonlinear regression (NLR) model with homoscedastic errors which follows a flexible class of two-piece distributions based on the scale mixtures of normal (TP-SMN) family. The objective of using this family is to develop a robust NLR model. The TP-SMN is a rich class of distributions that covers symmetric/asymmetric and lightly/heavy-tailed distributions and is an alternative family to the well-known scale mixtures of skew-normal (SMSN) family studied by Branco and Dey [35]. A key feature of this study is using a new suitable hierarchical representation of the family to obtain maximum-likelihood estimates of model parameters via an EM-type algorithm. The performances of the proposed robust model are demonstrated using simulated and some natural real datasets and also compared to other well-known NLR models. 相似文献

8.

A fuzzy robust regression approach applied to bedload transport data

Jalal Chachi 《统计学通讯:模拟与计算》2017,46(3):1703-1714

Fuzzy least-square regression can be very sensitive to unusual data (e.g., outliers). In this article, we describe how to fit an alternative robust-regression estimator in fuzzy environment, which attempts to identify and ignore unusual data. The proposed approach concerns classical robust regression and estimation methods that are insensitive to outliers. In this regard, based on the least trimmed square estimation method, an estimation procedure is proposed for determining the coefficients of the fuzzy regression model for crisp input-fuzzy output data. The investigated fuzzy regression model is applied to bedload transport data forecasting suspended load by discharge based on a real world data. The accuracy of the proposed method is compared with the well-known fuzzy least-square regression model. The comparison results reveal that the fuzzy robust regression model performs better than the other models in suspended load estimation for the particular dataset. This comparison is done based on a similarity measure between fuzzy sets. The proposed model is general and can be used for modeling natural phenomena whose available observations are reported as imprecise rather than crisp. 相似文献

9.

New approaches to model-free dimension reduction for bivariate regression

Xuerong Meggie Wen R. Dennis Cook 《Journal of statistical planning and inference》2009

Dimension reduction with bivariate responses, especially a mix of a continuous and categorical responses, can be of special interest. One immediate application is to regressions with censoring. In this paper, we propose two novel methods to reduce the dimension of the covariates of a bivariate regression via a model-free approach. Both methods enjoy a simple asymptotic chi-squared distribution for testing the dimension of the regression, and also allow us to test the contributions of the covariates easily without pre-specifying a parametric model. The new methods outperform the current one both in simulations and in analysis of a real data. The well-known PBC data are used to illustrate the application of our method to censored regression. 相似文献

10.

Multivariate measurement error models based on Student-t distribution under censored responses

Larissa A. Matos Luis M. Castro Celso R. B. Cabral Víctor H. Lachos 《Statistics》2013,47(6):1395-1416

Measurement error models constitute a wide class of models that include linear and nonlinear regression models. They are very useful to model many real-life phenomena, particularly in the medical and biological areas. The great advantage of these models is that, in some sense, they can be represented as mixed effects models, allowing us to implement well-known techniques, like the EM-algorithm for the parameter estimation. In this paper, we consider a class of multivariate measurement error models where the observed response and/or covariate are not fully observed, i.e., the observations are subject to certain threshold values below or above which the measurements are not quantifiable. Consequently, these observations are considered censored. We assume a Student-t distribution for the unobserved true values of the mismeasured covariate and the error term of the model, providing a robust alternative for parameter estimation. Our approach relies on a likelihood-based inference using an EM-type algorithm. The proposed method is illustrated through some simulation studies and the analysis of an AIDS clinical trial dataset. 相似文献

11.

平滑转移空间自回归模型下IV方法参数估计值的一致性研究

郭光远等《统计研究》2018,35(4):117-128

本文研究了空间自回归模型的一种非线性形式：平滑转移空间自回归模型。该模型空间项系数与转移函数的形式相关,随转移变量变化而变化,既能刻画个体间的关联性,又能描述空间关联性随某些因素变化而发生的改变。本文在工具变量框架下主要讨论了Logistic平滑转移函数空间自回归模型的一些性质,对Exponential转移函数模型也作了相应比较分析,并给出了一系列的设定、检验、估计等过程的详细步骤。在较宽泛的假设条件下,我们证明了模型参数估计值的一致性,并对其进行了Monte Carlo模拟验证,模拟结果很好的支持了一致性结论。相似文献

12.

Statistical inference for homologous gene pairs between two circular genomes: a new circular–circular regression model

Ashis SenGupta Sungsu Kim 《Statistical Methods and Applications》2016,25(3):421-432

In this paper, we investigate the problem of determining the relationship, represented by similarity of the homologous gene configuration, between paired circular genomes using a regression analysis. We propose a new regression model for studying two circular genomes, where the Möbius transformation naturally arises and is taken as the link function, and propose the least circular distance estimation method, as an appropriate method for analyzing circular variables. The main utility of the new regression model is in identification of a new angular location of one of a homologous gene pair between two circular genomes, for various types of possible gene mutations, given that of the other gene. Furthermore, we demonstrate the utility of our new regression model for grouping of various genomes based on closeness of their relationship. Using angular locations of homologous genes from the five pairs of circular genomes (Horimoto et al. in Bioinformatics 14:789–802, 1998), the new model is compared with the existing models. 相似文献

13.

A class of finite mixture of quantile regressions with its applications

Yuzhu Tian Maozai Tian 《Journal of applied statistics》2016,43(7):1240-1252

Mixture of linear regression models provide a popular treatment for modeling nonlinear regression relationship. The traditional estimation of mixture of regression models is based on Gaussian error assumption. It is well known that such assumption is sensitive to outliers and extreme values. To overcome this issue, a new class of finite mixture of quantile regressions (FMQR) is proposed in this article. Compared with the existing Gaussian mixture regression models, the proposed FMQR model can provide a complete specification on the conditional distribution of response variable for each component. From the likelihood point of view, the FMQR model is equivalent to the finite mixture of regression models based on errors following asymmetric Laplace distribution (ALD), which can be regarded as an extension to the traditional mixture of regression models with normal error terms. An EM algorithm is proposed to obtain the parameter estimates of the FMQR model by combining a hierarchical representation of the ALD. Finally, the iterated weighted least square estimation for each mixture component of the FMQR model is derived. Simulation studies are conducted to illustrate the finite sample performance of the estimation procedure. Analysis of an aphid data set is used to illustrate our methodologies. 相似文献

14.

Local quadratic estimation of the curvature in a functional single index model

Zi Ye Giles Hooker 《Scandinavian Journal of Statistics》2020,47(4):1307-1338

The nonlinear responses of species to environmental variability can play an important role in the maintenance of ecological diversity. Nonetheless, many models use parametric nonlinear terms which pre-determine the ecological conclusions. Motivated by this concern, we study the estimate of the second derivative (curvature) of the link function in a functional single index model. Since the coefficient function and the link function are both unknown, the estimate is expressed as a nested optimization. We first estimate the coefficient function by minimizing squared error where the link function is estimated with a Nadaraya-Watson estimator for each candidate coefficient function. The first and second derivatives of the link function are then estimated via local-quadratic regression using the estimated coefficient function. In this paper, we derive a convergence rate for the curvature of the nonlinear response. In addition, we prove that the argument of the linear predictor can be estimated root-n consistently. However, practical implementation of the method requires solving a nonlinear optimization problem, and our results show that the estimates of the link function and the coefficient function are quite sensitive to the choices of starting values. 相似文献

15.

The exact risks of some pre-test and stein-type regression estimators umder balanced loss

Judith A. Giles David E.A. Giles Kazuhiro Ohtani 《统计学通讯:理论与方法》2013,42(12):2901-2924

In regression analysis we are often interested in using an estimator which is “precise” and which simultaneously provides a model with “good fit”, In this paper we consider the risk properties of several estimators of the regression coefficient vector "trader “balanced” loss, This loss function (Zellner, 1994) reflects both of the described attributes. Under a particular form of balanced loss, we derive the predictive risk of the pre-test estimator which results after a test for exact linear restrictions on the coefficient vector. The corresponding risks of Stein-rule and positive-part Stein-rale estimators are also established. The risks based on loss functions which allow only for estimation precision, or only for goodness of fit, are special cases of our results, and we draw appropriate comparisons, In particular, we show that some of the well-known results under (quadratic) precision-only loss are not robust to our generalization of the loss function 相似文献

16.

A stochastic approximation algorithm for maximum-likelihood estimation with incomplete data

Ming Gao Gu Shaolin Li 《Revue canadienne de statistique》1998,26(4):567-582

We propose a new stochastic approximation (SA) algorithm for maximum-likelihood estimation (MLE) in the incomplete-data setting. This algorithm is most useful for problems when the EM algorithm is not possible due to an intractable E-step or M-step. Compared to other algorithm that have been proposed for intractable EM problems, such as the MCEM algorithm of Wei and Tanner (1990), our proposed algorithm appears more generally applicable and efficient. The approach we adopt is inspired by the Robbins-Monro (1951) stochastic approximation procedure, and we show that the proposed algorithm can be used to solve some of the long-standing problems in computing an MLE with incomplete data. We prove that in general O(n) simulation steps are required in computing the MLE with the SA algorithm and O(n log n) simulation steps are required in computing the MLE using the MCEM and/or the MCNR algorithm, where n is the sample size of the observations. Examples include computing the MLE in the nonlinear error-in-variable model and nonlinear regression model with random effects. 相似文献

17.

Two-parameter ridge estimator in the binary logistic regression

Yasin Asar Aşır Genç 《统计学通讯:模拟与计算》2017,46(9):7088-7099

The binary logistic regression is a commonly used statistical method when the outcome variable is dichotomous or binary. The explanatory variables are correlated in some situations of the logit model. This problem is called multicollinearity. It is known that the variance of the maximum likelihood estimator (MLE) is inflated in the presence of multicollinearity. Therefore, in this study, we define a new two-parameter ridge estimator for the logistic regression model to decrease the variance and overcome multicollinearity problem. We compare the new estimator to the other well-known estimators by studying their mean squared error (MSE) properties. Moreover, a Monte Carlo simulation is designed to evaluate the performances of the estimators. Finally, a real data application is illustrated to show the applicability of the new method. According to the results of the simulation and real application, the new estimator outperforms the other estimators for all of the situations considered. 相似文献

18.

Sparse discriminant analysis based on estimation of posterior probabilities

Akinori Hidaka Kenji Watanabe Takio Kurita 《Journal of applied statistics》2019,46(15):2761-2785

ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment. 相似文献

19.

A Cox-type regression model with change-points in the covariates

Jensen U Lütkebohmert C 《Lifetime data analysis》2008,14(3):267-285

We consider a Cox-type regression model with change-points in the covariates. A change-point specifies the unknown threshold at which the influence of a covariate shifts smoothly, i.e., the regression parameter may change over the range of a covariate and the underlying regression function is continuous but not differentiable. The model can be used to describe change-points in different covariates but also to model more than one change-point in a single covariate. Estimates of the change-points and of the regression parameters are derived and their properties are investigated. It is shown that not only the estimates of the regression parameters are [Formula: see text] -consistent but also the estimates of the change-points in contrast to the conjecture of other authors. Asymptotic normality is shown by using results developed for M-estimators. At the end of this paper we apply our model to an actuarial dataset, the PBC dataset of Fleming and Harrington (Counting processes and survival analysis, 1991) and to a dataset of electric motors. 相似文献

20.

Parameter estimation of complex mixed models based on meta-model approach

Pierre Barbillon Célia Barthélémy Adeline Samson 《Statistics and Computing》2017,27(4):1111-1128

Complex biological processes are usually experimented along time among a collection of individuals, longitudinal data are then available. The statistical challenge is to better understand the underlying biological mechanisms. A standard statistical approach is mixed-effects model where the regression function is highly-developed to describe precisely the biological processes (solutions of multi-dimensional ordinary differential equations or of partial differential equation). A classical estimation method relies on coupling a stochastic version of the EM algorithm with a Monte Carlo Markov Chain algorithm. This algorithm requires many evaluations of the regression function. This is clearly prohibitive when the solution is numerically approximated with a time-consuming solver. In this paper a meta-model relying on a Gaussian process emulator is proposed to approximate the regression function, that leads to what is called a mixed meta-model. The uncertainty of the meta-model approximation can be incorporated in the model. A control on the distance between the maximum likelihood estimates of the mixed meta-model and the maximum likelihood estimates of the exact mixed model is guaranteed. Eventually, numerical simulations are performed to illustrate the efficiency of this approach. 相似文献