期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Analysis of rounded data in mixture normal model

Ningning Zhao Zhidong Bai 《Statistical Papers》2012,53(4):895-914

Rounding errors have a considerable impact on statistical inferences, especially when the data size is large and the finite normal mixture model is very important in many applied statistical problems, such as bioinformatics. In this article, we investigate the statistical impacts of rounding errors to the finite normal mixture model with a known number of components, and develop a new estimation method to obtain consistent and asymptotically normal estimates for the unknown parameters based on rounded data drawn from this kind of models. 相似文献

2.

Efficient Estimation of the Partly Linear Additive Hazards Model with Current Status Data

下载免费PDF全文

Xuewen Lu Peter X.‐K. Song 《Scandinavian Journal of Statistics》2015,42(1):306-328

This paper focuses on efficient estimation, optimal rates of convergence and effective algorithms in the partly linear additive hazards regression model with current status data. We use polynomial splines to estimate both cumulative baseline hazard function with monotonicity constraint and nonparametric regression functions with no such constraint. We propose a simultaneous sieve maximum likelihood estimation for regression parameters and nuisance parameters and show that the resultant estimator of regression parameter vector is asymptotically normal and achieves the semiparametric information bound. In addition, we show that rates of convergence for the estimators of nonparametric functions are optimal. We implement the proposed estimation through a backfitting algorithm on generalized linear models. We conduct simulation studies to examine the finite‐sample performance of the proposed estimation method and present an analysis of renal function recovery data for illustration. 相似文献

3.

Semiparametric multinomial logit models for analysing consumer choice behaviour 总被引：1，自引：0，他引：1

Thomas Kneib Bernhard Baumgartner Winfried J. Steiner 《AStA Advances in Statistical Analysis》2007,91(3):225-244

The multinomial logit model (MNL) is one of the most frequently used statistical models in marketing applications. It allows one to relate an unordered categorical response variable, for example representing the choice of a brand, to a vector of covariates such as the price of the brand or variables characterising the consumer. In its classical form, all covariates enter in strictly parametric, linear form into the utility function of the MNL model. In this paper, we introduce semiparametric extensions, where smooth effects of continuous covariates are modelled by penalised splines. A mixed model representation of these penalised splines is employed to obtain estimates of the corresponding smoothing parameters, leading to a fully automated estimation procedure. To validate semiparametric models against parametric models, we utilise different scoring rules as well as predicted market share and compare parametric and semiparametric approaches for a number of brand choice data sets. 相似文献

4.

Simulations on the Jelinski-Moranda model of software reliability; application of some parametric bootstrap methods

Mark Van Pul 《Statistics and Computing》1992,2(3):121-136

In software reliability theory many different models have been proposed and investigated. some of these models intuitively match reality better than others. The properties of certain statistical estimation procedures in connection with these models are also model-dependent. In this paper we investigate how well the maximum likelihood estimation procedure and the parametric bootstrap behave in the case of the very well-known software reliability model suggested by Jelinski and Moranda (1972). For this study we will make use of simulated data. 相似文献

5.

Diagnostics in multivariate generalized Birnbaum-Saunders regression models

Carolina Marchant Francisco José A. Cysneiros Juan F. Vivanco 《Journal of applied statistics》2016,43(15):2829-2849

Birnbaum–Saunders (BS) models are receiving considerable attention in the literature. Multivariate regression models are a useful tool of the multivariate analysis, which takes into account the correlation between variables. Diagnostic analysis is an important aspect to be considered in the statistical modeling. In this paper, we formulate multivariate generalized BS regression models and carry out a diagnostic analysis for these models. We consider the Mahalanobis distance as a global influence measure to detect multivariate outliers and use it for evaluating the adequacy of the distributional assumption. We also consider the local influence approach and study how a perturbation may impact on the estimation of model parameters. We implement the obtained results in the R software, which are illustrated with real-world multivariate data to show their potential applications. 相似文献

6.

Robust variable selection in finite mixture of regression models using the t distribution

Lin Dai Junhui Yin Zhengfen Xie 《统计学通讯:理论与方法》2013,42(21):5370-5386

Abstract

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with heavy tails and outliers. In this paper, we introduce a robust variable selection procedure for FMR models using the t distribution. With appropriate selection of the tuning parameters, the consistency and the oracle property of the regularized estimators are established. To estimate the parameters of the model, we develop an EM algorithm for numerical computations and a method for selecting tuning parameters adaptively. The parameter estimation performance of the proposed model is evaluated through simulation studies. The application of the proposed model is illustrated by analyzing a real data set. 相似文献

7.

Parameter redundancy in capture–recapture–recovery models

《Statistical Methodology》2014

In principle it is possible to use recently derived procedures to determine whether or not all the parameters of particular complex ecological models can be estimated using classical methods of statistical inference. If it is not possible to estimate all the parameters a model is parameter redundant. Furthermore, one can investigate whether derived results hold for such models for all lengths of study, and also how the results might change for specific data sets. In this paper we show how to apply these approaches to entire families of capture–recapture and capture–recapture–recovery models. This results in comprehensive tables, providing the definitive parameter redundancy status for such models. Parameter redundancy can also be caused by the data rather than the model, and how to investigate this is demonstrated through two applications, one to recapture data on dippers, and one to recapture–recovery data on great cormorants. 相似文献

8.

Variable selection in finite mixture of regression models using the skew-normal distribution

Junhui Yin Liucang Wu Lin Dai 《Journal of applied statistics》2020,47(16):2941

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with asymmetric behavior. In this paper, we introduce a variable selection procedure for FMR models using the skew-normal distribution. With appropriate choice of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. To estimate the parameters of the model, a modified EM algorithm for numerical computations is developed. The methodology is illustrated through numerical experiments and a real data example. 相似文献

9.

Application of Quasi-Least Squares to Analyse Replicated Autoregressive Time Series Regression Models

Genming Shi N. Rao Chaganty 《Journal of applied statistics》2004,31(10):1147-1156

Time series regression models have been widely studied in the literature by several authors. However, statistical analysis of replicated time series regression models has received little attention. In this paper, we study the application of the quasi-least squares method to estimate the parameters in a replicated time series model with errors that follow an autoregressive process of order p. We also discuss two other established methods for estimating the parameters: maximum likelihood assuming normality and the Yule-Walker method. When the number of repeated measurements is bounded and the number of replications n goes to infinity, the regression and the autocorrelation parameters are consistent and asymptotically normal for all three methods of estimation. Basically, the three methods estimate the regression parameter efficiently and differ in how they estimate the autocorrelation. When p=2, for normal data we use simulations to show that the quasi-least squares estimate of the autocorrelation is undoubtedly better than the Yule-Walker estimate. And the former estimate is as good as the maximum likelihood estimate almost over the entire parameter space. 相似文献

10.

Conditional inference in linear versus nonlinear models for binary time series

《Journal of Statistical Computation and Simulation》2012,82(7):881-897

The modelling of discrete such as binary time series, unlike the continuous time series, is not easy. This is due to the fact that there is no unique way to model the correlation structure of the repeated binary data. Some models may also provide a complicated correlation structure with narrow ranges for the correlations. In this paper, we consider a nonlinear dynamic binary time series model that provides a correlation structure which is easy to interpret and the correlations under this model satisfy the full?1 to 1 range. For the estimation of the parameters of this nonlinear model, we use a conditional generalized quasilikelihood (CGQL) approach which provides the same estimates as those of the well-known maximum likelihood approach. Furthermore, we consider a competitive linear dynamic binary time series model and examine the performance of the CGQL approach through a simulation study in estimating the parameters of this linear model. The model mis-specification effects on estimation as well as forecasting are also examined through simulations. 相似文献

11.

Regression for doubly inflated multivariate Poisson distributions

Ishapathik Das Sumen Sen Pooja Sengupta 《Journal of Statistical Computation and Simulation》2019,89(13):2549-2561

Dependent multivariate count data occur in several research studies. These data can be modelled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula-based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models. 相似文献

12.

Non-parametric Estimation for NHPP Software Reliability Models

Zhiguo Wang Jinde Wang Xue Liang 《Journal of applied statistics》2007,34(1):107-119

The non-homogeneous Poisson process (NHPP) model is a very important class of software reliability models and is widely used in software reliability engineering. NHPPs are characterized by their intensity functions. In the literature it is usually assumed that the functional forms of the intensity functions are known and only some parameters in intensity functions are unknown. The parametric statistical methods can then be applied to estimate or to test the unknown reliability models. However, in realistic situations it is often the case that the functional form of the failure intensity is not very well known or is completely unknown. In this case we have to use functional (non-parametric) estimation methods. The non-parametric techniques do not require any preliminary assumption on the software models and then can reduce the parameter modeling bias. The existing non-parametric methods in the statistical methods are usually not applicable to software reliability data. In this paper we construct some non-parametric methods to estimate the failure intensity function of the NHPP model, taking the particularities of the software failure data into consideration. 相似文献

13.

Transformation of the bathtub failure rate data in reliability for using Weibull-model analysis

Govind S. Mudholkar Kobby O. Asubonteng Alan D. Hutson 《Statistical Methodology》2009,6(6):622-633

All statistical methods involve basic model assumptions, which if violated render results of the analysis dubious. A solution to such a contingency is to seek an appropriate model or to modify the customary model by introducing additional parameters. Both of these approaches are in general cumbersome and demand uncommon expertise. An alternative is to transform the data to achieve compatibility with a well understood and convenient customary model with readily available software. The well-known example is the Box–Cox data transformation developed in order to make the normal theory linear model usable even when the assumptions of normality and homoscedasticity are not met.In reliability analysis the model appropriateness is determined by the nature of the hazard function. The well-known Weibull distribution is the most commonly employed model for this purpose. However, this model, which allows only a small spectrum of monotone hazard rates, is especially inappropriate if the data indicate bathtub-shaped hazard rates.In this paper, a new model based on the use of data transformation is presented for modeling bathtub-shaped hazard rates. Parameter estimation methods are studied for this new (transformation) approach. Examples and results of comparisons between the new model and other bathtub-shaped models are shown to illustrate the applicability of this new model. 相似文献

14.

A SAEM algorithm for the estimation of template and deformation parameters in medical image sequences

Frédéric J. P. Richard Adeline M. M. Samson Charles A. Cuénod 《Statistics and Computing》2009,19(4):465-478

This paper is about object deformations observed throughout a sequence of images. We present a statistical framework in which the observed images are defined as noisy realizations of a randomly deformed template image. In this framework, we focus on the problem of the estimation of parameters related to the template and deformations. Our main motivation is the construction of estimation framework and algorithm which can be applied to short sequences of complex and highly-dimensional images. The originality of our approach lies in the representations of the template and deformations, which are defined on a common triangulated domain, adapted to the geometry of the observed images. In this way, we have joint representations of the template and deformations which are compact and parsimonious. Using such representations, we are able to drastically reduce the number of parameters in the model. Besides, we adapt to our framework the Stochastic Approximation EM algorithm combined with a Markov Chain Monte Carlo procedure which was proposed in 2004 by Kuhn and Lavielle. Our implementation of this algorithm takes advantage of some properties which are specific to our framework. More precisely, we use the Markovian properties of deformations to build an efficient simulation strategy based on a Metropolis-Hasting-Within-Gibbs sampler. Finally, we present some experiments on sequences of medical images and synthetic data. 相似文献

15.

M-Estimation for partially functional linear regression model based on splines

Jianjun Zhou Zhimeng Sun 《统计学通讯:理论与方法》2013,42(21):6436-6446

ABSTRACT

M-estimation is a widely used technique for robust statistical inference. In this paper, we study robust partially functional linear regression model in which a scale response variable is explained by a function-valued variable and a finite number of real-valued variables. For the estimation of the regression parameters, which include the infinite dimensional function as well as the slope parameters for the real-valued variables, we use polynomial splines to approximate the slop parameter. The estimation procedure is easy to implement, and it is resistant to heavy-tailederrors or outliers in the response. The asymptotic properties of the proposed estimators are established. Finally, we assess the finite sample performance of the proposed method by Monte Carlo simulation studies. 相似文献

16.

Estimation of Non‐Crossing Quantile Regression Curves

下载免费PDF全文

Yuzhi Cai Tao Jiang 《Australian & New Zealand Journal of Statistics》2015,57(1):139-162

Quantile regression methods have been widely used in many research areas in recent years. However conventional estimation methods for quantile regression models do not guarantee that the estimated quantile curves will be non‐crossing. While there are various methods in the literature to deal with this problem, many of these methods force the model parameters to lie within a subset of the parameter space in order for the required monotonicity to be satisfied. Note that different methods may use different subspaces of the space of model parameters. This paper establishes a relationship between the monotonicity of the estimated conditional quantiles and the comonotonicity of the model parameters. We develope a novel quasi‐Bayesian method for parameter estimation which can be used to deal with both time series and independent statistical data. Simulation studies and an application to real financial returns show that the proposed method has the potential to be very useful in practice. 相似文献

17.

A Bayesian approach for parameter estimation in multi-stage models

Hoa Pham Darfiana Nur Huong T. T. Pham Alan Branford 《统计学通讯:理论与方法》2019,48(10):2459-2482

Multi-stage time evolving models are common statistical models for biological systems, especially insect populations. In stage-duration distribution models, parameter estimation for the models use the Laplace transform method. This method involves assumptions such as known constant shapes, known constant rates or the same overall hazard rate for all stages. These assumptions are strong and restrictive. The main aim of this paper is to weaken these assumptions by using a Bayesian approach. In particular, a Metropolis-Hastings algorithm based on deterministic transformations is used to estimate parameters. We will use two models, one which has no hazard rates, and the other has stage-wise constant hazard rates. These methods are validated in simulation studies followed by a case study of cattle parasites. The results show that the proposed methods are able to estimate the parameters comparably well, as opposed to using the Laplace transform methods. 相似文献

18.

Mixtures of general location model with factor analyzer covariance structure for clustering mixed type data

Leila Amiri Mojtaba Ganjali 《Journal of applied statistics》2019,46(11):2075-2100

Cluster analysis is one of the most widely used method in statistical analyses, in which homogeneous subgroups are identified in a heterogeneous population. Due to the existence of the continuous and discrete mixed data in many applications, so far, some ordinary clustering methods such as, hierarchical methods, k-means and model-based methods have been extended for analysis of mixed data. However, in the available model-based clustering methods, by increasing the number of continuous variables, the number of parameters increases and identifying as well as fitting an appropriate model may be difficult. In this paper, to reduce the number of the parameters, for the model-based clustering mixed data of continuous (normal) and nominal data, a set of parsimonious models is introduced. Models in this set are extended, using the general location model approach, for modeling distribution of mixed variables and applying factor analyzer structure for covariance matrices. The ECM algorithm is used for estimating the parameters of these models. In order to show the performance of the proposed models for clustering, results from some simulation studies and analyzing two real data sets are presented. 相似文献

19.

一类近似因子模型的GMM估计及其统计性质研究

白仲林白强《统计研究》2016,33(3):18-23

对于一类异质性误差项存在截面相关性的近似因子模型,本文首先提出了估计共同因子向量和因子载荷矩阵的广义矩估计方法（GMM）,该方法推广了Doz等（2012）的极大似然估计方法;其次,分别研究了模型参数广义矩估计的渐近性质和有限样本的统计性质,在适当的条件下,证明了参数的GMM估计是具有渐近正态分布的一致估计;最后,利用近似因子模型对我国各类上市公司增长性的共同驱动因素及其差异性进行了实证分析。相似文献

20.

Modified likelihood ratio tests for unit gamma regressions

Ana C. Guedes Francisco Cribari-Neto Patrícia L. Espinheira 《Journal of applied statistics》2020,47(9):1562

Regression analyses are commonly performed with doubly limited continuous dependent variables; for instance, when modeling the behavior of rates, proportions and income concentration indices. Several models are available in the literature for use with such variables, one of them being the unit gamma regression model. In all such models, parameter estimation is typically performed using the maximum likelihood method and testing inferences on the model''s parameters are usually based on the likelihood ratio test. Such a test can, however, deliver quite imprecise inferences when the sample size is small. In this paper, we propose two modified likelihood ratio test statistics for use with the unit gamma regressions that deliver much more accurate inferences when the number of data points in small. Numerical (i.e. simulation) evidence is presented for both fixed dispersion and varying dispersion models, and also for tests that involve nonnested models. We also present and discuss two empirical applications. 相似文献