首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Quantile regression (QR) is a natural alternative for depicting the impact of covariates on the conditional distributions of a outcome variable instead of the mean. In this paper, we investigate Bayesian regularized QR for the linear models with autoregressive errors. LASSO-penalized type priors are forced on regression coefficients and autoregressive parameters of the model. Gibbs sampler algorithm is employed to draw the full posterior distributions of unknown parameters. Finally, the proposed procedures are illustrated by some simulation studies and applied to a real data analysis of the electricity consumption.  相似文献   

2.
Models incorporating “latent” variables have been commonplace in financial, social, and behavioral sciences. Factor model, the most popular latent model, explains the continuous observed variables in a smaller set of latent variables (factors) in a matter of linear relationship. However, complex data often simultaneously display asymmetric dependence, asymptotic dependence, and positive (negative) dependence between random variables, which linearity and Gaussian distributions and many other extant distributions are not capable of modeling. This article proposes a nonlinear factor model that can model the above-mentioned variable dependence features but still possesses a simple form of factor structure. The random variables, marginally distributed as unit Fréchet distributions, are decomposed into max linear functions of underlying Fréchet idiosyncratic risks, transformed from Gaussian copula, and independent shared external Fréchet risks. By allowing the random variables to share underlying (latent) pervasive risks with random impact parameters, various dependence structures are created. This innovates a new promising technique to generate families of distributions with simple interpretations. We dive in the multivariate extreme value properties of the proposed model and investigate maximum composite likelihood methods for the impact parameters of the latent risks. The estimates are shown to be consistent. The estimation schemes are illustrated on several sets of simulated data, where comparisons of performance are addressed. We employ a bootstrap method to obtain standard errors in real data analysis. Real application to financial data reveals inherent dependencies that previous work has not disclosed and demonstrates the model’s interpretability to real data. Supplementary materials for this article are available online.  相似文献   

3.
In this paper, we discuss the class of generalized Birnbaum–Saunders distributions, which is a very flexible family suitable for modeling lifetime data as it allows for different degrees of kurtosis and asymmetry and unimodality as well as bimodality. We describe the theoretical developments on this model including properties, transformations and related distributions, lifetime analysis, and shape analysis. We also discuss methods of inference based on uncensored and censored data, diagnostics methods, goodness-of-fit tests, and random number generation algorithms for the generalized Birnbaum–Saunders model. Finally, we present some illustrative examples and show that this distribution fits the data better than the classical Birnbaum–Saunders model.  相似文献   

4.
A recently proposed model for describing the distribution of income over a population, based on the Burr distribution, has been shown to fit better than the commonly used lognormal or gamma distributions. The current article extends that analysis by deriving the large-sample properties of the maximum likelihood estimates for this three-parameter model. Consequently, resulting confidence intervals for some measures of income inequality (including the Gini index) are used to further test the model's validity, as well as to examine apparent trends in inequality over time. Since these properties depend on the way the income data are grouped and censored, implications for choosing data-report intervals can be analyzed. Specifically, a choice between two common methods of reporting the data is shown to have an important impact on Gini index estimates.  相似文献   

5.
In this work, we develop modeling and estimation approach for the analysis of cross-sectional clustered data with multimodal conditional distributions where the main interest is in analysis of subpopulations. It is proposed to model such data in a hierarchical model with conditional distributions viewed as finite mixtures of normal components. With a large number of observations in the lowest level clusters, a two-stage estimation approach is used. In the first stage, the normal mixture parameters in each lowest level cluster are estimated using robust methods. Robust alternatives to the maximum likelihood estimation are used to provide stable results even for data with conditional distributions such that their components may not quite meet normality assumptions. Then the lowest level cluster-specific means and standard deviations are modeled in a mixed effects model in the second stage. A small simulation study was conducted to compare performance of finite normal mixture population parameter estimates based on robust and maximum likelihood estimation in stage 1. The proposed modeling approach is illustrated through the analysis of mice tendon fibril diameters data. Analyses results address genotype differences between corresponding components in the mixtures and demonstrate advantages of robust estimation in stage 1.  相似文献   

6.
In this paper the indicator approach in spatial data analysis is presented for the determination of probability distributions to characterize the uncertainty about any unknown value. Such an analysis is non-parametric and is done independently of the estimate retained. These distributions are given through a series of quantile estimates and are not related to any particular prior model or shape. Moreover, determination of these distributions accounts for the data configuration and data values. An application is discussed. Moreover, some properties related to the Gaussian model are presented.  相似文献   

7.
In this paper, we consider a constructive representation of skewed distributions, which proposed by Ferreira and Steel (J Am Stat Assoc 101:823–829, 2006), and its basic properties is presented. We study the five versions of skew- normal distributions in this general setting. An appropriate empirical model for a skewed distribution is introduced. In data analysis, we compare this empirical model with the other four versions of skew-normal distributions, via some reasonable criteria. It is shown that the proposed empirical model has a better fit for density estimation.  相似文献   

8.
Categorical data frequently arise in applications in the Social Sciences. In such applications, the class of log-linear models, based on either a Poisson or (product) multinomial response distribution, is a flexible model class for inference and prediction. In this paper we consider the Bayesian analysis of both Poisson and multinomial log-linear models. It is often convenient to model multinomial or product multinomial data as observations of independent Poisson variables. For multinomial data, Lindley (1964) [20] showed that this approach leads to valid Bayesian posterior inferences when the prior density for the Poisson cell means factorises in a particular way. We develop this result to provide a general framework for the analysis of multinomial or product multinomial data using a Poisson log-linear model. Valid finite population inferences are also available, which can be particularly important in modelling social data. We then focus particular attention on multivariate normal prior distributions for the log-linear model parameters. Here, an improper prior distribution for certain Poisson model parameters is required for valid multinomial analysis, and we derive conditions under which the resulting posterior distribution is proper. We also consider the construction of prior distributions across models, and for model parameters, when uncertainty exists about the appropriate form of the model. We present classes of Poisson and multinomial models, invariant under certain natural groups of permutations of the cells. We demonstrate that, if prior belief concerning the model parameters is also invariant, as is the case in a ‘reference’ analysis, then the choice of prior distribution is considerably restricted. The analysis of multivariate categorical data in the form of a contingency table is considered in detail. We illustrate the methods with two examples.  相似文献   

9.
Many time series encountered in practice are nonstationary, and instead are often generated from a process with a unit root. Because of the process of data collection or the practice of researchers, time series used in analysis and modeling are frequently obtained through temporal aggregation. As a result, the series used in testing for a unit root are often time series aggregates. In this paper, we study the effects of the use of aggregate time series on the Dickey–Fuller test for a unit root. We start by deriving a proper model for the aggregate series. Based on this model, we find the limiting distributions of the test statistics and illustrate how the tests are affected by the use of aggregate time series. The results show that those distributions shift to the right and that this effect increases with the order of aggregation, causing a strong impact both on the empirical significance level and on the power of the test. To correct this problem, we present tables of critical points appropriate for the tests based on aggregate time series and demonstrate their adequacy. Examples illustrate the conclusions of our analysis.  相似文献   

10.
While excess zeros are often thought to cause data over-dispersion (i.e. when the variance exceeds the mean), this implication is not absolute. One should instead consider a flexible class of distributions that can address data dispersion along with excess zeros. This work develops a zero-inflated sum-of-Conway-Maxwell-Poissons (ZISCMP) regression as a flexible analysis tool to model count data that express significant data dispersion and contain excess zeros. This class of models contains several special case zero-inflated regressions, including zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), zero-inflated binomial (ZIB), and the zero-inflated Conway-Maxwell-Poisson (ZICMP). Through simulated and real data examples, we demonstrate class flexibility and usefulness. We further utilize it to analyze shark species data from Australia's Great Barrier Reef to assess the environmental impact of human action on the number of various species of sharks.  相似文献   

11.
Internet traffic data is characterized by some unusual statistical properties, in particular, the presence of heavy-tailed variables. A typical model for heavy-tailed distributions is the Pareto distribution although this is not adequate in many cases. In this article, we consider a mixture of two-parameter Pareto distributions as a model for heavy-tailed data and use a Bayesian approach based on the birth-death Markov chain Monte Carlo algorithm to fit this model. We estimate some measures of interest related to the queueing system k-Par/M/1 where k-Par denotes a mixture of k Pareto distributions. Heavy-tailed variables are difficult to model in such queueing systems because of the lack of a simple expression for the Laplace Transform (LT). We use a procedure based on recent LT approximating results for the Pareto/M/1 system. We illustrate our approach with both simulated and real data.  相似文献   

12.
Introducing a shape parameter to an exponential model is nothing new. There are many ways to introduce a shape parameter to an exponential distribution. The different methods may result in variety of weighted exponential (WE) distributions. In this article, we have introduced a shape parameter to an exponential model using the idea of Azzalini, which results in a new class of WE distributions. This new WE model has the probability density function (PDF) whose shape is very close to the shape of the PDFS of Weibull, gamma or generalized exponential distributions. Therefore, this model can be used as an alternative to any of these distributions. It is observed that this model can also be obtained as a hidden truncation model. Different properties of this new model have been discussed and compared with the corresponding properties of well-known distributions. Two data sets have been analysed for illustrative purposes and it is observed that in both the cases it fits better than Weibull, gamma or generalized exponential distributions.  相似文献   

13.
Although devised in 1936 by Fisher, discriminant analysis is still rapidly evolving, as the complexity of contemporary data sets grows exponentially. Our classification rules explore these complexities by modeling various correlations in higher-order data. Moreover, our classification rules are suitable to data sets where the number of response variables is comparable or larger than the number of observations. We assume that the higher-order observations have a separable variance-covariance matrix and two different Kronecker product structures on the mean vector. In this article, we develop quadratic classification rules among g different populations where each individual has κth order (κ ≥2) measurements. We also provide the computational algorithms to compute the maximum likelihood estimates for the model parameters and eventually the sample classification rules.  相似文献   

14.
ABSTRACT

A general class of models for discrete and/or continuous responses is proposed in which joint distributions are constructed via the conditional approach. It is assumed that the distributions of one response and of the other response given the first one belong to exponential family of distributions. Furthermore, the marginal means are related to the covariates by link functions and a dependency structure between the responses is inserted into the model. Estimation methods, diagnostic analysis and a simulation study considering a Bernoulli-exponential model, a particular case of the class, are presented. Finally, this model is used in a real data set.  相似文献   

15.
Informative dropout is a vexing problem for any biomedical study. Most existing statistical methods attempt to correct estimation bias related to this phenomenon by specifying unverifiable assumptions about the dropout mechanism. We consider a cohort study in Africa that uses an outreach programme to ascertain the vital status for dropout subjects. These data can be used to identify a number of relevant distributions. However, as only a subset of dropout subjects were followed, vital status ascertainment was incomplete. We use semi‐competing risk methods as our analysis framework to address this specific case where the terminal event is incompletely ascertained and consider various procedures for estimating the marginal distribution of dropout and the marginal and conditional distributions of survival. We also consider model selection and estimation efficiency in our setting. Performance of the proposed methods is demonstrated via simulations, asymptotic study and analysis of the study data.  相似文献   

16.
This paper deals with the analysis of multivariate survival data from a Bayesian perspective using Markov-chain Monte Carlo methods. The Metropolis along with the Gibbs algorithm is used to calculate some of the marginal posterior distributions. A multivariate survival model is proposed, since survival times within the same group are correlated as a consequence of a frailty random block effect. The conditional proportional-hazards model of Clayton and Cuzick is used with a martingale structured prior process (Arjas and Gasbarra) for the discretized baseline hazard. Besides the calculation of the marginal posterior distributions of the parameters of interest, this paper presents some Bayesian EDA diagnostic techniques to detect model adequacy. The methodology is exemplified with kidney infection data where the times to infections within the same patients are expected to be correlated.  相似文献   

17.
The Weibull, log-logistic and log-normal distributions are extensively used to model time-to-event data. The Weibull family accommodates only monotone hazard rates, whereas the log-logistic and log-normal are widely used to model unimodal hazard functions. The increasing availability of lifetime data with a wide range of characteristics motivate us to develop more flexible models that accommodate both monotone and nonmonotone hazard functions. One such model is the exponentiated Weibull distribution which not only accommodates monotone hazard functions but also allows for unimodal and bathtub shape hazard rates. This distribution has demonstrated considerable potential in univariate analysis of time-to-event data. However, the primary focus of many studies is rather on understanding the relationship between the time to the occurrence of an event and one or more covariates. This leads to a consideration of regression models that can be formulated in different ways in survival analysis. One such strategy involves formulating models for the accelerated failure time family of distributions. The most commonly used distributions serving this purpose are the Weibull, log-logistic and log-normal distributions. In this study, we show that the exponentiated Weibull distribution is closed under the accelerated failure time family. We then formulate a regression model based on the exponentiated Weibull distribution, and develop large sample theory for statistical inference. We also describe a Bayesian approach for inference. Two comparative studies based on real and simulated data sets reveal that the exponentiated Weibull regression can be valuable in adequately describing different types of time-to-event data.  相似文献   

18.
Shapes of service-time distributions in queueing network models have a great impact on the distribution of system response-times. It is essential for the analysis of response-time distribution that the modeled service-time distributions have the correct shape. Tradionally modeling of service-time distributions is based on a parametric approach by assuming a specific distribution and estimating its parameters. We introduce an alternative approach based on the principles of exploratory data analysis and nonparametric data modeling. The proposed method applies nonlinear data transformation and resistant curve fitting. The method can be used in cases, where the available data is a complete sample, a histogram, or the mean and a set of 5-10 quantiles. The reported results indicate that the proposed method is able to approximate the distribution of measured service times so that accurate estimates for quantiles of the response-time distribution are obtained  相似文献   

19.
A Multivariate Model for Repeated Failure Time Measurements   总被引:1,自引:1,他引:0  
A parametric multivariate failure time distribution is derived from a frailty-type model with a particular frailty distribution. It covers as special cases certain distributions which have been used for multivariate survival data in recent years. Some properties of the distribution are derived: its marginal and conditional distributions lie within the parametric family, and association between the component variates can be positive or, to a limited extent, negative. The simple closed form of the survivor function is useful for right-censored data, as occur commonly in survival analysis, and for calculating uniform residuals. Also featured is the distribution of ratios of paired failure times. The model is applied to data from the literature  相似文献   

20.
Receiver operating characteristic(ROC)curves are useful for studying the performance of diagnostic tests. ROC curves occur in many fields of applications including psychophysics, quality control and medical diagnostics. In practical situations, often the responses to a diagnostic test are classified into a number of ordered categories. Such data are referred to as ratings data. It is typically assumed that the underlying model is based on a continuous probability distribution. The ROC curve is then constructed from such data using this probability model. Properties of the ROC curve are inherited from the model. Therefore, understanding the role of different probability distributions in ROC modeling is an interesting and important area of research. In this paper the Lomax distribution is considered as a model for ratings data and the corresponding ROC curve is derived. The maximum likelihood estimation procedure for the related parameters is discussed. This procedure is then illustrated in the analysis of a neurological data example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号