首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Although the variance-gamma distribution is a flexible model for log-returns of financial assets, so far it has found rather limited applications in finance and risk management. One of the reasons is that maximum likelihood estimation of its parameters is not straightforward. We develop an EM-type algorithm based on Nitithumbundit and Chan (An ECM algorithm for skewed multivariate variance gamma distribution in normal mean–variance representation, arXiv:1504.01239, 2015) that bypasses the evaluation of the full likelihood, which may be difficult because the density is not in closed form and is unbounded for small values of the shape parameter. Moreover, we study the relative efficiency of our approach with respect to the maximum likelihood estimation procedures implemented in the VarianceGamma and ghyp R packages. Extensive simulation experiments and real-data analyses suggest that the multicycle ECM algorithm gives the best results in terms of root-mean-squared-error, for both parameter and value-at-risk estimation. The performance of the routines in the ghyp R package is similar but not as good, whereas the VarianceGamma package produces worse results, especially when the shape parameter is small.  相似文献   

2.
Handling dependence or not in feature selection is still an open question in supervised classification issues where the number of covariates exceeds the number of observations. Some recent papers surprisingly show the superiority of naive Bayes approaches based on an obviously erroneous assumption of independence, whereas others recommend to infer on the dependence structure in order to decorrelate the selection statistics. In the classical linear discriminant analysis (LDA) framework, the present paper first highlights the impact of dependence in terms of instability of feature selection. A second objective is to revisit the above issue using a flexible factor modeling for the covariance. This framework introduces latent components of dependence, conditionally on which a new Bayes consistency is defined. A procedure is then proposed for the joint estimation of the expectation and variance parameters of the model. The present method is compared to recent regularized diagonal discriminant analysis approaches, assuming independence among features, and regularized LDA procedures, both in terms of classification performance and stability of feature selection. The proposed method is implemented in the R package FADA, freely available from the R repository CRAN.  相似文献   

3.
This article considers explicit and detailed theoretical and empirical Bayesian analysis of the well-known Poisson regression model for count data with unobserved individual effects based on the lognormal, rather than the popular negative binomial distribution. Although the negative binomial distribution leads to analytical expressions for the likelihood function, a Poisson-lognormal model is closer to the concept of regression with normally distributed innovations, and accounts for excess zeros as well. Such models have been considered widely in the literature (Winkelmann, 2008 Winkelmann , R. ( 2008 ). Econometric Analysis of Count Data. , 5th ed. Berlin : Springer . [Google Scholar]). The article also provides the necessary theoretical results regarding the posterior distribution of the model. Given that the likelihood function involves integrals with respect to the latent variables, numerical methods organized around Gibbs sampling with data augmentation are proposed for likelihood analysis of the model. The methods are applied to the patent-R&D relationship of 70 US pharmaceutical and biomedical companies, and it is found that it performs better than Poisson regression or negative binomial regression models.  相似文献   

4.
Finite mixture models can adequately model population heterogeneity when this heterogeneity arises from a finite number of relatively homogeneous clusters. An example of such a situation is market segmentation. Order selection in mixture models, i.e. selecting the correct number of components, however, is a problem which has not been satisfactorily resolved. Existing simulation results in the literature do not completely agree with each other. Moreover, it appears that the performance of different selection methods is affected by the type of model and the parameter values. Furthermore, most existing results are based on simulations where the true generating model is identical to one of the models in the candidate set. In order to partly fill this gap we carried out a (relatively) large simulation study for finite mixture models of normal linear regressions. We included several types of model (mis)specification to study the robustness of 18 order selection methods. Furthermore, we compared the performance of these selection methods based on unpenalized and penalized estimates of the model parameters. The results indicate that order selection based on penalized estimates greatly improves the success rates of all order selection methods. The most successful methods were \(MDL2\) , \(MRC\) , \(MRC_k\) , \(ICL\) \(BIC\) , \(ICL\) , \(CAIC\) , \(BIC\) and \(CLC\) but not one method was consistently good or best for all types of model (mis)specification.  相似文献   

5.
We propose new time-dependent sensitivity, specificity, ROC curves and net reclassification indices that can take into account biomarkers or scores that are repeatedly measured at different time-points. Inference proceeds through inverse probability weighting and resampling. The newly proposed measures exploit the information contained in biomarkers measured at different visits, rather than using only the measurements at the first visits. The contribution is illustrated via simulations and an original application on patients affected by dilated cardiomiopathy. The aim is to evaluate if repeated binary measurements of right ventricular dysfunction bring additive prognostic information on mortality/urgent heart transplant. It is shown that taking into account the trajectory of the new biomarker improves risk classification, while the first measurement alone might not be sufficiently informative. The methods are implemented in an R package (longROC), freely available on CRAN.  相似文献   

6.
This work proposes a novel method through which local information about the target density can be used to construct an efficient importance sampler. The backbone of the proposed method is the incremental mixture importance sampling (IMIS) algorithm of Raftery and Bao (Biometrics 66(4):1162–1173, 2010), which builds a mixture importance distribution incrementally, by positioning new mixture components where the importance density lacks mass, relative to the target. The key innovation proposed here is to construct the mean vectors and covariance matrices of the mixture components by numerically solving certain differential equations, whose solution depends on the local shape of the target log-density. The new sampler has a number of advantages: (a) it provides an extremely parsimonious parametrization of the mixture importance density, whose configuration effectively depends only on the shape of the target and on a single free parameter representing pseudo-time; (b) it scales well with the dimensionality of the target; (c) it can deal with targets that are not log-concave. The performance of the proposed approach is demonstrated on two synthetic non-Gaussian densities, one being defined on up to eighty dimensions, and on a Bayesian logistic regression model, using the Sonar dataset. The Julia code implementing the importance sampler proposed here can be found at https://github.com/mfasiolo/LIMIS.  相似文献   

7.
In this paper, we utilize normal/independent (NI) distributions as a tool for robust modeling of linear mixed models (LMM) under a Bayesian paradigm. The purpose is to develop a non-iterative sampling method to obtain i.i.d. samples approximately from the observed posterior distribution by combining the inverse Bayes formulae, sampling/importance resampling and posterior mode estimates from the expectation maximization algorithm to LMMs with NI distributions, as suggested by Tan et al. [33 Tan, M., Tian, G. and Ng, K. 2003. A noniterative sampling method for computing posteriors in the structure of EM-type algorithms. Statist. Sinica, 13(3): 625640. [Web of Science ®] [Google Scholar]]. The proposed algorithm provides a novel alternative to perfect sampling and eliminates the convergence problems of Markov chain Monte Carlo methods. In order to examine the robust aspects of the NI class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback–Leibler divergence. Further, some discussions on model selection criteria are given. The new methodologies are exemplified through a real data set, illustrating the usefulness of the proposed methodology.  相似文献   

8.
In this paper, we use Markov Chain Monte Carlo (MCMC) methods in order to estimate and compare stochastic production frontier models from a Bayesian perspective. We consider a number of competing models in terms of different production functions and the distribution of the asymmetric error term. All MCMC simulations are done using the package JAGS (Just Another Gibbs Sampler), a clone of the classic BUGS package which works closely with the R package where all the statistical computations and graphics are done.  相似文献   

9.
In this article, we develop the Halton sequence of generating quasi-random numbers to an optimal sequence which has the inversive property. The new constructed quasi-random number generator satisfies the extra uniformity condition on [0, 1]. We finally present the performances of this generator in contrast to the former optimal Halton sequence in Chi et al. (2005 Chi, H., Mascagni, M. and Warnock, T. 2005. On the Optimal Halton Sequences. Math. Comput. in Simul., 70(1): 921. [Crossref], [Web of Science ®] [Google Scholar]) and modified optimal Halton sequence in Fathi et al. (2009 Fathi, B., Samimi, H. and Eskandari, A. 2009. A New Efficient Algorithm for Linear Scrambling Halton Sequence. International Journal of Applied Mathematics, 22(7): 10591065.  [Google Scholar]).  相似文献   

10.
Richter and McCann (2007 Richter , S. J. , McCann , M. H. ( 2007 ). Multiple comparisons using medians and permutation tests . Journal of Modern Applied Statistical Methods 6 ( 2 ): 399412 . [Google Scholar]) presented a median-based multiple comparison procedure for assessing evidence of group location differences. The sampling distribution was based on the permutation distribution of the maximum median difference among all pairs, and provides strong control of the FWE. This idea is extended to develop a step-down procedure for comparing group locations. The new step-down procedure exploits logical dependencies between pairwise hypotheses and provides greater power than the single-step procedure, while still maintaining strong FWE control. The new procedure can also be a more powerful alternative to existing methods based on means, especially for heavy-tailed distributions.  相似文献   

11.
We address the issue of recovering the structure of large sparse directed acyclic graphs from noisy observations of the system. We propose a novel procedure based on a specific formulation of the \(\ell _1\)-norm regularized maximum likelihood, which decomposes the graph estimation into two optimization sub-problems: topological structure and node order learning. We provide convergence inequalities for the graph estimator, as well as an algorithm to solve the induced optimization problem, in the form of a convex program embedded in a genetic algorithm. We apply our method to various data sets (including data from the DREAM4 challenge) and show that it compares favorably to state-of-the-art methods. This algorithm is available on CRAN as the R package GADAG.  相似文献   

12.
Biradar and Santosha (2014 Biradar, B. S., and C. D. Santosha. 2014. Estimation of the mean of the exponential distribution using maximum ranked set sampling with unequal samples. Open Journal of Statistics 4:64149.[Crossref] [Google Scholar]) proposed maximum ranked set sampling procedure with unequal samples (MRSSU) to estimate the mean of the exponential distribution. In this paper, we consider information measures of MRSSU in terms of Shannon entropy, Rényi entropy and Kullback-Leibler (KL) information. We also compare the uncertainty and information content of MRSSU with simple random sampling and ranked set sampling data. Finally, we develop some characterization results in terms of cumulative entropy and failure entropy of MRSSU.  相似文献   

13.
The crux of this article is to estimate the mean of the number of persons possessing a rare sensitive attribute based on the Mangat (1991 Mangat, N.S. (1991). An optional randomized response sampling technique using non–stigmatized attribute. Statistica. 51(4):595602. [Google Scholar]) randomization device by utilizing the Poisson distribution in simple random sampling and stratified sampling. Properties of the proposed randomized response (RR) model have been studied along with recommendations. It is also shown that the proposed model is more efficient than that of Land et al. (2011 Land, M., Singh, S., Sedory, S.A. (2011). Estimation of a rare attribute using Poisson distribution. Statistics. 46(3):351360.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) in simple random sampling and that of Lee et al. (2013 Lee, G.S., Uhm, D., Kim, J.M. (2013). Estimation of a rare sensitive attribute in stratified sampling using Poisson distribution. Statistics. 47(3):575589.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) in stratified random sampling when the proportion of persons possessing a rare unrelated attribute is known. Numerical illustrations are also given in support of the present study.  相似文献   

14.
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is an interest in studying latent variables (or latent traits). Usually such latent traits are assumed to be random variables and a convenient distribution is assigned to them. A very common choice for such a distribution has been the standard normal. Recently, Azevedo et al. [Bayesian inference for a skew-normal IRT model under the centred parameterization, Comput. Stat. Data Anal. 55 (2011), pp. 353–365] proposed a skew-normal distribution under the centred parameterization (SNCP) as had been studied in [R.B. Arellano-Valle and A. Azzalini, The centred parametrization for the multivariate skew-normal distribution, J. Multivariate Anal. 99(7) (2008), pp. 1362–1382], to model the latent trait distribution. This approach allows one to represent any asymmetric behaviour concerning the latent trait distribution. Also, they developed a Metropolis–Hastings within the Gibbs sampling (MHWGS) algorithm based on the density of the SNCP. They showed that the algorithm recovers all parameters properly. Their results indicated that, in the presence of asymmetry, the proposed model and the estimation algorithm perform better than the usual model and estimation methods. Our main goal in this paper is to propose another type of MHWGS algorithm based on a stochastic representation (hierarchical structure) of the SNCP studied in [N. Henze, A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271–275]. Our algorithm has only one Metropolis–Hastings step, in opposition to the algorithm developed by Azevedo et al., which has two such steps. This not only makes the implementation easier but also reduces the number of proposal densities to be used, which can be a problem in the implementation of MHWGS algorithms, as can be seen in [R.J. Patz and B.W. Junker, A straightforward approach to Markov Chain Monte Carlo methods for item response models, J. Educ. Behav. Stat. 24(2) (1999), pp. 146–178; R.J. Patz and B.W. Junker, The applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, J. Educ. Behav. Stat. 24(4) (1999), pp. 342–366; A. Gelman, G.O. Roberts, and W.R. Gilks, Efficient Metropolis jumping rules, Bayesian Stat. 5 (1996), pp. 599–607]. Moreover, we consider a modified beta prior (which generalizes the one considered in [3 Azevedo, C. L.N., Bolfarine, H. and Andrade, D. F. 2011. Bayesian inference for a skew-normal IRT model under the centred parameterization. Comput. Stat. Data Anal., 55: 353365. [Crossref], [Web of Science ®] [Google Scholar]]) and a Jeffreys prior for the asymmetry parameter. Furthermore, we study the sensitivity of such priors as well as the use of different kernel densities for this parameter. Finally, we assess the impact of the number of examinees, number of items and the asymmetry level on the parameter recovery. Results of the simulation study indicated that our approach performed equally as well as that in [3 Azevedo, C. L.N., Bolfarine, H. and Andrade, D. F. 2011. Bayesian inference for a skew-normal IRT model under the centred parameterization. Comput. Stat. Data Anal., 55: 353365. [Crossref], [Web of Science ®] [Google Scholar]], in terms of parameter recovery, mainly using the Jeffreys prior. Also, they indicated that the asymmetry level has the highest impact on parameter recovery, even though it is relatively small. A real data analysis is considered jointly with the development of model fitting assessment tools. The results are compared with the ones obtained by Azevedo et al. The results indicate that using the hierarchical approach allows us to implement MCMC algorithms more easily, it facilitates diagnosis of the convergence and also it can be very useful to fit more complex skew IRT models.  相似文献   

15.
Near-records of a sequence, as defined in Balakrishnan et al. (2005 Balakrishnan , N. , Pakes , A. G. , Stepanov , A. ( 2005 ). On the number and sum of near-record observations . Advances in Applied Probability 37 : 765780 .[Crossref], [Web of Science ®] [Google Scholar]), are observations lying within a fixed distance of the current record. In this article we study the asymptotic behavior of the number of near-records, among the first n observations in a sequence of independent, identically distributed and absolutely continuous random variables. We give conditions for the finiteness of the total number of near-records as well as laws of large numbers for their counting process. For distributions with a finite number of near-records, we carry out a simulation study suggesting that the total number of near-records has a geometric distribution.  相似文献   

16.
In this research, multiple dependent state and repetitive group sampling are used to design a variable sampling plan based on one-sided process capability indices, which consider the quality of the current lot as well as the quality of the preceding lots. The sample size and critical values of the proposed plan are determined by minimizing the average sample number while satisfying the producer's risk and consumer's risk at corresponding quality levels. In addition, comparisons are made with the existing sampling plans [Pearn and Wu (2006a Pearn, W. L., and C. W. Wu. 2006a. Critical acceptance values and sample sizes of a variables sampling plan for very low fraction of defectives. Omega: International Journal of Management Science 34 (1):90101.[Crossref], [Web of Science ®] [Google Scholar]), Yen et al. (2015 Yen, C. H., C. H. Chang, and M. Aslam. 2015. Repetitive variable acceptance sampling plan for one-sided specification. Journal of Statistical Computation and Simulation 85 (6):110216.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar])] in terms of average sample number and operating characteristic curve. Finally, an example is provided to illustrate the proposed plan.  相似文献   

17.
This article evaluates the economic benefit of methods that have been suggested to optimally sample (in an MSE sense) high-frequency return data for the purpose of realized variance/covariance estimation in the presence of market microstructure noise (Bandi and Russell, 2005a Bandi , F. M. , Russell , J. R. ( 2005a ). Realized covariation, realized beta, and microstructure noise . Working paper . [Google Scholar], 2008 Bandi , F. M. , Russell , J. R. ( 2008 ). Microstructure noise, realized variance, and optimal sampling . Review of Economic Studies , forthcoming . [Google Scholar]). We compare certainty equivalents derived from volatility-timing trading strategies relying on optimally-sampled realized variances and covariances, on realized variances and covariances obtained by sampling every 5 minutes, and on realized variances and covariances obtained by sampling every 15 minutes. In our sample, we show that a risk-averse investor who is given the option of choosing variance/covariance forecasts derived from MSE-based optimal sampling methods versus forecasts obtained from 5- and 15-minute intervals (as generally proposed in the literature) would be willing to pay up to about 80 basis points per year to achieve the level of utility that is guaranteed by optimal sampling. We find that the gains yielded by optimal sampling are economically large, statistically significant, and robust to realistic transaction costs.  相似文献   

18.
We consider the problem of unbiased estimation of a finite population proportion and compare the relative efficiency of the unequal probability sampling strategies due to Horvitz and Thompson (1952 Horvitz, D.G., Thompson, D.J. (1952). A generalization of sampling without replacement. J Am Stat Assoc. 47:663685.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) and Murthy (1957 Murthy, M.N. (1957). Ordered and unordered estimators in sampling without replacement. Sankhya 18:379390. [Google Scholar]) under a super-population model. It is shown that the model expected variance is smaller for the Murthy's (1957 Murthy, M.N. (1957). Ordered and unordered estimators in sampling without replacement. Sankhya 18:379390. [Google Scholar]) strategy both when these two sampling strategies are based on data obtained from (i) a direct survey, and (ii) a randomized response (RR) survey employing some RR technique following a general RR model.  相似文献   

19.
When outliers and/or heavy-tailed errors exist in linear models, the least absolute deviation (LAD) regression is a robust alternative to the ordinary least squares regression. Existing variable-selection methods in linear models based on LAD regression either only consider the finite number of predictors or lack the oracle property associated with the estimator. In this article, we focus on the variable selection via LAD regression with a diverging number of parameters. The rate of convergence of the LAD estimator with the smoothly clipped absolute deviation (SCAD) penalty function is established. Furthermore, we demonstrate that, under certain regularity conditions, the penalized estimator with a properly selected tuning parameter enjoys the oracle property. In addition, the rank correlation screening method originally proposed by Li et al. (2011 Li, G.R., Peng, H., Zhu, L.X. (2011). Nonconcave penalized M-estimation with a diverging number of parameters. Statistica Sinica 21:391419.[Web of Science ®] [Google Scholar]) is applied to deal with ultrahigh dimensional data. Simulation studies are conducted for revealing the finite sample performance of the estimator. We further illustrate the proposed methodology by a real example.  相似文献   

20.
Let \(X_1 ,X_2 ,\ldots ,X_n \) be a sequence of Markov Bernoulli trials (MBT) and \(\underline{X}_n =( {X_{n,k_1 } ,X_{n,k_2 } ,\ldots ,X_{n,k_r } })\) be a random vector where \(X_{n,k_i } \) represents the number of occurrences of success runs of length \(k_i \,( {i=1,2,\ldots ,r})\) . In this paper the joint distribution of \(\underline{X}_n \) in the sequence of \(n\) MBT is studied using method of conditional probability generating functions. Five different counting schemes of runs namely non-overlapping runs, runs of length at least \(k\) , overlapping runs, runs of exact length \(k\) and \(\ell \) -overlapping runs (i.e. \(\ell \) -overlapping counting scheme), \(0\le \ell are considered. The pgf of joint distribution of \(\underline{X}_n \) is obtained in terms of matrix polynomial and an algorithm is developed to get exact probability distribution. Numerical results are included to demonstrate the computational flexibility of the developed results. Various applications of the joint distribution of \(\underline{X}_n \) such as in evaluation of the reliability of \(( {n,f,k})\!\!:\!\!G\) and \(\!:\!\!G\) system, in evaluation of quantities related to start-up demonstration tests, acceptance sampling plans are also discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号