期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Self-healing umbrella sampling: convergence and efficiency

Gersende Fort Benjamin Jourdain Tony Lelièvre Gabriel Stoltz 《Statistics and Computing》2017,27(1):147-168

The Self-Healing Umbrella Sampling (SHUS) algorithm is an adaptive biasing algorithm which has been proposed in Marsili et al. (J Phys Chem B 110(29):14011–14013, 2006) in order to efficiently sample a multimodal probability measure. We show that this method can be seen as a variant of the well-known Wang–Landau algorithm Wang and Landau (Phys Rev E 64:056101, 2001a; Phys Rev Lett 86(10):2050–2053, 2001b). Adapting results on the convergence of the Wang-Landau algorithm obtained in Fort et al. (Math Comput 84(295):2297–2327, 2014a), we prove the convergence of the SHUS algorithm. We also compare the two methods in terms of efficiency. We finally propose a modification of the SHUS algorithm in order to increase its efficiency, and exhibit some similarities of SHUS with the well-tempered metadynamics method Barducci et al. (Phys Rev Lett 100:020,603, 2008). 相似文献

2.

Growth mixture models in longitudinal research

Jost Reinecke Daniel Seddig 《AStA Advances in Statistical Analysis》2011,95(4):415-434

Latent growth curve models as structural equation models are extensively discussed in various research fields (Curran and Muthén in Am. J. Community Psychol. 27:567–595, 1999; Duncan et al. in An introduction to latent variable growth curve modeling. Concepts, issues and applications, 2nd edn., Lawrence Earlbaum, Mahwah, 2006; Muthén and Muthén in Alcohol. Clin. Exp. Res. 24(6):882–891, 2000a; in J. Stud. Alcohol. 61:290–300, 2000b). Recent methodological and statistical extension are focused on the consideration of unobserved heterogeneity in empirical data. Muthén extended the classic structural equation approach by mixture components, i.e. categorical latent classes (Muthén in Marcouldies, G.A., Sckumacker, R.E. (eds.), New developments and techniques in structural equation modeling, pp. 1–33, Lawrance Erlbaum, Mahwah, 2001a; in Behaviometrika 29(1):81–117, 2002; in Kaplan, D. (ed.), The SAGE handbook of quantitative methodology for the social sciences, pp. 345–368, Sage, Thousand Oaks, 2004). The paper discusses applications of growth mixture models with data on delinquent behavior of adolescents from the German panel study Crime in the modern City (CrimoC) (Boers et al. in Eur. J. Criminol. 7:499–520, 2010; Reinecke in Delinquenzverläufe im Jugendalter: Empirische Überprüfung von Wachstums- und Mischverteilungsmodellen, Institut für sozialwissenschaftliche Forschung e.V., Münster, 2006a; in Methodology 2:100–112, 2006b; in van Montfort, K., Oud, J., Satorra, A. (eds.), Longitudinal models in the behavioral and related sciences, pp. 239–266, Lawrence Erlbaum, Mahwah, 2007). Observed as well as unobserved heterogeneity will be considered with growth mixture models. Special attention is given to the distribution of the outcome variables as counts. Poisson and negative binomial distributions with zero inflation are considered in the proposed growth mixture models variables. Different model specifications will be emphasized with respect to their particular parameterizations. 相似文献

3.

A semiparametric regression cure model for doubly censored data

Peijie Wang Xingwei Tong Jianguo Sun 《Lifetime data analysis》2018,24(3):492-508

This paper discusses regression analysis of doubly censored failure time data when there may exist a cured subgroup. By doubly censored data, we mean that the failure time of interest denotes the elapsed time between two related events and the observations on both event times can suffer censoring (Sun in The statistical analysis of interval-censored failure time data. Springer, New York, 2006). One typical example of such data is given by an acquired immune deficiency syndrome cohort study. Although many methods have been developed for their analysis (De Gruttola and Lagakos in Biometrics 45:1–12, 1989; Sun et al. in Biometrics 55:909–914, 1999; 60:637–643, 2004; Pan in Biometrics 57:1245–1250, 2001), it does not seem to exist an established method for the situation with a cured subgroup. This paper discusses this later problem and presents a sieve approximation maximum likelihood approach. In addition, the asymptotic properties of the resulting estimators are established and an extensive simulation study indicates that the method seems to work well for practical situations. An application is also provided. 相似文献

4.

An alternative to unrelated randomized response techniques with logistic regression analysis

Shu-Hui Hsieh Shen-Ming Lee Chin-Shang Li Su-Hao Tu 《Statistical Methods and Applications》2016,25(4):601-621

The randomized response technique (RRT) is an important tool that is commonly used to protect a respondent’s privacy and avoid biased answers in surveys on sensitive issues. In this work, we consider the joint use of the unrelated-question RRT of Greenberg et al. (J Am Stat Assoc 64:520–539, 1969) and the related-question RRT of Warner (J Am Stat Assoc 60:63–69, 1965) dealing with the issue of an innocuous question from the unrelated-question RRT. Unlike the existing unrelated-question RRT of Greenberg et al. (1969), the approach can provide more information on the innocuous question by using the related-question RRT of Warner (1965) to effectively improve the efficiency of the maximum likelihood estimator of Scheers and Dayton (J Am Stat Assoc 83:969–974, 1988). We can then estimate the prevalence of the sensitive characteristic by using logistic regression. In this new design, we propose the transformation method and provide large-sample properties. From the case of two survey studies, an extramarital relationship study and a cable TV study, we develop the joint conditional likelihood method. As part of this research, we conduct a simulation study of the relative efficiencies of the proposed methods. Furthermore, we use the two survey studies to compare the analysis results under different scenarios. 相似文献

5.

On the decomposition by sources of the Zenga 1984 point and synthetic inequality indexes

Alberto Arcagni 《Statistical Methods and Applications》2017,26(1):113-133

In this work we provide a decomposition by sources of the inequality index \(\zeta \) defined by Zenga (Giornale degli Economisti e Annali di economia 43(5–6):301–326, 1984). The source contributions are obtained with the method proposed in Zenga et al. (Stat Appl X(1):3–31, 2012) and Zenga (Stat Appl XI(2):133–161, 2013), that allows to compare different inequality measures. This method is based on the decomposition of inequality curves. To apply this decomposition to the index \(\zeta \) and its inequality curve, we adapt the method to the “cograduation” table. Moreover, we consider the case of linear transformation of sources and analyse the corresponding results. 相似文献

6.

A flexible generalization of the skew normal distribution based on a weighted normal distribution

Mahdi Rasekhi Rahim Chinipardaz Sayed Mohammad Reza Alavi 《Statistical Methods and Applications》2016,25(3):375-394

The skew normal distribution of Azzalini (Scand J Stat 12:171–178, 1985) has been found suitable for unimodal density but with some skewness present. Through this article, we introduce a flexible extension of the Azzalini (Scand J Stat 12:171–178, 1985) skew normal distribution based on a symmetric component normal distribution (Gui et al. in J Stat Theory Appl 12(1):55–66, 2013). The proposed model can efficiently capture the bimodality, skewness and kurtosis criteria and heavy-tail property. The paper presents various basic properties of this family of distributions and provides two stochastic representations which are useful for obtaining theoretical properties and to simulate from the distribution. Further, maximum likelihood estimation of the parameters is studied numerically by simulation and the distribution is investigated by carrying out comparative fitting of three real datasets. 相似文献

7.

Wavelet regression estimations with strong mixing data

Junke Kou Youming Liu 《Statistical Methods and Applications》2018,27(4):667-688

Using a wavelet basis, we establish in this paper upper bounds of wavelet estimation on \( L^{p}({\mathbb {R}}^{d}) \) risk of regression functions with strong mixing data for \( 1\le p<\infty \). In contrast to the independent case, these upper bounds have different analytic formulae for \(p\in [1, 2]\) and \(p\in (2, +\infty )\). For \(p=2\), it turns out that our result reduces to a theorem of Chaubey et al. (J Nonparametr Stat 25:53–71, 2013); and for \(d=1\) and \(p=2\), it becomes the corresponding theorem of Chaubey and Shirazi (Commun Stat Theory Methods 44:885–899, 2015). 相似文献

8.

Bayesian bivariate survival analysis using the power variance function copula

Jose S. Romeo Renate Meyer Diego I. Gallardo 《Lifetime data analysis》2018,24(2):355-383

Copula models have become increasingly popular for modelling the dependence structure in multivariate survival data. The two-parameter Archimedean family of Power Variance Function (PVF) copulas includes the Clayton, Positive Stable (Gumbel) and Inverse Gaussian copulas as special or limiting cases, thus offers a unified approach to fitting these important copulas. Two-stage frequentist procedures for estimating the marginal distributions and the PVF copula have been suggested by Andersen (Lifetime Data Anal 11:333–350, 2005), Massonnet et al. (J Stat Plann Inference 139(11):3865–3877, 2009) and Prenen et al. (J R Stat Soc Ser B 79(2):483–505, 2017) which first estimate the marginal distributions and conditional on these in a second step to estimate the PVF copula parameters. Here we explore an one-stage Bayesian approach that simultaneously estimates the marginal and the PVF copula parameters. For the marginal distributions, we consider both parametric as well as semiparametric models. We propose a new method to simulate uniform pairs with PVF dependence structure based on conditional sampling for copulas and on numerical approximation to solve a target equation. In a simulation study, small sample properties of the Bayesian estimators are explored. We illustrate the usefulness of the methodology using data on times to appendectomy for adult twins in the Australian NH&MRC Twin registry. Parameters of the marginal distributions and the PVF copula are simultaneously estimated in a parametric as well as a semiparametric approach where the marginal distributions are modelled using Weibull and piecewise exponential distributions, respectively. 相似文献

9.

Computation of Gaussian orthant probabilities in high dimension

James Ridgway 《Statistics and Computing》2016,26(4):899-916

We study the computation of Gaussian orthant probabilities, i.e. the probability that a Gaussian variable falls inside a quadrant. The Geweke–Hajivassiliou–Keane (GHK) algorithm (Geweke, Comput Sci Stat 23:571–578 1991, Keane, Simulation estimation for panel data models with limited dependent variables, 1993, Hajivassiliou, J Econom 72:85–134, 1996, Genz, J Comput Graph Stat 1:141–149, 1992) is currently used for integrals of dimension greater than 10. In this paper, we show that for Markovian covariances GHK can be interpreted as the estimator of the normalizing constant of a state-space model using sequential importance sampling. We show for an AR(1) the variance of the GHK, properly normalized, diverges exponentially fast with the dimension. As an improvement we propose using a particle filter. We then generalize this idea to arbitrary covariance matrices using Sequential Monte Carlo with properly tailored MCMC moves. We show empirically that this can lead to drastic improvements on currently used algorithms. We also extend the framework to orthants of mixture of Gaussians (Student, Cauchy, etc.), and to the simulation of truncated Gaussians. 相似文献

10.

A coordinate descent algorithm for computing penalized smooth quantile regression

Abdallah Mkhadri Mohamed Ouhourane Karim Oualkacha 《Statistics and Computing》2017,27(4):865-883

The computation of penalized quantile regression estimates is often computationally intensive in high dimensions. In this paper we propose a coordinate descent algorithm for computing the penalized smooth quantile regression (cdaSQR) with convex and nonconvex penalties. The cdaSQR approach is based on the approximation of the objective check function, which is not differentiable at zero, by a modified check function which is differentiable at zero. Then, using the maximization-minimization trick of the gcdnet algorithm (Yang and Zou in, J Comput Graph Stat 22(2):396–415, 2013), we update each coefficient simply and efficiently. In our implementation, we consider the convex penalties \(\ell _1+\ell _2\) and the nonconvex penalties SCAD (or MCP) \(+ \ell _2\). We establishe the convergence property of the csdSQR with \(\ell _1+\ell _2\) penalty. The numerical results show that our implementation is an order of magnitude faster than its competitors. Using simulations we compare the speed of our algorithm to its competitors. Finally, the performance of our algorithm is illustrated on three real data sets from diabetes, leukemia and Bardet–Bidel syndrome gene expression studies. 相似文献

11.

Computational challenges and temporal dependence in Bayesian nonparametric models

Raffaele Argiento Matteo Ruggiero 《Statistical Methods and Applications》2018,27(2):231-238

Müller et al. (Stat Methods Appl, 2017) provide an excellent review of several classes of Bayesian nonparametric models which have found widespread application in a variety of contexts, successfully highlighting their flexibility in comparison with parametric families. Particular attention in the paper is dedicated to modelling spatial dependence. Here we contribute by concisely discussing general computational challenges which arise with posterior inference with Bayesian nonparametric models and certain aspects of modelling temporal dependence. 相似文献

12.

An exact method for the multiple comparison of several polynomial regression models with applications in dose-response study

Sanyu Zhou 《AStA Advances in Statistical Analysis》2018,102(3):413-429

Research on the multiple comparison during the past 60 years or so has focused mainly on the comparison of several population means. Spurrier (J Am Stat Assoc 94:483–488, 1999) and Liu et al. (J Am Stat Assoc 99:395–403, 2004) considered the multiple comparison of several linear regression lines. They assumed that there was no functional relationship between the predictor variables. For the case of the polynomial regression model, the functional relationship between the predictor variables does exist. This lack of a full utilization of the functional relationship between the predictor variables may have some undesirable consequences. In this article we introduce an exact method for the multiple comparison of several polynomial regression models. This method sufficiently takes advantage of the feature of the polynomial regression model, and therefore, it can quickly and accurately compute the critical constant. This proposed method allows various types of comparisons, including pairwise, many-to-one and successive, and it also allows the predictor variable to be either unconstrained or constrained to a finite interval. The examples from the dose-response study are used to illustrate the method. MATLAB programs have been written for easy implementation of this method. 相似文献

13.

Multiple Monte Carlo testing,with applications in spatial point processes

Tomáš Mrkvička Mari Myllymäki Ute Hahn 《Statistics and Computing》2017,27(5):1239-1255

The rank envelope test (Myllymäki et al. in J R Stat Soc B, doi: 10.1111/rssb.12172, 2016) is proposed as a solution to the multiple testing problem for Monte Carlo tests. Three different situations are recognized: (1) a few univariate Monte Carlo tests, (2) a Monte Carlo test with a function as the test statistic, (3) several Monte Carlo tests with functions as test statistics. The rank test has correct (global) type I error in each case and it is accompanied with a p-value and with a graphical interpretation which determines subtests and distances of the used test function(s) which lead to the rejection at the prescribed significance level of the test. Examples of null hypotheses from point process and random set statistics are used to demonstrate the strength of the rank envelope test. The examples include goodness-of-fit test with several test functions, goodness-of-fit test for a group of point patterns, test of dependence of components in a multi-type point pattern, and test of the Boolean assumption for random closed sets. A power comparison to the classical multiple testing procedures is given. 相似文献

14.

Multilevel particle filters: normalizing constant estimation

Ajay Jasra Kengo Kamatani Prince Peprah Osei Yan Zhou 《Statistics and Computing》2018,28(1):47-60

In this article, we introduce two new estimates of the normalizing constant (or marginal likelihood) for partially observed diffusion (POD) processes, with discrete observations. One estimate is biased but non-negative and the other is unbiased but not almost surely non-negative. Our method uses the multilevel particle filter of Jasra et al. (Multilevel particle lter, arXiv:1510.04977, 2015). We show that, under assumptions, for Euler discretized PODs and a given \(\varepsilon >0\) in order to obtain a mean square error (MSE) of \({\mathcal {O}}(\varepsilon ^2)\) one requires a work of \({\mathcal {O}}(\varepsilon ^{-2.5})\) for our new estimates versus a standard particle filter that requires a work of \({\mathcal {O}}(\varepsilon ^{-3})\). Our theoretical results are supported by numerical simulations. 相似文献

15.

Variable selection for survival data with a class of adaptive elastic net techniques

Md Hasinur Rahaman Khan J. Ewart H. Shaw 《Statistics and Computing》2016,26(3):725-741

The accelerated failure time (AFT) models have proved useful in many contexts, though heavy censoring (as for example in cancer survival) and high dimensionality (as for example in microarray data) cause difficulties for model fitting and model selection. We propose new approaches to variable selection for censored data, based on AFT models optimized using regularized weighted least squares. The regularized technique uses a mixture of \(\ell _1\) and \(\ell _2\) norm penalties under two proposed elastic net type approaches. One is the adaptive elastic net and the other is weighted elastic net. The approaches extend the original approaches proposed by Ghosh (Adaptive elastic net: an improvement of elastic net to achieve oracle properties, Technical Reports 2007) and Hong and Zhang (Math Model Nat Phenom 5(3):115–133 2010), respectively. We also extend the two proposed approaches by adding censoring observations as constraints into their model optimization frameworks. The approaches are evaluated on microarray and by simulation. We compare the performance of these approaches with six other variable selection techniques-three are generally used for censored data and the other three are correlation-based greedy methods used for high-dimensional data. 相似文献

16.

Discussion of “The power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli,Marco Riani,Anthony C. Atkinson and Aldo Corbellini

Valentin Todorov 《Statistical Methods and Applications》2018,27(4):595-602

This paper discusses the contribution of Cerioli et al. (Stat Methods Appl, 2018), where robust monitoring based on high breakdown point estimators is proposed for multivariate data. The results follow years of development in robust diagnostic techniques. We discuss the issues of extending data monitoring to other models with complex structure, e.g. factor analysis, mixed linear models for which S and MM-estimators exist or deviating data cells. We emphasise the importance of robust testing that is often overlooked despite robust tests being readily available once S and MM-estimators have been defined. We mention open questions like out-of-sample inference or big data issues that would benefit from monitoring. 相似文献

17.

Gender wage inequalities in Switzerland: the public versus the private sector

Mihaela Catalina Anastasiade Yves Tillé 《Statistical Methods and Applications》2017,26(2):293-316

Wage differences between women and men can be divided into an explained part and an unexplained part. The former encompasses differences in the observable characteristics of the members of groups, such as age, education or work experience. The latter includes the part of the difference that is not attributable to objective factors and represents an estimation of the discrimination level. We discuss the original method of Blinder (J Hum Resour 8(4):436–455, 1973) and Oaxaca (Int Econ Rev 14(3):693–709, 1973), the reweighting technique of DiNardo et al. (Econometrica 64(5):1001–1044, 1996) and our approach based on calibration. Using a Swiss dataset from 2012, we compare the estimated explained and unexplained parts of the difference in average wages in the private and public sectors obtained with the three methods. We show that for the private sector, all three methods yield similar results. For the public sector, the reweighting technique estimates a lower value of the unexplained part than the other two methods. The calibration approach and the reweighting technique allow us to estimate the explained and unexplained parts of the wage differences at points other than the mean. By using this, in this paper, the assumption that wages are more equitable in the public sector is analysed. Wage differences at different quantiles in both sectors are examined. We show that in the public sector, discrimination occurs quite uniformly both in lower and in higher-paying jobs. On the other hand, in the private sector, discrimination is greater in lower-paying jobs than in higher-paying jobs.queryPlease check and confirm the given name and family name is correctly identified for the first author and amend if necessary. 相似文献

18.

Extensions of stability selection using subsamples of observations and covariates

Andre Beinrucker Ürün Dogan Gilles Blanchard 《Statistics and Computing》2016,26(5):1059-1077

We introduce extensions of stability selection, a method to stabilise variable selection methods introduced by Meinshausen and Bühlmann (J R Stat Soc 72:417–473, 2010). We propose to apply a base selection method repeatedly to random subsamples of observations and subsets of covariates under scrutiny, and to select covariates based on their selection frequency. We analyse the effects and benefits of these extensions. Our analysis generalizes the theoretical results of Meinshausen and Bühlmann (J R Stat Soc 72:417–473, 2010) from the case of half-samples to subsamples of arbitrary size. We study, in a theoretical manner, the effect of taking random covariate subsets using a simplified score model. Finally we validate these extensions on numerical experiments on both synthetic and real datasets, and compare the obtained results in detail to the original stability selection method. 相似文献

19.

Asymptotics of the weighted least squares estimation for AR(1) processes with applications to confidence intervals

Ruidong Han Xinghui Wang Shuhe Hu 《Statistical Methods and Applications》2018,27(3):479-490

For the first-order autoregressive model, we establish the asymptotic theory of the weighted least squares estimations whether the underlying autoregressive process is stationary, unit root, near integrated or even explosive under a weaker moment condition of innovations. The asymptotic limit of this estimator is always normal. It is shown that the empirical log-likelihood ratio at the true parameter converges to the standard chi-square distribution. An empirical likelihood confidence interval is proposed for interval estimations of the autoregressive coefficient. The results improve the corresponding ones of Chan et al. (Econ Theory 28:705–717, 2012). Some simulations are conducted to illustrate the proposed method. 相似文献

20.

Asymptotic distribution of quasi-maximum likelihood estimation of dynamic panels using long difference transformation when both <Emphasis Type="Italic">N</Emphasis> and <Emphasis Type="Italic">T</Emphasis> are large

Cheng Hsiao Qiankun Zhou 《Statistical Methods and Applications》2016,25(4):675-683

This note shows that the asymptotic properties of the quasi-maximum likelihood estimation for dynamic panel models can be easily derived by following the approach of Grassetti (Stat Methods Appl 20:221–240, 2011) to take the long difference to remove the time-invariant individual specific effects. 相似文献