首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract

A nonparametric procedure is proposed to estimate multiple change-points of location changes in a univariate data sequence by using ranks instead of the raw data. While existing rank-based multiple change-point detection methods are mostly based on sequential tests, we treat it as a model selection problem. We derive the corresponding Schwarz’s information criterion for rank-statistics, theoretically prove the consistency of the change-point estimator and use a pruned dynamic programing algorithm to achieve the change-point estimator. Simulation studies show our method’s robustness, effectiveness and efficiency in detecting mean-changes. We also apply the method to a gene dataset as an illustration.  相似文献   

2.
The product partition model (PPM) is a well-established efficient statistical method for detecting multiple change points in time-evolving univariate data. In this article, we refine the PPM for the purpose of detecting multiple change points in correlated multivariate time-evolving data. Our model detects distributional changes in both the mean and covariance structures of multivariate Gaussian data by exploiting a smaller dimensional representation of correlated multiple time series. The utility of the proposed method is demonstrated through experiments on simulated and real datasets.  相似文献   

3.
We consider the problem of change-point detection in multivariate time-series. The multivariate distribution of the observations is supposed to follow a graphical model, whose graph and parameters are affected by abrupt changes throughout time. We demonstrate that it is possible to perform exact Bayesian inference whenever one considers a simple class of undirected graphs called spanning trees as possible structures. We are then able to integrate on the graph and segmentation spaces at the same time by combining classical dynamic programming with algebraic results pertaining to spanning trees. In particular, we show that quantities such as posterior distributions for change-points or posterior edge probabilities over time can efficiently be obtained. We illustrate our results on both synthetic and experimental data arising from biology and neuroscience.  相似文献   

4.
5.
We demonstrate how to perform direct simulation from the posterior distribution of a class of multiple changepoint models where the number of changepoints is unknown. The class of models assumes independence between the posterior distribution of the parameters associated with segments of data between successive changepoints. This approach is based on the use of recursions, and is related to work on product partition models. The computational complexity of the approach is quadratic in the number of observations, but an approximate version, which introduces negligible error, and whose computational cost is roughly linear in the number of observations, is also possible. Our approach can be useful, for example within an MCMC algorithm, even when the independence assumptions do not hold. We demonstrate our approach on coal-mining disaster data and on well-log data. Our method can cope with a range of models, and exact simulation from the posterior distribution is possible in a matter of minutes.  相似文献   

6.
The paper considers a linear regression model with multiple change-points occurring at unknown times. The LASSO technique is very interesting since it allows simultaneously the parametric estimation, including the change-points estimation, and the automatic variable selection. The asymptotic properties of the LASSO-type (which has as particular case the LASSO estimator) and of the adaptive LASSO estimators are studied. For this last estimator the Oracle properties are proved. In both cases, a model selection criterion is proposed. Numerical examples are provided showing the performances of the adaptive LASSO estimator compared to the least squares estimator.  相似文献   

7.
In this article, we propose a new technique for constructing confidence intervals for the mean of a noisy sequence with multiple change-points. We use the weighted bootstrap to generalize the bootstrap aggregating or bagging estimator. A standard deviation formula for the bagging estimator is introduced, based on which smoothed confidence intervals are constructed. To further improve the performance of the smoothed interval for weak signals, we suggest a strategy of adaptively choosing between the percentile intervals and the smoothed intervals. A new intensity plot is proposed to visualize the pattern of the change-points. We also propose a new change-point estimator based on the intensity plot, which has superior performance in comparison with the state-of-the-art segmentation methods. The finite sample performance of the confidence intervals and the change-point estimator are evaluated through Monte Carlo studies and illustrated with a real data example.  相似文献   

8.
We address the problem of recovering a common set of covariates that are relevant simultaneously to several classification problems. By penalizing the sum of 2 norms of the blocks of coefficients associated with each covariate across different classification problems, similar sparsity patterns in all models are encouraged. To take computational advantage of the sparsity of solutions at high regularization levels, we propose a blockwise path-following scheme that approximately traces the regularization path. As the regularization coefficient decreases, the algorithm maintains and updates concurrently a growing set of covariates that are simultaneously active for all problems. We also show how to use random projections to extend this approach to the problem of joint subspace selection, where multiple predictors are found in a common low-dimensional subspace. We present theoretical results showing that this random projection approach converges to the solution yielded by trace-norm regularization. Finally, we present a variety of experimental results exploring joint covariate selection and joint subspace selection, comparing the path-following approach to competing algorithms in terms of prediction accuracy and running time.  相似文献   

9.
In this paper, two tests, based on weighted CUSUM of the least squares residuals, are studied to detect in real time a change-point in a nonlinear model. A first test statistic is proposed by extension of a method already used in the literature but for the linear models. It is tested under the null hypothesis, at each sequential observation, that there is no change in the model against a change presence. The asymptotic distribution of the test statistic under the null hypothesis is given and its convergence in probability to infinity is proved when a change occurs. These results will allow to build an asymptotic critical region. Next, in order to decrease the type I error probability, a bootstrapped critical value is proposed and a modified test is studied in a similar way. A generalization of the Hájek–Rényi inequality is established.  相似文献   

10.
This work is devoted to the problem of change-point parameter estimation in the case of the presence of multiple changes in the intensity function of the Poisson process. It is supposed that the observations are independent inhomogeneous Poisson processes with the same intensity function and this intensity function has two jumps separated by a known quantity. The asymptotic behavior of the maximum-likelihood and Bayesian estimators are described. It is shown that these estimators are consistent, have different limit distributions, the moments converge and that the Bayesian estimators are asymptotically efficient. The numerical simulations illustrate the obtained results.  相似文献   

11.
A method for efficiently calculating exact marginal, conditional and joint distributions for change points defined by general finite state Hidden Markov Models is proposed. The distributions are not subject to any approximation or sampling error once parameters of the model have been estimated. It is shown that, in contrast to sampling methods, very little computation is needed. The method provides probabilities associated with change points within an interval, as well as at specific points.  相似文献   

12.
There has been significant new work published recently on the subject of model selection. Notably Rissanen (1986, 1987, 1988) has introduced new criteria based on the notion of stochastic complexity and Hurvich and Tsai(1989) have introduced a bias corrected version of Akaike's information criterion. In this paper, a Monte Carlo study is conducted to evaluate the relative performance of these new model selection criteria against the commonly used alternatives. In addition, we compare the performance of all the criteria in a number of situations not considered in earlier studies: robustness to distributional assumptions, collinearity among regressors, and non-stationarity in a time series. The evaluation is based on the number of times the correct model is chosen and the out of sample prediction error. The results of this study suggest that Rissanen's criteria are sensitive to the assumptions and choices that need to made in their application, and so are sometimes unreliable. While many of the criteria often perform satisfactorily, across experiments the Schwartz Bayesian Information Criterion (and the related Bayesian Estimation Criterion of Geweke-Meese) seem to consistently outperfom the other alternatives considered.  相似文献   

13.
There has been significant new work published recently on the subject of model selection. Notably Rissanen (1986, 1987, 1988) has introduced new criteria based on the notion of stochastic complexity and Hurvich and Tsai(1989) have introduced a bias corrected version of Akaike's information criterion. In this paper, a Monte Carlo study is conducted to evaluate the relative performance of these new model selection criteria against the commonly used alternatives. In addition, we compare the performance of all the criteria in a number of situations not considered in earlier studies: robustness to distributional assumptions, collinearity among regressors, and non-stationarity in a time series. The evaluation is based on the number of times the correct model is chosen and the out of sample prediction error. The results of this study suggest that Rissanen's criteria are sensitive to the assumptions and choices that need to made in their application, and so are sometimes unreliable. While many of the criteria often perform satisfactorily, across experiments the Schwartz Bayesian Information Criterion (and the related Bayesian Estimation Criterion of Geweke-Meese) seem to consistently outperfom the other alternatives considered.  相似文献   

14.
15.
The purpose of this paper is threefold. First, we obtain the asymptotic properties of the modified model selection criteria proposed by Hurvich et al. (1990. Improved estimators of Kullback-Leibler information for autoregressive model selection in small samples. Biometrika 77, 709–719) for autoregressive models. Second, we provide some highlights on the better performance of this modified criteria. Third, we extend the modification introduced by these authors to model selection criteria commonly used in the class of self-exciting threshold autoregressive (SETAR) time series models. We show the improvements of the modified criteria in their finite sample performance. In particular, for small and medium sample size the frequency of selecting the true model improves for the consistent criteria and the root mean square error (RMSE) of prediction improves for the efficient criteria. These results are illustrated via simulation with SETAR models in which we assume that the threshold and the parameters are unknown.  相似文献   

16.
In this article, we propose a new empirical information criterion (EIC) for model selection which penalizes the likelihood of the data by a non-linear function of the number of parameters in the model. It is designed to be used where there are a large number of time series to be forecast. However, a bootstrap version of the EIC can be used where there is a single time series to be forecast. The EIC provides a data-driven model selection tool that can be tuned to the particular forecasting task.

We compare the EIC with other model selection criteria including Akaike’s information criterion (AIC) and Schwarz’s Bayesian information criterion (BIC). The comparisons show that for the M3 forecasting competition data, the EIC outperforms both the AIC and BIC, particularly for longer forecast horizons. We also compare the criteria on simulated data and find that the EIC does better than existing criteria in that case also.  相似文献   

17.
Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian estimate of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial.  相似文献   

18.
The main object of Bayesian statistical inference is the determination of posterior distributions. Sometimes these laws are given for quantities devoid of empirical value. This serious drawback vanishes when one confines oneself to considering a finite horizon framework. However, assuming infinite exchangeability gives rise to fairly tractable a posteriori quantities, which is very attractive in applications. Hence, with a view to a reconciliation between these two aspects of the Bayesian way of reasoning, in this paper we provide quantitative comparisons between posterior distributions of finitary parameters and posterior distributions of allied parameters appearing in usual statistical models.  相似文献   

19.
Let R = Rn denote the total (and unconditional) number of runs of successes or failures in a sequence of n Bernoulll (p) trials, where p is assumed to be known throughout. The exact distribution of R is related to a convolution of two negative binomial random variables with parameters p and q (=1-p). Using the representation of R as the sum of 1 - dependent indicators, a Berry - Esséen theorem is derived; the obtained rate of sup norm convergence is O(n). This yields an unconditional version of the classical result of Wald and Wolfowitz (1940). The Stein - Chen method for m - dependent random variables is used, together with a suitable coupling, to prove a Poisson limit theorem for R. but with the limiting support set being the set of odd integers, Total variation error bounds (of order O(p) are found for the last result. Applications are indicated.  相似文献   

20.
ABSTRACT

In this paper, we investigate the consistency of the Expectation Maximization (EM) algorithm-based information criteria for model selection with missing data. The criteria correspond to a penalization of the conditional expectation of the complete data log-likelihood given the observed data and with respect to the missing data conditional density. We present asymptotic properties related to maximum likelihood estimation in the presence of incomplete data and we provide sufficient conditions for the consistency of model selection by minimizing the information criteria. Their finite sample performance is illustrated through simulation and real data studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号