首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

Nonstandard mixtures are those that result from a mixture of a discrete and a continuous random variable. They arise in practice, for example, in medical studies of exposure. Here, a random variable that models exposure might have a discrete mass point at no exposure, but otherwise may be continuous. In this article we explore estimating the distribution function associated with such a random variable from a nonparametric viewpoint. We assume that the locations of the discrete mass points are known so that we will be able to apply a classical nonparametric smoothing approach to the problem. The proposed estimator is a mixture of an empirical distribution function and a kernel estimate of a distribution function. A simple theoretical argument reveals that existing bandwidth selection algorithms can be applied to the smooth component of this estimator as well. The proposed approach is applied to two example sets of data.  相似文献   

2.
Abstract

A nonparametric procedure is proposed to estimate multiple change-points of location changes in a univariate data sequence by using ranks instead of the raw data. While existing rank-based multiple change-point detection methods are mostly based on sequential tests, we treat it as a model selection problem. We derive the corresponding Schwarz’s information criterion for rank-statistics, theoretically prove the consistency of the change-point estimator and use a pruned dynamic programing algorithm to achieve the change-point estimator. Simulation studies show our method’s robustness, effectiveness and efficiency in detecting mean-changes. We also apply the method to a gene dataset as an illustration.  相似文献   

3.
This paper deals with techniques for obtaining random point samples from spatial databases. We seek random points from a continuous domain (usually 2) which satisfy a spatial predicate that is represented in the database as a collection of polygons. Several applications of spatial sampling (e.g. environmental monitoring, agronomy, forestry, etc) are described. Sampling problems are characterized in terms of two key parameters: coverage (selectivity), and expected stabbing number (overlap). We discuss two fundamental approaches to sampling with spatial predicates, depending on whether we sample first or evaluate the predicate first. The approaches are described in the context of both quadtrees and R-trees, detailing the sample first, acceptance/rejection tree, and partial area tree algorithms. A sequential algorithm, the one-pass spatial reservoir algorithm is also described. The relative performance of the various sampling algorithms is compared and choice of preferred algorithms is suggested. We conclude with a short discussion of possible extensions.  相似文献   

4.
ABSTRACT

Very fast automatic rejection algorithms were developed recently which allow us to generate random variates from large classes of unimodal distributions. They require the choice of several design points which decompose the domain of the distribution into small sub-intervals. The optimal choice of these points is an important but unsolved problem. Therefore, we present an approach that allows us to characterize optimal design points in the asymptotic case (when their number tends to infinity) under mild regularity conditions. We describe a short algorithm to calculate these asymptotically optimal points in practice. Numerical experiments indicate that they are very close to optimal even when only six or seven design points are calculated.  相似文献   

5.
Abstract

We propose a new class of two-stage parameter estimation methods for semiparametric ordinary differential equation (ODE) models. In the first stage, state variables are estimated using a penalized spline approach; In the second stage, form of numerical discretization algorithms for an ODE solver is used to formulate estimating equations. Estimated state variables from the first stage are used to obtain more data points for the second stage. Asymptotic properties for the proposed estimators are established. Simulation studies show that the method performs well, especially for small sample. Real life use of the method is illustrated using Influenza specific cell-trafficking study.  相似文献   

6.
7.
ABSTRACT

In a load-sharing system, the failure of a component affects the residual lifetime of the surviving components. We propose a model for the load-sharing phenomenon in k-out-of-m systems. The model is based on exponentiated conditional distributions of the order statistics formed by the failure times of the components. For an illustration, we consider two component parallel systems with the initial lifetimes of the components having Weibull and linear failure rate distributions. We analyze one data set to show that the proposed model may be a better fit than the model based on sequential order statistics.  相似文献   

8.
ABSTRACT

The aim of this note is to investigate the concentration properties of unbounded functions of geometrically ergodic Markov chains. We derive concentration properties of centred functions with respect to the square of Lyapunov's function in the drift condition satisfied by the Markov chain. We apply the new exponential inequalities to derive confidence intervals for Markov Chain Monte Carlo algorithms. Quantitative error bounds are provided for the regenerative Metropolis algorithm of [Brockwell and Kadane Identification of regeneration times in MCMC simulation, with application to adaptive schemes. J Comput Graphical Stat. 2005;14(2)].  相似文献   

9.

A Bayesian approach is considered to detect the number of change points in simple linear regression models. A normal-gamma empirical prior for the regression parameters based on maximum likelihood estimator (MLE) is employed in the analysis. Under mild conditions, consistency for the number of change points and boundedness between the estimated location and the true location of the change points are established. The Bayesian approach to the detection of the number of change points is suitable whether the switching simple regression is continuous or discontinuous. Some simulation results are given to confirm the accuracy of the proposed estimator.  相似文献   

10.
ABSTRACT

In this paper, we propose modified spline estimators for nonparametric regression models with right-censored data, especially when the censored response observations are converted to synthetic data. Efficient implementation of these estimators depends on the set of knot points and an appropriate smoothing parameter. We use three algorithms, the default selection method (DSM), myopic algorithm (MA), and full search algorithm (FSA), to select the optimum set of knots in a penalized spline method based on a smoothing parameter, which is chosen based on different criteria, including the improved version of the Akaike information criterion (AICc), generalized cross validation (GCV), restricted maximum likelihood (REML), and Bayesian information criterion (BIC). We also consider the smoothing spline (SS), which uses all the data points as knots. The main goal of this study is to compare the performance of the algorithm and criteria combinations in the suggested penalized spline fits under censored data. A Monte Carlo simulation study is performed and a real data example is presented to illustrate the ideas in the paper. The results confirm that the FSA slightly outperforms the other methods, especially for high censoring levels.  相似文献   

11.
The Buckley–James estimator (BJE) [J. Buckley and I. James, Linear regression with censored data, Biometrika 66 (1979), pp. 429–436] has been extended from right-censored (RC) data to interval-censored (IC) data by Rabinowitz et al. [D. Rabinowitz, A. Tsiatis, and J. Aragon, Regression with interval-censored data, Biometrika 82 (1995), pp. 501–513]. The BJE is defined to be a zero-crossing of a modified score function H(b), a point at which H(·) changes its sign. We discuss several approaches (for finding a BJE with IC data) which are extensions of the existing algorithms for RC data. However, these extensions may not be appropriate for some data, in particular, they are not appropriate for a cancer data set that we are analysing. In this note, we present a feasible iterative algorithm for obtaining a BJE. We apply the method to our data.  相似文献   

12.
Clustering algorithms are used in the analysis of gene expression data to identify groups of genes with similar expression patterns. These algorithms group genes with respect to a predefined dissimilarity measure without using any prior classification of the data. Most of the clustering algorithms require the number of clusters as input, and all the objects in the dataset are usually assigned to one of the clusters. We propose a clustering algorithm that finds clusters sequentially, and allows for sporadic objects, so there are objects that are not assigned to any cluster. The proposed sequential clustering algorithm has two steps. First it finds candidates for centers of clusters. Multiple candidates are used to make the search for clusters more efficient. Secondly, it conducts a local search around the candidate centers to find the set of objects that defines a cluster. The candidate clusters are compared using a predefined score, the best cluster is removed from data, and the procedure is repeated. We investigate the performance of this algorithm using simulated data and we apply this method to analyze gene expression profiles in a study on the plasticity of the dendritic cells.  相似文献   

13.
The traditional mixture model assumes that a dataset is composed of several populations of Gaussian distributions. In real life, however, data often do not fit the restrictions of normality very well. It is likely that data from a single population exhibiting either asymmetrical or heavy-tail behavior could be erroneously modeled as two populations, resulting in suboptimal decisions. To avoid these pitfalls, we generalize the mixture model using adaptive kernel density estimators. Because kernel density estimators enforce no functional form, we can adapt to non-normal asymmetric, kurtotic, and tail characteristics in each population independently. This, in effect, robustifies mixture modeling. We adapt two computational algorithms, genetic algorithm with regularized Mahalanobis distance and genetic expectation maximization algorithm, to optimize the kernel mixture model (KMM) and use results from robust estimation theory in order to data-adaptively regularize both. Finally, we likewise extend the information criterion ICOMP to score the KMM. We use these tools to simultaneously select the best mixture model and classify all observations without making any subjective decisions. The performance of the KMM is demonstrated on two medical datasets; in both cases, we recover the clinically determined group structure and substantially improve patient classification rates over the Gaussian mixture model.  相似文献   

14.
The concept of location depth was introduced as a way to extend the univariate notion of ranking to a bivariate configuration of data points. It has been used successfully for robust estimation, hypothesis testing, and graphical display. The depth contours form a collection of nested polygons, and the center of the deepest contour is called the Tukey median. The only available implemented algorithms for the depth contours and the Tukey median are slow, which limits their usefulness. In this paper we describe an optimal algorithm which computes all bivariate depth contours in O(n 2) time and space, using topological sweep of the dual arrangement of lines. Once these contours are known, the location depth of any point can be computed in O(log2 n) time with no additional preprocessing or in O(log n) time after O(n 2) preprocessing. We provide fast implementations of these algorithms to allow their use in everyday statistical practice.  相似文献   

15.
Abstract. Kernel density estimation is an important tool in visualizing posterior densities from Markov chain Monte Carlo output. It is well known that when smooth transition densities exist, the asymptotic properties of the estimator agree with those for independent data. In this paper, we show that because of the rejection step of the Metropolis–Hastings algorithm, this is no longer true and the asymptotic variance will depend on the probability of accepting a proposed move. We find an expression for this variance and apply the result to algorithms for automatic bandwidth selection.  相似文献   

16.
The Barrodale and Roberts algorithm for least absolute value (LAV) regression and the algorithm proposed by Bartels and Conn both have the advantage that they are often able to skip across points at which the conventional simplex-method algorithms for LAV regression would be required to carry out an (expensive) pivot operation.

We indicate here that this advantage holds in the Bartels-Conn approach for a wider class of problems: the minimization of piecewise linear functions. We show how LAV regression, restricted LAV regression, general linear programming and least maximum absolute value regression can all be easily expressed as piecewise linear minimization problems.  相似文献   

17.
ABSTRACT

We propose an extension of parametric product partition models. We name our proposal nonparametric product partition models because we associate a random measure instead of a parametric kernel to each set within a random partition. Our methodology does not impose any specific form on the marginal distribution of the observations, allowing us to detect shifts of behaviour even when dealing with heavy-tailed or skewed distributions. We propose a suitable loss function and find the partition of the data having minimum expected loss. We then apply our nonparametric procedure to multiple change-point analysis and compare it with PPMs and with other methodologies that have recently appeared in the literature. Also, in the context of missing data, we exploit the product partition structure in order to estimate the distribution function of each missing value, allowing us to detect change points using the loss function mentioned above. Finally, we present applications to financial as well as genetic data.  相似文献   

18.
ABSTRACT

In this paper, we consider an effective Bayesian inference for censored Student-t linear regression model, which is a robust alternative to the usual censored Normal linear regression model. Based on the mixture representation of the Student-t distribution, we propose a non-iterative Bayesian sampling procedure to obtain independently and identically distributed samples approximately from the observed posterior distributions, which is different from the iterative Markov Chain Monte Carlo algorithm. We conduct model selection and influential analysis using the posterior samples to choose the best fitted model and to detect latent outliers. We illustrate the performance of the procedure through simulation studies, and finally, we apply the procedure to two real data sets, one is the insulation life data with right censoring and the other is the wage rates data with left censoring, and we get some interesting results.  相似文献   

19.
We describe an image reconstruction problem and the computational difficulties arising in determining the maximum a posteriori (MAP) estimate. Two algorithms for tackling the problem, iterated conditional modes (ICM) and simulated annealing, are usually applied pixel by pixel. The performance of this strategy can be poor, particularly for heavily degraded images, and as a potential improvement Jubb and Jennison (1991) suggest the cascade algorithm in which ICM is initially applied to coarser images formed by blocking squares of pixels. In this paper we attempt to resolve certain criticisms of cascade and present a version of the algorithm extended in definition and implementation. As an illustration we apply our new method to a synthetic aperture radar (SAR) image. We also carry out a study of simulated annealing, with and without cascade, applied to a more tractable minimization problem from which we gain insight into the properties of cascade algorithms.  相似文献   

20.
A nonparametric mixture model specifies that observations arise from a mixture distribution, ∫ f(x, θ) dG(θ), where the mixing distribution G is completely unspecified. A number of algorithms have been developed to obtain unconstrained maximum-likelihood estimates of G, but none of these algorithms lead to estimates when functional constraints are present. In many cases, there is a natural interest in functional ?(G), such as the mean and variance, of the mixing distribution, and profile likelihoods and confidence intervals for ?(G) are desired. In this paper we develop a penalized generalization of the ISDM algorithm of Kalbfleisch and Lesperance (1992) that can be used to solve the problem of constrained estimation. We also discuss its use in various different applications. Convergence results and numerical examples are given for the generalized ISDM algorithm, and asymptotic results are developed for the likelihood-ratio test statistics in the multinomial case.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号