首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We study the non-parametric estimation of a continuous distribution function F based on the partially rank-ordered set (PROS) sampling design. A PROS sampling design first selects a random sample from the underlying population and uses judgement ranking to rank them into partially ordered sets, without measuring the variable of interest. The final measurements are then obtained from one of the partially ordered sets. Considering an imperfect PROS sampling procedure, we first develop the empirical distribution function (EDF) estimator of F and study its theoretical properties. Then, we consider the problem of estimating F, where the underlying distribution is assumed to be symmetric. We also find a unique admissible estimator of F within the class of nondecreasing step functions with jumps at observed values and show the inadmissibility of the EDF. In addition, we introduce a smooth estimator of F and discuss its theoretical properties. Finally, we expand on various numerical illustrations of our results via several simulation studies and a real data application and show the advantages of PROS estimates over their counterparts under the simple random and ranked set sampling designs.  相似文献   

2.
Partially rank-ordered set (PROS) sampling is a generalization of ranked set sampling in which rankers are not required to fully rank the sampling units in each set, hence having more flexibility to perform the necessary judgemental ranking process. The PROS sampling has a wide range of applications in different fields ranging from environmental and ecological studies to medical research and it has been shown to be superior over ranked set sampling and simple random sampling for estimating the population mean. We study Fisher information content and uncertainty structure of the PROS samples and compare them with those of simple random sample (SRS) and ranked set sample (RSS) counterparts of the same size from the underlying population. We study uncertainty structure in terms of the Shannon entropy, Rényi entropy and Kullback–Leibler (KL) discrimination measures.  相似文献   

3.
We present a maximum likelihood estimation procedure for the multivariate frailty model. The estimation is based on a Monte Carlo EM algorithm. The expectation step is approximated by averaging over random samples drawn from the posterior distribution of the frailties using rejection sampling. The maximization step reduces to a standard partial likelihood maximization. We also propose a simple rule based on the relative change in the parameter estimates to decide on sample size in each iteration and a stopping time for the algorithm. An important new concept is acquiring absolute convergence of the algorithm through sample size determination and an efficient sampling technique. The method is illustrated using a rat carcinogenesis dataset and data on vase lifetimes of cut roses. The estimation results are compared with approximate inference based on penalized partial likelihood using these two examples. Unlike the penalized partial likelihood estimation, the proposed full maximum likelihood estimation method accounts for all the uncertainty while estimating standard errors for the parameters.  相似文献   

4.
Estimators derived from the expectation‐maximization (EM) algorithm are not robust since they are based on the maximization of the likelihood function. We propose an iterative proximal‐point algorithm based on the EM algorithm to minimize a divergence criterion between a mixture model and the unknown distribution that generates the data. The algorithm estimates in each iteration the proportions and the parameters of the mixture components in two separate steps. Resulting estimators are generally robust against outliers and misspecification of the model. Convergence properties of our algorithm are studied. The convergence of the introduced algorithm is discussed on a two‐component Weibull mixture entailing a condition on the initialization of the EM algorithm in order for the latter to converge. Simulations on Gaussian and Weibull mixture models using different statistical divergences are provided to confirm the validity of our work and the robustness of the resulting estimators against outliers in comparison to the EM algorithm. An application to a dataset of velocities of galaxies is also presented. The Canadian Journal of Statistics 47: 392–408; 2019 © 2019 Statistical Society of Canada  相似文献   

5.
This paper develops statistical inference for population quantiles based on a partially rank-ordered set (PROS) sample design. A PROS sample design is similar to a ranked set sample with some clear differences. This design first creates partially rank-ordered subsets by allowing ties whenever the units in a set cannot be ranked with high confidence. It then selects a unit for full measurement at random from one of these partially rank-ordered subsets. The paper develops a point estimator, confidence interval and hypothesis testing procedure for the population quantile of order p. Exact, as well as asymptotic, distribution of the test statistic is derived. It is shown that the null distribution of the test statistic is distribution-free, and statistical inference is reasonably robust against possible ranking errors in ranking process.  相似文献   

6.
ABSTRACT

In this paper, we use the idea of order statistics from independent and non-identically distributed random variables to propose ordered partially ordered judgment subset sampling (OPOJSS) and then develop optimal linear parametric inferences. The best linear unbiased and invariant estimators of the location and scale parameters of a location-scale family are developed based on OPOJSS. It is shown that, despite the presence or absence of ranking errors, the proposed estimators with OPOJSS are uniformly better than the existing estimators with simple random sampling (SRS), ranked set sampling (RSS), ordered RSS (ORSS) and partially ordered judgment subset sampling (POJSS). Moreover, we also derive the best linear unbiased estimators (BLUEs) of the unknown parameters of the simple linear regression model with replicated observations using POJSS and OPOJSS. It is found that the BLUEs with OPOJSS are more precise than the BLUEs based on SRS, RSS, ORSS and POJSS.  相似文献   

7.
Subgroup detection has received increasing attention recently in different fields such as clinical trials, public management and market segmentation analysis. In these fields, people often face time‐to‐event data, which are commonly subject to right censoring. This paper proposes a semiparametric Logistic‐Cox mixture model for subgroup analysis when the interested outcome is event time with right censoring. The proposed method mainly consists of a likelihood ratio‐based testing procedure for testing the existence of subgroups. The expectation–maximization iteration is applied to improve the testing power, and a model‐based bootstrap approach is developed to implement the testing procedure. When there exist subgroups, one can also use the proposed model to estimate the subgroup effect and construct predictive scores for the subgroup membership. The large sample properties of the proposed method are studied. The finite sample performance of the proposed method is assessed by simulation studies. A real data example is also provided for illustration.  相似文献   

8.
In nomination sampling, the largest values from several independent random samples (nominees) are rank ordered, and an estimate of the population median is formed by interpolating between 2 of these order statistics. The resulting estimate compares favorably to the sample median of a simple random sample from the same population. When historical data sets retain only extreme values, nomination sampling may offer the only practical way to estimate the population median. The approach may also be useful when potential survey respondents will only participate if they can actively influence the selection of cases for analysis.  相似文献   

9.
We consider mixtures of general angular central Gaussian distributions as models for multimodal directional data. We prove consistency of the maximum‐likelihood estimates of model parameters and convergence of their numerical approximations based on an expectation–maximization algorithm. Then, we focus on mixtures of special angular central Gaussian distributions and discuss the details of a fast numerical algorithm, which allows to fit multimodal distributions to massive data, occurring, for example, in the study of the microstructure of materials. We illustrate the applicability with some data from fibre composites and from ceramic foams.  相似文献   

10.
Abstract. We consider a bidimensional Ornstein–Uhlenbeck process to describe the tissue microvascularization in anti‐cancer therapy. Data are discrete, partial and noisy observations of this stochastic differential equation (SDE). Our aim is to estimate the SDE parameters. We use the main advantage of a one‐dimensional observation to obtain an easy way to compute the exact likelihood using the Kalman filter recursion, which allows to implement an easy numerical maximization of the likelihood. Furthermore, we establish the link between the observations and an ARMA process and we deduce the asymptotic properties of the maximum likelihood estimator. We show that this ARMA property can be generalized to a higher dimensional underlying Ornstein–Uhlenbeck diffusion. We compare this estimator with the one obtained by the well‐known expectation maximization algorithm on simulated data. Our estimation methods can be directly applied to other biological contexts such as drug pharmacokinetics or hormone secretions.  相似文献   

11.
We give a formal definition of a representative sample, but roughly speaking, it is a scaled‐down version of the population, capturing its characteristics. New methods for selecting representative probability samples in the presence of auxiliary variables are introduced. Representative samples are needed for multipurpose surveys, when several target variables are of interest. Such samples also enable estimation of parameters in subspaces and improved estimation of target variable distributions. We describe how two recently proposed sampling designs can be used to produce representative samples. Both designs use distance between population units when producing a sample. We propose a distance function that can calculate distances between units in general auxiliary spaces. We also propose a variance estimator for the commonly used Horvitz–Thompson estimator. Real data as well as illustrative examples show that representative samples are obtained and that the variance of the Horvitz–Thompson estimator is reduced compared with simple random sampling.  相似文献   

12.
By combining the progressive hybrid censoring with the step-stress partially accelerated lifetime test, we propose an adaptive step-stress partially accelerated lifetime test, which allows random changing of the number of step-stress levels according to the pre-fixed censoring number and time points. Thus, the time expenditure and economic cost of the test will be reduced greatly. Based on the Lindley-distributed tampered failure rate (TFR) model with masked system lifetime data, the BFGS method is introduced in the expectation maximization (EM) algorithm to obtain the maximum likelihood estimation (MLE), which overcomes the difficulties of the vague maximization procedure in the M-step. Asymptotic confidence intervals of components' distribution parameters are also investigated according to the missing information principle. As comparison, the Bayesian estimation and the highest probability density (HPD) credible intervals are obtained by using adaptive rejection sampling. Furthermore, the reliability of the system and components are estimated at a specified time under usual and severe operating conditions. Finally, a numerical simulation example is presented to illustrate the performance of our proposed method.  相似文献   

13.
A computational problem in many fields is to estimate simultaneously multiple integrals and expectations, assuming that the data are generated by some Monte Carlo algorithm. Consider two scenarios in which draws are simulated from multiple distributions but the normalizing constants of those distributions may be known or unknown. For each scenario, existing estimators can be classified as using individual samples separately or using all the samples jointly. The latter pooled‐sample estimators are statistically more efficient but computationally more costly to evaluate than the separate‐sample estimators. We develop a cluster‐sample approach to obtain computationally effective estimators, after draws are generated for each scenario. We divide all the samples into mutually exclusive clusters and combine samples from each cluster separately. Furthermore, we exploit a relationship between estimators based on samples from different clusters to achieve variance reduction. The resulting estimators, compared with the pooled‐sample estimators, typically yield similar statistical efficiency but have reduced computational cost. We illustrate the value of the new approach by two examples for an Ising model and a censored Gaussian random field. The Canadian Journal of Statistics 41: 151–173; 2013 © 2012 Statistical Society of Canada  相似文献   

14.
In this article, a non-iterative posterior sampling algorithm for linear quantile regression model based on the asymmetric Laplace distribution is proposed. The algorithm combines the inverse Bayes formulae, sampling/importance resampling, and the expectation maximization algorithm to obtain independently and identically distributed samples approximately from the observed posterior distribution, which eliminates the convergence problems in the iterative Gibbs sampling and overcomes the difficulty in evaluating the standard deviance in the EM algorithm. The numeric results in simulations and application to the classical Engel data show that the non-iterative sampling algorithm is more effective than the Gibbs sampling and EM algorithm.  相似文献   

15.
In this article, we utilize a scale mixture of Gaussian random field as a tool for modeling spatial ordered categorical data with non-Gaussian latent variables. In fact, we assume a categorical random field is created by truncating a Gaussian Log-Gaussian latent variable model to accommodate heavy tails. Since the traditional likelihood approach for the considered model involves high-dimensional integrations which are computationally intensive, the maximum likelihood estimates are obtained using a stochastic approximation expectation–maximization algorithm. For this purpose, Markov chain Monte Carlo methods are employed to draw from the posterior distribution of latent variables. A numerical example illustrates the methodology.  相似文献   

16.
A new family of mixture models for the model‐based clustering of longitudinal data is introduced. The covariance structures of eight members of this new family of models are given and the associated maximum likelihood estimates for the parameters are derived via expectation–maximization (EM) algorithms. The Bayesian information criterion is used for model selection and a convergence criterion based on the Aitken acceleration is used to determine the convergence of these EM algorithms. This new family of models is applied to yeast sporulation time course data, where the models give good clustering performance. Further constraints are then imposed on the decomposition to allow a deeper investigation of the correlation structure of the yeast data. These constraints greatly extend this new family of models, with the addition of many parsimonious models. The Canadian Journal of Statistics 38:153–168; 2010 © 2010 Statistical Society of Canada  相似文献   

17.
Likelihood‐based inference with missing data is challenging because the observed log likelihood is often an (intractable) integration over the missing data distribution, which also depends on the unknown parameter. Approximating the integral by Monte Carlo sampling does not necessarily lead to a valid likelihood over the entire parameter space because the Monte Carlo samples are generated from a distribution with a fixed parameter value. We consider approximating the observed log likelihood based on importance sampling. In the proposed method, the dependency of the integral on the parameter is properly reflected through fractional weights. We discuss constructing a confidence interval using the profile likelihood ratio test. A Newton–Raphson algorithm is employed to find the interval end points. Two limited simulation studies show the advantage of the Wilks inference over the Wald inference in terms of power, parameter space conformity and computational efficiency. A real data example on salamander mating shows that our method also works well with high‐dimensional missing data.  相似文献   

18.
This work provides a class of non‐Gaussian spatial Matérn fields which are useful for analysing geostatistical data. The models are constructed as solutions to stochastic partial differential equations driven by generalized hyperbolic noise and are incorporated in a standard geostatistical setting with irregularly spaced observations, measurement errors and covariates. A maximum likelihood estimation technique based on the Monte Carlo expectation‐maximization algorithm is presented, and a Monte Carlo method for spatial prediction is derived. Finally, an application to precipitation data is presented, and the performance of the non‐Gaussian models is compared with standard Gaussian and transformed Gaussian models through cross‐validation.  相似文献   

19.
Recently, Bolfarine et al. [Bimodal symmetric-asymmetric power-normal families. Commun Statist Theory Methods. Forthcoming. doi:10.1080/03610926.2013.765475] introduced a bimodal asymmetric model having the normal and skew normal as special cases. Here, we prove a stochastic representation for their bimodal asymmetric model and use it to generate random numbers from that model. It is shown how the resulting algorithm can be seen as an improvement over the rejection method. We also discuss practical and numerical aspects regarding the estimation of the model parameters by maximum likelihood under simple random sampling. We show that a unique stationary point of the likelihood equations exists except when all observations have the same sign. However, the location-scale extension of the model usually presents two or more roots and this fact is illustrated here. The standard maximization routines available in the R system (Broyden–Fletcher–Goldfarb–Shanno (BFGS), Trust, Nelder–Mead) were considered in our implementations but exhibited similar performance. We show the usefulness of inspecting profile loglikelihoods as a method to obtain starting values for maximization and illustrate data analysis with the location-scale model in the presence of multiple roots. A simple Bayesian model is discussed in the context of a data set which presents a flat likelihood in the direction of the skewness parameter.  相似文献   

20.
The article considers a new approach for small area estimation based on a joint modelling of mean and variances. Model parameters are estimated via expectation–maximization algorithm. The conditional mean squared error is used to evaluate the prediction error. Analytical expressions are obtained for the conditional mean squared error and its estimator. Our approximations are second‐order correct, an unwritten standardization in the small area literature. Simulation studies indicate that the proposed method outperforms the existing methods in terms of prediction errors and their estimated values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号