期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal estimators for the importance sampling method

Anne Philippe 《统计学通讯:模拟与计算》2013,42(1):97-119

The Monte Carlo method gives some estimators to evaluate the expectation [ILM0001] based on samples from either the true density f or from some instrumental density. In this paper, we show that the Riemann estimators introduced by Philippe (1997) can be improved by using the importance sampling method. This approach produces a class of Monte Carlo estimators such that the variance is of order O(n ^?2). The choice of an optimal estimator among this class is discussed. Some simulations illustrate the improvement brought by this method. Moreover, we give a criterion to assess the convergence of our optimal estimator to the integral of interest. 相似文献

2.

Robust inference for group sequential trials

下载免费PDF全文

Jitendra Ganju Yunzhi Lin Kefei Zhou 《Pharmaceutical statistics》2017,16(2):167-173

For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. 相似文献

3.

Kernel Method Starting with Half-Normal Detection Function for Line Transect Density Estimation

Omar Eidous 《统计学通讯:理论与方法》2013,42(14):2366-2378

In this article, we introduce the nonparametric kernel method starting with half-normal detection function using line transect sampling. The new method improves bias from O(h ²), as the smoothing parameter h → 0, to O(h ³) and in some cases to O(h ⁴). Properties of the proposed estimator are derived and an expression for the asymptotic mean square error (AMSE) of the estimator is given. Minimization of the AMSE leads to an explicit formula for an optimal choice of the smoothing parameter. Small-sample properties of the estimator are investigated and compared with the traditional kernel estimator by using simulation technique. A numerical results show that improvements over the traditional kernel estimator often can be realized even when the true detection function is far from the half-normal detection function. 相似文献

4.

The pseudo‐GEE approach to the analysis of longitudinal surveys

Iván A. Carrillo Jiahua Chen Changbao Wu 《Revue canadienne de statistique》2010,38(4):540-554

Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. Longitudinal data are often analyzed through the generalized estimating equations (GEE) approach. The vast majority of existing literature on the GEE method; however, is developed under non‐survey settings and are inappropriate for data collected through complex sampling designs. In this paper the authors develop a pseudo‐GEE approach for the analysis of survey data. They show that survey weights must and can be appropriately accounted in the GEE method under a joint randomization framework. The consistency of the resulting pseudo‐GEE estimators is established under the proposed framework. Linearization variance estimators are developed for the pseudo‐GEE estimators when the finite population sampling fractions are small or negligible, a scenario often held for large‐scale surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study using data from the National Longitudinal Survey of Children and Youth. The results show that the pseudo‐GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous and binary responses. The Canadian Journal of Statistics 38: 540–554; 2010 © 2010 Statistical Society of Canada 相似文献

5.

A resampling approach to estimate variance components of multilevel models

Zilin Wang Mary E. Thompson 《Revue canadienne de statistique》2012,40(1):150-171

In a multilevel model for complex survey data, the weight‐inflated estimators of variance components can be biased. We propose a resampling method to correct this bias. The performance of the bias corrected estimators is studied through simulations using populations generated from a simple random effects model. The simulations show that, without lowering the precision, the proposed procedure can reduce the bias of the estimators, especially for designs that are both informative and have small cluster sizes. Application of these resampling procedures to data from an artificial workplace survey provides further evidence for the empirical value of this method. The Canadian Journal of Statistics 40: 150–171; 2012 © 2012 Statistical Society of Canada 相似文献

6.

The new strategy for the concise presentation of sampling errors in the Italian Structural Business Statistics Survey

Piero Demetrio Falorsi Salvatore Filiberti Antonio Pavone 《Statistical Methods and Applications》2006,15(2):243-265

Reporting sampling errors of survey estimates is a problem that is commonly addressed when compiling a survey report. Because of the vast number of study variables or population characteristics and of interest domains in a survey, it is almost impossible to calculate and to publish the standard errors for each statistic. A way of overcoming such problem would be to estimate indirectly the sampling errors by using generalized variance functions, which define a statistical relationship between the sampling errors and the corresponding estimates. One of the problems with this approach is that the model specification has to be consistent with a roughly constant design effect. If the design effects vary greatly across estimates, as in the case of the Business Surveys, the prediction model is not correctly specified and the least-square estimation is biased. In this paper, we show an extension of the generalized variance functions, which address the above problems, which could be used in contexts similar to those encountered in Business Surveys. The proposed method has been applied to the Italian Structural Business Statistics Survey case. 相似文献

7.

Elliptical safety region plots for C pk

Malin Albing 《Journal of applied statistics》2011,38(6):1169-1187

The process capability index C _pk is widely used when measuring the capability of a manufacturing process. A process is defined to be capable if the capability index exceeds a stated threshold value, e.g. C _pk>4/3. This inequality can be expressed graphically using a process capability plot, which is a plot in the plane defined by the process mean and the process standard deviation, showing the region for a capable process. In the process capability plot, a safety region can be plotted to obtain a simple graphical decision rule to assess process capability at a given significance level. We consider safety regions to be used for the index C _pk. Under the assumption of normality, we derive elliptical safety regions so that, using a random sample, conclusions about the process capability can be drawn at a given significance level. This simple graphical tool is helpful when trying to understand whether it is the variability, the deviation from target, or both that need to be reduced to improve the capability. Furthermore, using safety regions, several characteristics with different specification limits and different sample sizes can be monitored in the same plot. The proposed graphical decision rule is also investigated with respect to power. 相似文献

8.

Small area estimation of the mean using non-parametric M-quantile regression: a comparison when a linear mixed model does not hold

《Journal of Statistical Computation and Simulation》2012,82(8):945-964

The demand for reliable statistics in subpopulations, when only reduced sample sizes are available, has promoted the development of small area estimation methods. In particular, an approach that is now widely used is based on the seminal work by Battese et al. [An error-components model for prediction of county crop areas using survey and satellite data, J. Am. Statist. Assoc. 83 (1988), pp. 28–36] that uses linear mixed models (MM). We investigate alternatives when a linear MM does not hold because, on one side, linearity may not be assumed and/or, on the other, normality of the random effects may not be assumed. In particular, Opsomer et al. [Nonparametric small area estimation using penalized spline regression, J. R. Statist. Soc. Ser. B 70 (2008), pp. 265–283] propose an estimator that extends the linear MM approach to the case in which a linear relationship may not be assumed using penalized splines regression. From a very different perspective, Chambers and Tzavidis [M-quantile models for small area estimation, Biometrika 93 (2006), pp. 255–268] have recently proposed an approach for small-area estimation that is based on M-quantile (MQ) regression. This allows for models robust to outliers and to distributional assumptions on the errors and the area effects. However, when the functional form of the relationship between the qth MQ and the covariates is not linear, it can lead to biased estimates of the small area parameters. Pratesi et al. [Semiparametric M-quantile regression for estimating the proportion of acidic lakes in 8-digit HUCs of the Northeastern US, Environmetrics 19(7) (2008), pp. 687–701] apply an extended version of this approach for the estimation of the small area distribution function using a non-parametric specification of the conditional MQ of the response variable given the covariates [M. Pratesi, M.G. Ranalli, and N. Salvati, Nonparametric m-quantile regression using penalized splines, J. Nonparametric Stat. 21 (2009), pp. 287–304]. We will derive the small area estimator of the mean under this model, together with its mean-squared error estimator and compare its performance to the other estimators via simulations on both real and simulated data. 相似文献

9.

Teaching Decision Theory Proof Strategies Using a Crowdsourcing Problem

Luís Gustavo Esteves Rafael Bassi Stern 《The American statistician》2017,71(4):336-343

Teaching how to derive minimax decision rules can be challenging because of the lack of examples that are simple enough to be used in the classroom. Motivated by this challenge, we provide a new example that illustrates the use of standard techniques in the derivation of optimal decision rules under the Bayes and minimax approaches. We discuss how to predict the value of an unknown quantity, θ ∈ {0, 1}, given the opinions of n experts. An important example of such crowdsourcing problem occurs in modern cosmology, where θ indicates whether a given galaxy is merging or not, and Y₁, …, Y_n are the opinions from n astronomers regarding θ. We use the obtained prediction rules to discuss advantages and disadvantages of the Bayes and minimax approaches to decision theory. The material presented here is intended to be taught to first-year graduate students. 相似文献

10.

Multivariate statistics books are reviewed

《Journal of Statistical Computation and Simulation》2012,82(1):123-124

Asymptotic inferences about a linear combination of K independent binomial proportions are very frequent in applied research. Nevertheless, until quite recently research had been focused almost exclusively on cases of K≤2 (particularly on cases of one proportion and the difference of two proportions). This article focuses on cases of K>2, which have recently begun to receive more attention due to their great practical interest. In order to make this inference, there are several procedures which have not been compared: the score method (S0) and the method proposed by Martín Andrés et al. (W3) for adjusted Wald (which is a generalization of the method proposed by Price and Bonett) on the one hand and, on the other hand, the method of Zou et al. (N0) based on the Wilson confidence interval (which is a generalization of the Newcombe method). The article describes a new procedure (P0) based on the classic Peskun method, modifies the previous methods giving them continuity correction (methods S0c, W3c, N0c and P0c, respectively) and, finally, a simulation is made to compare the eight aforementioned procedures (which are selected from a total of 32 possible methods). The conclusion reached is that the S0c method is the best, although for very small samples (n _i≤10, ? i) the W3 method is better. The P0 method would be the optimal method if one needs a method which is almost never too liberal, but this entails using a method which is too conservative and which provides excessively wide CIs. The W3 and P0 methods have the additional advantage of being very easy to apply. A free programme which allows the application of the S0 and S0c methods (which are the most complex) can be obtained at http://www.ugr.es/local/bioest/Z_LINEAR_K.EXE. 相似文献

11.

A hybrid method for improved critical points for multiple comparisons

Melinda McCann Don Edwards 《统计学通讯:模拟与计算》2013,42(3):703-722

相似文献

12.

Partial logistic relevance vector machines in survival analysis

Nicola Lama Patrizia Boracchi Elia Biganzoli 《Journal of applied statistics》2011,38(11):2445-2458

The use of relevance vector machines to flexibly model hazard rate functions is explored. This technique is adapted to survival analysis problems through the partial logistic approach. The method exploits the Bayesian automatic relevance determination procedure to obtain sparse solutions and it incorporates the flexibility of kernel-based models. Example results are presented on literature data from a head-and-neck cancer survival study using Gaussian and spline kernels. Sensitivity analysis is conducted to assess the influence of hyperprior distribution parameters. The proposed method is then contrasted with other flexible hazard regression methods, in particular the HARE model proposed by Kooperberg et al. [16]. A simulation study is conducted to carry out the comparison. The model developed in this paper exhibited good performance in the prediction of hazard rate. The application of this sparse Bayesian technique to a real cancer data set demonstrated that the proposed method can potentially reveal characteristics of the hazards, associated with the dynamics of the studied diseases, which may be missed by existing modeling approaches based on different perspectives on the bias vs. variance balance. 相似文献

13.

A test for suitability of the preliminary samples for constructing control limits of X¯ chart

《Journal of Statistical Computation and Simulation》2012,82(6):689-705

Unless the preliminary m subgroups of small samples are drawn from a stable process, the estimated control limits of X¯ chart in phase I can be erroneous, due to which the performance of the chart in phase II can be significantly affected. In this work, a quantitative approach based on extraction of the shape features of control chart patterns in the X¯ chart is proposed for evaluating the stability of the process mean, while the preliminary samples were drawn and thus, the subjectivity associated with the visual analysis of the patterns is eliminated. The effectiveness of the test procedure is evaluated using simulated data. The results show that the proposed approach can be very effective for m≥48. The power of the test can be improved by identifying a new feature that can more efficiently discriminate the cyclic pattern of smaller periodicity from the natural pattern and by redefining the test statistic. 相似文献

14.

INVENTORY PROCESSES: QUASI-REGENERATIVE PROPERTY,PERFORMANCE EVALUATION,AND SENSITIVITY ESTIMATION VIA SIMULATION

《随机性模型》2013,29(3):469-496

We consider a single-commodity, discrete-time, multiperiod (s, S)-policy inventory model with backlog. The cost function may contain holding, shortage, and fixed ordering costs. Holding and shortage costs may be nonlinear. We show that the resulting inventory process is quasi-regenerative, i.e., admits a cycle decomposition and indicates how to estimate the performance by Monte Carlo simulation. By using a conditioning method, the push-out technique, and the change-of-measure method, estimates of the whole response surface (i.e., the steady-state performance in dependence of the parameters s and S) and its derivatives can be found. Estimates for the optimal (s, S) policy can be calculated then by numerical optimization. 相似文献

15.

Inference on finite population categorical response: nonparametric regression-based predictive approach

Sumanta Adhya Tathagata Banerjee Gaurangadeb Chattopadhyay 《AStA Advances in Statistical Analysis》2012,96(1):69-98

Suppose that a finite population consists of N distinct units. Associated with the ith unit is a polychotomous response vector, d _i, and a vector of auxiliary variable x _i. The values x _i’s are known for the entire population but d _i’s are known only for the units selected in the sample. The problem is to estimate the finite population proportion vector P. One of the fundamental questions in finite population sampling is how to make use of the complete auxiliary information effectively at the estimation stage. In this article a predictive estimator is proposed which incorporates the auxiliary information at the estimation stage by invoking a superpopulation model. However, the use of such estimators is often criticized since the working superpopulation model may not be correct. To protect the predictive estimator from the possible model failure, a nonparametric regression model is considered in the superpopulation. The asymptotic properties of the proposed estimator are derived and also a bootstrap-based hybrid re-sampling method for estimating the variance of the proposed estimator is developed. Results of a simulation study are reported on the performances of the predictive estimator and its re-sampling-based variance estimator from the model-based viewpoint. Finally, a data survey related to the opinions of 686 individuals on the cause of addiction is used for an empirical study to investigate the performance of the nonparametric predictive estimator from the design-based viewpoint. 相似文献

16.

A robust process capability index

Arthur B. Yeh Sumankumar Bhattcharya 《统计学通讯:模拟与计算》2013,42(2):565-589

The existing process capability indices (PCI's) assume that the distribution of the process being investigated is normal. For non-normal distributions, PCI's become unreliable in that PCI's may indicate the process is capable when in fact it is not. In this paper, we propose a new index which can be applied to any distribution. The proposed indexC_f:, is directly related to the probability of non-conformance of the process. For a given random sample, the estimation of C_f boils down to estimating non-parametrically the tail probabilities of an unknown distribution. The approach discussed in this paper is based on the works by Pickands (1975) and Smith (1987). We also discuss the construction of bootstrap confidence intervals of C_f: based on the so-called accelerated bias correction method (BC _a:). Several simulations are carried out to demonstrate the flexibility and applicability of C_f:. Two real life data sets are analyzed using the proposed index. 相似文献

17.

A TWO-STAGE UNRELATED RANDOMISED RESPONSE PROCEDURE

Horng JinhChang Der HsinLiang 《Australian & New Zealand Journal of Statistics》1996,38(1):43-51

In a sample survey, questions requiring personal or controversial assertions often give rise to resistance. A randomised response procedure can be used to help the researcher gather accurate data in this case. This paper describes a new two-stage unrelated randomised response procedure that combines the use of two randomisation devices (Mangat & Singh, 1990) and an unrelated question (Horwitz et al. 1967). It examines the situation where the respondents are not completely truthful in their answers. The efficiency of this new method is compared with the original one-stage procedure proposed by Horwitz et al. (1967), and guidelines for choosing the values of different parameters for the procedures are provided. Results from an empirical study which examines the efficiency and feasibility of the proposed method are given. 相似文献

18.

Dynamic approach to linear statistical calibration with an application in microwave radiometry

Derick L. Rivers Edward L. Boone 《Journal of Statistical Computation and Simulation》2017,87(1):17-28

The problem of statistical calibration of a measuring instrument can be framed both in a statistical context as well as in an engineering context. In the first, the problem is dealt with by distinguishing between the ‘classical’ approach and the ‘inverse’ regression approach. Both of these models are static models and are used to estimate exact measurements from measurements that are affected by error. In the engineering context, the variables of interest are considered to be taken at the time at which you observe it. The Bayesian time series analysis method of Dynamic Linear Models can be used to monitor the evolution of the measures, thus introducing a dynamic approach to statistical calibration. The research presented employs a new approach to performing statistical calibration. A simulation study in the context of microwave radiometry is conducted that compares the dynamic model to traditional static frequentist and Bayesian approaches. The focus of the study is to understand how well the dynamic statistical calibration method performs under various signal-to-noise ratios, r. 相似文献

19.

Bayesian composite Tobit quantile regression

Fadel Hamid Hadi Alhusseini Vasile Georgescu 《Journal of applied statistics》2018,45(4):727-739

Composite quantile regression models have been shown to be effective techniques in improving the prediction accuracy [H. Zou and M. Yuan, Composite quantile regression and the oracle model selection theory, Ann. Statist. 36 (2008), pp. 1108–1126; J. Bradic, J. Fan, and W. Wang, Penalized composite quasi-likelihood for ultrahighdimensional variable selection, J. R. Stat. Soc. Ser. B 73 (2011), pp. 325–349; Z. Zhao and Z. Xiao, Efficient regressions via optimally combining quantile information, Econometric Theory 30(06) (2014), pp. 1272–1314]. This paper studies composite Tobit quantile regression (TQReg) from a Bayesian perspective. A simple and efficient MCMC-based computation method is derived for posterior inference using a mixture of an exponential and a scaled normal distribution of the skewed Laplace distribution. The approach is illustrated via simulation studies and a real data set. Results show that combine information across different quantiles can provide a useful method in efficient statistical estimation. This is the first work to discuss composite TQReg from a Bayesian perspective. 相似文献

20.

Synthetic data method to incorporate external information into a current study

Tian Gu Jeremy M. G. Taylor Wenting Cheng Bhramar Mukherjee 《Revue canadienne de statistique》2019,47(4):580-603

We consider the situation where there is a known regression model that can be used to predict an outcome, Y, from a set of predictor variables X . A new variable B is expected to enhance the prediction of Y. A dataset of size n containing Y, X and B is available, and the challenge is to build an improved model for Y| X ,B that uses both the available individual level data and some summary information obtained from the known model for Y| X . We propose a synthetic data approach, which consists of creating m additional synthetic data observations, and then analyzing the combined dataset of size n + m to estimate the parameters of the Y| X ,B model. This combined dataset of size n + m now has missing values of B for m of the observations, and is analyzed using methods that can handle missing data (e.g., multiple imputation). We present simulation studies and illustrate the method using data from the Prostate Cancer Prevention Trial. Though the synthetic data method is applicable to a general regression context, to provide some justification, we show in two special cases that the asymptotic variances of the parameter estimates in the Y| X ,B model are identical to those from an alternative constrained maximum likelihood estimation approach. This correspondence in special cases and the method's broad applicability makes it appealing for use across diverse scenarios. The Canadian Journal of Statistics 47: 580–603; 2019 © 2019 Statistical Society of Canada 相似文献