期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Subset Selection Toward Optimizing the Best Performance at a Second Stage

Chaim Meyer Ehrman Abba Krieger Klaus J. Miescke 《商业与经济统计学杂志》2013,31(2):295-303

In the search for the best of n candidates, two-stage procedures of the following type are in common use. In a first stage, weak candidates are removed, and the subset of promising candidates is then further examined. At a second stage, the best of the candidates in the subset is selected. In this article, optimization is not aimed at the parameter with largest value but rather at the best performance of the selected candidates at Stage 2. Under a normal model, a new procedure based on posterior percentiles is derived using a Bayes approach, where nonsymmetric normal (proper and improper) priors are applied. Comparisons are made with two other procedures frequently used in selection decisions. The three procedures and their performances are illustrated with data from a recent recruitment process at a Midwestern university. 相似文献

2.

Analyzing Hybrid Randomized Response Data with a Binomial Selection Procedure

Feiyi Jia Gary C. McDonald 《统计学通讯:理论与方法》2013,42(6):784-807

The operating characteristics (OCs) of an indifference-zone ranking and selection procedure are derived for randomized response binomial data. The OCs include tables and figures to facilitate tradeoffs between sample size and a stated probability of a correct selection, i.e., correctly identifying the binomial population (out of k ≥ 2) characterized by the largest probability of success. Measures of efficiency are provided to assist the analyst in selection of an appropriate randomized response design for the collection of the data. A hybrid randomized response model, which includes the Warner model and the Greenberg et al. model, is introduced to facilitate comparisons among a wider range of statistical designs than previously available. An example comparing failure rates of contraceptive methods is used to illustrate the use of these new results. 相似文献

3.

Estimation of parameters for a Birnbaum–Saunders regression model with censored data

《Journal of Statistical Computation and Simulation》2012,82(11):983-997

Little work has been published on the analysis of censored data for the Birnbaum–Saunders distribution (BISA). In this article, we implement the EM algorithm to fit a regression model with censored data when the failure times follow the BISA. Three approaches to implement the E-Step of the EM algorithm are considered. In two of these implementations, the M-Step is attained by an iterative least-squares procedure. The algorithm is exemplified with a single explanatory variable in the model. 相似文献

4.

Generalized geometric distribution of order k: A flexible choice to randomize the response

Zawar Hussain Javid Shabbir Zahid Pervez Said Farooq Shah Manzoor Khan 《统计学通讯:模拟与计算》2017,46(6):4708-4721

This article focuses on the improvement of a well-celebrated randomized response technique of Kuk. A generalized randomized response technique is suggested. In particular, the generalized geometric distribution of order k is introduced as a randomization device for estimating the population proportion of a rare sensitive attribute. The proposed randomized response technique includes Singh and Grewal and Hussain et al. techniques as its special cases. Through numerical illustrations, it is established that the suggested technique is superior to the Kuk, Singh and Grewal, and Hussain et al. techniques. Flexibility of the proposed technique is also discussed. 相似文献

5.

Efficiency and Validity Analyses of Two-Stage Estimation Procedures and Derived Testing Procedures in Quantitative Linear Models with AR(1) Errors

《统计学通讯:模拟与计算》2013,42(3):799-833

Abstract

In a quantitative linear model with errors following a stationary Gaussian, first-order autoregressive or AR(1) process, Generalized Least Squares (GLS) on raw data and Ordinary Least Squares (OLS) on prewhitened data are efficient methods of estimation of the slope parameters when the autocorrelation parameter of the error AR(1) process, ρ, is known. In practice, ρ is generally unknown. In the so-called two-stage estimation procedures, ρ is then estimated first before using the estimate of ρ to transform the data and estimate the slope parameters by OLS on the transformed data. Different estimators of ρ have been considered in previous studies. In this article, we study nine two-stage estimation procedures for their efficiency in estimating the slope parameters. Six of them (i.e., three noniterative, three iterative) are based on three estimators of ρ that have been considered previously. Two more (i.e., one noniterative, one iterative) are based on a new estimator of ρ that we propose: it is provided by the sample autocorrelation coefficient of the OLS residuals at lag 1, denoted r(1). Lastly, REstricted Maximum Likelihood (REML) represents a different type of two-stage estimation procedure whose efficiency has not been compared to the others yet. We also study the validity of the testing procedures derived from GLS and the nine two-stage estimation procedures. Efficiency and validity are analyzed in a Monte Carlo study. Three types of explanatory variable x in a simple quantitative linear model with AR(1) errors are considered in the time domain: Case 1, x is fixed; Case 2, x is purely random; and Case 3, x follows an AR(1) process with the same autocorrelation parameter value as the error AR(1) process. In a preliminary step, the number of inadmissible estimates and the efficiency of the different estimators of ρ are compared empirically, whereas their approximate expected value in finite samples and their asymptotic variance are derived theoretically. Thereafter, the efficiency of the estimation procedures and the validity of the derived testing procedures are discussed in terms of the sample size and the magnitude and sign of ρ. The noniterative two-stage estimation procedure based on the new estimator of ρ is shown to be more efficient for moderate values of ρ at small sample sizes. With the exception of small sample sizes, REML and its derived F-test perform the best overall. The asymptotic equivalence of two-stage estimation procedures, besides REML, is observed empirically. Differences related to the nature, fixed or random (uncorrelated or autocorrelated), of the explanatory variable are also discussed. 相似文献

6.

Multiple comparisons with a control for exponential location parameters under heteroscedasticity

Vishal Maurya Amar Nath Gill Parminder Singh 《Journal of applied statistics》2013,40(8):1817-1830

In this paper, a new design-oriented two-stage two-sided simultaneous confidence intervals, for comparing several exponential populations with control population in terms of location parameters under heteroscedasticity, are proposed. If there is a prior information that the location parameter of k exponential populations are not less than the location parameter of control population, one-sided simultaneous confidence intervals provide more inferential sensitivity than two-sided simultaneous confidence intervals. But the two-sided simultaneous confidence intervals have advantages over the one-sided simultaneous confidence intervals as they provide both lower and upper bounds for the parameters of interest. The proposed design-oriented two-stage two-sided simultaneous confidence intervals provide the benefits of both the two-stage one-sided and two-sided simultaneous confidence intervals. When the additional sample at the second stage may not be available due to the experimental budget shortage or other factors in an experiment, one-stage two-sided confidence intervals are proposed, which combine the advantages of one-stage one-sided and two-sided simultaneous confidence intervals. The critical constants are obtained using the techniques given in Lam [9,10]. These critical constant are compared with the critical constants obtained by Bonferroni inequality techniques and found that critical constant obtained by Lam [9,10] are less conservative than critical constants computed from the Bonferroni inequality technique. Implementation of the proposed simultaneous confidence intervals is demonstrated by a numerical example. 相似文献

7.

Estimation of a rare sensitive attribute in a stratified sample using Poisson distribution

Gi-Sung Lee Jong-Min Kim 《Statistics》2013,47(3):575-589

This study proposes the estimators for the mean and its variance of the number of respondents who possessed a rare sensitive attribute based on stratified sampling schemes (stratified sampling and stratified double sampling). This study deals with the extension of the estimation reported in Land et al. [Estimation of a rare sensitive attribute using Poisson distribution, Statistics (2011), in press. DOI: 10.1080/02331888.2010.524300] using a Poisson distribution and an unrelated question randomized response model reported in Greenberg et al. [The unrelated question randomized response model: Theoretical framework, J. Amer. Statist. Assoc. 64 (1969), 520–539]. In the stratified sampling, the estimators are proposed when the parameter of the rare unrelated attribute is known and unknown. The variances of estimators using a proportional and optimum allocation are also suggested. The proposed estimators are evaluated using a relative efficiency comparing variances of the estimators reported in Land et al. depending on the parameters and the probability of selecting a question. We showed that our proposed methods have better efficiencies than Land et al.’s randomized response model in some conditions. When the sizes of stratified populations are not given, other estimators are suggested using a stratified double sampling. For the proportional allocation, the difference between two variances in the stratified sampling and the stratified double sampling is given with the known rare unrelated attribute. 相似文献

8.

Bayesian estimation of rare sensitive attribute

Joon Jin Song 《统计学通讯:模拟与计算》2017,46(5):4154-4160

Randomized response models have been used to estimate a population proportion of a sensitive attribute. A randomized device is typically employed to protect respondent's privacy in a survey. In addition, an unrelated question is asked to improve the statistical efficiency. In this article, we propose Bayesian estimation of rare sensitive attribute using randomized response technique, which includes a rare unrelated attribute. Two cases are considered, the proportion of a rare unrelated attribute is known and unknown. A simulation study is conducted to assess the performance of the models using mean absolute error and coverage probability. The results show that the performance depends on the parameters and is robust to priors. 相似文献

9.

An application of negative binomial distribution of order k in sensitive surveys

Zahid Pervez Said Farooq Shah Zawar Hussain Andreas N. Philippou 《统计学通讯:模拟与计算》2017,46(8):6654-6660

This article is an attempt to generalize some of the recent papers on randomized response techniques by using the negative binomial distribution of order k to randomize the responses in the randomization design where respondents can report outcome of one of two binary devices depending upon their actual status. The relative efficiency results are observed to be better than those of many recent and relevant randomized response techniques. The results are also better than those of the base line model used in this study, providing the sensitive attribute is rare. An extra advantage of the proposed technique is that it does not require any additional sampling and administrative cost. 相似文献

10.

Population mixture models and clustering algorithms

Stanley L. Sclove 《统计学通讯:理论与方法》2013,42(5):417-434

The problem of clustering individuals is considered within the context of a mixture of distributions. A modification of the usual approach to population mixtures is employed. As usual, a parametric family of distributions is considered, a set of parameter values being associated with each population. In addition, with each observation is associated an identification parameter, Indicating from which population the observation arose. Theresulting likelihood function is interpreted in terms of the conditional probability density of a sample from a mixture of populations, given the identification parameter of each observation. Clustering algorithms are obtained by applying a method of iterated maximum likelihood to this like-lihood function. 相似文献

11.

On the validation of fiducial techniques

Andr Plante 《Revue canadienne de statistique》1979,7(2):217-226

A structured model is essentially a family of random vectors X_θ defined on a probability space with values in a sample space. If, for a given sample value x and for each ω in the probability space, there is at most one parameter value θ for which X_θ(ω) is equal to x, then the model is called additive at x. When a certain conditional distribution exists, a frequency interpretation specific to additive structured models holds, and is summarized in a unique structured distribution for the parameter. Many of the techniques used by Fisher in deriving and handling his fiducial probability distribution are shown to be valid when dealing with a structured distribution. 相似文献

12.

Optimum mixture designs for the log-logistic dose–response model with mixture of two similar compounds

Manisha Pal 《统计学通讯:模拟与计算》2018,47(3):800-808

The article studies the log-logistic class of dose–response bioassay models in the binomial set-up. The dose is identified by the potency adjusted mixing proportions of two similar compounds. Models for both absence and presence of interaction between the compounds have been considered. The aim is to investigate the D- and D_s-optimal mixture designs for the estimation of the full set of parameters or for the estimation of potency for a best guess of the parameter values. We also indicate how to find the optimal design to estimate the mixing proportions at which the probability of success attains a given value in the absence of the interaction effect. 相似文献

13.

Augmented-limited regression models with an application to the study of the risk perceived using continuous scales

Ana R. S. Silva Caio L. N. Azevedo Jorge L. Bazn Juvêncio S. Nobre 《Journal of applied statistics》2021,48(11):1998

Studies of risk perceived using continuous scales of [0,100] were recently introduced in psychometrics, which can be transformed to the unit interval, but the presence of zeros or ones are commonly observed. Motivated by this, we introduce a full inferential set of tools that allows for augmented and limited data modeling. We considered parameter estimation, residual analysis, influence diagnostic and model selection for zero-and/or-one augmented beta rectangular (ZOABR) regression models and their particular nested models, which is based on a new parameterization of the beta rectangular distribution. Different from other alternatives, we performed maximum-likelihood estimation using a combination of the EM algorithm (for the continuous part) and Fisher scoring algorithm (for the discrete part). Also, we perform an additional step, by considering other link functions, besides the usual logistic link, for modeling the response mean. By considering randomized quantile residuals, (local) influence diagnostics and model selection tools, we identified that the ZOABR regression model is the best one. We also conducted extensive simulations studies, which indicate that all developed tools work properly. Finally, we discuss the use of this type of models to treat psychometric data. It is worthwhile to mention that applications of the developed methods go beyond to Psychometric data. Indeed, they can be useful when the response variable in bounded, including or not the respective limits. 相似文献

14.

Tail dependence of skew t-copulas

Tõnu Kollo Gaida Pettere Marju Valge 《统计学通讯:模拟与计算》2017,46(2):1024-1034

We examine tail behavior of skew t-copula in the bivariate case. The tail dependence coefficient is calculated for different skewing parameter values and compared with the corresponding coefficient for the t-copula. It is shown that depending on skewing parameter values, the tail dependence coefficient can differ considerably from the tail dependence of the t-copula. The speed of convergence of the estimator of tail dependence coefficient to its theoretical value is examined in a simulation experiment. Method of moments and maximum likelihood method are compared by simulation either. In the considered cases, maximum likelihood method converged faster to the theoretical value. 相似文献

15.

Estimating the prevalence of sensitive behaviour and cheating with a dual design for direct questioning and randomized response

van den Hout A Böckenholt U van der Heijden PG 《Journal of the Royal Statistical Society. Series C, Applied statistics》2010,59(4):723-736

Randomized response is a misclassification design to estimate the prevalence of sensitive behaviour. Respondents who do not follow the instructions of the design are considered to be cheating. A mixture model is proposed to estimate the prevalence of sensitive behaviour and cheating in the case of a dual sampling scheme with direct questioning and randomized response. The mixing weight is the probability of cheating, where cheating is modelled separately for direct questioning and randomized response. For Bayesian inference, Markov chain Monte Carlo sampling is applied to sample parameter values from the posterior. The model makes it possible to analyse dual sample scheme data in a unified way and to assess cheating for direct questions as well as for randomized response questions. The research is illustrated with randomized response data concerning violations of regulations for social benefit. 相似文献

16.

The simultaneous confidence intervals for all distances from the extreme populations for two-parameter exponential populations based on the multiply type II censored samples

《Journal of Statistical Computation and Simulation》2012,82(2):137-165

Among k independent two-parameter exponential distributions which have the common scale parameter, the lower extreme population (LEP) is the one with the smallest location parameter and the upper extreme population (UEP) is the one with the largest location parameter. Given a multiply type II censored sample from each of these k independent two-parameter exponential distributions, 14 estimators for the unknown location parameters and the common unknown scale parameter are considered. Fourteen simultaneous confidence intervals (SCIs) for all distances from the extreme populations (UEP and LEP) and from the UEP from these k independent exponential distributions under the multiply type II censoring are proposed. The critical values are obtained by the Monte Carlo method. The optimal SCIs among 14 methods are identified based on the criteria of minimum confidence length for various censoring schemes. The subset selection procedures of extreme populations are also proposed and two numerical examples are given for illustration. 相似文献

17.

Multivariate measurement error models based on Student-t distribution under censored responses

Larissa A. Matos Luis M. Castro Celso R. B. Cabral Víctor H. Lachos 《Statistics》2013,47(6):1395-1416

Measurement error models constitute a wide class of models that include linear and nonlinear regression models. They are very useful to model many real-life phenomena, particularly in the medical and biological areas. The great advantage of these models is that, in some sense, they can be represented as mixed effects models, allowing us to implement well-known techniques, like the EM-algorithm for the parameter estimation. In this paper, we consider a class of multivariate measurement error models where the observed response and/or covariate are not fully observed, i.e., the observations are subject to certain threshold values below or above which the measurements are not quantifiable. Consequently, these observations are considered censored. We assume a Student-t distribution for the unobserved true values of the mismeasured covariate and the error term of the model, providing a robust alternative for parameter estimation. Our approach relies on a likelihood-based inference using an EM-type algorithm. The proposed method is illustrated through some simulation studies and the analysis of an AIDS clinical trial dataset. 相似文献

18.

Divergence-based confidence intervals in false-positive misclassification model

《Journal of Statistical Computation and Simulation》2012,82(6):527-540

In this article, we introduce minimum divergence estimators of parameters of a binary response model when data are subject to false-positive misclassification and obtained using a double-sampling plan. Under this set up, the problem of goodness-of-fit is considered and divergence-based confidence intervals (CIs) for a population proportion parameter are derived. A simulation experiment is carried out to compare the coverage probabilities of the new CIs. An application to real data is also given. 相似文献

19.

Robust multivariate mixture regression models with incomplete data

Hwa Kyung Lim Naveen N. Narisetty 《Journal of Statistical Computation and Simulation》2017,87(2):328-347

Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis. 相似文献

20.

On Some Multiple Decision Procedures for Normal Variances

Ching-Ching Lin 《统计学通讯:模拟与计算》2013,42(2):265-275

In this article, we propose a multiple decision procedure to test the homogeneity of normal variances. If the null-hypothesis is rejected, our goal is to select a subset containing the population associated with the largest variance. An approximation for the critical value is obtained by deriving an approximate distribution for a linear combination of independent log-gamma distributed random variables. A lower bound for the probability of correct decision is obtained. We also study the determination of the common sample size in order to satisfy a given probability of correct decision when the largest variance is “sufficiently” larger than the rest. 相似文献