首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
Estimation of the mean of an exponential distribution based on record data has been treated by Samaniego and Whitaker [F.J. Samaniego, and L.R. Whitaker, On estimating popular characteristics from record breaking observations I. Parametric results, Naval Res. Logist. Quart. 33 (1986), pp. 531–543] and Doostparast [M. Doostparast, A note on estimation based on record data, Metrika 69 (2009), pp. 69–80]. When a random sample Y 1, …, Y n is examined sequentially and successive minimum values are recorded, Samaniego and Whitaker [F.J. Samaniego, and L.R. Whitaker, On estimating popular characteristics from record breaking observations I. Parametric results, Naval Res. Logist. Quart. 33 (1986), pp. 531–543] obtained a maximum likelihood estimator of the mean of the population and showed its convergence in probability. We establish here its convergence in mean square error, which is stronger than the convergence in probability. Next, we discuss the optimal sample size for estimating the mean based on a criterion involving a cost function as well as the Fisher information based on records arising from a random sample. Finally, a comparison between complete data and record is carried out and some special cases are discussed in detail.  相似文献   

2.
The authors consider a finite population ρ = {(Yk, xk), k = 1,…,N} conforming to a linear superpopulation model with unknown heteroscedastic errors, the variances of which are values of a smooth enough function of the auxiliary variable X for their nonparametric estimation. They describe a method of the Chambers‐Dunstan type for estimation of the distribution of {Yk, k = 1,…, N} from a sample drawn from without replacement, and determine the asymptotic distribution of its estimation error. They also consider estimation of its mean squared error in particular cases, evaluating both the analytical estimator derived by “plugging‐in” the asymptotic variance, and a bootstrap approach that is also applicable to estimation of parameters other than mean squared error. These proposed methods are compared with some common competitors in simulation studies.  相似文献   

3.
We consider the situation where there is a known regression model that can be used to predict an outcome, Y, from a set of predictor variables X . A new variable B is expected to enhance the prediction of Y. A dataset of size n containing Y, X and B is available, and the challenge is to build an improved model for Y| X ,B that uses both the available individual level data and some summary information obtained from the known model for Y| X . We propose a synthetic data approach, which consists of creating m additional synthetic data observations, and then analyzing the combined dataset of size n + m to estimate the parameters of the Y| X ,B model. This combined dataset of size n + m now has missing values of B for m of the observations, and is analyzed using methods that can handle missing data (e.g., multiple imputation). We present simulation studies and illustrate the method using data from the Prostate Cancer Prevention Trial. Though the synthetic data method is applicable to a general regression context, to provide some justification, we show in two special cases that the asymptotic variances of the parameter estimates in the Y| X ,B model are identical to those from an alternative constrained maximum likelihood estimation approach. This correspondence in special cases and the method's broad applicability makes it appealing for use across diverse scenarios. The Canadian Journal of Statistics 47: 580–603; 2019 © 2019 Statistical Society of Canada  相似文献   

4.
Capacity utilization measures have traditionally been constructed as indexes of actual, as compared to “potential,” output. This potential or capacity output (Y*) can be represented within an economic model of the firm as the tangency between the short- and long-run average cost curves. Economic theoretical measures of capacity utilization (CU) can then be characterized as Y/Y* where Y is the realized level of output. These quantity or primal CU measures allow for economic interpretation; they provide explicit inference as to how changes in exogenous variables affect CU. Additional information for analyzing deviations from capacity production can be obtained by assessing the “dual” cost of the gap.

In this article the definitions and representations of primal-output and dual-cost CU measures are formalized within a dynamic model of a monopolistic firm. As an illustration of this approach to characterizing CU measures, a model is estimated for the U.S. automobile industry, 1959–1980, and primal and dual CU indexes are constructed. Application of these indexes to adjustment-of-productivity measures for “disequilibrium” is then carried out, using the dual-cost measure.  相似文献   

5.
Results of an exhaustive study of the bias of the least square estimator (LSE) of an first order autoregression coefficient α in a contaminated Gaussian model are presented. The model describes the following situation. The process is defined as Xt = α Xt-1 + Yt . Until a specified time T, Yt are iid normal N(0, 1). At the moment T we start our observations and since then the distribution of Yt, tT, is a Tukey mixture T(εσ) = (1 – ε)N(0,1) + εN(0, σ2). Bias of LSE as a function of α and ε, and σ2 is considered. A rather unexpected fact is revealed: given α and ε, the bias does not change montonically with σ (“the magnitude of the contaminant”), and similarly, given α and σ, the bias is not growing with ε (“the amount of contaminants”).  相似文献   

6.
Consider k independent observations Yi (i= 1,., k) from two-parameter exponential populations i with location parameters μ and the same scale parameter If the μi are ranked as consider population as the “worst” population and IIp(k) as the “best” population (with some tagging so that p{) and p(k) are well defined in the case of equalities). If the Yi are ranked as we consider the procedure, “Select provided YR(k) Yr(k) is sufficiently large so that is demonstrably better than the other populations.” A similar procedure is studied for selecting the “demonstrably worst” population.  相似文献   

7.
ABSTRACT

The cost and time of pharmaceutical drug development continue to grow at rates that many say are unsustainable. These trends have enormous impact on what treatments get to patients, when they get them and how they are used. The statistical framework for supporting decisions in regulated clinical development of new medicines has followed a traditional path of frequentist methodology. Trials using hypothesis tests of “no treatment effect” are done routinely, and the p-value < 0.05 is often the determinant of what constitutes a “successful” trial. Many drugs fail in clinical development, adding to the cost of new medicines, and some evidence points blame at the deficiencies of the frequentist paradigm. An unknown number effective medicines may have been abandoned because trials were declared “unsuccessful” due to a p-value exceeding 0.05. Recently, the Bayesian paradigm has shown utility in the clinical drug development process for its probability-based inference. We argue for a Bayesian approach that employs data from other trials as a “prior” for Phase 3 trials so that synthesized evidence across trials can be utilized to compute probability statements that are valuable for understanding the magnitude of treatment effect. Such a Bayesian paradigm provides a promising framework for improving statistical inference and regulatory decision making.  相似文献   

8.
Several authors, including the American Statistical Association (ASA) guidelines for undergraduate statistics education (American Statistical Association Undergraduate Guidelines Workgroup) American Statistical Association Undergraduate Guidelines Workgroup (2014), “2014 Curriculum Guidelines for Undergraduate Programs in Statistical Science,” Alexandria, VA: American Statistical Asociation. Available at http://www.amstat.org/education/curriculumguidelines.cfm [Google Scholar], have noted the challenges facing statisticians when attacking large, complex, and unstructured problems, as opposed to well-defined textbook problems. Clearly, the standard paradigm of selecting the one “correct” statistical method for such problems is not sufficient; a new paradigm is needed. Statistical engineering has been proposed as a discipline that can provide a viable paradigm to attack such problems, used in conjunction with sound statistical science. Of course, to develop as a true discipline, statistical engineering must be clearly defined and articulated. Further, a well-developed underlying theory is needed, one that would prove helpful in addressing such large, complex, and unstructured problems. The purpose of this expository article is to more clearly articulate the current state of statistical engineering, and make a case for why it merits further study by the profession as a means of addressing such problems. We conclude with a “call to action.”  相似文献   

9.
We consider the specific transformation of a Wiener process {X(t), t ≥ 0} in the presence of an absorbing barrier a that results when this process is “time-locked” with respect to its first passage time T a through a criterion level a, and the evolution of X(t) is considered backwards (retrospectively) from T a . Formally, we study the random variables defined by Y(t) ≡ X(T a  ? t) and derive explicit results for their density and mean, and also for their asymptotic forms. We discuss how our results can aid interpretations of time series “response-locked” to their times of crossing a criterion level.  相似文献   

10.
Random samples are assumed for the univariate two-sample problem. Sometimes this assumption may be violated in that an observation in one “sample”, of size m, is from a population different from that yielding the remaining m—1 observations (which are a random sample). Then, the interest is in whether this random sample of size m—1 is from the same population as the other random sample. If such a violation occurs and can be recognized, and also the non-conforming observation can be identified (without imposing conditional effects), then that observation could be removed and a two-sample test applied to the remaining samples. Unfortunately, satisfactory procedures for such a removal do not seem to exist. An alternative approach is to use two-sample tests whose significance levels remain the same when a non-conforming observation occurs, and is removed, as for the case where the samples were both truly random. The equal-tail median test is shown to have this property when the two “samples” are of the same size (and ties do not occur).  相似文献   

11.
Suppose (X, Y) has a Downton's bivariate exponential distribution with correlation ρ. For a random sample of size n from (X, Y), let X r:n be the rth X-order statistic and Y [r:n] be its concomitant. We investigate estimators of ρ when all the parameters are unknown and the available data is an incomplete bivariate sample made up of (i) all the Y-values and the ranks of associated X-values, i.e. (i, Y [i:n]), 1≤in, and (ii) a Type II right-censored bivariate sample consisting of (X i:n , Y [i:n]), 1≤ir<n. In both setups, we use simulation to examine the bias and mean square errors of several estimators of ρ and obtain their estimated relative efficiencies. The preferred estimator under (i) is a function of the sample correlation of (Y i:n , Y [i:n]) values, and under (ii), a method of moments estimator involving the regression function is preferred.  相似文献   

12.
LetF(x,y) be a distribution function of a two dimensional random variable (X,Y). We assume that a distribution functionF x(x) of the random variableX is known. The variableX will be called an auxiliary variable. Our purpose is estimation of the expected valuem=E(Y) on the basis of two-dimensional simple sample denoted by:U=[(X 1, Y1)…(Xn, Yn)]=[X Y]. LetX=[X 1X n]andY=[Y 1Y n].This sample is drawn from a distribution determined by the functionF(x,y). LetX (k)be the k-th (k=1, …,n) order statistic determined on the basis of the sampleX. The sampleU is truncated by means of this order statistic into two sub-samples: % MathType!End!2!1! and % MathType!End!2!1!.Let % MathType!End!2!1! and % MathType!End!2!1! be the sample means from the sub-samplesU k,1 andU k,2, respectively. The linear combination % MathType!End!2!1! of these means is the conditional estimator of the expected valuem. The coefficients of this linear combination depend on the distribution function of auxiliary variable in the pointx (k).We can show that this statistic is conditionally as well as unconditionally unbiased estimator of the averagem. The variance of this estimator is derived. The variance of the statistic % MathType!End!2!1! is compared with the variance of the order sample mean. The generalization of the conditional estimation of the mean is considered, too.  相似文献   

13.
Abstract

The present note explores sources of misplaced criticisms of P-values, such as conflicting definitions of “significance levels” and “P-values” in authoritative sources, and the consequent misinterpretation of P-values as error probabilities. It then discusses several properties of P-values that have been presented as fatal flaws: That P-values exhibit extreme variation across samples (and thus are “unreliable”), confound effect size with sample size, are sensitive to sample size, and depend on investigator sampling intentions. These properties are often criticized from a likelihood or Bayesian framework, yet they are exactly the properties P-values should exhibit when they are constructed and interpreted correctly within their originating framework. Other common criticisms are that P-values force users to focus on irrelevant hypotheses and overstate evidence against those hypotheses. These problems are not however properties of P-values but are faults of researchers who focus on null hypotheses and overstate evidence based on misperceptions that p?=?0.05 represents enough evidence to reject hypotheses. Those problems are easily seen without use of Bayesian concepts by translating the observed P-value p into the Shannon information (S-value or surprisal) –log2(p).  相似文献   

14.
Suppose that we have a nonparametric regression model Y = m(X) + ε with XRp, where X is a random design variable and is observed completely, and Y is the response variable and some Y-values are missing at random. Based on the “complete” data sets for Y after nonaprametric regression imputation and inverse probability weighted imputation, two estimators of the regression function m(x0) for fixed x0Rp are proposed. Asymptotic normality of two estimators is established, which is used to construct normal approximation-based confidence intervals for m(x0). We also construct an empirical likelihood (EL) statistic for m(x0) with limiting distribution of χ21, which is used to construct an EL confidence interval for m(x0).  相似文献   

15.
Latent class analysis (LCA) has been found to have important applications in social and behavioral sciences for modeling categorical response variables, and nonresponse is typical when collecting data. In this study, the nonresponse mainly included “contingency questions” and real “missing data.” The primary objective of this research was to evaluate the effects of some potential factors on model selection indices in LCA with nonresponse data.

We simulated missing data with contingency questions and evaluated the accuracy rates of eight information criteria for selecting the correct models. The results showed that the main factors are latent class proportions, conditional probabilities, sample size, the number of items, the missing data rate, and the contingency data rate. Interactions of the conditional probabilities with class proportions, sample size, and the number of items are also significant. From our simulation results, the impact of missing data and contingency questions can be amended by increasing the sample size or the number of items.  相似文献   


16.
Let {X 1, …, X n } and {Y 1, …, Y m } be two samples of independent and identically distributed observations with common continuous cumulative distribution functions F(x)=P(Xx) and G(y)=P(Yy), respectively. In this article, we would like to test the no quantile treatment effect hypothesis H 0: F=G. We develop a bootstrap quantile-treatment-effect test procedure for testing H 0 under the location-scale shift model. Our test procedure avoids the calculation of the check function (which is non-differentiable at the origin and makes solving the quantile effects difficult in typical quantile regression analysis). The limiting null distribution of the test procedure is derived and the procedure is shown to be consistent against a broad family of alternatives. Simulation studies show that our proposed test procedure attains its type I error rate close to the pre-chosen significance level even for small sample sizes. Our test procedure is illustrated with two real data sets on the lifetimes of guinea pigs from a treatment-control experiment.  相似文献   

17.
The null distribution of Wilks' likelihood-ratio criterion for testing independence of several groups of variables in a multivariate normal population is derived. Percentage points are tabulated for various values of the sample sizeN and partitions of p, the number of variables. This paper extends Mathai and Katiya's (1979) “sphericity” results and tables.  相似文献   

18.
This article deals with the estimation of R = P{X < Y}, where X and Y are independent random variables from geometric and exponential distribution, respectively. For complete samples, the MLE of R, its asymptotic distribution, and confidence interval based on it are obtained. The procedure for deriving bootstrap-p confidence interval is presented. The UMVUE of R and UMVUE of its variance are derived. The Bayes estimator of R is investigated and its Lindley's approximation is obtained. A simulation study is performed in order to compare these estimators. Finally, all point estimators for right censored sample from the exponential distribution, are obtained.  相似文献   

19.
Let X1, …,Xn, and Y1, … Yn be consecutive samples from a distribution function F which itself is randomly chosen according to the Ferguson (1973) Dirichlet-process prior distribution on the space of distribution functions. Typically, prediction intervals employ the observations X1,…, Xn in the first sample in order to predict a specified function of the future sample Y1, …, Yn. Here one- and two-sided prediction intervals for at least q of N future observations are developed for the situation in which, in addition to the previous sample, there is prior information available. The information is specified via the parameter α of the Dirichlet process prior distribution.  相似文献   

20.
In this paper we study the procedures of Dudewicz and Dalal ( 1975 ), and the modifications suggested by Rinott ( 1978 ), for selecting the largest mean from k normal populations with unknown variances. We look at the case k = 2 in detail, because there is an optimal allocation scheme here. We do not really allocate the total number of samples into two groups, but we estimate this optimal sample size, as well, so as to guarantee the probability of correct selection (written as P(CS)) at least P?, 1/2 < P? < 1 . We prove that the procedure of Rinott is “asymptotically in-efficient” (to be defined below) in the sense of Chow and Robbins ( 1965 ) for any k  2. Next, we propose two-stage procedures having all the properties of Rinott's procedure, together with the property of “asymptotic efficiency” - which is highly desirable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号