首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
As an alternative to an estimation based on a simple random sample (BLUE-SRS) for the simple linear regression model, Moussa-Hamouda and Leone [E. Moussa-Hamouda and F.C. Leone, The o-blue estimators for complete and censored samples in linear regression, Technometrics, 16 (3) (1974), pp. 441–446.] discussed the best linear unbiased estimators based on order statistics (BLUE-OS), and showed that BLUE-OS is more efficient than BLUE-SRS for normal data. Using the ranked set sampling, Barreto and Barnett [M.C.M. Barreto and V. Barnett, Best linear unbiased estimators for the simple linear regression model using ranked set sampling. Environ. Ecoll. Stat. 6 (1999), pp. 119–133.] derived the best linear unbiased estimators (BLUE-RSS) for simple linear regression model and showed that BLUE-RSS is more efficient for the estimation of the regression parameters (intercept and slope) than BLUE-SRS for normal data, but not so for the estimation of the residual standard deviation in the case of small sample size. As an alternative to RSS, this paper considers the best linear unbiased estimators based on order statistics from a ranked set sample (BLUE-ORSS) and shows that BLUE-ORSS is uniformly more efficient than BLUE-RSS and BLUE-OS for normal data.  相似文献   

2.
An internal pilot with interim analysis (IPIA) design combines interim power analysis (an internal pilot) with interim data analysis (two-stage group sequential). We provide IPIA methods for single df hypotheses within the Gaussian general linear model, including one and two group t tests. The design allows early stopping for efficacy and futility while also re-estimating sample size based on an interim variance estimate. Study planning in small samples requires the exact and computable forms reported here. The formulation gives fast and accurate calculations of power, Type I error rate, and expected sample size.  相似文献   

3.
Consider k( k ≥ 1) independent Weibull populations and a control population which is also Weibull. The problem of identifying which of these k populations are better than the control using shape parameter as a criterion is considered. We allow the possibility of making at most m(0 ≤ m < k) incorrect identifications of better populations. This allowance results in significant savings in sample size. Procedures based on simple linear unbiased estimators of the reciprocal of the shape parameters of these populations are proposed. These procedures can be used for both complete and Type II-censored samples. A related problem of confidence intervals for the ratio of ordered shape parameters is also considered. Monte Carlo simulations as well as both chi-square and normal approximations to the solutions are obtained.  相似文献   

4.
In this paper, we translate variable selection for linear regression into multiple testing, and select significant variables according to testing result. New variable selection procedures are proposed based on the optimal discovery procedure (ODP) in multiple testing. Due to ODP’s optimality, if we guarantee the number of significant variables included, it will include less non significant variables than marginal p-value based methods. Consistency of our procedures is obtained in theory and simulation. Simulation results suggest that procedures based on multiple testing have improvement over procedures based on selection criteria, and our new procedures have better performance than marginal p-value based procedures.  相似文献   

5.
In this article, lower bounds for expected sample size of sequential selection procedures are constructed for the problem of selecting the most probable event of k-variate multinomial distribution. The study is based on Volodin’s universal lower bounds for expected sample size of statistical inference procedures. The obtained lower bounds are used to estimate the efficiency of some selection procedures in terms of their expected sample sizes.  相似文献   

6.
《统计学通讯:理论与方法》2012,41(13-14):2465-2489
The Akaike information criterion, AIC, and Mallows’ C p statistic have been proposed for selecting a smaller number of regressors in the multivariate regression models with fully unknown covariance matrix. All of these criteria are, however, based on the implicit assumption that the sample size is substantially larger than the dimension of the covariance matrix. To obtain a stable estimator of the covariance matrix, it is required that the dimension of the covariance matrix is much smaller than the sample size. When the dimension is close to the sample size, it is necessary to use ridge-type estimators for the covariance matrix. In this article, we use a ridge-type estimators for the covariance matrix and obtain the modified AIC and modified C p statistic under the asymptotic theory that both the sample size and the dimension go to infinity. It is numerically shown that these modified procedures perform very well in the sense of selecting the true model in large dimensional cases.  相似文献   

7.
ABSTRACT

Despite the popularity of the general linear mixed model for data analysis, power and sample size methods and software are not generally available for commonly used test statistics and reference distributions. Statisticians resort to simulations with homegrown and uncertified programs or rough approximations which are misaligned with the data analysis. For a wide range of designs with longitudinal and clustering features, we provide accurate power and sample size approximations for inference about fixed effects in the linear models we call reversible. We show that under widely applicable conditions, the general linear mixed-model Wald test has noncentral distributions equivalent to well-studied multivariate tests. In turn, exact and approximate power and sample size results for the multivariate Hotelling–Lawley test provide exact and approximate power and sample size results for the mixed-model Wald test. The calculations are easily computed with a free, open-source product that requires only a web browser to use. Commercial software can be used for a smaller range of reversible models. Simple approximations allow accounting for modest amounts of missing data. A real-world example illustrates the methods. Sample size results are presented for a multicenter study on pregnancy. The proposed study, an extension of a funded project, has clustering within clinic. Exchangeability among the participants allows averaging across them to remove the clustering structure. The resulting simplified design is a single-level longitudinal study. Multivariate methods for power provide an approximate sample size. All proofs and inputs for the example are in the supplementary materials (available online).  相似文献   

8.
We propose two new procedures based on multiple hypothesis testing for correct support estimation in high‐dimensional sparse linear models. We conclusively prove that both procedures are powerful and do not require the sample size to be large. The first procedure tackles the atypical setting of ordered variable selection through an extension of a testing procedure previously developed in the context of a linear hypothesis. The second procedure is the main contribution of this paper. It enables data analysts to perform support estimation in the general high‐dimensional framework of non‐ordered variable selection. A thorough simulation study and applications to real datasets using the R package mht shows that our non‐ordered variable procedure produces excellent results in terms of correct support estimation as well as in terms of mean square errors and false discovery rate, when compared to common methods such as the Lasso, the SCAD penalty, forward regression or the false discovery rate procedure (FDR).  相似文献   

9.
Sample size calculation is a critical issue in clinical trials because a small sample size leads to a biased inference and a large sample size increases the cost. With the development of advanced medical technology, some patients can be cured of certain chronic diseases, and the proportional hazards mixture cure model has been developed to handle survival data with potential cure information. Given the needs of survival trials with potential cure proportions, a corresponding sample size formula based on the log-rank test statistic for binary covariates has been proposed by Wang et al. [25]. However, a sample size formula based on continuous variables has not been developed. Herein, we presented sample size and power calculations for the mixture cure model with continuous variables based on the log-rank method and further modified it by Ewell's method. The proposed approaches were evaluated using simulation studies for synthetic data from exponential and Weibull distributions. A program for calculating necessary sample size for continuous covariates in a mixture cure model was implemented in R.  相似文献   

10.
Abstract

Sample size calculation is an important component in designing an experiment or a survey. In a wide variety of fields—including management science, insurance, and biological and medical science—truncated normal distributions are encountered in many applications. However, the sample size required for the left-truncated normal distribution has not been investigated, because the distribution of the sample mean from the left-truncated normal distribution is complex and difficult to obtain. This paper compares an ad hoc approach to two newly proposed methods based on the Central Limit Theorem and on a high degree saddlepoint approximation for calculating the required sample size with the prespecified power. As shown by use of simulations and an example of health insurance cost in China, the ad hoc approach underestimates the sample size required to achieve prespecified power. The method based on the high degree saddlepoint approximation provides valid sample size and power calculations, and it performs better than the Central Limit Theorem. When the sample size is not too small, the Central Limit Theorem also provides a valid, but relatively simple tool to approximate that sample size.  相似文献   

11.
Approximations to the noncentral F distribution yield surprisingly accurate results for power and sample size problems arising from linear hypotheses about normal random variables. The approximations are easy to use with a desk (or hand-held) calculator that computes cumulative F probabilities. These approximations are particularly advantageous for testing the hypothesis that differences among the means are small against the alternative that the differences are large.  相似文献   

12.
Maximum likelihood estimation is investigated in the context of linear regression models under partial independence restrictions. These restrictions aim to assume a kind of completeness of a set of predictors Z in the sense that they are sufficient to explain the dependencies between an outcome Y and predictors X: ?(Y|Z, X) = ?(Y|Z), where ?(·|·) stands for the conditional distribution. From a practical point of view, the former model is particularly interesting in a double sampling scheme where Y and Z are measured together on a first sample and Z and X on a second separate sample. In that case, estimation procedures are close to those developed in the study of double‐regression by Engel & Walstra (1991) and Causeur & Dhorne (1998) . Properties of the estimators are derived in a small sample framework and in an asymptotic one, and the procedure is illustrated by an example from the food industry context.  相似文献   

13.
Sample Size     
Conventionally, sample size calculations are viewed as calculations determining the right number of subjects needed for a study. Such calculations follow the classical paradigm: “for a difference X, I need sample size Y.” We argue that the paradigm “for a sample size Y, I get information Z” is more appropriate for many studies and reflects the information needed by scientists when planning a study. This approach applies to both physiological studies and Phase I and II interventional studies. We provide actual examples from our own consulting work to demonstrate this. We conclude that sample size should be viewed not as a unique right number, but rather as a factor needed to assess the utility of a study.  相似文献   

14.
G. Aneiros  F. Ferraty  P. Vieu 《Statistics》2015,49(6):1322-1347
The problem of variable selection is considered in high-dimensional partial linear regression under some model allowing for possibly functional variable. The procedure studied is that of nonconcave-penalized least squares. It is shown the existence of a √n/sn-consistent estimator for the vector of pn linear parameters in the model, even when pn tends to ∞ as the sample size n increases (sn denotes the number of influential variables). An oracle property is also obtained for the variable selection method, and the nonparametric rate of convergence is stated for the estimator of the nonlinear functional component of the model. Finally, a simulation study illustrates the finite sample size performance of our procedure.  相似文献   

15.
One of the general problems in clinical trials and mortality rates is the comparison of competing risks. Most of the test statistics used for independent and dependent risks with censored data belong to the class of weighted linear rank tests in its multivariate version. In this paper, we introduce the saddlepoint approximations as accurate and fast approximations for the exact p-values of this class of tests instead of the asymptotic and permutation simulated calculations. Real data examples and extensive simulation studies showed the accuracy and stability performance of the saddlepoint approximations over different scenarios of lifetime distributions, sample sizes and censoring.  相似文献   

16.
Abstract

In a quantitative linear model with errors following a stationary Gaussian, first-order autoregressive or AR(1) process, Generalized Least Squares (GLS) on raw data and Ordinary Least Squares (OLS) on prewhitened data are efficient methods of estimation of the slope parameters when the autocorrelation parameter of the error AR(1) process, ρ, is known. In practice, ρ is generally unknown. In the so-called two-stage estimation procedures, ρ is then estimated first before using the estimate of ρ to transform the data and estimate the slope parameters by OLS on the transformed data. Different estimators of ρ have been considered in previous studies. In this article, we study nine two-stage estimation procedures for their efficiency in estimating the slope parameters. Six of them (i.e., three noniterative, three iterative) are based on three estimators of ρ that have been considered previously. Two more (i.e., one noniterative, one iterative) are based on a new estimator of ρ that we propose: it is provided by the sample autocorrelation coefficient of the OLS residuals at lag 1, denoted r(1). Lastly, REstricted Maximum Likelihood (REML) represents a different type of two-stage estimation procedure whose efficiency has not been compared to the others yet. We also study the validity of the testing procedures derived from GLS and the nine two-stage estimation procedures. Efficiency and validity are analyzed in a Monte Carlo study. Three types of explanatory variable x in a simple quantitative linear model with AR(1) errors are considered in the time domain: Case 1, x is fixed; Case 2, x is purely random; and Case 3, x follows an AR(1) process with the same autocorrelation parameter value as the error AR(1) process. In a preliminary step, the number of inadmissible estimates and the efficiency of the different estimators of ρ are compared empirically, whereas their approximate expected value in finite samples and their asymptotic variance are derived theoretically. Thereafter, the efficiency of the estimation procedures and the validity of the derived testing procedures are discussed in terms of the sample size and the magnitude and sign of ρ. The noniterative two-stage estimation procedure based on the new estimator of ρ is shown to be more efficient for moderate values of ρ at small sample sizes. With the exception of small sample sizes, REML and its derived F-test perform the best overall. The asymptotic equivalence of two-stage estimation procedures, besides REML, is observed empirically. Differences related to the nature, fixed or random (uncorrelated or autocorrelated), of the explanatory variable are also discussed.  相似文献   

17.
We consider approximate Bayesian inference about scalar parameters of linear regression models with possible censoring. A second-order expansion of their Laplace posterior is seen to have a simple and intuitive form for logconcave error densities with nondecreasing hazard functions. The accuracy of the approximations is assessed for normal and Gumbel errors when the number of regressors increases with sample size. Perturbations of the prior and the likelihood are seen to be easily accommodated within our framework. Links with the work of DiCiccio et al. (1990) and Viveros and Sprott (1987) extend the applicability of our results to conditional frequentist inference based on likelihood-ratio statistics.  相似文献   

18.
Standard Schwarz information criterion for testing a change-point in regression models is considered and two new test procedures are evolved. The case of small sample size is investigated. Numerical approximations to the power against various alternatives are given and compared with powers of tests based on r-ahead recursive residuals and of the CUSUM of squares test. Application of these procedures to some real data is also provided.  相似文献   

19.
In this article, we propose a kernel-based estimator for the finite-dimensional parameter of a partially additive linear quantile regression model. For dependent processes that are strictly stationary and absolutely regular, we establish a precise convergent rate and show that the estimator is root-n consistent and asymptotically normal. To help facilitate inferential procedures, a consistent estimator for the asymptotic variance is also provided. In addition to conducting a simulation experiment to evaluate the finite sample performance of the estimator, an application to US inflation is presented. We use the real-data example to motivate how partially additive linear quantile models can offer an alternative modeling option for time-series data.  相似文献   

20.
The statistical inference problem on effect size indices is addressed using a series of independent two-armed experiments from k arbitrary populations. The effect size parameter simply quantifies the difference between two groups. It is a meaningful index to be used when data are measured on different scales. In the context of bivariate statistical models, we define estimators of the effect size indices and propose large sample testing procedures to test the homogeneity of these indices. The null and non-null distributions of the proposed testing procedures are derived and their performance is evaluated via Monte Carlo simulation. Further, three types of interval estimation of the proposed indices are considered for both combined and uncombined data. Lower and upper confidence limits for the actual effect size indices are obtained and compared via bootstrapping. It is found that the length of the intervals based on the combined effect size estimator are almost half the length of the intervals based on the uncombined effect size estimators. Finally, we illustrate the proposed procedures for hypothesis testing and interval estimation using a real data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号