期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Variance Estimation under Two‐Phase Sampling

Takumi Saegusa 《Scandinavian Journal of Statistics》2015,42(4):1078-1091

We consider the variance estimation of the weighted likelihood estimator (WLE) under two‐phase stratified sampling without replacement. Asymptotic variance of the WLE in many semiparametric models contains unknown functions or does not have a closed form. The standard method of the inverse probability weighted (IPW) sample variances of an estimated influence function is then not available in these models. To address this issue, we develop the variance estimation procedure for the WLE in a general semiparametric model. The phase I variance is estimated by taking a numerical derivative of the IPW log likelihood. The phase II variance is estimated based on the bootstrap for a stratified sample in a finite population. Despite a theoretical difficulty of dependent observations due to sampling without replacement, we establish the (bootstrap) consistency of our estimators. Finite sample properties of our method are illustrated in a simulation study. 相似文献

2.

Bayesian temporal density estimation with autoregressive species sampling models

Youngin Jo Seongil Jo Yung-Seop Lee Jaeyong Lee 《Journal of the Korean Statistical Society》2018,47(3):248-262

We propose a novel Bayesian nonparametric (BNP) model, which is built on a class of species sampling models, for estimating density functions of temporal data. In particular, we introduce species sampling mixture models with temporal dependence. To accommodate temporal dependence, we define dependent species sampling models by modeling random support points and weights through an autoregressive model, and then we construct the mixture models based on the collection of these dependent species sampling models. We propose an algorithm to generate posterior samples and present simulation studies to compare the performance of the proposed models with competitors that are based on Dirichlet process mixture models. We apply our method to the estimation of densities for the price of apartment in Seoul, the closing price in Korea Composite Stock Price Index (KOSPI), and climate variables (daily maximum temperature and precipitation) of around the Korean peninsula. 相似文献

3.

Bayesian quantile regression for longitudinal data models

《Journal of Statistical Computation and Simulation》2012,82(11):1635-1649

In this paper, we discuss a fully Bayesian quantile inference using Markov Chain Monte Carlo (MCMC) method for longitudinal data models with random effects. Under the assumption of error term subject to asymmetric Laplace distribution, we establish a hierarchical Bayesian model and obtain the posterior distribution of unknown parameters at τ-th level. We overcome the current computational limitations using two approaches. One is the general MCMC technique with Metropolis–Hastings algorithm and another is the Gibbs sampling from the full conditional distribution. These two methods outperform the traditional frequentist methods under a wide array of simulated data models and are flexible enough to easily accommodate changes in the number of random effects and in their assumed distribution. We apply the Gibbs sampling method to analyse a mouse growth data and some different conclusions from those in the literatures are obtained. 相似文献

4.

Umvu estimators for the population mean and variance based on random effects models for lognormal data

Robert H. Lyles Lawrence L. Kupper 《统计学通讯:理论与方法》2013,42(4):795-818

Cross-classified data are often obtained in controlled experimental situations and in epidemiologic studies. As an example of the latter, occupational health studies sometimes require personal exposure measurements on a random sample of workers from one or more job groups, in one or more plant locations, on several different sampling dates. Because the marginal distributions of exposure data from such studies are generally right-skewed and well-approximated as lognormal, researchers in this area often consider the use of ANOVA models after a logarithmic transformation. While it is then of interest to estimate original-scale population parameters (e.g., the overall mean and variance), standard candidates such as maximum likelihood estimators (MLEs) can be unstable and highly biased. Uniformly minimum variance unbiased (UMVU) cstiniators offer a viable alternative, and are adaptable to sampling schemes that are typiral of experimental or epidemiologic studies. In this paper, we provide UMVU estimators for the mean and variance under two random effects ANOVA models for logtransformed data. We illustrate substantial mean squared error gains relative to the MLE when estimating the mean under a one-way classification. We illustrate that the results can readily be extended to encompass a useful class of purely random effects models, provided that the study data are balanced. 相似文献

5.

Bayesian methods for contingency tables using Gibbs sampling

Paul E. Green Taesung Park 《Statistical Papers》2004,45(1):33-50

Cell counts in contingency tables can be smoothed using loglinear models. Recently, sampling-based methods such as Markov chain Monte Carlo (MCMC) have been introduced, making it possible to sample from posterior distributions. The novelty of the approach presented here is that all conditional distributions can be specified directly, so that straight-forward Gibbs sampling is possible. Thus, the model is constructed in a way that makes burn-in and checking convergence a relatively minor issue. The emphasis of this paper is on smoothing cell counts in contingency tables, and not so much on estimation of regression parameters. Therefore, the prior distribution consists of two stages. We rely on a normal nonconjugate prior at the first stage, and a vague prior for hyperparameters at the second stage. The smoothed counts tend to compromise between the observed data and a log-linear model. The methods are demonstrated with a sparse data table taken from a multi-center clinical trial. The research for the first author was supported by Brain Pool program of the Korean Federation of Science and Technology Societies. The research for the second author was partially supported by KOSEF through Statistical Research Center for Complex Systems at Seoul National University. 相似文献

6.

Maximum likelihood methods for complex sample data: logistic regression and discrete proportional hazards models

Lloyd E. Chambless Kerrie E. Boyle 《统计学通讯:理论与方法》2013,42(6):1377-1392

To estimate model parameters from complex sample data. we apply maximum likelihood techniques to the complex sample data from the finite population, which is treated as a sample from an i nfinite superpopulation. General asymptotic distribution theory is developed and then applied to both logistic regression and discrete proportional hazards models. Data from the Lipid Research Clinics Program areused to illustrate each model, demonstrating the effects on inference of neglecting the sampling design during parameter estimation. These empirical results also shed light on the issue of model-based vs. design-based inferences. 相似文献

7.

Predicting random effects with an expanded finite population mixed model

Edward J. Stanek III Julio M. Singer 《Journal of statistical planning and inference》2008

Prediction of random effects is an important problem with expanding applications. In the simplest context, the problem corresponds to prediction of the latent value (the mean) of a realized cluster selected via two-stage sampling. Recently, Stanek and Singer [Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 119–130] developed best linear unbiased predictors (BLUP) under a finite population mixed model that outperform BLUPs from mixed models and superpopulation models. Their setup, however, does not allow for unequally sized clusters. To overcome this drawback, we consider an expanded finite population mixed model based on a larger set of random variables that span a higher dimensional space than those typically applied to such problems. We show that BLUPs for linear combinations of the realized cluster means derived under such a model have considerably smaller mean squared error (MSE) than those obtained from mixed models, superpopulation models, and finite population mixed models. We motivate our general approach by an example developed for two-stage cluster sampling and show that it faithfully captures the stochastic aspects of sampling in the problem. We also consider simulation studies to illustrate the increased accuracy of the BLUP obtained under the expanded finite population mixed model. 相似文献

8.

Filtering recursions for calculating likelihoods for queues based on inter-departure time data

Paul Fearnhead 《Statistics and Computing》2004,14(3):261-266

We consider inference for queues based on inter-departure time data. Calculating the likelihood for such models is difficult, as the likelihood involves summing up over the (exponentially-large) space of realisations of the arrival process. We demonstrate how a likelihood recursion can be used to calculate this likelihood efficiently for the specific cases of M/G/1 and Er/G/1 queues. We compare the sampling properties of the mles to the sampling properties of estimators, based on indirect inference, which have previously been suggested for this problem. 相似文献

9.

Monte Carlo Maximum Likelihood Estimation for Generalized Long-Memory Time Series Models

G. Mesters M. Ooms 《Econometric Reviews》2016,35(4):659-687

An exact maximum likelihood method is developed for the estimation of parameters in a non-Gaussian nonlinear density function that depends on a latent Gaussian dynamic process with long-memory properties. Our method relies on the method of importance sampling and on a linear Gaussian approximating model from which the latent process can be simulated. Given the presence of a latent long-memory process, we require a modification of the importance sampling technique. In particular, the long-memory process needs to be approximated by a finite dynamic linear process. Two possible approximations are discussed and are compared with each other. We show that an autoregression obtained from minimizing mean squared prediction errors leads to an effective and feasible method. In our empirical study, we analyze ten daily log-return series from the S&P 500 stock index by univariate and multivariate long-memory stochastic volatility models. We compare the in-sample and out-of-sample performance of a number of models within the class of long-memory stochastic volatility models. 相似文献

10.

PkStaMp Library for Constructing Optimal Population Designs for PK/PD Studies

《统计学通讯:模拟与计算》2012,41(6):717-729

We discuss a Matlab-based library for constructing optimal sampling schemes for pharmacokinetic (PK) and pharmacodynamic (PD) studies. The software relies on optimal design theory for nonlinear mixed effects models and, in particular, on the first-order optimization algorithm. The library includes a number of popular compartmental PK and combined PK/PD models and can be extended to include more models. An outline of inputs/outputs is provided, some algorithmic details and examples are presented, and future work is discussed. 相似文献

11.

The Gibbs sampler with particle efficient importance sampling for state-space models*

Oliver Grothe Tore Selland Kleppe 《Econometric Reviews》2019,38(10):1152-1175

We consider Particle Gibbs (PG) for Bayesian analysis of non-linear non-Gaussian state-space models. As a Monte Carlo (MC) approximation of the Gibbs procedure, PG uses sequential MC (SMC) importance sampling inside the Gibbs to update the latent states. We propose to combine PG with the Particle Efficient Importance Sampling (PEIS). By using SMC sampling densities which are approximately globally fully adapted to the targeted density of the states, PEIS can substantially improve the simulation efficiency of the PG relative to existing PG implementations. The efficiency gains are illustrated in PG applications to a non-linear local-level model and stochastic volatility models. 相似文献

12.

Increased Fisher’s information for parameters of association in count regression via extreme ranks

Daniel F. Linder Jingjing Yin Haresh Rochani Hani Samawi Sanjay Sethi 《统计学通讯:理论与方法》2018,47(5):1181-1203

The article details a sampling scheme which can lead to a reduction in sample size and cost in clinical and epidemiological studies of association between a count outcome and risk factor. We show that inference in two common generalized linear models for count data, Poisson and negative binomial regression, is improved by using a ranked auxiliary covariate, which guides the sampling procedure. This type of sampling has typically been used to improve inference on a population mean. The novelty of the current work is its extension to log-linear models and derivations showing that the sampling technique results in an increase in information as compared to simple random sampling. Specifically, we show that under the proposed sampling strategy the maximum likelihood estimate of the risk factor’s coefficient is improved through an increase in the Fisher’s information. A simulation study is performed to compare the mean squared error, bias, variance, and power of the sampling routine with simple random sampling under various data-generating scenarios. We also illustrate the merits of the sampling scheme on a real data set from a clinical setting of males with chronic obstructive pulmonary disease. Empirical results from the simulation study and data analysis coincide with the theoretical derivations, suggesting that a significant reduction in sample size, and hence study cost, can be realized while achieving the same precision as a simple random sample. 相似文献

13.

Design‐Based Inference in a Mixture Model for Ordinal Variables for a Two Stage Stratified Design

R. Gambacorta M. Iannario R. Valliant 《Australian & New Zealand Journal of Statistics》2014,56(2):125-143

In this paper we present methods for inference on data selected by a complex sampling design for a class of statistical models for the analysis of ordinal variables. Specifically, assuming that the sampling scheme is not ignorable, we derive for the class of cub models (Combination of discrete Uniform and shifted Binomial distributions) variance estimates for a complex two stage stratified sample. Both Taylor linearization and repeated replication variance estimators are presented. We also provide design‐based test diagnostics and goodness‐of‐fit measures. We illustrate by means of real data analysis the differences between survey‐weighted and unweighted point estimates and inferences for cub model parameters. 相似文献

14.

On prediction of population and subpopulation characteristics for future periods

Tomasz Ża̧dło 《统计学通讯:模拟与计算》2017,46(10):8086-8104

The estimation or prediction of population characteristics based on the sample information is the key issue in survey sampling. If the sample sizes in subpopulations (domains) are large enough, similar methods as used for the whole population can be used to estimate or to predict subpopulations characteristics as well. To estimate or to predict characteristics of domains with small or even zero sample sizes, small area estimation methods “borrowing strength” from other subpopulations or time periods are widely used. We extend this problem and study methods of prediction of future population and subpopulations’ characteristics based on the longitudinal data. 相似文献

15.

Comparison of Akaike information criterion and consistent Akaike information criterion for model selection and statistical inference from capture-recapture studies 总被引：1，自引：0，他引：1

D. R. Anderson K. P. Burnham G. C. White 《Journal of applied statistics》1998,25(2):263-282

SUMMARY We compare properties of parameter estimators under Akaike information criterion (AIC) and 'consistent' AIC (CAIC) model selection in a nested sequence of open population capture-recapture models. These models consist of product multinomials, where the cell probabilities are parameterized in terms of survival ( ) and capture ( p ) i i probabilities for each time interval i . The sequence of models is derived from 'treatment' effects that might be (1) absent, model H ; (2) only acute, model H ; or (3) acute and 0 2 p chronic, lasting several time intervals, model H . Using a 35 factorial design, 1000 3 repetitions were simulated for each of 243 cases. The true number of parameters ranged from 7 to 42, and the sample size ranged from approximately 470 to 55 000 per case. We focus on the quality of the inference about the model parameters and model structure that results from the two selection criteria. We use achieved confidence interval coverage as an integrating metric to judge what constitutes a 'properly parsimonious' model, and contrast the performance of these two model selection criteria for a wide range of models, sample sizes, parameter values and study interval lengths. AIC selection resulted in models in which the parameters were estimated with relatively little bias. However, these models exhibited asymptotic sampling variances that were somewhat too small, and achieved confidence interval coverage that was somewhat below the nominal level. In contrast, CAIC-selected models were too simple, the parameter estimators were often substantially biased, the asymptotic sampling variances were substantially too small and the achieved coverage was often substantially below the nominal level. An example case illustrates a pattern: with 20 capture occasions, 300 previously unmarked animals are released at each occasion, and the survival and capture probabilities in the control group on each occasion were 0.9 and 0.8 respectively using model H . There was a strong acute treatment effect 3 on the first survival ( ) and first capture probability ( p ), and smaller, chronic effects 1 2 on the second and third survival probabilities ( and ) as well as on the second capture 2 3 probability ( p ); the sample size for each repetition was approximately 55 000. CAIC 3 selection led to a model with exactly these effects in only nine of the 1000 repetitions, compared with 467 times under AIC selection. Under CAIC selection, even the two acute effects were detected only 555 times, compared with 998 for AIC selection. AIC selection exhibited a balance between underfitted and overfitted models (270 versus 263), while CAIC tended strongly to select underfitted models. CAIC-selected models were overly parsimonious and poor as a basis for statistical inferences about important model parameters or structure. We recommend the use of the AIC and not the CAIC for analysis and inference from capture-recapture data sets. 相似文献

16.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker 《Econometric Reviews》2013,32(2-3):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

17.

Monte Carlo Likelihood Estimation for Three Multivariate Stochastic Volatility Models

Borus Jungbacker Siem Jan Koopman 《Econometric Reviews》2006,25(2):385-408

Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods. 相似文献

18.

Integrated likelihoods in parametric survival models for highly clustered censored data

Giuliana Cortese Nicola Sartori 《Lifetime data analysis》2016,22(3):382-404

In studies that involve censored time-to-event data, stratification is frequently encountered due to different reasons, such as stratified sampling or model adjustment due to violation of model assumptions. Often, the main interest is not in the clustering variables, and the cluster-related parameters are treated as nuisance. When inference is about a parameter of interest in presence of many nuisance parameters, standard likelihood methods often perform very poorly and may lead to severe bias. This problem is particularly evident in models for clustered data with cluster-specific nuisance parameters, when the number of clusters is relatively high with respect to the within-cluster size. However, it is still unclear how the presence of censoring would affect this issue. We consider clustered failure time data with independent censoring, and propose frequentist inference based on an integrated likelihood. We then apply the proposed approach to a stratified Weibull model. Simulation studies show that appropriately defined integrated likelihoods provide very accurate inferential results in all circumstances, such as for highly clustered data or heavy censoring, even in extreme settings where standard likelihood procedures lead to strongly misleading results. We show that the proposed method performs generally as well as the frailty model, but it is superior when the frailty distribution is seriously misspecified. An application, which concerns treatments for a frequent disease in late-stage HIV-infected people, illustrates the proposed inferential method in Weibull regression models, and compares different inferential conclusions from alternative methods. 相似文献

19.

Estimation of river and stream temperature trends under haphazard sampling

Brian R. Gray Vyacheslav Lyubchich Yulia R. Gel James T. Rogala Dale M. Robertson Xiaoqiao Wei 《Statistical Methods and Applications》2016,25(1):89-105

Long-term temporal trends in water temperature in rivers and streams are typically estimated under the assumption of evenly-spaced space-time measurements. However, sampling times and dates associated with historical water temperature datasets and some sampling designs may be haphazard. As a result, trends in temperature may be confounded with trends in time or space of sampling which, in turn, may yield biased trend estimators and thus unreliable conclusions. We address this concern using multilevel (hierarchical) linear models, where time effects are allowed to vary randomly by day and date effects by year. We evaluate the proposed approach by Monte Carlo simulations with imbalance, sparse data and confounding by trend in time and date of sampling. Simulation results indicate unbiased trend estimators while results from a case study of temperature data from the Illinois River, USA conform to river thermal assumptions. We also propose a new nonparametric bootstrap inference on multilevel models that allows for a relatively flexible and distribution-free quantification of uncertainties. The proposed multilevel modeling approach may be elaborated to accommodate nonlinearities within days and years when sampling times or dates typically span temperature extremes. 相似文献

20.

Computational methods for complex stochastic systems: a review of some alternatives to MCMC 总被引：1，自引：0，他引：1

Paul Fearnhead 《Statistics and Computing》2008,18(2):151-171

We consider analysis of complex stochastic models based upon partial information. MCMC and reversible jump MCMC are often the methods of choice for such problems, but in some situations they can be difficult to implement; and suffer from problems such as poor mixing, and the difficulty of diagnosing convergence. Here we review three alternatives to MCMC methods: importance sampling, the forward-backward algorithm, and sequential Monte Carlo (SMC). We discuss how to design good proposal densities for importance sampling, show some of the range of models for which the forward-backward algorithm can be applied, and show how resampling ideas from SMC can be used to improve the efficiency of the other two methods. We demonstrate these methods on a range of examples, including estimating the transition density of a diffusion and of a discrete-state continuous-time Markov chain; inferring structure in population genetics; and segmenting genetic divergence data. 相似文献