期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

General partially linear varying-coefficient transformation models for ranking data

Jianbo Li Minggao Gu Tao Hu 《Journal of applied statistics》2012,39(7):1475-1488

In this paper,we propose a class of general partially linear varying-coefficient transformation models for ranking data. In the models, the functional coefficients are viewed as nuisance parameters and approximated by B-spline smoothing approximation technique. The B-spline coefficients and regression parameters are estimated by rank-based maximum marginal likelihood method. The three-stage Monte Carlo Markov Chain stochastic approximation algorithm based on ranking data is used to compute estimates and the corresponding variances for all the B-spline coefficients and regression parameters. Through three simulation studies and a Hong Kong horse racing data application, the proposed procedure is illustrated to be accurate, stable and practical. 相似文献

2.

Three centuries of categorical data analysis: Log-linear models and maximum likelihood estimation

Stephen E. Fienberg Alessandro Rinaldo 《Journal of statistical planning and inference》2007

The common view of the history of contingency tables is that it begins in 1900 with the work of Pearson and Yule, but in fact it extends back at least into the 19th century. Moreover, it remains an active area of research today. In this paper we give an overview of this history focussing on the development of log-linear models and their estimation via the method of maximum likelihood. Roy played a crucial role in this development with two papers co-authored with his students, Mitra and Marvin Kastenbaum, at roughly the mid-point temporally in this development. Then we describe a problem that eluded Roy and his students, that of the implications of sampling zeros for the existence of maximum likelihood estimates for log-linear models. Understanding the problem of non-existence is crucial to the analysis of large sparse contingency tables. We introduce some relevant results from the application of algebraic geometry to the study of this statistical problem. 相似文献

3.

On the Hypothesis of No Interaction on a Linear Scale in Contingency Tables

Jacqueline Lan Dana D.V. Gokhale 《统计学通讯:理论与方法》2013,42(11):1225-1240

The purpose of this paper is to review briefly the three main formulations of no Interaction hypotheses in contingency tables and to consider the formulation on a linear scale in some detail.More specifically we (i) present a situation in 2×2 tables where such a formulation may be more appropriate than others, (ii) study the geometry for this problem, (iii) give contrast-type or parametric ANOVA type formulations in the general n-dimensional tables, (iv) discuss estimation and testing procedures and (v) consider collapsibility of contingency tables in relation to the hypotheses of no interaction on a linear scale. 相似文献

4.

Inference about regression parameters using highly stratified survey count data with over-dispersion and repeated measurements

S. Wang H. P. Benoît 《Journal of applied statistics》2017,44(6):1013-1030

We study methods to estimate regression and variance parameters for over-dispersed and correlated count data from highly stratified surveys. Our application involves counts of fish catches from stratified research surveys and we propose a novel model in fisheries science to address changes in survey protocols. A challenge with this model is the large number of nuisance parameters which leads to computational issues and biased statistical inferences. We use a computationally efficient profile generalized estimating equation method and compare it to marginal maximum likelihood (MLE) and restricted MLE (REML) methods. We use REML to address bias and inaccurate confidence intervals because of many nuisance parameters. The marginal MLE and REML approaches involve intractable integrals and we used a new R package that is designed for estimating complex nonlinear models that may include random effects. We conclude from simulation analyses that the REML method provides more reliable statistical inferences among the three methods we investigated. 相似文献

5.

Type I error rates from likelihood‐based repeated measures analyses of incomplete longitudinal data

Craig H. Mallinckrodt Christopher J. Kaiser John G. Watkin Michael J. Detke Geert Molenberghs Raymond J. Carroll 《Pharmaceutical statistics》2004,3(3):171-186

The last observation carried forward (LOCF) approach is commonly utilized to handle missing values in the primary analysis of clinical trials. However, recent evidence suggests that likelihood‐based analyses developed under the missing at random (MAR) framework are sensible alternatives. The objective of this study was to assess the Type I error rates from a likelihood‐based MAR approach – mixed‐model repeated measures (MMRM) – compared with LOCF when estimating treatment contrasts for mean change from baseline to endpoint (Δ). Data emulating neuropsychiatric clinical trials were simulated in a 4 × 4 factorial arrangement of scenarios, using four patterns of mean changes over time and four strategies for deleting data to generate subject dropout via an MAR mechanism. In data with no dropout, estimates of Δ and SE_Δ from MMRM and LOCF were identical. In data with dropout, the Type I error rates (averaged across all scenarios) for MMRM and LOCF were 5.49% and 16.76%, respectively. In 11 of the 16 scenarios, the Type I error rate from MMRM was at least 1.00% closer to the expected rate of 5.00% than the corresponding rate from LOCF. In no scenario did LOCF yield a Type I error rate that was at least 1.00% closer to the expected rate than the corresponding rate from MMRM. The average estimate of SE_Δ from MMRM was greater in data with dropout than in complete data, whereas the average estimate of SE_Δ from LOCF was smaller in data with dropout than in complete data, suggesting that standard errors from MMRM better reflected the uncertainty in the data. The results from this investigation support those from previous studies, which found that MMRM provided reasonable control of Type I error even in the presence of MNAR missingness. No universally best approach to analysis of longitudinal data exists. However, likelihood‐based MAR approaches have been shown to perform well in a variety of situations and are a sensible alternative to the LOCF approach. MNAR methods can be used within a sensitivity analysis framework to test the potential presence and impact of MNAR data, thereby assessing robustness of results from an MAR method. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献

6.

Likelihood analysis of joint marginal and conditional models for longitudinal categorical data

Baojiang Chen Grace Y. Yi Richard J. Cook 《Revue canadienne de statistique》2009,37(2):182-205

The authors develop a Markov model for the analysis of longitudinal categorical data which facilitates modelling both marginal and conditional structures. A likelihood formulation is employed for inference, so the resulting estimators enjoy the optimal properties such as efficiency and consistency, and remain consistent when data are missing at random. Simulation studies demonstrate that the proposed method performs well under a variety of situations. Application to data from a smoking prevention study illustrates the utility of the model and interpretation of covariate effects. The Canadian Journal of Statistics © 2009 Statistical Society of Canada 相似文献

7.

Maximum Likelihood Estimation for Multinomial‐Poisson Models: A Generalization of Birch's Numerical Invariance Results

JOSEPH B. LANG 《Scandinavian Journal of Statistics》2013,40(3):530-548

Abstract. This study gives a generalization of Birch's log‐linear model numerical invariance result. The generalization is given in the form of a sufficient condition for numerical invariance that is simple to verify in practice and is applicable for a much broader class of models than log‐linear models. Unlike Birch's log‐linear result, the generalization herein does not rely on any relationship between sufficient statistics and maximum likelihood estimates. Indeed the generalization does not rely on the existence of a reduced set of sufficient statistics. Instead, the concept of homogeneity takes centre stage. Several examples illustrate the utility of non‐log‐linear models, the invariance (and non‐invariance) of fitted values, and the invariance (and non‐invariance) of certain approximating distributions. 相似文献

8.

Marginal correlation from an extended random-effects model for repeated and overdispersed counts

Tony Vangeneugden Geert Verbeke Clarice G.B. Demétrio 《Journal of applied statistics》2011,38(2):215-232

Vangeneugden et al. [15 Vangeneugden, T., Molenberghs, G., Laenen, A., Geys, H., Beunckens, C. and Sotto, C. 2007. Marginal correlation in longitudinal binary data based on generalized linear mixed models, Tech. Rep., Hasselt University. submitted for publication [Google Scholar]] derived approximate correlation functions for longitudinal sequences of general data type, Gaussian and non-Gaussian, based on generalized linear mixed-effects models (GLMM). Their focus was on binary sequences, as well as on a combination of binary and Gaussian sequences. Here, we focus on the specific case of repeated count data, important in two respects. First, we employ the model proposed by Molenberghs et al. [13 Molenberghs, G., Verbeke, G. and Demétrio, C. G.B. 2007. An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Anal., 13: 513–531. [Crossref], [PubMed], [Web of Science ®] [Google Scholar]], which generalizes at the same time the Poisson-normal GLMM and the conventional overdispersion models, in particular the negative-binomial model. The model flexibly accommodates data hierarchies, intra-sequence correlation, and overdispersion. Second, means, variances, and joint probabilities can be expressed in closed form, allowing for exact intra-sequence correlation expressions. Next to the general situation, some important special cases such as exchangeable clustered outcomes are considered, producing insightful expressions. The closed-form expressions are contrasted with the generic approximate expressions of Vangeneugden et al. [15 Vangeneugden, T., Molenberghs, G., Laenen, A., Geys, H., Beunckens, C. and Sotto, C. 2007. Marginal correlation in longitudinal binary data based on generalized linear mixed models, Tech. Rep., Hasselt University. submitted for publication [Google Scholar]]. Data from an epileptic-seizures trial are analyzed and correlation functions derived. It is shown that the proposed extension strongly outperforms the classical GLMM. 相似文献

9.

Analysis of categorical data obtained by stratified random sampling

Peter B. Imrey E. Sobel M. E. Francis 《统计学通讯:理论与方法》2013,42(7):653-670

Analysis of categorical data by linear models is extended to data obtained by stratified random sampling. It is shown that, asymptotically, proportional allocation reduces the variances of estimators from those obtained hy simple random sampling. The difference between the asymptotic covariance matrices of estimated parameters obtained by simple random sampling and stratified random sampling with proportional allocation is shown to be positive definite vinder fairly non-restrictive conditions, when an asymptotically efficient method of estimation is used. Data from a major community study of mental health are used to illustrate application of the technique. 相似文献

10.

A comparison study on modeling of clustered and overdispersed count data for multiple comparisons

Jochen Kruppa Ludwig Hothorn 《Journal of applied statistics》2021,48(16):3220

Data collected in various scientific fields are count data. One way to analyze such data is to compare the individual levels of the factor treatment using multiple comparisons. However, the measured individuals are often clustered – e.g. according to litter or rearing. This must be considered when estimating the parameters by a repeated measurement model. In addition, ignoring the overdispersion to which count data is prone leads to an increase of the type one error rate. We carry out simulation studies using several different data settings and compare different multiple contrast tests with parameter estimates from generalized estimation equations and generalized linear mixed models in order to observe coverage and rejection probabilities. We generate overdispersed, clustered count data in small samples as can be observed in many biological settings. We have found that the generalized estimation equations outperform generalized linear mixed models if the variance-sandwich estimator is correctly specified. Furthermore, generalized linear mixed models show problems with the convergence rate under certain data settings, but there are model implementations with lower implications exists. Finally, we use an example of genetic data to demonstrate the application of the multiple contrast test and the problems of ignoring strong overdispersion. 相似文献

11.

Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues

Dobra Adrian Karr Alan F. Sanil Ashish P. 《Statistics and Computing》2003,13(4):363-370

Dissemination of information derived from large contingency tables formed from confidential data is a major responsibility of statistical agencies. In this paper we present solutions to several computational and algorithmic problems that arise in the dissemination of cross-tabulations (marginal sub-tables) from a single underlying table. These include data structures that exploit sparsity to support efficient computation of marginals and algorithms such as iterative proportional fitting, as well as a generalized form of the shuttle algorithm that computes sharp bounds on (small, confidentiality threatening) cells in the full table from arbitrary sets of released marginals. We give examples illustrating the techniques. 相似文献

12.

A Bayesian hierarchical model for categorical longitudinal data from a social survey of immigrants

A. N. Pettitt T. T. Tran M. A. Haynes J. L. Hay 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(1):97-114

Summary. The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia. 相似文献

13.

Analysis of incomplete data in experiments with repeated measurements in clinicaltrials using a stochastic model

H. I. Patel C. G. Khatri 《统计学通讯:理论与方法》2013,42(22):2259-2277

A stochastic model is proposed to analyze the observation vectors of variable lengths in a long-term clinical trial. Using a Markovian normal density, the likelihood ratio tests for usual hypotheses are derived and asymptotic distributions of the test statistics are obtained. The use of 'step-down' procedure is discussed for the interim analysis and a numerical example is given to illustrate the methodology. 相似文献

14.

Analysis of growth curve data by using cubic smoothing splines

Tapio Nummi Laura Koskela 《Journal of applied statistics》2008,35(6):681-691

Longitudinal data frequently arises in various fields of applied sciences where individuals are measured according to some ordered variable, e.g. time. A common approach used to model such data is based on the mixed models for repeated measures. This model provides an eminently flexible approach to modeling of a wide range of mean and covariance structures. However, such models are forced into a rigidly defined class of mathematical formulas which may not be well supported by the data within the whole sequence of observations. A possible non-parametric alternative is a cubic smoothing spline, which is highly flexible and has useful smoothing properties. It can be shown that under normality assumption, the solution of the penalized log-likelihood equation is the cubic smoothing spline, and this solution can be further expressed as a solution of the linear mixed model. It is shown here how cubic smoothing splines can be easily used in the analysis of complete and balanced data. Analysis can be greatly simplified by using the unweighted estimator studied in the paper. It is shown that if the covariance structure of random errors belong to certain class of matrices, the unweighted estimator is the solution to the penalized log-likelihood function. This result is new in smoothing spline context and it is not only confined to growth curve settings. The connection to mixed models is used in developing a rough testing of group profiles. Numerical examples are presented to illustrate the techniques proposed. 相似文献

15.

Estimation and testing for semiparametric mixtures of partially linear models

Xing Wu Tian Liu 《统计学通讯:理论与方法》2017,46(17):8690-8705

In this paper, we study the estimation and inference for a class of semiparametric mixtures of partially linear models. We prove that the proposed models are identifiable under mild conditions, and then give a PL–EM algorithm estimation procedure based on profile likelihood. The asymptotic properties for the resulting estimators and the ascent property of the PL–EM algorithm are investigated. Furthermore, we develop a test statistic for testing whether the non parametric component has a linear structure. Monte Carlo simulations and a real data application highlight the interest of the proposed procedures. 相似文献

16.

Estimation and tests of hypotheses for the initial mean and covariance in the kalman filter model

R. H. Shumway D. E. Olsen L. J. Levy 《统计学通讯:理论与方法》2013,42(16):1625-1641

Kalman filtering techniques are widely used by engineers to recursively estimate random signal parameters which are essentially coefficients in a large-scale time series regression model. These Bayesian estimators depend on the values assumed for the mean and covariance parameters associated with the initial state of the random signal. This paper considers a likelihood approach to estimation and tests of hypotheses involving the critical initial means and covariances. A computationally simple convergent iterative algorithm is used to generate estimators which depend only on standard Kalman filter outputs at each successive stage. Conditions are given under which the maximum likelihood estimators are consistent and asymptotically normal. The procedure is illustrated using a typical large-scale data set involving 10-dimensional signal vectors. 相似文献

17.

The r – d class estimator in generalized linear models: applications on gamma,Poisson and binomial distributed responses

M. Revan Özkale 《Journal of Statistical Computation and Simulation》2019,89(4):615-640

相似文献

18.

A Sparse Implementation of the Average Information Algorithm for Factor Analytic and Reduced Rank Variance Models 总被引：1，自引：0，他引：1

Robin Thompson Brian Cullis Alison Smith Arthur Gilmour 《Australian & New Zealand Journal of Statistics》2003,45(4):445-459

Factor analytic variance models have been widely considered for the analysis of multivariate data particularly in the psychometrics area. Recently Smith, Cullis & Thompson (2001) have considered their use in the analysis of multi‐environment data arising from plant improvement programs. For these data, the size of the problem and the complexity of the variance models chosen to account for spatial heterogeneity within trials implies that standard algorithms for fitting factor analytic models can be computationally expensive. This paper presents a sparse implementation of the average information algorithm (Gilmour, Thompson & Cullis, 1995) for fitting factor analytic and reduced rank variance models. 相似文献

19.

Mixed models for data from thorough QT studies: part 1. assessment of marginal QT prolongation

Schall R Ring A 《Pharmaceutical statistics》2011,10(3):265-276

We investigate mixed models for repeated measures data from cross-over studies in general, but in particular for data from thorough QT studies. We extend both the conventional random effects model and the saturated covariance model for univariate cross-over data to repeated measures cross-over (RMC) data; the resulting models we call the RMC model and Saturated model, respectively. Furthermore, we consider a random effects model for repeated measures cross-over data previously proposed in the literature. We assess the standard errors of point estimates and the coverage properties of confidence intervals for treatment contrasts under the various models. Our findings suggest: (i) Point estimates of treatment contrasts from all models considered are similar; (ii) Confidence intervals for treatment contrasts under the random effects model previously proposed in the literature do not have adequate coverage properties; the model therefore cannot be recommended for analysis of marginal QT prolongation; (iii) The RMC model and the Saturated model have similar precision and coverage properties; both models are suitable for assessment of marginal QT prolongation; and (iv) The Akaike Information Criterion (AIC) is not a reliable criterion for selecting a covariance model for RMC data in the following sense: the model with the smallest AIC is not necessarily associated with the highest precision for the treatment contrasts, even if the model with the smallest AIC value is also the most parsimonious model. 相似文献

20.

The Analysis of Designed Experiments and Longitudinal Data by Using Smoothing Splines 总被引：10，自引：0，他引：10

Ar&#;nas P. Verbyla Brian R. Cullis Michael G. Kenward & Sue J. Welham 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(3):269-311

In designed experiments and in particular longitudinal studies, the aim may be to assess the effect of a quantitative variable such as time on treatment effects. Modelling treatment effects can be complex in the presence of other sources of variation. Three examples are presented to illustrate an approach to analysis in such cases. The first example is a longitudinal experiment on the growth of cows under a factorial treatment structure where serial correlation and variance heterogeneity complicate the analysis. The second example involves the calibration of optical density and the concentration of a protein DNase in the presence of sampling variation and variance heterogeneity. The final example is a multienvironment agricultural field experiment in which a yield–seeding rate relationship is required for several varieties of lupins. Spatial variation within environments, heterogeneity between environments and variation between varieties all need to be incorporated in the analysis. In this paper, the cubic smoothing spline is used in conjunction with fixed and random effects, random coefficients and variance modelling to provide simultaneous modelling of trends and covariance structure. The key result that allows coherent and flexible empirical model building in complex situations is the linear mixed model representation of the cubic smoothing spline. An extension is proposed in which trend is partitioned into smooth and non-smooth components. Estimation and inference, the analysis of the three examples and a discussion of extensions and unresolved issues are also presented. 相似文献