期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Grouped data exponentially weighted moving average control charts 总被引：2，自引：0，他引：2

Stefan H. Steiner 《Journal of the Royal Statistical Society. Series C, Applied statistics》1998,47(2):203-216

In the manufacture of metal fasteners in a progressive die operation, and other industrial situations, important quality dimensions cannot be measured on a continuous scale, and manufactured parts are classified into groups by using a step gauge. This paper proposes a version of exponentially weighted moving average (EWMA) control charts that are applicable to monitoring the grouped data for process shifts. The run length properties of this new grouped data EWMA chart are compared with similar results previously obtained for EWMA charts for variables data and with those for cumulative sum (CUSUM) schemes based on grouped data. Grouped data EWMA charts are shown to be nearly as efficient as variables-based EWMA charts and are thus an attractive alternative when the collection of variables data is not feasible. In addition, grouped data EWMA charts are less affected by the discreteness that is inherent in grouped data than are grouped data CUSUM charts. In the metal fasteners application, grouped data EWMA charts were simple to implement and allowed the rapid detection of undesirable process shifts. 相似文献

2.

A robust pairwise likelihood method for incomplete longitudinal binary data arising in clusters

Grace Y. Yi Leilei Zeng Richard J. Cook 《Revue canadienne de statistique》2011,39(1):34-51

Clustered longitudinal data feature cross‐sectional associations within clusters, serial dependence within subjects, and associations between responses at different time points from different subjects within the same cluster. Generalized estimating equations are often used for inference with data of this sort since they do not require full specification of the response model. When data are incomplete, however, they require data to be missing completely at random unless inverse probability weights are introduced based on a model for the missing data process. The authors propose a robust approach for incomplete clustered longitudinal data using composite likelihood. Specifically, pairwise likelihood methods are described for conducting robust estimation with minimal model assumptions made. The authors also show that the resulting estimates remain valid for a wide variety of missing data problems including missing at random mechanisms and so in such cases there is no need to model the missing data process. In addition to describing the asymptotic properties of the resulting estimators, it is shown that the method performs well empirically through simulation studies for complete and incomplete data. Pairwise likelihood estimators are also compared with estimators obtained from inverse probability weighted alternating logistic regression. An application to data from the Waterloo Smoking Prevention Project is provided for illustration. The Canadian Journal of Statistics 39: 34–51; 2011 © 2010 Statistical Society of Canada 相似文献

3.

Multivariate analysis of variance for functional data

T. Górecki 《Journal of applied statistics》2017,44(12):2172-2189

Functional data are being observed frequently in many scientific fields, and therefore most of the standard statistical methods are being adapted for functional data. The multivariate analysis of variance problem for functional data is considered. It seems to be of practical interest similarly as the one-way analysis of variance for such data. For the MANOVA problem for multivariate functional data, we propose permutation tests based on a basis function representation and tests based on random projections. Their performance is examined in comprehensive simulation studies, which provide an idea of the size control and power of the tests and identify differences between them. The simulation experiments are based on artificial data and real labeled multivariate time series data found in the literature. The results suggest that the studied testing procedures can detect small differences between vectors of curves even with small sample sizes. Illustrative real data examples of the use of the proposed testing procedures in practice are also presented. 相似文献

4.

A likelihood-based approach for multivariate one-sided tests with missing data

Guohai Zhou Lang Wu Rollin Brant J. Mark Ansermino 《Journal of applied statistics》2017,44(11):2000-2016

Inequality-restricted hypotheses testing methods containing multivariate one-sided testing methods are useful in practice, especially in multiple comparison problems. In practice, multivariate and longitudinal data often contain missing values since it may be difficult to observe all values for each variable. However, although missing values are common for multivariate data, statistical methods for multivariate one-sided tests with missing values are quite limited. In this article, motivated by a dataset in a recent collaborative project, we develop two likelihood-based methods for multivariate one-sided tests with missing values, where the missing data patterns can be arbitrary and the missing data mechanisms may be non-ignorable. Although non-ignorable missing data are not testable based on observed data, statistical methods addressing this issue can be used for sensitivity analysis and might lead to more reliable results, since ignoring informative missingness may lead to biased analysis. We analyse the real dataset in details under various possible missing data mechanisms and report interesting findings which are previously unavailable. We also derive some asymptotic results and evaluate our new tests using simulations. 相似文献

5.

Forecasting multivariate longitudinal binary data with marginal and marginally specified models

《Journal of Statistical Computation and Simulation》2012,82(2):414-429

Forecasting with longitudinal data has been rarely studied. Most of the available studies are for continuous response and all of them are for univariate response. In this study, we consider forecasting multivariate longitudinal binary data. Five different models including simple ones, univariate and multivariate marginal models, and complex ones, marginally specified models, are studied to forecast such data. Model forecasting abilities are illustrated via a real-life data set and a simulation study. The simulation study includes a model independent data generation to provide a fair environment for model competitions. Independent variables are forecast as well as the dependent ones to mimic the real-life cases best. Several accuracy measures are considered to compare model forecasting abilities. Results show that complex models yield better forecasts. 相似文献

6.

Inference for Bivariate Survival Data by Copula Models Adjusted for the Boundary Effect

Aidong Adam Ding Weijing Wang 《统计学通讯:理论与方法》2013,42(16):2927-2936

Copula models describe the dependence structure of two random variables separately from their marginal distributions and hence are particularly useful in studying the association for bivariate survival data. Semiparametric inference for bivariate survival data based on copula models has been studied for various types of data, including complete data, right-censored data, and current status data. This article discusses the boundary effect on these inference procedures, a problem that has been neglected in the previous literature. Specifically, asymptotic distribution of the association estimator on the boundary of parameter space is derived for one-dimensional copula models. The boundary properties are applied to test independence and to study the estimation efficiency. Simulation study is conducted for the bivariate right-censored data and current status data. 相似文献

7.

Generating synthetic data to produce public-use microdata for small geographic areas based on complex sample survey data with application to the National Health Interview Survey

Joseph W. Sakshaug Trivellore E. Raghunathan 《Journal of applied statistics》2014,41(10):2103-2122

Small area statistics obtained from sample survey data provide a critical source of information used to study health, economic, and sociological trends. However, most large-scale sample surveys are not designed for the purpose of producing small area statistics. Moreover, data disseminators are prevented from releasing public-use microdata for small geographic areas for disclosure reasons; thus, limiting the utility of the data they collect. This research evaluates a synthetic data method, intended for data disseminators, for releasing public-use microdata for small geographic areas based on complex sample survey data. The method replaces all observed survey values with synthetic (or imputed) values generated from a hierarchical Bayesian model that explicitly accounts for complex sample design features, including stratification, clustering, and sampling weights. The method is applied to restricted microdata from the National Health Interview Survey and synthetic data are generated for both sampled and non-sampled small areas. The analytic validity of the resulting small area inferences is assessed by direct comparison with the actual data, a simulation study, and a cross-validation study. 相似文献

8.

Practical modeling strategies for unbalanced longitudinal data analysis

Enrico A. Colosimo Maria Arlene Fausto Marta Afonso Freitas Jorge Andrade Pinto 《Journal of applied statistics》2012,39(9):2005-2013

In practice, data are often measured repeatedly on the same individual at several points in time. Main interest often relies in characterizing the way the response changes in time, and the predictors of that change. Marginal, mixed and transition are frequently considered to be the main models for continuous longitudinal data analysis. These approaches are proposed primarily for balanced longitudinal design. However, in clinic studies, data are usually not balanced and some restrictions are necessary in order to use these models. This paper was motivated by a data set related to longitudinal height measurements in children of HIV-infected mothers that was recorded at the university hospital of the Federal University in Minas Gerais, Brazil. This data set is severely unbalanced. The goal of this paper is to assess the application of continuous longitudinal models for the analysis of unbalanced data set. 相似文献

9.

Perfect aggregation of Bayesian analysis on compositional data

Tzu-Tsung Wong 《Statistical Papers》2007,48(2):265-282

Sufficiency is a widely used concept for reducing the dimensionality of a data set. Collecting data for a sufficient statistic is generally much easier and less expensive than collecting all of the available data. When the posterior distributions of a quantity of interest given the aggregate and disaggregate data are identical, perfect aggregation is said to hold, and in this case the aggregate data is a sufficient statistic for the quantity of interest. In this paper, the conditions for perfect aggregation are shown to depend on the functional form of the prior distribution. When the quantity of interest is the sum of some parameters in a vector having either a generalized Dirichlet or a Liouville distribution for analyzing compositional data, necessary and sufficient conditions for perfect aggregation are also established. 相似文献

10.

A Bayesian semiparametric method for analyzing length-biased data

Nusrat Harun Bo Cai Yu Shen 《Journal of applied statistics》2021,48(6):977

Survival data obtained from prevalent cohort study designs are often subject to length-biased sampling. Frequentist methods including estimating equation approaches, as well as full likelihood methods, are available for assessing covariate effects on survival from such data. Bayesian methods allow a perspective of probability interpretation for the parameters of interest, and may easily provide the predictive distribution for future observations while incorporating weak prior knowledge on the baseline hazard function. There is lack of Bayesian methods for analyzing length-biased data. In this paper, we propose Bayesian methods for analyzing length-biased data under a proportional hazards model. The prior distribution for the cumulative hazard function is specified semiparametrically using I-Splines. Bayesian conditional and full likelihood approaches are developed for analyzing simulated and real data. 相似文献

11.

Defective data in the analysis of variance

S. C. Pearce 《Journal of applied statistics》1986,13(2):139-147

Several kinds of defective data are considered, the first being that of data completely missing. General expressions for block designs are given for the variances of treatment contrasts. Sometimes even a moderate loss of data will seriously increase an important variance. A similar defect arises when some plots receive the wrong treatment. Again the consequences can be serious though, if the correct form of analysis used, variances of contrasts may be much better estimated than if defective data are regarded as missing. Finally, the problem of the mixed-up plots is considered, the total value for certain plots being known but not the individual data. Formulae are presented for dealing with such a case. Again the correct form of analysis can sometimes retain useful information that would be lost if affected plots are regarded as missing. Finally some considerations are set out for dealing with defective data in general. 相似文献

12.

Assessing the adequacy of Weibull survival models: a simulated envelope approach

Yun Zhao Kelvin K.W. Yau Geoffrey J. McLachlan 《Journal of applied statistics》2011,38(10):2089-2097

The Weibull proportional hazards model is commonly used for analysing survival data. However, formal tests of model adequacy are still lacking. It is well known that residual-based goodness-of-fit measures are inappropriate for censored data. In this paper, a graphical diagnostic plot of Cox–Snell residuals with a simulated envelope added is proposed to assess the adequacy of Weibull survival models. Both single component and two-component mixture models with random effects are considered for recurrent failure time data. The effectiveness of the diagnostic method is illustrated using simulated data sets and data on recurrent urinary tract infections of elderly women. 相似文献

13.

The em algorithm for the quasi-likelihood regression model

Myunghee Cho Paik 《统计学通讯:理论与方法》2013,42(6):1403-1430

The objective of this paper is to present a method which can accommodate certain types of missing data by using the quasi-likelihood function for the complete data. This method can be useful when we can make first and second moment assumptions only; in addition, it can be helpful when the EM algorithm applied to the actual likelihood becomes overly complicated. First we derive a loss function for the observed data using an exponential family density which has the same mean and variance structure of the complete data. This loss function is the counterpart of the quasi-deviance for the observed data. Then the loss function is minimized using the EM algorithm. The use of the EM algorithm guarantees a decrease in the loss function at every iteration. When the observed data can be expressed as a deterministic linear transformation of the complete data, or when data are missing completely at random, the proposed method yields consistent estimators. Examples are given for overdispersed polytomous data, linear random effects models, and linear regression with missing covariates. Simulation results for the linear regression model with missing covariates show that the proposed estimates are more efficient than estimates based on completely observed units, even when outcomes are bimodal or skewed. 相似文献

14.

Nonparametric Applications of Shaffer's Extension of Dennett's Procedure

Kenneth J. Levy 《The American statistician》2013,67(2):99-102

Shaffer's extensions and generalization of Dunnett's procedure are shown to be applicable in several nonparametric data analyses. Applications are considered within the context of the Kruskal-Wallis one-way analysis of variance (ANOVA) test for ranked data, Friedman's two-way ANOVA test for ranked data, and Cochran's test of change for dichotomous data. 相似文献

15.

A robust Parafac model for compositional data

M. A. Di Palma P. Filzmoser M. Gallo K. Hron 《Journal of applied statistics》2018,45(8):1347-1369

Compositional data are characterized by values containing relative information, and thus the ratios between the data values are of interest for the analysis. Due to specific features of compositional data, standard statistical methods should be applied to compositions expressed in a proper coordinate system with respect to an orthonormal basis. It is discussed how three-way compositional data can be analyzed with the Parafac model. When data are contaminated by outliers, robust estimates for the Parafac model parameters should be employed. It is demonstrated how robust estimation can be done in the context of compositional data and how the results can be interpreted. A real data example from macroeconomics underlines the usefulness of this approach. 相似文献

16.

Exploratory data structure comparisons: three new visual tools based on principal component analysis

Anne Helby Petersen Bo Markussen Karl Bang Christensen 《Journal of applied statistics》2021,48(9):1675

Datasets are sometimes divided into distinct subsets, e.g. due to multi-center sampling, or to variations in instruments, questionnaire item ordering or mode of administration, and the data analyst then needs to assess whether a joint analysis is meaningful. The Principal Component Analysis-based Data Structure Comparisons (PCADSC) tools are three new non-parametric, visual diagnostic tools for investigating differences in structure for two subsets of a dataset through covariance matrix comparisons by use of principal component analysis. The PCADCS tools are demonstrated in a data example using European Social Survey data on psychological well-being in three countries, Denmark, Sweden, and Bulgaria. The data structures are found to be different in Denmark and Bulgaria, and thus a comparison of for example mean psychological well-being scores is not meaningful. However, when comparing Denmark and Sweden, very similar data structures, and thus comparable concepts of well-being, are found. Therefore, inter-country comparisons are warranted for these countries. 相似文献

17.

Observable trend-projecting state-space models

E. J. Godolphin 《Journal of applied statistics》2001,28(3):379-389

Much attention has focused in recent years on the use of state-space models for describing and forecasting industrial time series. However, several state-space models that are proposed for such data series are not observable and do not have a unique representation, particularly in situations where the data history suggests marked seasonal trends. This raises major practical difficulties since it becomes necessary to impose one or more constraints and this implies a complicated error structure on the model. The purpose of this paper is to demonstrate that state-space models are useful for describing time series data for forecasting purposes and that there are trend-projecting state-space components that can be combined to provide observable state-space representations for specified data series. This result is particularly useful for seasonal or pseudo-seasonal time series. A well-known data series is examined in some detail and several observable state-space models are suggested and compared favourably with the constrained observable model. 相似文献

18.

Constructing likelihood functions for interval-valued random variables

X. Zhang B. Beranger S. A. Sisson 《Scandinavian Journal of Statistics》2020,47(1):1-35

相似文献

19.

From grouped to de-grouped data: a new approach in distribution fitting for grouped data

Ying-Ju Chen Tatjana Miljkovic 《Journal of Statistical Computation and Simulation》2019,89(2):272-291

Sampling within a given interval with a constraint has not been previously considered. Standard parametric simulation engines require knowledge of the parameters of the distribution from which a sample is drawn. These methods are limited if additional constrains are required for the simulated data. We propose a method that generates the targeted number of individual observations within a given interval with a constraint that the average value of observations is known. This method is further extended to a grouped data setting, as a way of data de-grouping, when the frequency and average value of observations are provided for each group. Several simulation studies are employed to evaluate the performance of the proposed method, in case of both a single interval and grouped data, for different simulation settings. Furthermore, the proposed method is evaluated in the parameter recovery when different distributions are fitted to the de-grouped data. This method is found to be superior to the uniform method previously used in data de-grouping. The results of the simulation study are promising and they show that this method can be used successfully in the applications where data de-grouping requires that the average value of observations is maintained in each group. The application of the proposed method is illustrated on a real data of insurance losses for bodily injury claims. 相似文献

20.

Strategies for Improving Policy-Relevant Data on the Poor

Martin David 《The American statistician》2013,67(3):129-134

This article reexamines the famous barley data that are often used to demonstrate dot plots. Additional sources of supplemental data provide context for interpretation of the original data. Graphical and mixed-model analyses shed new light on the variability in the data and challenge previously held beliefs about the accuracy of the data. Supplementary materials for this article are available online. 相似文献