首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 968 毫秒
1.
Grouped data exponentially weighted moving average control charts   总被引:2,自引:0,他引:2  
In the manufacture of metal fasteners in a progressive die operation, and other industrial situations, important quality dimensions cannot be measured on a continuous scale, and manufactured parts are classified into groups by using a step gauge. This paper proposes a version of exponentially weighted moving average (EWMA) control charts that are applicable to monitoring the grouped data for process shifts. The run length properties of this new grouped data EWMA chart are compared with similar results previously obtained for EWMA charts for variables data and with those for cumulative sum (CUSUM) schemes based on grouped data. Grouped data EWMA charts are shown to be nearly as efficient as variables-based EWMA charts and are thus an attractive alternative when the collection of variables data is not feasible. In addition, grouped data EWMA charts are less affected by the discreteness that is inherent in grouped data than are grouped data CUSUM charts. In the metal fasteners application, grouped data EWMA charts were simple to implement and allowed the rapid detection of undesirable process shifts.  相似文献   

2.
We consider whether one should transform to estimate nonparametrically a regression curve sampled from data with a constant coefficient of variation, i.e. with multiplicative errors. Kernel-based smoothing methods are used to provide curve estimates from the data both in the original units and after transformation. Comparisons are based on the mean-squared error (MSE) or mean integrated squared error (MISE), calculated in the original units. Even when the data are generated by the simplest multiplicative error model, the asymptotically optimal MSE (or MISE) is surprisingly not always obtained by smoothing transformed data, but in many cases by directly smoothing the original data. Which method is optimal depends on both the regression curve and the distribution of the errors. Data-based procedures which could be useful in choosing between transforming and not transforming a particular data set are discussed. The results are illustrated on simulated and real data.  相似文献   

3.
网上拍卖中竞买者出价数据的特征及分析方法研究   总被引:2,自引:1,他引:1  
在传统统计分析中,研究者面对的数值型数据有三种形式,即横截面数据、时间序列数据以及混合数据。这些类型的数据具有离散、等间隔分布、密度均匀等特点,它们是传统的描述性统计和推断性统计中最主要的数据分析对象。然而,从拍卖网站收集到的诸如竞买者出价等数据,却不具备这些特点,对传统统计分析方法提出了挑战。因此需要从数据容量、数据的混合性、不等间隔分布及数据密度等方面,对网上拍卖数据的产生机制进行阐释,对其特征进行分析,并结合实际网上拍卖资料给出分析此类数据的方法和过程。  相似文献   

4.
In practice, data are often measured repeatedly on the same individual at several points in time. Main interest often relies in characterizing the way the response changes in time, and the predictors of that change. Marginal, mixed and transition are frequently considered to be the main models for continuous longitudinal data analysis. These approaches are proposed primarily for balanced longitudinal design. However, in clinic studies, data are usually not balanced and some restrictions are necessary in order to use these models. This paper was motivated by a data set related to longitudinal height measurements in children of HIV-infected mothers that was recorded at the university hospital of the Federal University in Minas Gerais, Brazil. This data set is severely unbalanced. The goal of this paper is to assess the application of continuous longitudinal models for the analysis of unbalanced data set.  相似文献   

5.
The real-time polymerase chain reaction (rtPCR) provides sensitive and accurate quantitative results and becomes a widespread technique in analyzing gene expressions. House-keeping genes are required as references to normalize data of target genes, which may be unstable. This normalization process is similar to the normalization in analyzing high-density oligonucleotide arrays. This article evaluates the feasibility of normalizations for high-density oligonucleotide arrays to normalize data collected in rtPCR experiments. Since data features are different, simulations are used to evaluate the performance of these normalizations to rtPCR data based on five indices. Their feasibilities are illustrated by a rtPCR data.  相似文献   

6.
Functional data are being observed frequently in many scientific fields, and therefore most of the standard statistical methods are being adapted for functional data. The multivariate analysis of variance problem for functional data is considered. It seems to be of practical interest similarly as the one-way analysis of variance for such data. For the MANOVA problem for multivariate functional data, we propose permutation tests based on a basis function representation and tests based on random projections. Their performance is examined in comprehensive simulation studies, which provide an idea of the size control and power of the tests and identify differences between them. The simulation experiments are based on artificial data and real labeled multivariate time series data found in the literature. The results suggest that the studied testing procedures can detect small differences between vectors of curves even with small sample sizes. Illustrative real data examples of the use of the proposed testing procedures in practice are also presented.  相似文献   

7.
Missing covariates data with censored outcomes put a challenge in the analysis of clinical data especially in small sample settings. Multiple imputation (MI) techniques are popularly used to impute missing covariates and the data are then analyzed through methods that can handle censoring. However, techniques based on MI are available to impute censored data also but they are not much in practice. In the present study, we applied a method based on multiple imputation by chained equations to impute missing values of covariates and also to impute censored outcomes using restricted survival time in small sample settings. The complete data were then analyzed using linear regression models. Simulation studies and a real example of CHD data show that the present method produced better estimates and lower standard errors when applied on the data having missing covariate values and censored outcomes than the analysis of the data having censored outcome but excluding cases with missing covariates or the analysis when cases with missing covariate values and censored outcomes were excluded from the data (complete case analysis).  相似文献   

8.
There are various techniques for dealing with incomplete data; some are computationally highly intensive and others are not as computationally intensive, while all may be comparable in their efficiencies. In spite of these developments, analysis using only the complete data subset is performed when using popular statistical software. In an attempt to demonstrate the efficiencies and advantages of using all available data, we compared several approaches that are relatively simple but efficient alternatives to those using the complete data subset for analyzing repeated measures data with missing values, under the assumption of a multivariate normal distribution of the data. We also assumed that the missing values occur in a monotonic pattern and completely at random. The incomplete data procedure is demonstrated to be more powerful than the procedure of using the complete data subset, generally when the within-subject correlation gets large. One other principal finding is that even with small sample data, for which various covariance models may be indistinguishable, the empirical size and power are shown to be sensitive to misspecified assumptions about the covariance structure. Overall, the testing procedures that do not assume any particular covariance structure are shown to be more robust in keeping the empirical size at the nominal level than those assuming a special structure.  相似文献   

9.
Compositional data are characterized by values containing relative information, and thus the ratios between the data values are of interest for the analysis. Due to specific features of compositional data, standard statistical methods should be applied to compositions expressed in a proper coordinate system with respect to an orthonormal basis. It is discussed how three-way compositional data can be analyzed with the Parafac model. When data are contaminated by outliers, robust estimates for the Parafac model parameters should be employed. It is demonstrated how robust estimation can be done in the context of compositional data and how the results can be interpreted. A real data example from macroeconomics underlines the usefulness of this approach.  相似文献   

10.
Concomitant Medications are medications used by patients in a clinical trial, other than the investigational drug. These data are routinely collected in clinical trials. The data are usually collected in a longitudinal manner, for the duration of patients' participation in the trial. The routine summaries of this data are incidence‐type, describing whether or not a medication was ever administered during the study. The longitudinal aspect of the data is essentially ignored. The aim of this article is to suggest exploratory methods for graphically displaying the longitudinal features of the data using a well‐established estimator called the ‘mean cumulative function’. This estimator permits summary and a graphical display of the data, and preparation of some statistical tests to compare between groups. This estimator may also incorporate information on censoring of patient data. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

11.
黄恒君 《统计研究》2019,36(7):3-12
大数据在统计生产中潜力巨大,有助于构建高质量的统计生产体系,但符合统计生产目标的数据源特征及其数据质量问题有待明确。本文在寻求大数据源与传统统计数据源共同点的基础上,讨论了统计生产中的大数据源及其数据质量问题,进而探讨了大数据与传统统计生产融合应用。首先从数据生成流程及数据特征两个方面论证并限定了可用于统计生产的大数据源;然后在广义数据质量框架下讨论了大数据统计生产中的数据质量问题,梳理了大数据统计生产流程的数据质量控制要点和质量缺陷;最后根据数据质量分析结果,提出了将大数据融入传统调查的统计体系构建思路。  相似文献   

12.
The small sample performance of least median of squares, reweighted least squares, least squares, least absolute deviations, and three partially adaptive estimators are compared using Monte Carlo simulations. Two data problems are addressed in the paper: (1) data generated from non-normal error distributions and (2) contaminated data. Breakdown plots are used to investigate the sensitivity of partially adaptive estimators to data contamination relative to RLS. One partially adaptive estimator performs especially well when the errors are skewed, while another partially adaptive estimator and RLS perform particularly well when the errors are extremely leptokur-totic. In comparison with RLS, partially adaptive estimators are only moderately effective in resisting data contamination; however, they outperform least squares and least absolute deviation estimators.  相似文献   

13.
This paper compares the performance of weighted generalized estimating equations (WGEEs), multiple imputation based on generalized estimating equations (MI-GEEs) and generalized linear mixed models (GLMMs) for analyzing incomplete longitudinal binary data when the underlying study is subject to dropout. The paper aims to explore the performance of the above methods in terms of handling dropouts that are missing at random (MAR). The methods are compared on simulated data. The longitudinal binary data are generated from a logistic regression model, under different sample sizes. The incomplete data are created for three different dropout rates. The methods are evaluated in terms of bias, precision and mean square error in case where data are subject to MAR dropout. In conclusion, across the simulations performed, the MI-GEE method performed better in both small and large sample sizes. Evidently, this should not be seen as formal and definitive proof, but adds to the body of knowledge about the methods’ relative performance. In addition, the methods are compared using data from a randomized clinical trial.  相似文献   

14.
This article reexamines the famous barley data that are often used to demonstrate dot plots. Additional sources of supplemental data provide context for interpretation of the original data. Graphical and mixed-model analyses shed new light on the variability in the data and challenge previously held beliefs about the accuracy of the data. Supplementary materials for this article are available online.  相似文献   

15.
Cramér–von Mises type goodness of fit tests for interval censored data case 2 are proposed based on a resampling method called the leveraged bootstrap, and their asymptotic consistency is shown. The proposed tests are computationally efficient, and in fact can be applied to other types of censored data, including right censored data, doubly censored data and (mixture of) case k interval censored data. Some simulation results and an example from AIDS research are presented.  相似文献   

16.
Principal component and correspondence analysis can both be used as exploratory methods for representing multivariate data in two dimensions. Circumstances under which the, possibly inappropriate, application of principal components to untransformed compositional data approximates to a correspondence analysis of the raw data are noted. Aitchison (1986) has proposed a method for the principal component analysis of compositional data involving transformation of the raw data. It is shown how this can be approximated by a correspondence analysis of appropriately transformed data. The latter approach may be preferable when there are zeroes in the data.  相似文献   

17.
Several nonparametric tests for multivariate multi-sample location problem are proposed in this paper. These tests are based on the notion of data depth, which is used to measure the centrality/outlyingness of a given point with respect to a given distribution or a data cloud. Proposed tests are completely nonparametric and implemented through the idea of permutation tests. Performance of the proposed tests is compared with existing parametric test and nonparametric test based on data depth. An extensive simulation study reveals that proposed tests are superior to the existing tests based on data depth with regard to power. Illustrations with real data are provided.  相似文献   

18.
Summary In spite of widespread criticism, macroeconometric models are still most popular for forecasting and policy, analysis. When the most recent data available on both the exogenous and the endogenous variable are preliminaryestimates subject to a revision process, the estimators of the coefficients are affected by the presence of the preliminary data, the projections for the exogenous variables are affected by the presence of data uncertainty, the values of lagged dependent variables used as initial values for, forecasts are still subject to revisions. Since several provisional estimates of the value of a certain variable are available before the data are finalized, in this paper they are seen as repeated predictions of the same quantity (referring to different information sets not necessarily overlapping with one other) to be exploited in a forecast combination framework. The components of the asymptotic bias and of the asymptotic mean square prediction error related to data uncertainty can be reduced or eliminated by using a forecast combination technique which makes the deterministic and the Monte Carlo predictors not worse than either predictor used with or without provisional data. The precision of the forecast with the nonlinear model can be improved if the provisional data are not rational predictions of the final data and contain systematic effects. Economics Department European University Institute Thanks are due to my Ph. D. thesis advisor Bobby Mariano for his guidance and encouragment at various stages of this research. The comments of the participants in the Europan Meeting of the Econometric Society in Maastricht, Aug. 1994, helped in improving the presentation,. A grant from the NSF (SES 8604219) is gratefully acknowledged.  相似文献   

19.
Frequently, count data obtained from dilution assays are subject to an upper detection limit, and as such, data obtained from these assays are usually censored. Also, counts from the same subject at different dilution levels are correlated. Ignoring the censoring and the correlation may provide unreliable and misleading results. Therefore, any meaningful data modeling requires that the censoring and the correlation be simultaneously addressed. Such comprehensive approaches of modeling censoring and correlation are not widely used in the analysis of dilution assays data. Traditionally, these data are analyzed using a general linear model on a logarithmic-transformed average count per subject. However, this traditional approach ignores the between-subject variability and risks, providing inconsistent results and unreliable conclusions. In this paper, we propose the use of a censored negative binomial model with normal random effects to analyze such data. This model addresses, in addition to the censoring and the correlation, any overdispersion that may be present in count data. The model is shown to be widely accessible through the use of several modern statistical software.  相似文献   

20.
We find that existing multiple imputation procedures that are currently implemented in major statistical packages and that are available to the wide majority of data analysts are limited with regard to handling incomplete panel data. We review various missing data methods that we deem useful for the analysis of incomplete panel data and discuss, how some of the shortcomings of existing procedures can be overcome. In a simulation study based on real panel data, we illustrate these procedures’ quality and outline fruitful avenues of future research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号