首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Usual fitting methods for the nested error linear regression model are known to be very sensitive to the effect of even a single outlier. Robust approaches for the unbalanced nested error model with proved robustness and efficiency properties, such as M-estimators, are typically obtained through iterative algorithms. These algorithms are often computationally intensive and require robust estimates of the same parameters to start the algorithms, but so far no robust starting values have been proposed for this model. This paper proposes computationally fast robust estimators for the variance components under an unbalanced nested error model, based on a simple robustification of the fitting-of-constants method or Henderson method III. These estimators can be used as starting values for other iterative methods. Our simulations show that they are highly robust to various types of contamination of different magnitude.  相似文献   

2.
Different quality control charts for the sample mean are developed using ranked set sampling (RSS), and two of its modifications, namely median ranked set sampling (MRSS) and extreme ranked set sampling (ERSS). These new charts are compared to the usual control charts based on simple random sampling (SRS) data. The charts based on RSS or one of its modifications are shown to have smaller average run length (ARL) than the classical chart when there is a sustained shift in the process mean. The MRSS and ERSS methods are compared with RSS and SRS data, it turns out that MRSS dominates all other methods in terms of the out-of-control ARL performance. Real data are collected using the RSS, MRSS, and ERSS in cases of perfect and imperfect ranking. These data sets are used to construct the corresponding control charts. These charts are compared to usual SRS chart. Throughout this study we are assuming that the underlying distribution is normal. A check of the normality for our example data set indicated that the normality assumption is reasonable.  相似文献   

3.
It has been proposed that there is a familial relationship in the shape of the spine. This paper describes a pilot study investigating familial shape in the sagittal plane (side view), using three data sets of normal Leeds schoolchildren. The study is exploratory in nature, because only small samples were available. Data acquisition was by means of the Quantec system, which obtains surface shape measurements and extracts a line representing the spinal curve. The coordinates of the spine line in the sagittal plane are then used to investigate familial correlations of spinal shape. The spine lines first undergo some preprocessing, including Procrustes rotations to remove location, rotation and size effects. Smoothed principal component analysis of the curves provides suitable shape variables, and familial correlations between curves are then investigated. The covariates of sex and height are also investigated in the analysis. It does appear that there could be some evidence for familial correlations in sagittal spinal shape, although a further large-scale study is required. Finally, a discussion of the approach and other alternatives is considered.  相似文献   

4.
In this paper, we study the change-point inference problem motivated by the genomic data that were collected for the purpose of monitoring DNA copy number changes. DNA copy number changes or copy number variations (CNVs) correspond to chromosomal aberrations and signify abnormality of a cell. Cancer development or other related diseases are usually relevant to DNA copy number changes on the genome. There are inherited random noises in such data, therefore, there is a need to employ an appropriate statistical model for identifying statistically significant DNA copy number changes. This type of statistical inference is evidently crucial in cancer researches, clinical diagnostic applications, and other related genomic researches. For the high-throughput genomic data resulting from DNA copy number experiments, a mean and variance change point model (MVCM) for detecting the CNVs is appropriate. We propose to use a Bayesian approach to study the MVCM for the cases of one change and propose to use a sliding window to search for all CNVs on a given chromosome. We carry out simulation studies to evaluate the estimate of the locus of the DNA copy number change using the derived posterior probability. These simulation results show that the approach is suitable for identifying copy number changes. The approach is also illustrated on several chromosomes from nine fibroblast cancer cell line data (array-based comparative genomic hybridization data). All DNA copy number aberrations that have been identified and verified by karyotyping are detected by our approach on these cell lines.  相似文献   

5.
In applied work, the two-parameter gamma model gives useful representations of many physical situations. It has a two dimensional sufficient statistic for the two parameters which describe shape and scale. This makes it superficially comparable to the normal model, but accurate and simple statistical inference procedures for each parameter have not been available. In this paper, the saddlepoint approximation is applied to approximate observed levels of significance of the shape parameter. An averaging method is proposed to approximate observed levels of significance of the scale parameter. These methods are extended to the two-sample case.  相似文献   

6.
In the regression setting, dimension reduction allows for complicated regression structures to be detected via visualisation in a low‐dimensional framework. However, some popular dimension reduction methodologies fail to achieve this aim when faced with a problem often referred to as symmetric dependency. In this paper we show how vastly superior results can be achieved when carrying out response and predictor transformations for methods such as least squares and sliced inverse regression. These transformations are simple to implement and utilise estimates from other dimension reduction methods that are not faced with the symmetric dependency problem. We highlight the effectiveness of our approach via simulation and an example. Furthermore, we show that ordinary least squares can effectively detect multiple dimension reduction directions. Methods robust to extreme response values are also considered.  相似文献   

7.
In pre-clinical oncology studies, tumor-bearing animals are treated and observed over a period of time in order to measure and compare the efficacy of one or more cancer-intervention therapies along with a placebo/standard of care group. A data analysis is typically carried out by modeling and comparing tumor volumes, functions of tumor volumes, or survival. Data analysis on tumor volumes is complicated because animals under observation may be euthanized prior to the end of the study for one or more reasons, such as when an animal's tumor volume exceeds an upper threshold. In such a case, the tumor volume is missing not-at-random for the time remaining in the study. To work around the non-random missingness issue, several statistical methods have been proposed in the literature, including the rate of change in log tumor volume and partial area under the curve. In this work, an examination and comparison of the test size and statistical power of these and other popular methods for the analysis of tumor volume data is performed through realistic Monte Carlo computer simulations. The performance, advantages, and drawbacks of popular statistical methods for animal oncology studies are reported. The recommended methods are applied to a real data set.  相似文献   

8.
Minimum information bivariate distributions with uniform marginals and a specified rank correlation are studied in this paper. These distributions play an important role in a particular way of modeling dependent random variables which has been used in the computer code UNICORN for carrying out uncertainty analyses. It is shown that these minimum information distributions have a particular form which makes simulation of conditional distributions very simple. Approximations to the continuous distributions are discussed and explicit formulae are determined. Finally a relation is discussed to DAD theorems, and a numerical algorithm is given (which has geometric rate of covergence) for determining the minimum information distributions.  相似文献   

9.
Statistical experiments, more commonly referred to as Monte Carlo or simulation studies, are used to study the behavior of statistical methods and measures under controlled situations. Whereas recent computing and methodological advances have permitted increased efficiency in the simulation process, known as variance reduction, such experiments remain limited by their finite nature and hence are subject to uncertainty; when a simulation is run more than once, different results are obtained. However, virtually no emphasis has been placed on reporting the uncertainty, referred to here as Monte Carlo error, associated with simulation results in the published literature, or on justifying the number of replications used. These deserve broader consideration. Here we present a series of simple and practical methods for estimating Monte Carlo error as well as determining the number of replications required to achieve a desired level of accuracy. The issues and methods are demonstrated with two simple examples, one evaluating operating characteristics of the maximum likelihood estimator for the parameters in logistic regression and the other in the context of using the bootstrap to obtain 95% confidence intervals. The results suggest that in many settings, Monte Carlo error may be more substantial than traditionally thought.  相似文献   

10.
An image that is mapped into a bit stream suitable for communication over or storage in a digital medium is said to have been compressed. Using tree-structured vector quantizers (TSVQs) is an approach to image compression in which clustering algorithms are combined with ideas from tree-structured classification to provide code books that can be searched quickly and simply. The overall goal is to optimize the quality of the compressed image subject to a constraint on the communication or storage capacity, i.e. on the allowed bit rate. General goals of image compression and vector quantization are summarized in this paper. There is discussion of methods for code book design, particularly the generalized Lloyd algorithm for clustering, and methods for splitting and pruning that have been extended from the design of classification trees to TSVQs. The resulting codes, called pruned TSVQs, are of variable rate, and yield lower distortion than fixed-rate, full-search vector quantizers for a given average bit rate. They have simple encoders and a natural successive approximation (progressive) property. Applications of pruned TSVQs are discussed, particularly compressing computerized tomography images. In this work, the key issue is not merely the subjective attractiveness of the compressed image but rather whether the diagnostic accuracy is adversely aflected by compression. In recent work, TSVQs have been combined with other types of image processing, including segmentation and enhancement. The relationship between vector quantizer performance and the size of the training sequence used to design the code and other asymptotic properties of the codes are discussed.  相似文献   

11.
An image that is mapped into a bit stream suitable for communication over or storage in a digital medium is said to have been compressed. Using tree-structured vector quantizers (TSVQs) is an approach to image compression in which clustering algorithms are combined with ideas from tree-structured classification to provide code books that can be searched quickly and simply. The overall goal is to optimize the quality of the compressed image subject to a constraint on the communication or storage capacity, i.e. on the allowed bit rate. General goals of image compression and vector quantization are summarized in this paper. There is discussion of methods for code book design, particularly the generalized Lloyd algorithm for clustering, and methods for splitting and pruning that have been extended from the design of classification trees to TSVQs. The resulting codes, called pruned TSVQs, are of variable rate, and yield lower distortion than fixed-rate, full-search vector quantizers for a given average bit rate. They have simple encoders and a natural successive approximation (progressive) property. Applications of pruned TSVQs are discussed, particularly compressing computerized tomography images. In this work, the key issue is not merely the subjective attractiveness of the compressed image but rather whether the diagnostic accuracy is adversely aflected by compression. In recent work, TSVQs have been combined with other types of image processing, including segmentation and enhancement. The relationship between vector quantizer performance and the size of the training sequence used to design the code and other asymptotic properties of the codes are discussed.  相似文献   

12.
Although “choose all that apply” questions are common in modern surveys, methods for analyzing associations among responses to such questions have only recently been developed. These methods are generally valid only for simple random sampling, but these types of questions often appear in surveys conducted under more complex sampling plans. The purpose of this article is to provide statistical analysis methods that can be applied to “choose all that apply” questions in complex survey sampling situations. Loglinear models are developed to incorporate the multiple responses inherent in these types of questions. Statistics to compare models and to measure association are proposed and their asymptotic distributions are derived. Monte Carlo simulations show that tests based on adjusted Pearson statistics generally hold their correct size when comparing models. These simulations also show that confidence intervals for odds ratios estimated from loglinear models have good coverage properties, while being shorter than those constructed using empirical estimates. Furthermore, the methods are shown to be applicable to more general problems of modeling associations between elements of two or more binary vectors. The proposed analysis methods are applied to data from the National Health and Nutrition Examination Survey. The Canadian Journal of Statistics © 2009 Statistical Society of Canada  相似文献   

13.
Logistic regression plays an important role in many fields. In practice, we often encounter missing covariates in different applied sectors, particularly in biomedical sciences. Ibrahim (1990) proposed a method to handle missing covariates in generalized linear model (GLM) setup. It is well known that logistic regression estimates using small or medium sized missing data are biased. Considering the missing data that are missing at random, in this paper we have reduced the bias by two methods; first we have derived a closed form bias expression using Cox and Snell (1968), and second we have used likelihood based modification similar to Firth (1993). Here we have analytically shown that the Firth type likelihood modification in Ibrahim led to the second order bias reduction. The proposed methods are simple to apply on an existing method, need no analytical work, with the exception of a little change in the optimization function. We have carried out extensive simulation studies comparing the methods, and our simulation results are also supported by a real world data.  相似文献   

14.
Many survey questions allow respondents to pick any number out of c possible categorical responses or “items”. These kinds of survey questions often use the terminology “choose all that apply” or “pick any”. Often of interest is determining if the marginal response distributions of each item differ among r different groups of respondents. Agresti and Liu (1998, 1999) call this a test for multiple marginal independence (MMI). If respondents are allowed to pick only 1 out of c responses, the hypothesis test may be performed using the Pearson chi-square test of independence. However, since respondents may pick more or less than 1 response, the test's assumptions that responses are made independently of each other is violated. Recently, a few MMI testing methods have been proposed. Loughin and Scherer (1998) propose using a bootstrap method based on a modified version of the Pearson chi-square test statistic. Agresti and Liu (1998, 1999) propose using marginal logit models, quasisymmetric loglinear models, and a few methods based on Pearson chi-square test statistics. Decady and Thomas (1999) propose using a Rao-Scott adjusted chi-squared test statistic. There has not been a full investigation of these MMI testing methods. The purpose here is to evaluate the proposed methods and propose a few new methods. Recommendations are given to guide the practitioner in choosing which MMI testing methods to use.  相似文献   

15.
In cost‐effectiveness analyses of drugs or health technologies, estimates of life years saved or quality‐adjusted life years saved are required. Randomised controlled trials can provide an estimate of the average treatment effect; for survival data, the treatment effect is the difference in mean survival. However, typically not all patients will have reached the endpoint of interest at the close‐out of a trial, making it difficult to estimate the difference in mean survival. In this situation, it is common to report the more readily estimable difference in median survival. Alternative approaches to estimating the mean have also been proposed. We conducted a simulation study to investigate the bias and precision of the three most commonly used sample measures of absolute survival gain – difference in median, restricted mean and extended mean survival – when used as estimates of the true mean difference, under different censoring proportions, while assuming a range of survival patterns, represented by Weibull survival distributions with constant, increasing and decreasing hazards. Our study showed that the three commonly used methods tended to underestimate the true treatment effect; consequently, the incremental cost‐effectiveness ratio (ICER) would be overestimated. Of the three methods, the least biased is the extended mean survival, which perhaps should be used as the point estimate of the treatment effect to be inputted into the ICER, while the other two approaches could be used in sensitivity analyses. More work on the trade‐offs between simple extrapolation using the exponential distribution and more complicated extrapolation using other methods would be valuable. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

16.
Reuse of controls in a nested case-control (NCC) study has not been considered feasible since the controls are matched to their respective cases. However, in the last decade or so, methods have been developed that break the matching and allow for analyses where the controls are no longer tied to their cases. These methods can be divided into two groups; weighted partial likelihood (WPL) methods and full maximum likelihood methods. The weights in the WPL can be estimated in different ways and four estimation procedures are discussed. In addition, we address modifications needed to accommodate left truncation. A full likelihood approach is also presented and we suggest an aggregation technique to decrease the computation time. Furthermore, we generalize calibration for case-cohort designs to NCC studies. We consider a competing risks situation and compare WPL, full likelihood and calibration through simulations and analyses on a real data example.  相似文献   

17.
Composite likelihood methods have been receiving growing interest in a number of different application areas, where the likelihood function is too cumbersome to be evaluated. In the present paper, some theoretical properties of the maximum composite likelihood estimate (MCLE) are investigated in more detail. Robustness of consistency of the MCLE is studied in a general setting, and clarified and illustrated through some simple examples. We also carry out a simulation study of the performance of the MCLE in a constructed model suggested by Arnold (2010) that is not multivariate normal, but has multivariate normal marginal distributions.  相似文献   

18.
In recent years, statistical profile monitoring has emerged as a relatively new and potentially useful subarea of statistical process control and has attracted attention of many researchers and practitioners. A profile, waveform, or signature is a function that relates a dependent or a response variable to one or more independent variables. Different statistical methods have been proposed by researchers to monitor profiles where each method requires its own assumptions. One of the common and implicit assumptions in most of the proposed procedures is the assumption of independent residuals. Violation of this assumption can affect the performance of control procedures and ultimately leading to misleading results. In this article, we study phase II analysis of monitoring multivariate simple linear profiles when the independency assumption is violated. Three time series based methods are proposed to eliminate the effect of correlation that exists between multivariate profiles. Performances of the proposed methods are evaluated using average run length (ARL) criterion. Numerical results indicate satisfactory performance for the proposed methods. A simulated example is also used to show the application of the proposed methods.  相似文献   

19.
Several variations of monotone nonparametric regression have been developed over the past 30 years. One approach is to first apply nonparametric regression to data and then monotone smooth the initial estimates to “iron out” violations to the assumed order. Here, such estimators are considered, where local polynomial regression is first used, followed by either least squares isotonic regression or a monotone method using simple averages. The primary focus of this work is to evaluate different types of confidence intervals for these monotone nonparametric regression estimators through Monte Carlo simulation. Most of the confidence intervals use bootstrap or jackknife procedures. Estimation of a response variable as a function of two continuous predictor variables is considered, where the estimation is performed at the observed values of the predictors (instead of on a grid). The methods are then applied to data involving subjects that worked at plants that use beryllium metal who have developed chronic beryllium disease.  相似文献   

20.
It is now possible to carry out Bayesian image segmentation from a continuum parametric model with an unknown number of regions. However, few suitable parametric models exist. We set out to model processes which have realizations that are naturally described by coloured planar triangulations. Triangulations are already used, to represent image structure in machine vision, and in finite element analysis, for domain decomposition. However, no normalizable parametric model, with realizations that are coloured triangulations, has been specified to date. We show how this must be done, and in particular we prove that a normalizable measure on the space of triangulations in the interior of a fixed simple polygon derives from a Poisson point process of vertices. We show how such models may be analysed by using Markov chain Monte Carlo methods and we present two case-studies, including convergence analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号