首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Several methods have been suggested to calculate robust M- and G-M -estimators of the regression parameter β and of the error scale parameter σ in a linear model. This paper shows that, for some data sets well known in robust statistics, the nonlinear systems of equations for the simultaneous estimation of β, with an M-estimate with a redescending ψ-function, and σ, with the residual median absolute deviation (MAD), have many solutions. This multiplicity is not caused by the possible lack of uniqueness, for redescending ψ-functions, of the solutions of the system defining β with known σ; rather, the simultaneous estimation of β and σ together creates the problem. A way to avoid these multiple solutions is to proceed in two steps. First take σ as the median absolute deviation of the residuals for a uniquely defined robust M-estimate such as Huber's Proposal 2 or the L1-estimate. Then solve the nonlinear system for the M-estimate with σ equal to the value obtained at the first step to get the estimate of β. Analytical conditions for the uniqueness of M and G-M-estimates are also given.  相似文献   

2.
This note exhibits two independent random variables on integers, X1 and X2, such that neither X1 nor X2 has a generalized Poisson distribution, but X1 + X2 has. This contradicts statements made by Professor Consul in his recent book.  相似文献   

3.
We consider moving average processes, {Xs, s ∈ ??}, where ?? is a triangular lattice in the plane R2. To estimate the parameters of such processes, Adjengue & Moore (1993) have considered likelihood and gaussian pseudo-likelihood methods. We consider here two other methods. The first one is based on the estimation of the correlations and the relation between these correlations and the parameters of the process. The second relies on a linear approximation of the process. The asymptotic properties of the proposed estimators are analyzed and compared. A simulation study allows us to compare the estimators for fixed sample sizes.  相似文献   

4.
The use of ridit, as a probability score, is a very common practice to compare discrete random variables in discrete data analysis. In the present work we formulate ridit reliability functionals for some comparison of K independent binary random variables. We use such functionals to provide a generalized response-adaptive design (GRAD) on K(≥ +2) treatment-arms for dichotomous response variables. We exhibit some properties of the proposed design and compare it with some of the existing competitors by computing its various performance measures. We also provide a discussion towards a possible modification of the GRAD in the presence of covariates.  相似文献   

5.
Can we find some common principle in the three comparisons? Lacking adequate time for a thorough exploration, let me suggest that representation is that common principle. I suggested (section 4) that judgment selection of spatial versus temporal extensions distinguish “longitudinal” local studies from “cross-section” population sampling. We had noted (section 3) that censuses are taken for detailed representation of the spatial dimension but they depend on judgmental selection of the temporal. Survey sampling lacks spatial detail but is spatially representative with randomization, and it can be made timely. Periodic samples can be designed that are representative of temporal extension. Furthermore, spatial and temporal detail can be obtained either through estimation or through cumulated samples [Purcell and Kish 1979, 1980; Kish 1979b, 1981, 1986 6.6]. Registers and administrative records can have good spatial and temporal representation, but representation may be lacking in population content, and surely in representation of variables. Representation of variables and of the relations between variables and over the population are the issues in conflict between surveys, experiments, and observations. This is a deep subject, and too deep to be explored again, as it was in section 2. A final point about limits for randomization to achieve representation through sampling: randomization for selecting samples of variables is beyond me generally, because I cannot conceive of frames for defined populations of variables. Yet we can find attempts at randomized selection of variables: in the selection of items for the consumer price index, also of items for tests of IQ or of achievements. Generally I believe that randomization is the way to achieve representation without complete coverage, and that it can be applied and practised in many dimensions.  相似文献   

6.
In the literature, assuming independence of random variables X and Y, statistical estimation of the stress–strength parameter R = P(X > Y) is intensively investigated. However, in some real applications, the strength variable X could be highly dependent on the stress variable Y. In this paper, unlike the common practice in the literature, we discuss on estimation of the parameter R where more realistically X and Y are dependent random variables distributed as bivariate Rayleigh model. We derive the Bayes estimates and highest posterior density credible intervals of the parameters using suitable priors on the parameters. Because there are not closed forms for the Bayes estimates, we will use an approximation based on Laplace method and a Markov Chain Monte Carlo technique to obtain the Bayes estimate of R and unknown parameters. Finally, simulation studies are conducted in order to evaluate the performances of the proposed estimators and analysis of two data sets are provided.  相似文献   

7.
After recalling the framework of minimum-contrast estimation, its consistency and its asymptotic normality, we highlight the fact that these results do not require any stationarity or ergodicity assumptions. The asymptotic distribution of the underlying contrast difference test is a weighted sum of independent chi-square variables having one degree of freedom each. We illustrate these results in three contexts: (1) a nonhomogeneous Markov chain with likelihood contrast; (2) a Markov field with coding, pseudolikelihood or likelihood contrasts; (3) a not necessarily Gaussian time series with Whittle's contrast. In contexts (2) and (3), we compare experimentally the power of the likelihood-ratio test with those of other contrast-difference tests.  相似文献   

8.
Let (X, Y) be a bivariate random vector whose distribution function H(x, y) belongs to the class of bivariate extreme-value distributions. If F1 and F2 are the marginals of X and Y, then H(x, y) = C{F1(x),F2(y)}, where C is a bivariate extreme-value dependence function. This paper gives the joint distribution of the random variables Z = {log F1(X)}/{log F1(X)F2(Y)} and W = C{F1{(X),F2(Y)}. Using this distribution, an algorithm to generate random variables having bivariate extreme-value distribution is présentés. Furthermore, it is shown that for any bivariate extreme-value dependence function C, the distribution of the random variable W = C{F1(X),F2(Y)} belongs to a monoparametric family of distributions. This property is used to derive goodness-of-fit statistics to determine whether a copula belongs to an extreme-value family.  相似文献   

9.
The resistance of tests to acceptance and rejection of null hypotheses was denned and studied by Ylvisaker in the context of one-sample problems. This notion provides a measure of a test's resistance to outliers. In this paper, we propose an extension of this notion to rank-based tests of independence for bivariate random variables. We show, among other things, that Kendall's test of independence is more resistant than Spearman's test.  相似文献   

10.
We study the distributions of the random variables Sn and Vr related to a sequence of dependent Bernoulli variables, where Sn denotes the number of successes in n trials and Vr the number of trials necessary to obtain r successes. The purpose of this article is twofold: (1) Generalizing some results on the “nature” of the binomial and negative binomial distributions we show that Sn and Vr can follow any prescribed discrete distribution. The corresponding joint distributions of the Bernoulli variables are characterized as the solutions of systems of linear equations. (2) We consider a specific type of dependence of the Bernoulli variables, where the probability of a success depends only on the number of previous successes. We develop some theory based on new closed-form representations for the probability mass functions of Sn and Vr which enable direct computations of the probabilities.  相似文献   

11.
Non-symmetric correspondence analysis (NSCA) is a useful technique for analysing a two-way contingency table. Frequently, the predictor variables are more than one; in this paper, we consider two categorical variables as predictor variables and one response variable. Interaction represents the joint effects of predictor variables on the response variable. When interaction is present, the interpretation of the main effects is incomplete or misleading. To separate the main effects and the interaction term, we introduce a method that, starting from the coordinates of multiple NSCA and using a two-way analysis of variance without interaction, allows a better interpretation of the impact of the predictor variable on the response variable. The proposed method has been applied on a well-known three-way contingency table proposed by Bockenholt and Bockenholt in which they cross-classify subjects by person's attitude towards abortion, number of years of education and religion. We analyse the case where the variables education and religion influence a person's attitude towards abortion.  相似文献   

12.
We study bias arising from rounding categorical variables following multivariate normal (MVN) imputation. This task has been well studied for binary variables, but not for more general categorical variables. Three methods that assign imputed values to categories based on fixed reference points are compared using 25 specific scenarios covering variables with k=3, …, 7 categories, and five distributional shapes, and for each k=3, …, 7, we examine the distribution of bias arising over 100,000 distributions drawn from a symmetric Dirichlet distribution. We observed, on both empirical and theoretical grounds, that one method (projected-distance-based rounding) is superior to the other two methods, and that the risk of invalid inference with the best method may be too high at sample sizes n≥150 at 50% missingness, n≥250 at 30% missingness and n≥1500 at 10% missingness. Therefore, these methods are generally unsatisfactory for rounding categorical variables (with up to seven categories) following MVN imputation.  相似文献   

13.
Multiple imputation has emerged as a popular approach to handling data sets with missing values. For incomplete continuous variables, imputations are usually produced using multivariate normal models. However, this approach might be problematic for variables with a strong non-normal shape, as it would generate imputations incoherent with actual distributions and thus lead to incorrect inferences. For non-normal data, we consider a multivariate extension of Tukey's gh distribution/transformation [38] to accommodate skewness and/or kurtosis and capture the correlation among the variables. We propose an algorithm to fit the incomplete data with the model and generate imputations. We apply the method to a national data set for hospital performance on several standard quality measures, which are highly skewed to the left and substantially correlated with each other. We use Monte Carlo studies to assess the performance of the proposed approach. We discuss possible generalizations and give some advices to practitioners on how to handle non-normal incomplete data.  相似文献   

14.
Multivariate control charts are used to monitor stochastic processes for changes and unusual observations. Hotelling's T2 statistic is calculated for each new observation and an out‐of‐control signal is issued if it goes beyond the control limits. However, this classical approach becomes unreliable as the number of variables p approaches the number of observations n, and impossible when p exceeds n. In this paper, we devise an improvement to the monitoring procedure in high‐dimensional settings. We regularise the covariance matrix to estimate the baseline parameter and incorporate a leave‐one‐out re‐sampling approach to estimate the empirical distribution of future observations. An extensive simulation study demonstrates that the new method outperforms the classical Hotelling T2 approach in power, and maintains appropriate false positive rates. We demonstrate the utility of the method using a set of quality control samples collected to monitor a gas chromatography–mass spectrometry apparatus over a period of 67 days.  相似文献   

15.
We address the problem of robust inference about the stress–strength reliability parameter R = P(X < Y), where X and Y are taken to be independent random variables. Indeed, although classical likelihood based procedures for inference on R are available, it is well-known that they can be badly affected by mild departures from model assumptions, regarding both stress and strength data. The proposed robust method relies on the theory of bounded influence M-estimators. We obtain large-sample test statistics with the standard asymptotic distribution by means of delta-method asymptotics. The finite sample behavior of these tests is investigated by some numerical studies, when both X and Y are independent exponential or normal random variables. An illustrative application in a regression setting is also discussed.  相似文献   

16.
In partly linear models, the dependence of the response y on (x T, t) is modeled through the relationship y=x T β+g(t)+?, where ? is independent of (x T, t). We are interested in developing an estimation procedure that allows us to combine the flexibility of the partly linear models, studied by several authors, but including some variables that belong to a non-Euclidean space. The motivating application of this paper deals with the explanation of the atmospheric SO2 pollution incidents using these models when some of the predictive variables belong in a cylinder. In this paper, the estimators of β and g are constructed when the explanatory variables t take values on a Riemannian manifold and the asymptotic properties of the proposed estimators are obtained under suitable conditions. We illustrate the use of this estimation approach using an environmental data set and we explore the performance of the estimators through a simulation study.  相似文献   

17.
Matched case–control designs are commonly used in epidemiological studies for estimating the effect of exposure variables on the risk of a disease by controlling the effect of confounding variables. Due to retrospective nature of the study, information on a covariate could be missing for some subjects. A straightforward application of the conditional logistic likelihood for analyzing matched case–control data with the partially missing covariate may yield inefficient estimators of the parameters. A robust method has been proposed to handle this problem using an estimated conditional score approach when the missingness mechanism does not depend on the disease status. Within the conditional logistic likelihood framework, an empirical procedure is used to estimate the odds of the disease for the subjects with missing covariate values. The asymptotic distribution and the asymptotic variance of the estimator when the matching variables and the completely observed covariates are categorical. The finite sample performance of the proposed estimator is assessed through a simulation study. Finally, the proposed method has been applied to analyze two matched case–control studies. The Canadian Journal of Statistics 38: 680–697; 2010 © 2010 Statistical Society of Canada  相似文献   

18.
We consider the comparison of two formulations in terms of average bioequivalence using the 2 × 2 cross‐over design. In a bioequivalence study, the primary outcome is a pharmacokinetic measure, such as the area under the plasma concentration by time curve, which is usually assumed to have a lognormal distribution. The criterion typically used for claiming bioequivalence is that the 90% confidence interval for the ratio of the means should lie within the interval (0.80, 1.25), or equivalently the 90% confidence interval for the differences in the means on the natural log scale should be within the interval (?0.2231, 0.2231). We compare the gold standard method for calculation of the sample size based on the non‐central t distribution with those based on the central t and normal distributions. In practice, the differences between the various approaches are likely to be small. Further approximations to the power function are sometimes used to simplify the calculations. These approximations should be used with caution, because the sample size required for a desirable level of power might be under‐ or overestimated compared to the gold standard method. However, in some situations the approximate methods produce very similar sample sizes to the gold standard method. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

19.
We propose a method to obtain several streams of pseudorandom numbers based on a backbone generator of the generalized shift register type. The method is based on inverting one cycle in a de Bruijn digraph into many sequences in a higher-order de Bruijn graph via an appropriate graph homomorphism. We apply this technique to twisted generalized feedback shift register generators and to the Mersenne Twister MT19937. Positive results of statistical testing are reported.  相似文献   

20.
In high-dimensional data, one often seeks a few interesting low-dimensional projections which reveal important aspects of the data. Projection pursuit for classification finds projections that reveal differences between classes. Even though projection pursuit is used to bypass the curse of dimensionality, most indexes will not work well when there are a small number of observations relative to the number of variables, known as a large p (dimension) small n (sample size) problem. This paper discusses the relationship between the sample size and dimensionality on classification and proposes a new projection pursuit index that overcomes the problem of small sample size for exploratory classification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号