首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT

This article suggests a chi-square test of fit for parametric families of bivariate copulas. The marginal distribution functions are assumed to be unknown and are estimated by their empirical counterparts. Therefore, the standard asymptotic theory of the test is not applicable, but we derive a rule for the determination of the appropriate degrees of freedom in the asymptotic chi-square distribution. The behavior of the test under H 0 and for selected alternatives is investigated by Monte Carlo simulation. The test is applied to investigate the dependence structure of daily German asset returns. It turns out that the Gauss copula is inappropriate to describe the dependencies in the data. A t ν-copula with low degrees of freedom performs better.  相似文献   

2.
For a multivariate linear model, Wilk's likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative hypothesis requires complex analytic approximations, and more importantly, these distributional approximations are feasible only for moderate dimension of the dependent variable, say p≤20. On the other hand, assuming that the data dimension p as well as the number q of regression variables are fixed while the sample size n grows, several asymptotic approximations are proposed in the literature for Wilk's Λ including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilk's test in a high-dimensional context, specifically assuming a high data dimension p and a large sample size n. Based on recent random matrix theory, the correction we propose to Wilk's test is asymptotically Gaussian under the null hypothesis and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large p and large n context, but also for moderately large data dimensions such as p=30 or p=50. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in multivariate analysis of variance which is valid for high-dimensional data.  相似文献   

3.
ABSTRACT

Background: Many exposures in epidemiological studies have nonlinear effects and the problem is to choose an appropriate functional relationship between such exposures and the outcome. One common approach is to investigate several parametric transformations of the covariate of interest, and to select a posteriori the function that fits the data the best. However, such approach may result in an inflated Type I error. Methods: Through a simulation study, we generated data from Cox's models with different transformations of a single continuous covariate. We investigated the Type I error rate and the power of the likelihood ratio test (LRT) corresponding to three different procedures that considered the same set of parametric dose-response functions. The first unconditional approach did not involve any model selection, while the second conditional approach was based on a posteriori selection of the parametric function. The proposed third approach was similar to the second except that it used a corrected critical value for the LRT to ensure a correct Type I error. Results: The Type I error rate of the second approach was two times higher than the nominal size. For simple monotone dose-response, the corrected test had similar power as the unconditional approach, while for non monotone, dose-response, it had a higher power. A real-life application that focused on the effect of body mass index on the risk of coronary heart disease death, illustrated the advantage of the proposed approach. Conclusion: Our results confirm that a posteriori selecting the functional form of the dose-response induces a Type I error inflation. The corrected procedure, which can be applied in a wide range of situations, may provide a good trade-off between Type I error and power.  相似文献   

4.
ABSTRACT

A simple and efficient goodness-of-fit test for exponentiality is developed by exploiting the characterization of the exponential distribution using the probability integral transformation. We adopted the empirical likelihood methodology in constructing the test statistic. The proposed test statistic has a chi-square limiting distribution. For small to moderate sample sizes Monte-Carlo simulations revealed that our proposed tests are much more superior under increasing failure rate (IFR) and bathtub decreasing-increasing failure rate (BFR) alternatives. Real data examples were used to demonstrate the robustness and applicability of our proposed tests in practice.  相似文献   

5.
This paper introduces W-tests for assessing homogeneity in mixtures of discrete probability distributions. A W-test statistic depends on the data solely through parameter estimators and, if a penalized maximum likelihood estimation framework is used, has a tractable asymptotic distribution under the null hypothesis of homogeneity. The large-sample critical values are quantiles of a chi-square distribution multiplied by an estimable constant for which we provide an explicit formula. In particular, the estimation of large-sample critical values does not involve simulation experiments or random field theory. We demonstrate that W-tests are generally competitive with a benchmark test in terms of power to detect heterogeneity. Moreover, in many situations, the large-sample critical values can be used even with small to moderate sample sizes. The main implementation issue (selection of an underlying measure) is thoroughly addressed, and we explain why W-tests are well-suited to problems involving large and online data sets. Application of a W-test is illustrated with an epidemiological data set.  相似文献   

6.
Abstract

In time series, it is essential to check the independence of data by means of a proper method or an appropriate statistical test before any further analysis. Therefore, among different independence tests, a powerful and productive test has been introduced by Matilla-García and Marín via m-dimensional vectorial process, in which the value of the process at time t includes m-histories of the primary process. However, this method causes a dependency for the vectors even when the independence assumption of random variables is considered. Considering this dependency, a modified test is obtained in this article through presenting a new asymptotic distribution based on weighted chi-square random variables. Also, some other alterations to the test have been made via bootstrap method and by controlling the overlap. Compared with the primary test, it is obtained that not only the modified test is more accurate but also, it possesses higher power.  相似文献   

7.

A basic graphical approach for checking normality is the Q - Q plot that compares sample quantiles against the population quantiles. In the univariate analysis, the probability plot correlation coefficient test for normality has been studied extensively. We consider testing the multivariate normality by using the correlation coefficient of the Q - Q plot. When multivariate normality holds, the sample squared distance should follow a chi-square distribution for large samples. The plot should resemble a straight line. A correlation coefficient test can be constructed by using the pairs of points in the probability plot. When the correlation coefficient test does not reject the null hypothesis, the sample data may come from a multivariate normal distribution or some other distributions. So, we use the following two steps to test multivariate normality. First, we check the multivariate normality by using the probability plot correction coefficient test. If the test does not reject the null hypothesis, then we test symmetry of the distribution and determine whether multivariate normality holds. This test procedure is called the combination test. The size and power of this test are studied, and it is found that the combination test, in general, is more powerful than other tests for multivariate normality.  相似文献   

8.
ABSTRACT

We consider multiple regression (MR) model averaging using the focused information criterion (FIC). Our approach is motivated by the problem of implementing a mean-variance portfolio choice rule. The usual approach is to estimate parameters ignoring the intention to use them in portfolio choice. We develop an estimation method that focuses on the trading rule of interest. Asymptotic distributions of submodel estimators in the MR case are derived using a localization framework. The localization is of both regression coefficients and error covariances. Distributions of submodel estimators are used for model selection with the FIC. This allows comparison of submodels using the risk of portfolio rule estimators. FIC model averaging estimators are then characterized. This extension further improves risk properties. We show in simulations that applying these methods in the portfolio choice case results in improved estimates compared with several competitors. An application to futures data shows superior performance as well.  相似文献   

9.
Abstract

In this article, empirical likelihood is applied to the linear regression model with inequality constraints. We prove that asymptotic distribution of the adjusted empirical likelihood ratio test statistic is a weighted mixture of chi-square distribution.  相似文献   

10.
ABSTRACT

Background: Instrumental variables (IVs) have become much easier to find in the “Big data era” which has increased the number of applications of the Two-Stage Least Squares model (TSLS). With the increased availability of IVs, the possibility that these IVs are weak has increased. Prior work has suggested a ‘rule of thumb’ that IVs with a first stage F statistic at least ten will avoid a relative bias in point estimates greater than 10%. We investigated whether or not this threshold was also an efficient guarantee of low false rejection rates of the null hypothesis test in TSLS applications with many IVs.

Objective: To test how the ‘rule of thumb’ for weak instruments performs in predicting low false rejection rates in the TSLS model when the number of IVs is large.

Method: We used a Monte Carlo approach to create 28 original data sets for different models with the number of IVs varying from 3 to 30. For each model, we generated 2000 observations for each iteration and conducted 50,000 iterations to reach convergence in rejection rates. The point estimate was set to 0, and probabilities of rejecting this hypothesis were recorded for each model as a measurement of false rejection rate. The relationship between the endogenous variable and IVs was carefully adjusted to let the F statistics for the first stage model equal ten, thus simulating the ‘rule of thumb.’

Results: We found that the false rejection rates (type I errors) increased when the number of IVs in the TSLS model increased while holding the F statistics for the first stage model equal to 10. The false rejection rate exceeds 10% when TLSL has 24 IVs and exceed 15% when TLSL has 30 IVs.

Conclusion: When more instrumental variables were applied in the model, the ‘rule of thumb’ was no longer an efficient guarantee for good performance in hypothesis testing. A more restricted margin for F statistics is recommended to replace the ‘rule of thumb,’ especially when the number of instrumental variables is large.  相似文献   

11.
In testing of hypothesis, the robustness of the tests is an important concern. Generally, the maximum likelihood-based tests are most efficient under standard regularity conditions, but they are highly non-robust even under small deviations from the assumed conditions. In this paper, we have proposed generalized Wald-type tests based on minimum density power divergence estimators for parametric hypotheses. This method avoids the use of nonparametric density estimation and the bandwidth selection. The trade-off between efficiency and robustness is controlled by a tuning parameter β. The asymptotic distributions of the test statistics are chi-square with appropriate degrees of freedom. The performance of the proposed tests is explored through simulations and real data analysis.  相似文献   

12.
The generalized odds ratio (ORG) is a novel model-free approach to test the association in genetic studies by estimating the overall risk effect based on the complete genotype distribution. However, the power of ORG has not been explored and, particularly, in a setting where the mode of inheritance is known. A population genetics model was simulated in order to define the mode of inheritance of a pertinent gene–disease association in advance. Then, the power of ORG was explored based on this model and compared with the chi-square test for trend. The model considered bi- and tri-allelic gene–disease associations, and deviations from the Hardy–Weinberg equilibrium (HWE). The simulations showed that bi- and tri-allelic variants have the same pattern of power results. The power of ORG increases with increase in the frequency of mutant allele and the coefficient of selection and, of course, the degree of dominance of the mutant allele. The deviation from HWE has a considerable impact on power only for small values of the above parameters. The ORG showed superiority in power compared with the chi-square test for trend when there is deviation from HWE; otherwise, the pattern of results was similar in both the approaches.  相似文献   

13.

We propose a semiparametric framework based on sliced inverse regression (SIR) to address the issue of variable selection in functional regression. SIR is an effective method for dimension reduction which computes a linear projection of the predictors in a low-dimensional space, without loss of information on the regression. In order to deal with the high dimensionality of the predictors, we consider penalized versions of SIR: ridge and sparse. We extend the approaches of variable selection developed for multidimensional SIR to select intervals that form a partition of the definition domain of the functional predictors. Selecting entire intervals rather than separated evaluation points improves the interpretability of the estimated coefficients in the functional framework. A fully automated iterative procedure is proposed to find the critical (interpretable) intervals. The approach is proved efficient on simulated and real data. The method is implemented in the R package SISIR available on CRAN at https://cran.r-project.org/package=SISIR.

  相似文献   

14.
ABSTRACT

In clustered survival data, the dependence among individual survival times within a cluster has usually been described using copula models and frailty models. In this paper we propose a profile likelihood approach for semiparametric copula models with different cluster sizes. We also propose a likelihood ratio method based on profile likelihood for testing the absence of association parameter (i.e. test of independence) under the copula models, leading to the boundary problem of the parameter space. For this purpose, we show via simulation study that the proposed likelihood ratio method using an asymptotic chi-square mixture distribution performs well as sample size increases. We compare the behaviors of the two models using the profile likelihood approach under a semiparametric setting. The proposed method is demonstrated using two well-known data sets.  相似文献   

15.
This paper investigates a new family of goodness-of-fit tests based on the negative exponential disparities. This family includes the popular Pearson's chi-square as a member and is a subclass of the general class of disparity tests (Basu and Sarkar, 1994) which also contains the family of power divergence statistics. Pitman efficiency and finite sample power comparisons between different members of this new family are made. Three asymptotic approximations of the exact null distributions of the negative exponential disparity famiiy of tests are discussed. Some numerical results on the small sample perfomance of this family of tests are presented for the symmetric null hypothesis. It is shown that the negative exponential disparity famiiy, Like the power divergence family, produces a new goodness-of-fit test statistic that can be a very attractive alternative to the Pearson's chi-square. Some numerical results suggest that, application of this test statistic, as an alternative to Pearson's chi-square, could be preferable to the I 2/3 statistic of Cressie and Read (1984) under the use of chi-square critical values.  相似文献   

16.
ABSTRACT

We propose an extension of parametric product partition models. We name our proposal nonparametric product partition models because we associate a random measure instead of a parametric kernel to each set within a random partition. Our methodology does not impose any specific form on the marginal distribution of the observations, allowing us to detect shifts of behaviour even when dealing with heavy-tailed or skewed distributions. We propose a suitable loss function and find the partition of the data having minimum expected loss. We then apply our nonparametric procedure to multiple change-point analysis and compare it with PPMs and with other methodologies that have recently appeared in the literature. Also, in the context of missing data, we exploit the product partition structure in order to estimate the distribution function of each missing value, allowing us to detect change points using the loss function mentioned above. Finally, we present applications to financial as well as genetic data.  相似文献   

17.
《Econometric Reviews》2013,32(4):341-370
Abstract

The power of Pearson's overall goodness-of-fit test and the components-of-chi-squared or “Pearson analog” tests of Anderson [Anderson, G. (1994). Simple tests of distributional form. J. Econometrics 62:265–276] to detect rejections due to shifts in location, scale, skewness and kurtosis is studied, as the number and position of the partition points is varied. Simulations are conducted for small and moderate sample sizes. It is found that smaller numbers of classes than are used in practice may be appropriate, and that the choice of non-equiprobable classes can result in substantial gains in power.  相似文献   

18.
《Statistics》2012,46(6):1306-1328
ABSTRACT

In this paper, we consider testing the homogeneity of risk differences in independent binomial distributions especially when data are sparse. We point out some drawback of existing tests in either controlling a nominal size or obtaining powers through theoretical and numerical studies. The proposed test is designed to avoid the drawbacks of existing tests. We present the asymptotic null distribution and asymptotic power function for the proposed test. We also provide numerical studies including simulations and real data examples showing the proposed test has reliable results compared to existing testing procedures.  相似文献   

19.
Pearson’s chi-square (Pe), likelihood ratio (LR), and Fisher (Fi)–Freeman–Halton test statistics are commonly used to test the association of an unordered r×c contingency table. Asymptotically, these test statistics follow a chi-square distribution. For small sample cases, the asymptotic chi-square approximations are unreliable. Therefore, the exact p-value is frequently computed conditional on the row- and column-sums. One drawback of the exact p-value is that it is conservative. Different adjustments have been suggested, such as Lancaster’s mid-p version and randomized tests. In this paper, we have considered 3×2, 2×3, and 3×3 tables and compared the exact power and significance level of these test’s standard, mid-p, and randomized versions. The mid-p and randomized test versions have approximately the same power and higher power than that of the standard test versions. The mid-p type-I error probability seldom exceeds the nominal level. For a given set of parameters, the power of Pe, LR, and Fi differs approximately the same way for standard, mid-p, and randomized test versions. Although there is no general ranking of these tests, in some situations, especially when averaged over the parameter space, Pe and Fi have the same power and slightly higher power than LR. When the sample sizes (i.e., the row sums) are equal, the differences are small, otherwise the observed differences can be 10% or more. In some cases, perhaps characterized by poorly balanced designs, LR has the highest power.  相似文献   

20.
Abstract

The Coefficient of Variation is one of the most commonly used statistical tool across various scientific fields. This paper proposes a use of the Coefficient of Variation, obtained by Sampling, to define the polynomial probability density function (pdf) of a continuous and symmetric random variable on the interval [a, b]. The basic idea behind the first proposed algorithm is the transformation of the interval from [a, b] to [0, b-a]. The chi-square goodness-of-fit test is used to compare the proposed (observed) sample distribution with the expected probability distribution. The experimental results show that the collected data are approximated by the proposed pdf. The second algorithm proposes a new method to get a fast estimate for the degree of the polynomial pdf when the random variable is normally distributed. Using the known percentages of values that lie within one, two and three standard deviations of the mean, respectively, the so-called three-sigma rule of thumb, we conclude that the degree of the polynomial pdf takes values between 1.8127 and 1.8642. In the case of a Laplace (μ, b) distribution, we conclude that the degree of the polynomial pdf takes values greater than 1. All calculations and graphs needed are done using statistical software R.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号