首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Appreciating the desirability of simultaneously using both the criteria of goodness of fitted model and clustering of estimates around true parameter values, an extended version of the balanced loss function is presented and the Bayesian estimation of regression coefficients is discussed. The thus obtained optimal estimator is then compared with the least squares estimator and posterior mean vector with respect to the criteria like posterior expected loss, Bayes risk, bias vector, mean squared error matrix and risk function.  相似文献   

We present an approximate leaving-one-out technique for estimating the error rate in logistic discrimination. The new measure is based on the one-step approximation of a(i), the maximum likelihood estimate of the parameter vector based on the sample without the ith case. Some inequalities between the resubstitution error rate, the approximate and exact leaving-one-out error rates for the multiple group logistic model are investigated. Monte-Carlo simulations assess the adequacy of the approximate leaving-one-out method as an estimate of the actual error rate. The usefulness of this approach is demonstrated by means of two medical examples.  相似文献   

This paper develops a test for comparing treatment effects when observations are missing at random for repeated measures data on independent subjects. It is assumed that missingness at any occasion follows a Bernoulli distribution. It is shown that the distribution of the vector of linear rank statistics depends on the unknown parameters of the probability law that governs missingness, which is absent in the existing conditional methods employing rank statistics. This dependence is through the variance–covariance matrix of the vector of linear ranks. The test statistic is a quadratic form in the linear rank statistics when the variance–covariance matrix is estimated. The limiting distribution of the test statistic is derived under the null hypothesis. Several methods of estimating the unknown components of the variance–covariance matrix are considered. The estimate that produces stable empirical Type I error rate while maintaining the highest power among the competing tests is recommended for implementation in practice. Simulation studies are also presented to show the advantage of the proposed test over other rank-based tests that do not account for the randomness in the missing data pattern. Our method is shown to have the highest power while also maintaining near-nominal Type I error rates. Our results clearly illustrate that even for an ignorable missingness mechanism, the randomness in the pattern of missingness cannot be ignored. A real data example is presented to highlight the effectiveness of the proposed method.  相似文献   

A method for the national assessment of the biological quality of river sites is developed. Multivariate discrimination, based on site environmental characteristics, is used on a biological classification of reference sites to derive a procedure to predict the fauna to be expected in the absence of environmental stress. Various quality indices, based on a comparison of the observed with the expected fauna, are proposed. The sizes of the various sources of error and variation, and their effects on the rates of misclassification to quality bands, are examined.  相似文献   

A method for the national assessment of the biological quality of river sites is developed. Multivariate discrimination, based on site environmental characteristics, is used on a biological classification of reference sites to derive a procedure to predict the fauna to be expected in the absence of environmental stress. Various quality indices, based on a comparison of the observed with the expected fauna, are proposed. The sizes of the various sources of error and variation, and their effects on the rates of misclassification to quality bands, are examined.  相似文献   

We evaluate alternative models of variances and correlations with an economic loss function. We construct portfolios to minimize predicted variance subject to a required return. It is shown that the realized volatility is smallest for the correctly specified covariance matrix for any vector of expected returns. A test of relative performance of two covariance matrices is based on work of Diebold and Mariano. The method is applied to stocks and bonds and then to highly correlated assets. On average, dynamically correct correlations are worth around 60 basis points in annualized terms, but on some days they may be worth hundreds.  相似文献   


Analogs of the classical one way MANOVA model have recently been suggested that do not assume that population covariance matrices are equal or that the error vector distribution is known. These tests are based on the sample mean and sample covariance matrix corresponding to each of the p populations. We show how to extend these tests using other measures of location such as the trimmed mean or coordinatewise median. These new bootstrap tests can have some outlier resistance, and can perform better than the tests based on the sample mean if the error vector distribution is heavy tailed.  相似文献   

Three procedures for testing the adequacy of a proposed linear multiresponse regression model against unspecified general alternatives are considered. The model has an error structure with a matrix normal distribution which allows the vector of responses for a particular run to have an unknown covariance matrix while the responses for different runs are uncorrelated. Furthermore, each response variable may be modeled by a separate design matrix. Multivariate statistics corresponding to the classical univariate lack of fit and pure error sums of squares are defined and used to determine the multivariate lack of fit tests. A simulation study was performed to compare the power functions of the test procedures in the case of replication. Generalizations of the tests for the case in which there are no independent replicates on all responses are also presented.  相似文献   

A mixture model is proposed to analyze a bivariate interval censored data with cure rates. There exist two types of association related with bivariate failure times and bivariate cure rates, respectively. A correlation coefficient is adopted for the association of bivariate cure rates and a copula function is applied for bivariate survival times. The conditional expectation of unknown quantities attributable to interval censored data and cure rates are calculated in the E-step in ES (Expectation-Solving algorithm) and the marginal estimates and the association measures are estimated in the S-step through a two-stage procedure. A simulation study is performed to evaluate the suggested method and a real data from HIV patients is analyzed as a real data example.  相似文献   

Most of the methods used to estimate claim frequency rates in general insurance have assumed that data are independent. However, it is not uncommon for information stored in the database of an insurance company to contain previous years' claim data from each policyholder. We consider the application of the generalized linear mixed model approach to the analysis of repeated insurance claim frequency data in which a conditionally fixed random effect vector is incorporated explicitly into the linear predictor to model the inherent correlation. A motor insurance data set is used as the basis for simulation to demonstrate the advantages of the method. Ignoring the underlying association for observations within the same policyholder results in an underestimation of the standard error of the parameter estimates and a remarkable reduction in the prediction accuracy. The method provides a viable alternative for incorporating repeated claim experience that enables the revision of rates in general insurance.  相似文献   

Under an assumption that missing values occur randomly in a matrix, formulae are developed for the expected value and variance of six statistics that summarize the number and location of the missing values. For a seventh statistic, a regression model based on simulated data yields an estimate of the expected value. The results can be used in the development of methods to control the Type I error and approximate power and sample size for multilevel and longitudinal studies with missing data.  相似文献   

In this paper, the problem of estimating the mean vector under non-negative constraints on location vector of the multivariate normal distribution is investigated. The value of the wavelet threshold based on Stein''s unbiased risk estimators is calculated for the shrinkage estimator in restricted parameter space. We suppose that covariance matrix is unknown and we find the dominant class of shrinkage estimators under Balance loss function. The performance evaluation of the proposed class of estimators is checked through a simulation study by using risk and average mean square error values.  相似文献   

The aim of this study was to investigate the Type I error rate of hypothesis testing based on generalized estimating equations (GEE) for data characteristic of periodontal clinical trials. The data in these studies consist of a large number of binary responses from each subject and a small number of subjects (Haffajee et al. (1983), Goodson (1986), Jenkins et al. (1988)) Computer simulations were employed to investigate GEE based both on an empirical estimate of the variance-covariance matrix and a model-based estimate. Results from this investigation indicate that hypothesis testing based on GEE resulted in inappropriate Type I error rates when small samples are employed. Only an increase in the number of subjects to the point where it matched the number of observations per subject resulted in appropriate Type I error rates  相似文献   

A vector autoregression is fit to recent U.S. data on wheat prices, wheat export sales, wheat export shipments, and exchange rates. Forecast error decompositions and out-of-sample forecasts indicate that exchange rates have little influence on wheat sales and shipments.  相似文献   

This paper develops clinical trial designs that compare two treatments with a binary outcome. The imprecise beta class (IBC), a class of beta probability distributions, is used in a robust Bayesian framework to calculate posterior upper and lower expectations for treatment success rates using accumulating data. The posterior expectation for the difference in success rates can be used to decide when there is sufficient evidence for randomized treatment allocation to cease. This design is formally related to the randomized play‐the‐winner (RPW) design, an adaptive allocation scheme where randomization probabilities are updated sequentially to favour the treatment with the higher observed success rate. A connection is also made between the IBC and the sequential clinical trial design based on the triangular test. Theoretical and simulation results are presented to show that the expected sample sizes on the truly inferior arm are lower using the IBC compared with either the triangular test or the RPW design, and that the IBC performs well against established criteria involving error rates and the expected number of treatment failures.  相似文献   

Eight algorithms are considered for the computation of the stationary distribution l´ of a finite Markov chain with associated probability transition matrix P. The recommended algorithm is based on solving l´(I—P+eú)=ú, where e is the column vector of ones and u´ is a row vector satisfying u´e ≠0.An error analysis is presented for any such u including the choices ú= ejP and ú=e´j where éj is the jth row of the identity matrix. Computationalcomparisons between five of the algorithms are made based on twenty 8 x 8, twenty 20 x 20, and twenty 40 x 40 transition matrices. The matrix (I—P+eú)?1 is shown to be a non-singular generalized inverse of I—P when the unit root of P is simple and úe ≠ 0. A simple closed form expression is obtained for the Moore-Penrose inverse of I—P whenI—P has nullity one  相似文献   

This paper focuses on bivariate kernel density estimation that bridges the gap between univariate and multivariate applications. We propose a subsampling-extrapolation bandwidth matrix selector that improves the reliability of the conventional cross-validation method. The proposed procedure combines a U-statistic expression of the mean integrated squared error and asymptotic theory, and can be used in both cases of diagonal bandwidth matrix and unconstrained bandwidth matrix. In the subsampling stage, one takes advantage of the reduced variability of estimating the bandwidth matrix at a smaller subsample size m (m < n); in the extrapolation stage, a simple linear extrapolation is used to remove the incurred bias. Simulation studies reveal that the proposed method reduces the variability of the cross-validation method by about 50% and achieves an expected integrated squared error that is up to 30% smaller than that of the benchmark cross-validation. It shows comparable or improved performance compared to other competitors across six distributions in terms of the expected integrated squared error. We prove that the components of the selected bivariate bandwidth matrix have an asymptotic multivariate normal distribution, and also present the relative rate of convergence of the proposed bandwidth selector.  相似文献   

This paper develops a method of estimating micro-level poverty in cases where data are scarce. The method is applied to estimate district-level poverty using the household level Indian national sample survey data for two states, viz., West Bengal and Madhya Pradesh. The method involves estimation of state-level poverty indices from the data formed by pooling data of all the districts (each time excluding one district) and multiplying this poverty vector with a known weight matrix to obtain the unknown district-level poverty vector. The proposed method is expected to yield reliable estimates at the district level, because the district-level estimate is now based on a much larger sample size obtained by pooling data of several districts. This method can be an alternative to the “small area estimation technique” for estimating poverty at sub-state levels in developing countries.  相似文献   

A common data mining task is the search for associations in large databases. Here we consider the search for “interestingly large” counts in a large frequency table, having millions of cells, most of which have an observed frequency of 0 or 1. We first construct a baseline or null hypothesis expected frequency for each cell, and then suggest and compare screening criteria for ranking the cell deviations of observed from expected count. A criterion based on the results of fitting an empirical Bayes model to the cell counts is recommended. An example compares these criteria for searching the FDA Spontaneous Reporting System database maintained by the Division of Pharmacovigilance and Epidemiology. In the example, each cell count is the number of reports combining one of 1,398 drugs with one of 952 adverse events (total of cell counts = 4.9 million), and the problem is to screen the drug-event combinations for possible further investigation.  相似文献   

In this paper, we introduce mixed Liu estimator (MLE) for the vector of parameters in linear measurement error models by unifying the sample and the prior information. The MLE is a generalization of the mixed estimator (ME) and Liu estimator (LE). In particular, asymptotic normality properties of the estimators are discussed, and the performance of the MLE over the LE and ME are compared based on mean squared error matrix (MSEM). Finally, a Monte Carlo simulation and a numerical example are also presented for analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号