首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
A regression simulation study investigates the behaviour of ICOMP, AIC, and BIC under various collinearity-, sample size-, and residual variance-levels. When the variation in the design matrix is large, as the collinearity levels in the design matrix increased, the agreement percentages for all of the information criteria decreased monotonically and that ICOMP agreed with the Kullback Leibler model more often. As the residual variance increases, the agreement percentages of all of the information criteria decreases. However, as the sample size increased the agreement percentages of all information criteria increased. When the variation in the design matrix is low and the collinearity is low, as the residual variance increases, the agreement percentages for all of the information criteria decreases monotonically such that ICOMP agreed more often with Kullback Leibler model than both AIC and BIC.  相似文献   

In this article we assess the suitability of two new ridge estimators by means of a simulation study. We compare these estimators with well-known ridge estimators. We also make direct comparisons between the ordinary least squares (OLS) estimator and the ridge estimators by using ratio of the average total mean square error of the OLS estimator and the ridge estimators. We find that the new estimators perform well under certain conditions.  相似文献   

The problem of predicting a future value of a time series is considered in this article. If the series follows a stationary Markov process, this can be done by nonparametric estimation of the autoregression function. Two forecasting algorithms are introduced. They only differ in the nonparametric kernel-type estimator used: the Nadaraya-Watson estimator and the local linear estimator. There are three major issues in the implementation of these algorithms: selection of the autoregressor variables, smoothing parameter selection, and computing prediction intervals. These have been tackled using recent techniques borrowed from the nonparametric regression estimation literature under dependence. The performance of these nonparametric algorithms has been studied by applying them to a collection of 43 well-known time series. Their results have been compared to those obtained using classical Box-Jenkins methods. Finally, the practical behavior of the methods is also illustrated by a detailed analysis of two data sets.  相似文献   

In this article, we present a model-based framework to estimate the educational attainments of students in latent groups defined by unobservable or only partially observed features that are likely to affect the outcome distribution, as well as being interesting to be investigated. We focus our attention on the case of students in the first year of the upper secondary schools, for which the teachers’ suggestion at the end of their lower educational level toward the subsequent type of school is available. We use this information to develop latent strata according to the compliance behavior of students simplifying to the case of binary data for both counseled and attended school (i.e., academic or technical institute). We consider a likelihood-based approach to estimate outcome distributions in the latent groups and propose a set of plausible assumptions with respect to the problem at hand. In order to assess our method and its robustness, we simulate data resembling a real study conducted on pupils of the province of Bologna in year 2007/2008 to investigate their success or failure at the end of the first school year.  相似文献   

This article studies robustification strategies for the linear model in the presence of outliers. The advantages of an internal analysis of the robustness of least squares for a given sample are pointed out. The application of this methodology is illustrated by building an explicit model of the determinants of rental housing values in the Madrid Metropolitan Area.  相似文献   

Although multiple indices were introduced in the area of agreement measurements, the only documented index for linear relational agreement, which is for interval scale data, is the Pearson product-moment correlation coefficient. Despite its meaningfulness, the Pearson product-moment correlation coefficient does not convey the practical information such as what proportion of observations is within a certain boundary of the target value. To address this need, based on the inverse regression, we proposed the adjusted mean squared deviation (AMSD), adjusted coverage probability (ACP), and adjusted total deviation index (ATDI) for the measurement of the relational agreement. They can serve as reasonable and practically meaningful measurements for relational agreement. Real life data are considered to illustrate the performance of the methods.  相似文献   

The size and power properties of the Cox–Stuart test for detection of a monotonic deterministic trend in hydrological time series are analyzed using the Monte Carlo method. The influence of distribution properties, lengths of series, and trend slopes is studied. Results indicate good size in all cases. The power is high for: length over 60 and strong trend slope, low or medium variation, and medium slope. The power declines if slope and length decrease and if variability increases. The properties are better for skewed distributions than for symmetrical. The test is slightly weaker in comparison to the Mann–Kendall test.  相似文献   

In the complete balanced model for the analysis of variance, the equivalence of sums of squares and quadratic forms is seen to imply well-fitting patterns involving Kronecker products of identity matrices and scalar multiples of matrices with all elements equal to 1. The questions of symmetry, idempotency, and orthogonality so central to this topic are answered by simple multiplications; ranks are determined from simple traces. The associations between the forms of the two-factor model are presented here in a way that is accessible to first-year students and makes generalizations to higher order models transparent. The lack of patterns in incomplete or unbalanced models is noted. Additional steps in design and analysis are suggested in the references.  相似文献   


Runs rules are usually used with Shewhart-type charts to enhance the charts' sensitivities toward small and moderate shifts. Abbas et al. in 2011 took it a step further by proposing two runs rules schemes, applied to the exponentially weighted moving average (EWMA) chart and evaluated their average run length (ARL) performances using simulation. They showed that the proposed schemes are superior to the classical EWMA chart and other schemes being investigated. Besides pointing out some erroneous ARL and standard deviation of the run length (SDRL) computations in Abbas et al., this paper presents a Markov chain approach for computing the ARL, percentiles of the run length (RL) distribution and SDRL, for the two runs rules schemes of Abbas et al. Using Markov chain, we also propose two combined runs rules EWMA schemes to quicken the two schemes of Abbas et al. in responding to large shifts. The runs rules (basic and combined rules) EWMA schemes will be compared with some existing control charting methods, where the former charts are shown to prevail.  相似文献   

Selection of the cell in a 2×2 -factorial design with the greatest mean is considered. A general class of ranking and selection procedures (RSP) is constructed to include methods based on the largest marginal cell means (SP1) or on the largest cell mean (SP3). Using the preference zone approach the minimum probability of a correct solution is found, In this paper a RSP which maximizes the minimum probability of a correct solution over the preference zone is found. In this way selection of the cell with the greatest observed mean is proven to be admissible.  相似文献   

A particular semiparametric model of interest is the generalized partial linear model (GPLM) which extends the generalized linear model (GLM) by a nonparametric component.The paper reviews different estimation procedures based on kernel methods as well as test procedures on the correct specification of this model (vs. a parametric generalized linear model). Simulations and an application to a data set on East–West German migration illustrate similarities and dissimilarities of the estimators and test statistics.  相似文献   

Many analyses in the epidemiological and the prognostic studies and in the studies of event history data require methods that allow for unobserved covariates or “frailties”. We consider the shared frailty model in the framework of parametric proportional hazard model. There are certain assumptions about the distribution of frailty and baseline distribution. The exponential distribution is the commonly used distribution for analyzing lifetime data. In this paper, we consider shared gamma frailty model with bivariate exponential of Marshall and Olkin (1967 Marshall, A.W., Olkin, I. (1967). A multivariate exponential distribution. J. Am. Stat. Assoc. 62:3044.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]) distribution as baseline hazard for bivariate survival times. We solve the inferential problem in a Bayesian framework with the help of a comprehensive simulation study and real data example. We fit the model to the real-life bivariate survival data set of diabetic retinopathy data. We introduce Bayesian estimation procedure using Markov Chain Monte Carlo (MCMC) technique to estimate the parameters involved in the proposed model and then compare the true values of the parameters with the estimated values for different sample sizes.  相似文献   

This paper presents an expository development of Stein estimation in several distribution families. Considered are both the point estimation and confidence interval cases. Specific results for linear regression models are added. Emphasis is laid on the chronological history and on recent results.  相似文献   

Sequential minimal optimization (SMO) algorithm is effective in solving large-scale support vector machine (SVM). The existing algorithms all assume that the kernels are positive definite (PD) or positive semi-definite (PSD) and should meet the Mercer condition. Some kernels, however, such as sigmoid kernel, which originates from neural network and then is extensively used in SVM, are conditionally PD in certain circumstances; in addition, practically, it is often difficult to prove whether a kernel is PD or PSD or not except some well-known kernels. So, the applications of the existing algorithm of SMO are limited. Considering the deficiency of the traditional ones, this algorithm of solving ?-SVR with nonpositive semi-definite (non-PSD) kernels is proposed. Different from the existing algorithms which must consider four Lagrange multipliers, the algorithm proposed in this article just need to consider two Lagrange multipliers in the process of implementation. The proposed algorithm simplified the implementation by expanding the original dual programming of ?-SVR and solving its KKT conditions, thus being easily applied in solving ?-SVR with non-PSD kernels. The presented algorithm is evaluated using five benchmark problems and one reality problem. The results show that ?-SVR with non-PSD provides more accurate prediction than that with PD kernel.  相似文献   

In diagnostic trials, the performance of a product is most frequently measured in terms such as sensitivity, specificity and the area under the ROC-curve (AUC). In multiple-reader trials, correlated data appear in a natural way since the same patient is observed under different conditions by several readers. The repeated measures may have quite an involved correlation structure. Even though sensitivity, specificity and the AUC are all assessments of diagnostic ability, a unified approach to analyze all such measurements allowing for an arbitrary correlation structure does not exist. Thus, a unified approach for these three effect measures of diagnostic ability will be presented in this paper. The fact that sensitivity and specificity are particular AUCs will serve as a basis for our method of analysis. As the presented theory can also be used in set-ups with correlated binomial random-variables, it may have a more extensive application than only in diagnostic trials.  相似文献   

It is common for linear regression models that the error variances are not the same for all observations and there are some high leverage data points. In such situations, the available literature advocates the use of heteroscedasticity consistent covariance matrix estimators (HCCME) for the testing of regression coefficients. Primarily, such estimators are based on the residuals derived from the ordinary least squares (OLS) estimator that itself can be seriously inefficient in the presence of heteroscedasticity. To get efficient estimation, many efficient estimators, namely the adaptive estimators are available but their performance has not been evaluated yet when the problem of heteroscedasticity is accompanied with the presence of high leverage data. In this article, the presence of high leverage data is taken into account to evaluate the performance of the adaptive estimator in terms of efficiency. Furthermore, our numerical work also evaluates the performance of the robust standard errors based on this efficient estimator in terms of interval estimation and null rejection rate (NRR).  相似文献   

Many authors have criticized the use of spreadsheets for statistical data processing and computing because of incorrect statistical functions, no log file or audit trail, inconsistent behavior of computational dialogs, and poor handling of missing values. Some improvements in some spreadsheet processors and the possibility of audit trail facilities suggest that the use of a spreadsheet for some statistical data entry and simple analysis tasks may now be acceptable. A brief outline of some issues and some guidelines for good practice are included.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号