首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper considers the statistical reliability on discrete failure data and the selection of the best geometric distribution having the smallest failure probability from among several competitors. Using the Bayesian approach a Bayes selection rule based on type-I censored data is derived and its associated monotonicity is also obtained. An early selection rule which allows us to make a selection possible earlier than the censoring time of the life testing experiment is proposed. This early selection rule can be shown to be equivalent to the Bayes selection rule. An illustrative example is given to demonstrate the use and the performance of the early selection rule.  相似文献   

2.
ABSTRACT

In this article we study the approximately unbiased multi-level pseudo maximum likelihood (MPML) estimation method for general multi-level modeling with sampling weights. We conduct a simulation study to determine the effect various factors have on the estimation method. The factors we included in this study are scaling method, size of clusters, invariance of selection, informativeness of selection, intraclass correlation, and variability of standardized weights. The scaling method is an indicator of how the weights are normalized on each level. The invariance of the selection is an indicator of whether or not the same selection mechanism is applied across clusters. The informativeness of the selection is an indicator of how biased the selection is. We summarize our findings and recommend a multi-stage procedure based on the MPML method that can be used in practical applications.  相似文献   

3.
In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity.  相似文献   

4.
Model selection is the most persuasive problem in generalized linear models. A model selection criterion based on deviance called the deviance-based criterion (DBC) is proposed. The DBC is obtained by penalizing the difference between the deviance of the fitted model and the full model. Under certain weak conditions, DBC is shown to be a consistent model selection criterion in the sense that with probability approaching to one, the selected model asymptotically equals the optimal model relating response and predictors. Further, the use of DBC in link function selection is also discussed. We compare the proposed model selection criterion with existing methods. The small sample efficiency of proposed model selection criterion is evaluated by the simulation study.  相似文献   

5.
We address the issue of model selection in beta regressions with varying dispersion. The model consists of two submodels, namely: for the mean and for the dispersion. Our focus is on the selection of the covariates for each submodel. Our Monte Carlo evidence reveals that the joint selection of covariates for the two submodels is not accurate in finite samples. We introduce two new model selection criteria that explicitly account for varying dispersion and propose a fast two step model selection scheme which is considerably more accurate and is computationally less costly than usual joint model selection. Monte Carlo evidence is presented and discussed. We also present the results of an empirical application.  相似文献   

6.
We investigate the problem of selecting the best population from positive exponential family distributions based on type-I censored data. A Bayes rule is derived and a monotone property of the Bayes selection rule is obtained. Following that property, we propose an early selection rule. Through this early selection rule, one can terminate the experiment on a few populations early and possibly make the final decision before the censoring time. An example is provided in the final part to illustrate the use of the early selection rule.  相似文献   

7.
The problem of selecting the best population from among a finite number of populations in the presence of uncertainty is a problem one faces in many scientific investigations, and has been studied extensively, Many selection procedures have been derived for different selection goals. However, most of these selection procedures, being frequentist in nature, don't tell how to incorporate the information in a particular sample to give a data-dependent measure of correct selection achieved for this particular sample. They often assign the same decision and probability of correct selection for two different sample values, one of which actually seems intuitively much more conclusive than the other. The methodology of conditional inference offers an approach which achieves both frequentist interpret ability and a data-dependent measure of conclusiveness. By partitioning the sample space into a family of subsets, the achieved probability of correct selection is computed by conditioning on which subset the sample falls in. In this paper, the partition considered is the so called continuum partition, while the selection rules are both the fixed-size and random-size subset selection rules. Under the distributional assumption of being monotone likelihood ratio, results on least favourable configuration and alpha-correct selection are established. These re-sults are not only useful in themselves, but also are used to design a new sequential procedure with elimination for selecting the best of k Binomial populations. Comparisons between this new procedure and some other se-quential selection procedures with regard to total expected sample size and some risk functions are carried out by simulations.  相似文献   

8.
Feature selection (FS) is one of the most powerful techniques to cope with the curse of dimensionality. In the study, a new filter approach to feature selection based on distance correlation is presented (DCFS, for short), which keeps the model-free advantage without any pre-specified parameters. Our method consists of two steps: hard step (forward selection) and soft step (backward selection). In the hard step, two types of associations, between univariate feature and the classes and between group feature and the classes, are involved to pick out the most relevant features with respect to the target classes. Due to the strict screening condition in the first step, some of the useful features are likely removed. Therefore, in the soft step, a feature-relationship gain (like feature score) based on the distance correlation is introduced, which is concerned with five kinds of associations. We sort the feature gain values and implement the backward selection procedure until the errors stop declining. The simulation results show that our method becomes more competitive on several datasets compared with some of the representative feature selection methods based on several classification models.  相似文献   

9.
Model selection methods are important to identify the best approximating model. To identify the best meaningful model, purpose of the model should be clearly pre-stated. The focus of this paper is model selection when the modelling purpose is classification. We propose a new model selection approach designed for logistic regression model selection where main modelling purpose is classification. The method is based on the distance between the two clustering trees. We also question and evaluate the performances of conventional model selection methods based on information theory concepts in determining best logistic regression classifier. An extensive simulation study is used to assess the finite sample performances of the cluster tree based and the information theoretic model selection methods. Simulations are adjusted for whether the true model is in the candidate set or not. Results show that the new approach is highly promising. Finally, they are applied to a real data set to select a binary model as a means of classifying the subjects with respect to their risk of breast cancer.  相似文献   

10.
We consider a Bayesian nonignorable model to accommodate a nonignorable selection mechanism for predicting small area proportions. Our main objective is to extend a model on selection bias in a previously published paper, coauthored by four authors, to accommodate small areas. These authors assume that the survey weights (or their reciprocals that we also call selection probabilities) are available, but there is no simple relation between the binary responses and the selection probabilities. To capture the nonignorable selection bias within each area, they assume that the binary responses and the selection probabilities are correlated. To accommodate the small areas, we extend their model to a hierarchical Bayesian nonignorable model and we use Markov chain Monte Carlo methods to fit it. We illustrate our methodology using a numerical example obtained from data on activity limitation in the U.S. National Health Interview Survey. We also perform a simulation study to assess the effect of the correlation between the binary responses and the selection probabilities.  相似文献   

11.
The problem of sample selection, when a one-stage superpopulation model-based approach is used to predict individual variate values for each unit in a finite population based on a sample of only some of the units, is investigated. The model framework is discussed and a sample selection scheme based on the model is derived. The sample selection scheme is evaluated using actual data. Future research topics including multiple predictions per unit are suggested.  相似文献   

12.
The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.  相似文献   

13.
In this paper, we study the problem of estimation and variable selection for generalised partially linear single-index models based on quasi-likelihood, extending existing studies on variable selection for partially linear single-index models to binary and count responses. To take into account the unit norm constraint of the index parameter, we use the ‘delete-one-component’ approach. The asymptotic normality of the estimates is demonstrated. Furthermore, the smoothly clipped absolute deviation penalty is added for variable selection of parameters both in the nonparametric part and the parametric part, and the oracle property of the variable selection procedure is shown. Finally, some simulation studies are carried out to illustrate the finite sample performance.  相似文献   

14.
This article extends the standard regression discontinuity (RD) design to allow for sample selection or missing outcomes. We deal with both treatment endogeneity and sample selection. Identification in this article does not require any exclusion restrictions in the selection equation, nor does it require specifying any selection mechanism. The results can therefore be applied broadly, regardless of how sample selection is incurred. Identification instead relies on smoothness conditions. Smoothness conditions are empirically plausible, have readily testable implications, and are typically assumed even in the standard RD design. We first provide identification of the “extensive margin” and “intensive margin” effects. Then based on these identification results and principle stratification, sharp bounds are constructed for the treatment effects among the group of individuals that may be of particular policy interest, that is, those always participating compliers. These results are applied to evaluate the impacts of academic probation on college completion and final GPAs. Our analysis reveals striking gender differences at the extensive versus the intensive margin in response to this negative signal on performance.  相似文献   

15.
Abstract

In this article, we propose a new penalized-likelihood method to conduct model selection for finite mixture of regression models. The penalties are imposed on mixing proportions and regression coefficients, and hence order selection of the mixture and the variable selection in each component can be simultaneously conducted. The consistency of order selection and the consistency of variable selection are investigated. A modified EM algorithm is proposed to maximize the penalized log-likelihood function. Numerical simulations are conducted to demonstrate the finite sample performance of the estimation procedure. The proposed methodology is further illustrated via real data analysis.  相似文献   

16.
Several studies have shown that at the individual level there exists a negative relationship between age at first birth and completed fertility. Using twin data in order to control for unobserved heterogeneity as possible source of bias, Kohler et al. (2001) showed the significant presence of such "postponement effect" at the micro level. In this paper, we apply sample selection models, where selection is based on having or not having had a first birth at all, to estimate the impact of postponing first births on subsequent fertility for four European nations, three of which have now lowest-low fertility levels. We use data from a set of comparative surveys (Fertility and Family Surveys), and we apply sample selection models on the logarithm of total fertility and on the progression to the second birth. Our results show that postponement effects are only very slightly affected by sample selection biases, so that sample selection models do not improve significantly the results of standard regression techniques on selected samples. Our results confirm that the postponement effect is higher in countries with lowest-low fertility levels.  相似文献   

17.
Variable selection in elliptical Linear Mixed Models (LMMs) with a shrinkage penalty function (SPF) is the main scope of this study. SPFs are applied for parameter estimation and variable selection simultaneously. The smoothly clipped absolute deviation penalty (SCAD) is one of the SPFs and it is adapted into the elliptical LMM in this study. The proposed idea is highly applicable to a variety of models which are set up with different distributions such as normal, student-t, Pearson VII, power exponential and so on. Simulation studies and real data example with one of the elliptical distributions show that if the variable selection is also a concern, it is worthwhile to carry on the variable selection and the parameter estimation simultaneously in the elliptical LMM.  相似文献   

18.
We consider the problem of model selection based on quantile analysis and with unknown parameters estimated using quantile leasts squares. We propose a model selection test for the null hypothesis that the competing models are equivalent against the alternative hypothesis that one model is closer to the true model. We follow with two applications of the proposed model selection test. The first application is in model selection for time series with non-normal innovations. The second application is in model selection in the NoVas method, short for normalizing and variance stabilizing transformation, forecast. A set of simulation results also lends strong support to the results presented in the paper.  相似文献   

19.
Parametric families of multivariate nonnormal distributions have received considerable attention in the past few decades. The authors propose a new definition of a selection distribution that encompasses many existing families of multivariate skewed distributions. Their work is motivated by examples that involve various forms of selection mechanisms and lead to skewed distributions. They give the main properties of selection distributions and show how various families of multivariate skewed distributions, such as the skew‐normal and skew‐elliptical distributions, arise as special cases. The authors further introduce several methods of constructing selection distributions based on linear and nonlinear selection mechanisms.  相似文献   

20.
Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号