共查询到20条相似文献,搜索用时 15 毫秒
1.
The purpose of this paper is to identify a relationship between pupils' mathematics and reading test scores and the characteristics of students themselves, stratifying for classes, schools and geographical areas. The data set of interest contains detailed information about more than 500,000 students at the first year of junior secondary school in the year 2012/2013, provided by the Italian Institute for the Evaluation of Educational System. The innovation of this work is in the use of multivariate multilevel models, in which the outcome is bivariate: reading and mathematics achievement. Using the bivariate outcome enables researchers to analyze the correlations between achievement levels in the two fields and to predict statistically significant school and class effects after adjusting for pupil's characteristics. The statistical model employed here explicates account for the potential covariance between the two topics, and at the same time it allows the school effect to vary among them. The results show that while for most cases the direction of school's effect is coherent for reading and mathematics (i.e. positive/negative), there are cases where internal school factors lead to different performances in the two fields. 相似文献
2.
Shelley B. Bull Celia M.T. Greenwood Allan Donner 《Revue canadienne de statistique》1994,22(3):319-334
One feature of the usual polychotomous logistic regression model for categorical outcomes is that a covariate must be included in all the regression equations. If a covariate is not important in all of them, the procedure will estimate unnecessary parameters. More flexible approaches allow different subsets of covariates in different regressions. One alternative uses individualized regressions which express the polychotomous model as a series of dichotomous models. Another uses a model in which a reduced set of parameters is simultaneously estimated for all the regressions. Large-sample efficiencies of these procedures were compared in a variety of circumstances in which there was a common baseline category for the outcome and the covariates were normally distributed. For a correctly specified model, the reduced estimates were over 100% efficient for nonzero slope parameters and up to 500% efficient when the baseline frequency and the effect of interest were small. The individualized estimates could have efficiencies less than 50% when the effect of interest was large, but were also up to 130% efficient when the baseline frequency was large and the effect of interest was small. Efficiency was usually enhanced by correlation among the covariates. For an underspecified reduced model, asymptotic bias in the reduced estimates was approximately proportional to the magnitude of the omitted parameter and to the reciprocal of the baseline frequency. 相似文献
3.
Folefac D. Atem Ravi K. Sharma Stewart J. Anderson 《Journal of applied statistics》2011,38(9):1819-1831
Using data from the National Health interview Survey from 1997 to 2006, we present a multilevel analysis of change in body mass index (BMI) and number of cigarettes smoked per day in the USA. Smoking and obesity are the leading causes of preventable mortality and morbidity in the USA and most parts of the developed world. A two-stage bivariate model of changes in obesity and number of cigarette smoked per day is proposed. At the within subject stage, an individual's BMI status and the number of cigarette smoked per day are jointly modeled as a function of an individual growth trajectory plus a random error. At the between-subject stage, the parameters of the individual growth trajectories are allowed to vary as a function of differences between subjects with respect to demographic and behavioral characteristics and with respect to the four regions of the USA (Northeast, West, South and North central). Our two-stage modeling techniques are more informative than standard regression because they characterize both group-level (nomothetic) and individual-level (idiographic) effects, yielding a more complete understanding of the phenomena under study. 相似文献
4.
Peter C. Austin George Leckie 《Journal of Statistical Computation and Simulation》2018,88(16):3151-3163
When using multilevel regression models that incorporate cluster-specific random effects, the Wald and the likelihood ratio (LR) tests are used for testing the null hypothesis that the variance of the random effects distribution is equal to zero. We conducted a series of Monte Carlo simulations to examine the effect of the number of clusters and the number of subjects per cluster on the statistical power to detect a non-null random effects variance and to compare the empirical type I error rates of the Wald and LR tests. Statistical power increased with increasing number of clusters and number of subjects per cluster. Statistical power was greater for the LR test than for the Wald test. These results applied to both the linear and logistic regressions, but were more pronounced for the latter. The use of the LR test is preferable to the use of the Wald test. 相似文献
5.
《Journal of Statistical Computation and Simulation》2012,82(2):252-261
Multilevel models are popular models for analysing data inheriting a hierarchical structure. They are used in diverse fields including social, medical, economical and biological sciences. These models encounter some problems in estimating the parameters, if there are measurement errors in either explanatory or response variables. A common approach to tackle this obstacle is to consider the pseudo variables and follow some simulation methods to estimate the parameters. We propose a new algorithm constituting the iterative and simulation extrapolation steps in turn. To evaluate the proposed algorithm, various simulation studies are also conducted. Moreover, we investigate the implementation of our method on a real data set concerning the cost and expenditure of the households in Tehran city in the year 2007. 相似文献
6.
This paper develops alternatives to maximum likelihood estimators (MLE) for logistic regression models and compares the mean squared error (MSE) of the estimators. The MLE for the vector of underlying success probabilities has low MSE only when the true probabilities are extreme (i.e., near 0 or 1). Extreme probabilities correspond to logistic regression parameter vectors which are large in norm. A competing “restricted” MLE and an empirical version of it are suggested as estimators with better performance than the MLE for central probabilities. An approximate EM-algorithm for estimating the restriction is described. As in the case of normal theory ridge estimators, the proposed estimators are shown to be formally derivable by Bayes and empirical Bayes arguments. The small sample operating characteristics of the proposed estimators are compared to the MLE via a simulation study; both the estimation of individual probabilities and of logistic parameters are considered. 相似文献
7.
In this paper, we obtain a new approximation of the Student's t distribution by using the symmetric generalized logistic (SGL) distribution function. The error of this approximation is shown to be 0(1/n2 )where nis the degrees of freedom of thetdistribution. In comparison to similar approximations by George and Ojo and George et al. (1986), this new approximation is much simpler and more accurate. It is also shown that under some conditions, the tdistribution is a good approximation of the SGL distribution. Therefore, the complicated expressions for the cumulants and moments of the SGL can be approximated by those of the t, distribution. Finally, numerical results are given. 相似文献
8.
In a multilevel model for complex survey data, the weight‐inflated estimators of variance components can be biased. We propose a resampling method to correct this bias. The performance of the bias corrected estimators is studied through simulations using populations generated from a simple random effects model. The simulations show that, without lowering the precision, the proposed procedure can reduce the bias of the estimators, especially for designs that are both informative and have small cluster sizes. Application of these resampling procedures to data from an artificial workplace survey provides further evidence for the empirical value of this method. The Canadian Journal of Statistics 40: 150–171; 2012 © 2012 Statistical Society of Canada 相似文献
9.
Marta Blangiardo 《Journal of applied statistics》2014,41(10):2312-2322
The Eurovision Song Contest is an annual musical competition held among active members of the European Broadcasting Union since 1956. The event is televised live across Europe. Each participating country presents a song and receive a vote based on a combination of tele-voting and jury. Over the years, this has led to speculations of tactical voting, discriminating against some participants and thus inducing bias in the final results. In this paper we investigate the presence of positive or negative bias (which may roughly indicate favouritisms or discrimination) in the votes based on geographical proximity, migration and cultural characteristics of the participating countries through a Bayesian hierarchical model. Our analysis found no evidence of negative bias, although mild positive bias does seem to emerge systematically, linking voters to performers. 相似文献
10.
It sometimes occurs that one or more components of the data exert a disproportionate influence on the model estimation. We need a reliable tool for identifying such troublesome cases in order to decide either eliminate from the sample, when the data collect was badly realized, or otherwise take care on the use of the model because the results could be affected by such components. Since a measure for detecting influential cases in linear regression setting was proposed by Cook [Detection of influential observations in linear regression, Technometrics 19 (1977), pp. 15–18.], apart from the same measure for other models, several new measures have been suggested as single-case diagnostics. For most of them some cutoff values have been recommended (see [D.A. Belsley, E. Kuh, and R.E. Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, 2nd ed., John Wiley & Sons, New York, Chichester, Brisban, (2004).], for instance), however the lack of a quantile type cutoff for Cook's statistics has induced the analyst to deal only with index plots as worthy diagnostic tools. Focussed on logistic regression, the aim of this paper is to provide the asymptotic distribution of Cook's distance in order to look for a meaningful cutoff point for detecting influential and leverage observations. 相似文献
11.
K. Vaitheeswaran M. Subbiah R. Ramakrishnan T. Kannan 《Journal of applied statistics》2016,43(12):2254-2260
Estimating the risk factors of a disease such as diabetic retinopathy (DR) is one of the important research problems among bio-medical and statistical practitioners as well as epidemiologists. Incidentally many studies have focused in building models with binary outcomes, that may not exploit the available information. This article has investigated the importance of retaining the ordinal nature of the response variable (e.g. severity level of a disease) while determining the risk factors associated with DR. A generalized linear model approach with appropriate link functions has been studied using both Classical and Bayesian frameworks. From the result of this study, it can be observed that the ordinal logistic regression with probit link function could be more appropriate approach in determining the risk factors of DR. The study has emphasized the ways to handle the ordinal nature of the response variable with better model fit compared to other link functions. 相似文献
12.
Nilgun Fescioglu-Unver 《Journal of applied statistics》2013,40(4):712-720
A merger proposal discloses a bidder firm's desire to purchase the control rights in a target firm. Predicting who will propose (bidder candidacy) and who will receive (target candidacy) merger bids is important to investigate why firms merge and to measure the price impact of mergers. This study investigates the performance of artificial neural networks and multinomial logit models in predicting bidder and target candidacy. We use a comprehensive data set that covers the years 1979–2004 and includes all deals with publicly listed bidders and targets. We find that both models perform similarly while predicting target and non-merger firms. The multinomial logit model performs slightly better in predicting bidder firms. 相似文献
13.
《Journal of Statistical Computation and Simulation》2012,82(8):765-779
We present a variational estimation method for the mixed logistic regression model. The method is based on a lower bound approximation of the logistic function [Jaakkola, J.S. and Jordan, M.I., 2000, Bayesian parameter estimation via variational methods. Statistics & Computing, 10, 25–37.]. Based on the approximation, an EM algorithm can be derived that results in a considerable simplification of the maximization problem in that it does not require the numerical evaluation of integrals over the random effects. We assess the performance of the variational method for the mixed logistic regression model in a simulation study and an empirical data example, and compare it to Laplace's method. The results indicate that the variational method is a viable choice for estimating the fixed effects of the mixed logistic regression model under the condition that the number of outcomes within each cluster is sufficiently high. 相似文献
14.
Methods for the simultaneous analysis of the relationships of binary variables for efficacy and toxicity to dosage of an experimental drug are developed. Properties of two models of ‘within-dose’ dependence of efficacy and toxicity in parallel designs - one a bivariate analogue of the familiar univariate logistic model, and the other an adaptation of a general model developed by D.R. Cox– are explored. The cell probabilities predicted by these models are often quite similar to those predicted by a model of independence of efficacy and toxicity, but large discrepancies can occur when there is approximate equality of the median effective and median toxic doses. Asymptotic variances of estimates of parameters involved in assessing correlation are large when there is little or no dependence in the data, but parameters can be estimated with good precision in at least some cases of moderate to strong dependence between efficacy and toxicity. 相似文献
15.
We present an approximate leaving-one-out technique for estimating the error rate in logistic discrimination. The new measure is based on the one-step approximation of a(i), the maximum likelihood estimate of the parameter vector based on the sample without the ith case. Some inequalities between the resubstitution error rate, the approximate and exact leaving-one-out error rates for the multiple group logistic model are investigated. Monte-Carlo simulations assess the adequacy of the approximate leaving-one-out method as an estimate of the actual error rate. The usefulness of this approach is demonstrated by means of two medical examples. 相似文献
16.
Hu Xuemei 《统计学通讯:模拟与计算》2017,46(4):2756-2768
This article introduces the robust indirect technique for the slightly contaminated stochastic logistic population models. Based on discrete sampled data with a fixed unit of time between two consecutive observations, we not only construct the robust indirect inference generalized method of moments (GMM) estimator for the model parameters, but also propose a likelihood-ratio-type indirect statistic and a robust indirect GMM saddle-point statistic for testing the parameters of interest. In addition, we develop the robust exponential tilting estimator and the robust exponential tilting test to improve their small sample performances. Finally, their finite-sample properties are studied through Monte Carlo experiments. 相似文献
17.
Steven B. Caudill 《Journal of applied statistics》2016,43(7):1253-1261
Hedonic price models are commonly used in the study of markets for various goods, most notably those for wine, art, and jewelry. These models were developed to estimate implicit prices of product attributes within a given product class, where in the case of some goods, such as wine, substantial product differentiation exists. To address this issue, recent research on wine prices employs local polynomial regression clustering (LPRC) for estimating regression models under class uncertainty. This study demonstrates that a superior empirical approach – estimation of a mixture model – is applicable to a hedonic model of wine prices, provided only that the dependent variable in the model is rescaled. The present study also catalogues several of the advantages over LPRC modeling of estimating mixture models. 相似文献
18.
《Journal of Statistical Computation and Simulation》2012,82(10):771-785
An algorithm is presented for calculating the power for the logistic and proportional hazards models in which some of the covariates are discrete and the remainders are multivariate normal. The mean and covariance matrix of the multivariate normal covariates may depend on the discrete covariates. The algorithm, which finds the power of the Wald test, uses the result that the information matrix can be calculated using univariate numerical integration even when there are several continuous covariates. The algorithm is checked using simulation and in certain situations gives more accurate results than current methods which are based on simple formulae. The algorithm is used to explore properties of these models, in particular, the power gain from a prognostic covariate in the analysis of a clinical trial or observational study. The methods can be extended to determine power for other generalized linear models. 相似文献
19.
Nicholas J. Horton Garrett M. Fitzmaurice 《Journal of the Royal Statistical Society. Series C, Applied statistics》2002,51(3):281-295
Summary. Missing observations are a common problem that complicate the analysis of clustered data. In the Connecticut child surveys of childhood psychopathology, it was possible to identify reasons why outcomes were not observed. Of note, some of these causes of missingness may be assumed to be ignorable , whereas others may be non-ignorable . We consider logistic regression models for incomplete bivariate binary outcomes and propose mixture models that permit estimation assuming that there are two distinct types of missingness mechanisms: one that is ignorable; the other non-ignorable. A feature of the mixture modelling approach is that additional analyses to assess the sensitivity to assumptions about the missingness are relatively straightforward to incorporate. The methods were developed for analysing data from the Connecticut child surveys, where there are missing informant reports of child psychopathology and different reasons for missingness can be distinguished. 相似文献