首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
This paper is intended to assist professors, administrators, librarians and other members of university level committees that must consider research expectations and research quality in academic fields that they lack. While this is not a problem for field experts, it is a difficulty when people are asked to make decisions in areas of study other than their own. This is commonly the case for senior university professors, librarians and administrators in regards to university wide decisions. The paper investigates this gap, through a study of 27 academic fields in 348 highly regarded universities. We find that there are almost always statistically significant differences in activity between academic fields, regardless of the metric one considers. However, it is possible to understand these differences by comparing the distribution of a known academic field to that of a field that one is not familiar with. Tables and information are provided to assist in the comparison of different fields of study on metrics such as: departmental publications and researcher level metrics of publications, citations, H-index, and total number of co-authors. The information can also be used to support decisions associated with promotion to senior posts such as endowed chairs and professorships. Information regarding specific universities and researchers are included in the data supplement.  相似文献   

2.
This article introduces BestClass, a set of SAS macros, available in the mainframe and workstation environment, designed for solving two-group classification problems using a class of recently developed nonparametric classification methods. The criteria used to estimate the classification function are based on either minimizing a function of the absolute deviations from the surface which separates the groups, or directly minimizing a function of the number of misclassified entities in the training sample. The solution techniques used by BestClass to estimate the classification rule use the mathematical programming routines of the SAS/OR software. Recently, a number of research studies have reported that under certain data conditions this class of classification methods can provide more accurate classification results than existing methods, such as Fisher's linear discriminant function and logistic regression. However, these robust classification methods have not yet been implemented in the major statistical packages, and hence are beyond the reach of those statistical analysts who are unfamiliar with mathematical programming techniques. We use a limited simulation experiment and an example to compare and contrast properties of the methods included in Best-Class with existing parametric and nonparametric methods. We believe that BestClass contributes significantly to the field of nonparametric classification analysis, in that it provides the statistical community with convenient access to this recently developed class of methods. BestClass is available from the authors.  相似文献   

3.
Equal values are common when rank methods are applied to rounded data or data consisting solely of small integers. A popular technique for resolving ties in rank correlation is the mid-rank method: the mean of the rankings remains unaltered, but the variance is reduced and modified according to the number and location of ties. Although other methods for breaking ties were proposed in the literature as early as 1939, no such procedure has gained such wide acceptance as mid-ranks. This research analyses various techniques for assigning ranks to tied values, with two objectives: (1) to enable the computation of rank correlation coefficients, such as those of Spearman, Kendall and Gini, by using the usual definition applied in the absence of ties, and (2) to determine whether it really makes a difference which of the various techniques is selected and, if so, which technique is most appropriate for a given application.  相似文献   

4.
Identification of influential genes and clinical covariates on the survival of patients is crucial because it can lead us to better understanding of underlying mechanism of diseases and better prediction models. Most of variable selection methods in penalized Cox models cannot deal properly with categorical variables such as gender and family history. The group lasso penalty can combine clinical and genomic covariates effectively. In this article, we introduce an optimization algorithm for Cox regression with group lasso penalty. We compare our method with other methods on simulated and real microarray data sets.  相似文献   

5.
判别企业生命周期的新方法——构面偏差法   总被引:1,自引:0,他引:1  
企业生命周期的判别方法存在指标单一、可操作性不强等诸多的问题,成为制约基于生命周期的相关企业管理研究的一个瓶颈。将构面偏差的方法应用到企业生命周期的判别中,详细阐述了基于构面偏差的企业生命周期判别方法的基本原理及应用,并以478个企业数据为例,详细说明了此方法的操作步骤,为基于生命周期的相关研究提供了科学的依据。  相似文献   

6.
Abstract. We review and extend some statistical tools that have proved useful for analysing functional data. Functional data analysis primarily is designed for the analysis of random trajectories and infinite‐dimensional data, and there exists a need for the development of adequate statistical estimation and inference techniques. While this field is in flux, some methods have proven useful. These include warping methods, functional principal component analysis, and conditioning under Gaussian assumptions for the case of sparse data. The latter is a recent development that may provide a bridge between functional and more classical longitudinal data analysis. Besides presenting a brief review of functional principal components and functional regression, we develop some concepts for estimating functional principal component scores in the sparse situation. An extension of the so‐called generalized functional linear model to the case of sparse longitudinal predictors is proposed. This extension includes functional binary regression models for longitudinal data and is illustrated with data on primary biliary cirrhosis.  相似文献   

7.
In this paper we deal with a Bayesian analysis for right-censored survival data suitable for populations with a cure rate. We consider a cure rate model based on the negative binomial distribution, encompassing as a special case the promotion time cure model. Bayesian analysis is based on Markov chain Monte Carlo (MCMC) methods. We also present some discussion on model selection and an illustration with a real data set.  相似文献   

8.
We investigate the impact of some characteristics of friendship networks on the timing of the first sexual intercourse. We assume that the gender-segregated composition of such networks explains part of the particularly late age at first intercourse in Italy. We use new data from a survey on sexual behavior and reproductive health of Italian first and second-year university students. The survey has been carried out in 15 different universities in 2000-2001 and it includes retrospective data on age at first intercourse, as well as retrospectively-collected time-varying measures for the gender composition of the friendship network at different ages, for almost 5,000 cases. After having described the data as transition frequencies, we use a Cox proportional hazards model with time-varying covariates. Results are in accordance with the hypothesis that having friendship networks that include more members of the other gender and talking about sex with friends increases the relative risk of first sexual intercourse.  相似文献   

9.
10.
Statistics, as one of the applied sciences, has great impacts in vast area of other sciences. Prediction of protein structures with great emphasize on their geometrical features using dihedral angles has invoked the new branch of statistics, known as directional statistics. One of the available biological techniques to predict is molecular dynamics simulations producing high-dimensional molecular structure data. Hence, it is expected that the principal component analysis (PCA) can response some related statistical problems particulary to reduce dimensions of the involved variables. Since the dihedral angles are variables on non-Euclidean space (their locus is the torus), it is expected that direct implementation of PCA does not provide great information in this case. The principal geodesic analysis is one of the recent methods to reduce the dimensions in the non-Euclidean case. A procedure to utilize this technique for reducing the dimension of a set of dihedral angles is highlighted in this paper. We further propose an extension of this tool, implemented in such way the torus is approximated by the product of two unit circle and evaluate its application in studying a real data set. A comparison of this technique with some previous methods is also undertaken.  相似文献   

11.
Regression analysis is one of methods widely used in prediction problems. Although there are many methods used for parameter estimation in regression analysis, ordinary least squares (OLS) technique is the most commonly used one among them. However, this technique is highly sensitive to outlier observation. Therefore, in literature, robust techniques are suggested when data set includes outlier observation. Besides, in prediction a problem, using the techniques that reduce the effectiveness of outlier and using the median as a target function rather than an error mean will be more successful in modeling these kinds of data. In this study, a new parameter estimation method using the median of absolute rate obtained by division of the difference between observation values and predicted values by the observation value and based on particle swarm optimization was proposed. The performance of the proposed method was evaluated with a simulation study by comparing it with OLS and some other robust methods in the literature.  相似文献   

12.
This article advances a proposal for building up adjusted composite indicators of the quality of university courses from students’ assessments. The flexible framework of Generalized Item Response Models is adopted here for controlling the sources of heterogeneity in the data structure that make evaluations across courses not directly comparable. Specifically, it allows us to: jointly model students’ ratings to the set of items which define the quality of university courses; explicitly consider the dimensionality of the items composing the evaluation form; evaluate and remove the effect of potential confounding factors which may affect students’ evaluation; model the intra-cluster variability at course level. The approach simultaneously deals with: (i) multilevel data structure; (ii) multidimensional latent trait; (iii) personal explanatory latent regression models. The paper pays attention to the potential of such a flexible approach in the analysis of students evaluation of university courses in order to explore both how the quality of the different aspects (teaching, management, etc.) is perceived by students and how to make meaningful comparisons across them on the basis of adjusted indicators.  相似文献   

13.
The wide-ranging and rapidly evolving nature of ecological studies mean that it is not possible to cover all existing and emerging techniques for analyzing multivariate data. However, two important methods enticed many followers: the Canonical Correspondence Analysis (CCA) and the STATICO analysis. Despite the particular characteristics of each, they have similarities and differences, which when analyzed properly, can, together, provide important complementary results to those that are usually exploited by researchers. If on one hand, the use of CCA is completely generalized and implemented, solving many problems formulated by ecologists, on the other hand, this method has some weaknesses mainly caused by the imposition of the number of variables that is required to be applied (much higher in comparison with samples). Also, the STATICO method has no such restrictions, but requires that the number of variables (species or environment) is the same in each time or space. Yet, the STATICO method presents information that can be more detailed since it allows visualizing the variability within groups (either in time or space). In this study, the data needed for implementing these methods are sketched, as well as the comparison is made showing the advantages and disadvantages of each method. The treated ecological data are a sequence of pairs of ecological tables, where species abundances and environmental variables are measured at different, specified locations, over the course of time.  相似文献   

14.
Two-stage least squares estimation in a simultaneous equations model has several desirable properties under the problem of multicollinearity. So, various kinds of improved estimation techniques can be developed to deal with the problem of multicollinearity. One of them is ridge regression estimation that can be applied at both stages and defined in Vinod and Ullah [Recent advances in regression methods. New York: Marcel Dekker; 1981]. We propose three different kinds of Liu estimators that are named by their implementation stages. Mean square errors are derived to compare the performances of the mentioned estimators and two different choices of the biasing parameter are offered. Moreover, a numerical example is given with a data analysis based on the Klein Model I and a Monte Carlo experiment is conducted.  相似文献   

15.
In recent years different approaches for the analysis of time-to-event data in the presence of competing risks, i.e. when subjects can fail from one of two or more mutually exclusive types of event, were introduced. Different approaches for the analysis of competing risks data, focusing either on cause-specific or subdistribution hazard rates, were presented in statistical literature. Many new approaches use complicated weighting techniques or resampling methods, not allowing an analytical evaluation of these methods. Simulation studies often replace analytical comparisons, since they can be performed more easily and allow investigation of non-standard scenarios. For adequate simulation studies the generation of appropriate random numbers is essential. We present an approach to generate competing risks data following flexible prespecified subdistribution hazards. Event times and types are simulated using possibly time-dependent cause-specific hazards, chosen in a way that the generated data will follow the desired subdistribution hazards or hazard ratios, respectively.  相似文献   

16.

Motivated by the study of traffic accidents on a road network, we discuss the estimation of the relative risk, the ratio of rates of occurrence of different types of events occurring on a network of lines. Methods developed for two-dimensional spatial point patterns can be adapted to a linear network, but their requirements and performance are very different on a network. Computation is slow and we introduce new techniques to accelerate it. Intensities (occurrence rates) are estimated by kernel smoothing using the heat kernel on the network. The main methodological problem is bandwidth selection. Binary regression methods, such as likelihood cross-validation and least squares cross-validation, perform tolerably well in our simulation experiments, but the Kelsall–Diggle density-ratio cross-validation method does not. We find a theoretical explanation, and propose a modification of the Kelsall–Diggle method which has better performance. The methods are applied to traffic accidents in a regional city, and to protrusions on the dendritic tree of a neuron.

  相似文献   

17.
Abstract.  Multivariate correlated failure time data arise in many medical and scientific settings. In the analysis of such data, it is important to use models where the parameters have simple interpretations. In this paper, we formulate a model for bivariate survival data based on the Plackett distribution. The model is an alternative to the Gamma frailty model proposed by Clayton and Oakes. The parameter in this distribution has a very appealing odds ratio interpretation for dependence between the two failure times; in addition, it allows for negative dependence. We develop novel semiparametric estimation and inference procedures for the model. The asymptotic results of the estimator are developed. The performance of the proposed techniques in finite samples is examined using simulation studies; in addition, the proposed methods are applied to data from an observational study in cancer.  相似文献   

18.
In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data.  相似文献   

19.
Abstract. We, as statisticians, are living in interesting times. New scientifically significant questions are waiting for our contributions, new data accumulate at a fast rate, and the rapid increase of computing power gives us unprecedented opportunities to meet these challenges. Yet, many members of our community are still turning the old wheel as if nothing dramatic had happened. There are ideas, methods and techniques which are commonly used but outdated and should be replaced by new ones. Can we expect to see, as has been suggested, a consolidation of statistical methodologies towards a new synthesis, or is perhaps an even wider separation and greater divergence the more likely scenario? In this talk these issues are discussed, and some conjectures and suggestions are made.  相似文献   

20.
ADE-4: a multivariate analysis and graphical display software   总被引:59,自引:0,他引:59  
We present ADE-4, a multivariate analysis and graphical display software. Multivariate analysis methods available in ADE-4 include usual one-table methods like principal component analysis and correspondence analysis, spatial data analysis methods (using a total variance decomposition into local and global components, analogous to Moran and Geary indices), discriminant analysis and within/between groups analyses, many linear regression methods including lowess and polynomial regression, multiple and PLS (partial least squares) regression and orthogonal regression (principal component regression), projection methods like principal component analysis on instrumental variables, canonical correspondence analysis and many other variants, coinertia analysis and the RLQ method, and several three-way table (k-table) analysis methods. Graphical display techniques include an automatic collection of elementary graphics corresponding to groups of rows or to columns in the data table, thus providing a very efficient way for automatic k-table graphics and geographical mapping options. A dynamic graphic module allows interactive operations like searching, zooming, selection of points, and display of data values on factor maps. The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for non- specialists in statistics, data analysis or computer science.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号