首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Summary.  We present an approach to the construction of clusters of life course trajectories and use it to obtain ideal types of trajectories that can be interpreted and analysed meaningfully. We represent life courses as sequences on a monthly timescale and apply optimal matching analysis to compute dissimilarities between individuals. We introduce a new divisive clustering algorithm which has features that are in common with both Ward's agglomerative algorithm and classification and regression trees. We analyse British Household Panel Survey data on the employment and family trajectories of women. Our method produces clusters of sequences for which it is straightforward to determine who belongs to each cluster, making it easier to interpret the relative importance of life course factors in distinguishing subgroups of the population. Moreover our method gives guidance on selecting the number of clusters.  相似文献   

2.
The wide-ranging and rapidly evolving nature of ecological studies mean that it is not possible to cover all existing and emerging techniques for analyzing multivariate data. However, two important methods enticed many followers: the Canonical Correspondence Analysis (CCA) and the STATICO analysis. Despite the particular characteristics of each, they have similarities and differences, which when analyzed properly, can, together, provide important complementary results to those that are usually exploited by researchers. If on one hand, the use of CCA is completely generalized and implemented, solving many problems formulated by ecologists, on the other hand, this method has some weaknesses mainly caused by the imposition of the number of variables that is required to be applied (much higher in comparison with samples). Also, the STATICO method has no such restrictions, but requires that the number of variables (species or environment) is the same in each time or space. Yet, the STATICO method presents information that can be more detailed since it allows visualizing the variability within groups (either in time or space). In this study, the data needed for implementing these methods are sketched, as well as the comparison is made showing the advantages and disadvantages of each method. The treated ecological data are a sequence of pairs of ecological tables, where species abundances and environmental variables are measured at different, specified locations, over the course of time.  相似文献   

3.
Summary.  Sequence analysis has become one of the most used and discussed tools to describe life course trajectories. We introduce a new tool for the graphical exploratory analysis of sequences. Our plots combine standard sequence plots with the results that are provided by multi-dimensional scaling. We apply our procedure to describe work and family careers of Israeli women by using data from the Israel Social Mobility Survey. We first focus on some preliminary choices relative to the definition of the sequences: the age span, the length of the sequences and the set of states registered in each time period. We then describe how our plots can be used to gain insights about the main features of sequences and about the relationships between sequences and external information.  相似文献   

4.
Abstract. We review and extend some statistical tools that have proved useful for analysing functional data. Functional data analysis primarily is designed for the analysis of random trajectories and infinite‐dimensional data, and there exists a need for the development of adequate statistical estimation and inference techniques. While this field is in flux, some methods have proven useful. These include warping methods, functional principal component analysis, and conditioning under Gaussian assumptions for the case of sparse data. The latter is a recent development that may provide a bridge between functional and more classical longitudinal data analysis. Besides presenting a brief review of functional principal components and functional regression, we develop some concepts for estimating functional principal component scores in the sparse situation. An extension of the so‐called generalized functional linear model to the case of sparse longitudinal predictors is proposed. This extension includes functional binary regression models for longitudinal data and is illustrated with data on primary biliary cirrhosis.  相似文献   

5.
In this study, classical and robust principal component analyses are used to evaluate socioeconomic development of regions of development agencies that give service on the purpose of decreasing development difference among regions in Turkey. Due to the high differences between development levels of regions outlier problem occurs, hence robust statistical methods are used. Also, classical and robust statistical methods are used to investigate if there are any outliers in data set. In classic principal component analyse, the number of observations must be larger than the number of variables. Otherwise determinant of covariance matrix is zero. In Robust method for Principal Component Analysis (ROBPCA), a robust approach to principal component analyse in high-dimensional data, even if the number of variables is larger than the number of observations, principal components are obtained. In this paper, firstly 26 development agencies are evaluated with 19 variables by using principal component analysis based on classical and robust scatter matrices and then these 26 development agencies are evaluated with 46 variables by using the ROBPCA method.  相似文献   

6.
The effect of nonstationarity in time series columns of input data in principal components analysis is examined. Nonstationarity are very common among economic indicators collected over time. They are subsequently summarized into fewer indices for purposes of monitoring. Due to the simultaneous drifting of the nonstationary time series usually caused by the trend, the first component averages all the variables without necessarily reducing dimensionality. Sparse principal components analysis can be used, but attainment of sparsity among the loadings (hence, dimension-reduction is achieved) is influenced by the choice of parameter(s) (λ 1,i ). Simulated data with more variables than the number of observations and with different patterns of cross-correlations and autocorrelations were used to illustrate the advantages of sparse principal components analysis over ordinary principal components analysis. Sparse component loadings for nonstationary time series data can be achieved provided that appropriate values of λ 1,j are used. We provide the range of values of λ 1,j that will ensure convergence of the sparse principal components algorithm and consequently achieve sparsity of component loadings.  相似文献   

7.
杭斌  修磊 《统计研究》2015,32(12):54-61
本文基于CFPS微观跟踪调查数据,从地位寻求角度分析了住房攀比行为对城镇家庭消费的影响。结果表明:①家庭对自身社会地位等级的主观评价与住房面积、消费水平、户主学历等因素呈显著的正相关关系;②住房面积攀比对城镇家庭消费有明显的抑制作用;③社会地位高的家庭的消费行为对其它家庭的示范效应显著,即中国城镇家庭存在向上攀比的地位寻求动机。  相似文献   

8.
Medical images and genetic assays typically generate data with more variables than subjects. Scientists may use a two-step approach for testing hypotheses about Gaussian mean vectors. In the first step, principal components analysis (PCA) selects a set of sample components fewer in number than the sample size. In the second step, applying classical multivariate analysis of variance (MANOVA) methods to the reduced set of variables provides the desired hypothesis tests. Simulation results presented here indicate that success of the PCA in the first step requires nearly all variation to occur in population components far fewer in number than the number of subjects. In the second step, multivariate tests fail to attain reasonable power except in restrictive, favorable cases. The results encourage using other approaches discussed in the article to provide dependable hypothesis testing with high dimension, low sample size data (HDLSS).  相似文献   

9.
高新技术开发区已成为中国自主创新的重要载体,对经济发展起着关键的支撑作用。选取企业数和从业人员数为投入变量,以工业总产值、出口创汇额以及利税总额为输出变量,运用数据包络分析方法对中国高新技术开发区的技术效率、规模效率和纯技术效率进行了评价,结果显示:1998-2012年高新开发区整体的技术效率、规模效率和纯技术效率呈增长趋势。同时,通过对中国高新开发区主要指标数据进行向量自回归分析,建立了中国高新技术开发区发展的主要指标的预测模型,为高新技术开发区经济计划的制定提供政策依据。  相似文献   

10.
The use of large-dimensional factor models in forecasting has received much attention in the literature with the consensus being that improvements on forecasts can be achieved when comparing with standard models. However, recent contributions in the literature have demonstrated that care needs to be taken when choosing which variables to include in the model. A number of different approaches to determining these variables have been put forward. These are, however, often based on ad hoc procedures or abandon the underlying theoretical factor model. In this article, we will take a different approach to the problem by using the least absolute shrinkage and selection operator (LASSO) as a variable selection method to choose between the possible variables and thus obtain sparse loadings from which factors or diffusion indexes can be formed. This allows us to build a more parsimonious factor model that is better suited for forecasting compared to the traditional principal components (PC) approach. We provide an asymptotic analysis of the estimator and illustrate its merits empirically in a forecasting experiment based on U.S. macroeconomic data. Overall we find that compared to PC we obtain improvements in forecasting accuracy and thus find it to be an important alternative to PC. Supplementary materials for this article are available online.  相似文献   

11.
The difference between a path analysis and the other multivariate analyses is that the path analysis has the ability to compute the indirect effects apart from the direct effects. The aim of this study is to investigate the distribution of indirect effects that is one of the components of path analysis via generated data. To realize this, a simulation study has been conducted with four different sample sizes, three different numbers of explanatory variables and with three different correlation matrices. A replication of 1000 has been applied for every single combination. According to the results obtained, it is found that irrespective of the sample size path coefficients tend to be stable. Moreover, path coefficients are not affected by correlation types either. Since the replication number is 1000, which is fairly large, the indirect effects from the path models have been treated as normal and their confidence intervals have been presented as well. It is also found that the path analysis should not be used with three explanatory variables. We think that this study would help scientists who are working in both natural and social sciences to determine sample size and different number of variables in the path analysis.  相似文献   

12.
国民幸福感是经济社会发展和公共政策的终极目标。借鉴已有研究文献的幸福指数量表,本文构建一套适合于测度中国国民幸福感的指标体系,并通过全国性的问卷调查获取相关数据。为了有效地选择重要变量和消除估计偏差,本文采用新近发展的重要统计方法LASSO筛选法,先从6个个人特征变量和40个维度变量中筛选重要变量,然后再进行回归系数估计与显著性检验。回归结果表明:①性别、婚否和学历水平3个个人特征变量对幸福感有显著的影响;②9个维度变量通过显著性检验,其中家庭生活满意度、自我价值评价、社会福利保障满意度和生活方式健康度评价对幸福感的影响最为显著。在此基础上,本文又分男女、城乡、南北三个组别进行考察。最后,提出旨在提高国民幸福感的政策举措。  相似文献   

13.
杭斌 《统计研究》2014,31(9):31-36
住房不仅兼有消费和投资功能,而且还是家庭社会地位的象征。对社会地位的关注会促使人们把更多的资源从非地位性商品转向地位性商品,从而扭曲资源配置。本文首次基于地位寻求理论,利用CHFS数据对中国城市家庭的住房需求行为和消费行为进行了实证分析。结果表明:1.中国城市家庭的住房具有明显的地位特征,家庭居住面积不断扩大与地位攀比有关。2.无论是中低收入家庭还是高收入家庭,住房面积扩大对消费的挤出效应都是显著的。而房价上涨仅抑制了中低收入家庭的消费,对高收入家庭没有显著影响。  相似文献   

14.
Many neuroscience experiments record sequential trajectories where each trajectory consists of oscillations and fluctuations around zero. Such trajectories can be viewed as zero-mean functional data. When there are structural breaks in higher-order moments, it is not always easy to spot these by mere visual inspection. Motivated by this challenging problem in brain signal analysis, we propose a detection and testing procedure to find the change point in functional covariance. The detection procedure is based on the cumulative sum statistics (CUSUM). The fully functional testing procedure relies on a null distribution which depends on infinitely many unknown parameters, though in practice only a finite number of these parameters can be included for the hypothesis test of the existence of change point. This paper provides some theoretical insights on the influence of the number of parameters. Meanwhile, the asymptotic properties of the estimated change point are developed. The effectiveness of the proposed method is numerically validated in simulation studies and an application to investigate changes in rat brain signals following an experimentally-induced stroke.  相似文献   

15.
State fragility is a concept that entered the political discourse in the last decades producing remarkable implications for aid allocation and international policies. The operationalization of this concept has generated a number of composite indices to produce rankings of fragile states. However, the temporal dimension of the driving forces leading to fragility has been rather neglected. This article discusses a statistical procedure that helps to represent the global fragility of a country and the path that a country has followed or will follow in the future when possibly entering into (or escaping from) a fragility condition. Specifically, multiple factor analysis is applied to depict vulnerable and weak countries, and to identify the fundamental forces that determine their overall fragility. Moreover, the trajectories of countries along the years are estimated using partial factor scores. Finally, the path of each country is predicted by means of parsimonious regression models, based on a reduced set of explanatory variables, and according to scenarios elaborated from available international outlooks.  相似文献   

16.
杭斌  余峰 《统计研究》2018,35(7):102-114
笔者认为,收入不平等与家庭消费的关系与信贷约束程度以及家庭社会地位偏好有关。住房是典型的地位性商品,收入差距扩大时,人们为了维持或提高现有的相对地位会努力改善居住条件,住房攀比最终会导致全社会住房面积标准提高和房价上涨。在信贷缺乏的环境中,购房标准提高和房价上涨意味着家庭未来遭遇流动性约束的风险加大,为此,家庭在增加购房预算的同时会抑制日常消费。利用2010年、2012年和2014年的微观跟踪调查数据所做的实证分析支持了我们的观点:(1)周围人群的住房面积的扩大,会促使家庭选择购买更大的房子。并且,攀比效应对住房需求的刺激作用明显大于房价上涨对住房需求的抑制作用。(2)家庭平均住房面积扩大和房价上涨都与收入不平等引发的住房攀比有关。(3)收入不平等对城镇家庭消费皆有拉动作用和抑制作用。(4)潜在流动性约束对家庭消费的抑制作用与家庭地位等级的高低有关。  相似文献   

17.
Open and closed linear dynamic systems are formulated and considered for stationary and integrated data processes. A typology of linear dynamic systems is developed, extending that available for individual dynamic equations. Then an overlapping typology of models of these systems is examined and methods for analysing econometric models are described. General to simple modelling strategies are briefly considered.  相似文献   

18.
陈卓  陈杰 《统计研究》2018,35(7):28-37
近年来政府大力推动“租购并举”以缓解大城市不断上涨的房价压力,但住房租赁体系发展是否能达到抑制房价的效果,这种效应与租房供应主体的关系又是什么样的,在国内外都还并没有直接的相关实证研究。本文综合国家统计局城镇住户调查大样本微观数据及城市层面统计数据组成的面板数据,在地级市层面实证考察了住房租赁部门发展对房价的影响效应,重点考察租房供应主体在这种效应中的作用。包含纠正内生性偏误在内的系列计量分析结果都一致性显示,我国城市居民家庭中的租住家庭占比与当地房价之间存在显著的负向关联。进一步从住房租赁供应主体结构来看,一个城市租赁供应中市场化比重越高,则该城市租住家庭占比与房价的负向关联越强。本文的研究发现具有较强的政策含义,表明推行“租购并举”,不仅应关注租售结构本身,还应加强租房供应主体的多元化,尤其要注意充分发挥市场机制在租赁住房供应中的主导性作用。  相似文献   

19.
Summary.   Data editing is the process by which data that are collected in some way (a statistical survey for example) are examined for errors and corrected with the help of software. Edits, the logical conditions that should be satisfied by the data, are specified by subject-matter experts with a procedure which could be tedious and could lead to mistakes with practical implications. To render the process of edit specification more efficient we provide a new step—the definition of the so-called abstract data model of a survey—which describes the structure of the phenomenon that is studied in a survey. The existence of this model enables experts to identify all combinations of variables which should be checked by edits and to avoid the definition of conflicting edits. Furthermore, we introduce an automatic data validation strategy—TREEVAL—that consists of fast tree growing to derive automatically the functional form of edits and of a statistical criterion to clean the incoming data. The TREEVAL strategy is cast within a total quality management framework. The application of the methodologies proposed is demonstrated with the help of a real life application.  相似文献   

20.
文章选取与寿险市场发展密切相关的经济、社会和文化等方面的12项关键指标,应用因子分析和聚类分析方法对中国30个省、市、自治区的寿险市场资源禀赋情况进行评估和分类,并结合保险密度和保险深度指标对中国区域性寿险市场的发展水平和市场潜力进行了比较分析.结果表明,中国区域性寿险市场资源禀赋差异很大,并且市场资源禀赋的多寡与市场发展水平也并不完全一致.文章最后对这些差异进行了经济解释并提出了相应的意见和建议.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号