首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent work has shown that the Lasso-based regularization is very useful for estimating the high-dimensional inverse covariance matrix. A particularly useful scheme is based on penalizing the ?1 norm of the off-diagonal elements to encourage sparsity. We embed this type of regularization into high-dimensional classification. A two-stage estimation procedure is proposed which first recovers structural zeros of the inverse covariance matrix and then enforces block sparsity by moving non-zeros closer to the main diagonal. We show that the block-diagonal approximation of the inverse covariance matrix leads to an additive classifier, and demonstrate that accounting for the structure can yield better performance accuracy. Effect of the block size on classification is explored, and a class of asymptotically equivalent structure approximations in a high-dimensional setting is specified. We suggest a variable selection at the block level and investigate properties of this procedure in growing dimension asymptotics. We present a consistency result on the feature selection procedure, establish asymptotic lower an upper bounds for the fraction of separative blocks and specify constraints under which the reliable classification with block-wise feature selection can be performed. The relevance and benefits of the proposed approach are illustrated on both simulated and real data.  相似文献   

2.
唐礼智  刘玉 《统计研究》2018,35(2):119-128
通过构建同时包含因变量和误差项空间滞后的随机效应半参数变系数面板模型,拓展了现有模型的灵活性和适应性。采用截面极大似然估计方法得出了参数和非参数的估计,理论证明发现:在一定的正则条件下,所有估计量具有一致性和渐近正态性。数值模拟显示:估计量具有良好的小样本性质,估计精度随着样本容量的增加而增加;空间权重矩阵的选择对估计量的表现没有产生显著差异,但是在Case权重矩阵下,当样本量相同时,空间相关系数的估计偏差随着空间权重结构复杂度的增加而扩大。  相似文献   

3.
Remove unwanted variation (RUV) is an estimation and normalization system in which the underlying correlation structure of a multivariate dataset is estimated from negative control measurements, typically gene expression values, which are assumed to stay constant across experimental conditions. In this paper we derive the weight matrix which is estimated and incorporated into the generalized least squares estimates of RUV-inverse, and show that this weight matrix estimates the average covariance matrix across negative control measurements. RUV-inverse can thus be viewed as an estimation method adjusting for an unknown experimental design. We show that for a balanced incomplete block design (BIBD), RUV-inverse recovers intra- and interblock estimates of the relevant parameters and combines them as a weighted sum just like the best linear unbiased estimator (BLUE), except that the weights are globally estimated from the negative control measurements instead of being individually optimized to each measurement as in the classical, single measurement BIBD BLUE.  相似文献   

4.
We consider a linear regression with the error term that obeys an autoregressive model of infinite order and estimate parameters of the models. The parameters of the autoregressive model should be estimated based on estimated residuals obtained by means of the method of ordinary least squares, because the errors are unobservable. The consistency of the coefficients, variance and spectral density of the model obeyed by the error term is shown. Further, we estimate the coefficients of the linear regression by means of the method of estimated generalized least squares. We also show the consistency of the estimator.

  相似文献   

5.
一种新的空间权重矩阵选择方法   总被引:1,自引:0,他引:1       下载免费PDF全文
任英华  游万海 《统计研究》2012,29(6):99-105
空间权重矩阵选择问题一直是空间计量经济学中的一个难题,权重矩阵的选择正确与否关系到模型的最终估计结果。本文在空间滞后模型框架下,把空间权重矩阵选择问题转化为变量选择问题,然后利用CWB方法进行变量选择。中国城市服务业集聚机理实证研究显示,利用本文所提出的方法所选取的空间权重矩阵较为合理,进而可以减少因为空间权重矩阵误设问题而引起的模型估计偏误。在大样本情形下,该方法可以非常有效地降低计算成本。  相似文献   

6.
In the framework of model-based cluster analysis, finite mixtures of Gaussian components represent an important class of statistical models widely employed for dealing with quantitative variables. Within this class, we propose novel models in which constraints on the component-specific variance matrices allow us to define Gaussian parsimonious clustering models. Specifically, the proposed models are obtained by assuming that the variables can be partitioned into groups resulting to be conditionally independent within components, thus producing component-specific variance matrices with a block diagonal structure. This approach allows us to extend the methods for model-based cluster analysis and to make them more flexible and versatile. In this paper, Gaussian mixture models are studied under the above mentioned assumption. Identifiability conditions are proved and the model parameters are estimated through the maximum likelihood method by using the Expectation-Maximization algorithm. The Bayesian information criterion is proposed for selecting the partition of the variables into conditionally independent groups. The consistency of the use of this criterion is proved under regularity conditions. In order to examine and compare models with different partitions of the set of variables a hierarchical algorithm is suggested. A wide class of parsimonious Gaussian models is also presented by parameterizing the component-variance matrices according to their spectral decomposition. The effectiveness and usefulness of the proposed methodology are illustrated with two examples based on real datasets.  相似文献   

7.
We propose a heterogeneous time-varying panel data model with a latent group structure that allows the coefficients to vary over both individuals and time. We assume that the coefficients change smoothly over time and form different unobserved groups. When treated as smooth functions of time, the individual functional coefficients are heterogeneous across groups but homogeneous within a group. We propose a penalized-sieve-estimation-based classifier-Lasso (C-Lasso) procedure to identify the individuals’ membership and to estimate the group-specific functional coefficients in a single step. The classification exhibits the desirable property of uniform consistency. The C-Lasso estimators and their post-Lasso versions achieve the oracle property so that the group-specific functional coefficients can be estimated as well as if the individuals’ membership were known. Several extensions are discussed. Simulations demonstrate excellent finite sample performance of the approach in both classification and estimation. We apply our method to study the heterogeneous trending behavior of GDP per capita across 91 countries for the period 1960–2012 and find four latent groups.  相似文献   

8.
The investigation of aliases or biases is important for the interpretation of the results from factorial experiments. For two-level fractional factorials this can be facilitated through their group structure. For more general arrays the alias matrix can be used. This tool is traditionally based on the assumption that the error structure is that associated with ordinary least squares. For situations where that is not the case, we provide in this article a generalization of the alias matrix applicable under the generalized least squares assumptions. We also show that for the special case of split plot error structure, the generalized alias matrix simplifies to the ordinary alias matrix.  相似文献   

9.
We present an application of subsampling and bootstrap methods for time series to determine the distribution of the estimator of zero crossings. The zero crossings method provides an alternative estimator of the lag-1 autocorrelation coefficient that is reducing the data storage requirements and is more robust with respect to outliers when compared to the classical estimator. The main results here are showing the consistency of subsampling, the consistency of moving block bootstrap, the consistency of non overlapping block bootstrap and the consistency of stationary bootstrap for this estimator. Theorems are formulated for Gaussian processes, elliptically symmetric processes and processes which are transformed Gaussian processes. Theoretical results are illustrated by simulations and practical data analysis. We have also shown that in practice the MBB method behaves better than the subsampling method.  相似文献   

10.
We explore the performance accuracy of the linear and quadratic classifiers for high-dimensional higher-order data, assuming that the class conditional distributions are multivariate normal with locally doubly exchangeable covariance structure. We derive a two-stage procedure for estimating the covariance matrix: at the first stage, the Lasso-based structure learning is applied to sparsifying the block components within the covariance matrix. At the second stage, the maximum-likelihood estimators of all block-wise parameters are derived assuming the doubly exchangeable within block covariance structure and a Kronecker product structured mean vector. We also study the effect of the block size on the classification performance in the high-dimensional setting and derive a class of asymptotically equivalent block structure approximations, in a sense that the choice of the block size is asymptotically negligible.  相似文献   

11.
Summary.  Alongside the development of meta-analysis as a tool for summarizing research literature, there is renewed interest in broader forms of quantitative synthesis that are aimed at combining evidence from different study designs or evidence on multiple parameters. These have been proposed under various headings: the confidence profile method, cross-design synthesis, hierarchical models and generalized evidence synthesis. Models that are used in health technology assessment are also referred to as representing a synthesis of evidence in a mathematical structure. Here we review alternative approaches to statistical evidence synthesis, and their implications for epidemiology and medical decision-making. The methods include hierarchical models, models informed by evidence on different functions of several parameters and models incorporating both of these features. The need to check for consistency of evidence when using these powerful methods is emphasized. We develop a rationale for evidence synthesis that is based on Bayesian decision modelling and expected value of information theory, which stresses not only the need for a lack of bias in estimates of treatment effects but also a lack of bias in assessments of uncertainty. The increasing reliance of governmental bodies like the UK National Institute for Clinical Excellence on complex evidence synthesis in decision modelling is discussed.  相似文献   

12.
We consider the variance estimation of the weighted likelihood estimator (WLE) under two‐phase stratified sampling without replacement. Asymptotic variance of the WLE in many semiparametric models contains unknown functions or does not have a closed form. The standard method of the inverse probability weighted (IPW) sample variances of an estimated influence function is then not available in these models. To address this issue, we develop the variance estimation procedure for the WLE in a general semiparametric model. The phase I variance is estimated by taking a numerical derivative of the IPW log likelihood. The phase II variance is estimated based on the bootstrap for a stratified sample in a finite population. Despite a theoretical difficulty of dependent observations due to sampling without replacement, we establish the (bootstrap) consistency of our estimators. Finite sample properties of our method are illustrated in a simulation study.  相似文献   

13.
宋鹏等 《统计研究》2020,37(7):116-128
高维协方差矩阵的估计问题现已成为大数据统计分析中的基本问题,传统方法要求数据满足正态分布假定且未考虑异常值影响,当前已无法满足应用需要,更加稳健的估计方法亟待被提出。针对高维协方差矩阵,一种稳健的基于子样本分组的均值-中位数估计方法被提出且简单易行,然而此方法估计的矩阵并不具备正定稀疏特性。基于此问题,本文引进一种中心正则化算法,弥补了原始方法的缺陷,通过在求解过程中对估计矩阵的非对角元素施加L1范数惩罚,使估计的矩阵具备正定稀疏的特性,显著提高了其应用价值。在数值模拟中,本文所提出的中心正则稳健估计有着更高的估计精度,同时更加贴近真实设定矩阵的稀疏结构。在后续的投资组合实证分析中,与传统样本协方差矩阵估计方法、均值-中位数估计方法和RA-LASSO方法相比,基于中心正则稳健估计构造的最小方差投资组合收益率有着更低的波动表现。  相似文献   

14.
This paper proposes a working estimating equation which is computationally easy to use for spatial count data. The proposed estimating equation is a modification of quasi-likelihood estimating equations without the need of correctly specifying the covariance matrix. Under some regularity conditions, we show that the proposed estimator has consistency and asymptotic normality. A simulation comparison also indicates that the proposed method has competitive performance in dealing with over-dispersion data from a parameter-driven model.  相似文献   

15.
In this article, a semiparametric time‐varying nonlinear vector autoregressive (NVAR) model is proposed to model nonlinear vector time series data. We consider a combination of parametric and nonparametric estimation approaches to estimate the NVAR function for both independent and dependent errors. We use the multivariate Taylor series expansion of the link function up to the second order which has a parametric framework as a representation of the nonlinear vector regression function. After the unknown parameters are estimated by the maximum likelihood estimation procedure, the obtained NVAR function is adjusted by a nonparametric diagonal matrix, where the proposed adjusted matrix is estimated by the nonparametric kernel estimator. The asymptotic consistency properties of the proposed estimators are established. Simulation studies are conducted to evaluate the performance of the proposed semiparametric method. A real data example on short‐run interest rates and long‐run interest rates of United States Treasury securities is analyzed to demonstrate the application of the proposed approach. The Canadian Journal of Statistics 47: 668–687; 2019 © 2019 Statistical Society of Canada  相似文献   

16.
Pairwise comparison matrix (PCM) is a popular technique used in multi-criteria decision making. The abelian linearly ordered group (alo-group) is a powerful tool for the discussion of PCMs. In this article, a criterion for acceptable consistency of PCM is introduced, which is independent of the scale and can be intuitively interpreted. The relation of the introduced criterion with the weak consistency is investigated. Then, a multiplicative alo-group based hierarchical decision model is proposed. The following approaches are included: (1) the introduced criterion for acceptable consistency is used to check whether or not a PCM is acceptable; (2) the row’s geometric mean method is used for deriving the local priorities of a multiplicative PCM; (3) a Hierarchy Composition Rule derived from the weighted mean is used for computing the criterion/subcriterion’s weights with regard to the total goal; and (4) the weighted geometric mean is used as the aggregation rule, where the alternative’s local priorities are min-normalized. The proposed model has the property of preserving rank. Moreover, it has counterparts in the additive case. Finally, the model is applied to a layout planning problem of an aircraft maintenance base with a computer-based software.  相似文献   

17.
This article considers a simple test for the correct specification of linear spatial autoregressive models, assuming that the choice of the weight matrix Wn is true. We derive the limiting distributions of the test under the null hypothesis of correct specification and a sequence of local alternatives. We show that the test is free of nuisance parameters asymptotically under the null and prove the consistency of our test. To improve the finite sample performance of our test, we also propose a residual-based wild bootstrap and justify its asymptotic validity. We conduct a small set of Monte Carlo simulations to investigate the finite sample properties of our tests. Finally, we apply the test to two empirical datasets: the vote cast and the economic growth rate. We reject the linear spatial autoregressive model in the vote cast example but fail to reject it in the economic growth rate example. Supplementary materials for this article are available online.  相似文献   

18.
A framework is described for organizing and understanding the computations necessary to obtain the posterior mean of a vector of linear effects in a normal linear model, conditional on the parameters that determine covariance structure. The approach has two major uses; firstly, as a pedagogical tool in the derivation of formulae, and secondly, as a practical tool for developing computational strategies without needing complicated matrix formulae that are often unwieldy in complex hierarchical models. The proposed technique is based upon symbolic application of the sweep operator SWP to an appropriate tableau of means and covariances. The method is illustrated with standard linear model specifications, including the so-called mixed model, with both fixed and random effects.  相似文献   

19.
Summary.  Non-hierarchical clustering methods are frequently based on the idea of forming groups around 'objects'. The main exponent of this class of methods is the k -means method, where these objects are points. However, clusters in a data set may often be due to certain relationships between the measured variables. For instance, we can find linear structures such as straight lines and planes, around which the observations are grouped in a natural way. These structures are not well represented by points. We present a method that searches for linear groups in the presence of outliers. The method is based on the idea of impartial trimming. We search for the 'best' subsample containing a proportion 1− α of the data and the best k affine subspaces fitting to those non-discarded observations by measuring discrepancies through orthogonal distances. The population version of the sample problem is also considered. We prove the existence of solutions for the sample and population problems together with their consistency. A feasible algorithm for solving the sample problem is described as well. Finally, some examples showing how the method proposed works in practice are provided.  相似文献   

20.
陈建宝  孙林 《统计研究》2015,32(1):95-101
对随机效应空间滞后单指数面板模型,本文构建了该模型的截面极大似然估计方法,从理论证明和数值模拟两方面分别考察了其估计量的大样本性质和小样本表现。研究结果表明:(1)在大样本条件下,估计量均具有一致性,并且参数估计量具有渐近正态性。(2)在小样本条件下,各估计量依然具有良好的表现,其精度随着样本容量的增加而提高;空间权重矩阵结构的复杂性对空间相关系数的估计量影响较大,但对其他估计量的影响较小。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号