首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

It has been shown that equilibrium restrictions in a search model can be used to identify quantiles of the search cost distribution from observedprices alone. These quantiles can be difficult to estimate in practice. This article uses a minimum distance approach to estimate them that is easy to compute. A version of our estimator is a solution to a nonlinear least-square problem that can be straightforwardly programmed on softwares such as STATA. We show our estimator is consistent and has an asymptotic normal distribution. Its distribution can be consistently estimated by a bootstrap. Our estimator can be used to estimate the cost distribution nonparametrically on a larger support when prices from heterogenous markets are available. We propose a two-step sieve estimator for that case. The first step estimates quantiles from each market. They are used in the second step as generated variables to perform nonparametric sieve estimation. We derive the uniform rate of convergence of the sieve estimator that can be used to quantify the errors incurred from interpolating data across markets. To illustrate we use online bookmaking odds for English football leagues’ matches (as prices) and find evidence that suggests search costs for consumers have fallen following a change in the British law that allows gambling operators to advertise more widely. Supplementary materials for this article are available online.  相似文献   

2.
Regression methods for common data types such as measured, count and categorical variables are well understood but increasingly statisticians need ways to model relationships between variable types such as shapes, curves, trees, correlation matrices and images that do not fit into the standard framework. Data types that lie in metric spaces but not in vector spaces are difficult to use within the usual regression setting, either as the response and/or a predictor. We represent the information in these variables using distance matrices which requires only the specification of a distance function. A low-dimensional representation of such distance matrices can be obtained using methods such as multidimensional scaling. Once these variables have been represented as scores, an internal model linking the predictors and the responses can be developed using standard methods. We call scoring as the transformation from a new observation to a score, whereas backscoring is a method to represent a score as an observation in the data space. Both methods are essential for prediction and explanation. We illustrate the methodology for shape data, unregistered curve data and correlation matrices using motion capture data from an experiment to study the motion of children with cleft lip.  相似文献   

3.
提出一种模糊多分配p枢纽站中位问题,其中运输成本定义为模糊变量,问题的目标函数是在给定的可信性水平下,最小化总的运输成本。对于梯形和正态运输成本,问题等价于确定的混合整数线性规划问题。在实证分析中,选取了辽宁省煤炭产业的相关面板数据,分析计算在不同可信度水平下煤炭运输枢纽站设立的数量和位置,再利用传统的优化方法(如分枝定界法)求解。经计算,这一模型和求解方法可以用来解决辽宁省煤炭运输的选址问题。  相似文献   

4.
文章基于解释变量与被解释变量之间的互信息提出一种新的变量选择方法:MI-SIS。该方法可以处理解释变量数目p远大于观测样本量n的超高维问题,即p=O(exp(nε))ε>0。另外,该方法是一种不依赖于模型假设的变量选择方法。数值模拟和实证研究表明,MI-SIS方法在小样本情形下能够有效地发现微弱信号。  相似文献   

5.
Approximate conditional inference is developed for the linear calibration problem. It is shown that this problem can be transformed so that the primary parameter is an angle, the nuisance parameter is a radial distance, and the density is rotationally symmetric. Were the nuisance parameter known, exact location confidence intervals would be available by location of structural arguments. A confidence distribution is used to average out the nuisance parameter yielding an approximate confidence interval that involves a precision indicator derived from the radial distance. Some difficulties with the ordinary solution are avoided by the conditional procedure.  相似文献   

6.
In modern football, various variables as, for example, the distance a team runs or its percentage of ball possession, are collected throughout a match. However, there is a lack of methods to make use of these on-field variables simultaneously and to connect them with the final result of the match. This paper considers data from the German Bundesliga season 2015/2016. The objective is to identify the on-field variables that are connected to the sportive success or failure of the single teams. An extended Bradley–Terry model for football matches is proposed that is able to take into account on-field covariates. Penalty terms are used to reduce the complexity of the model and to find clusters of teams with equal covariate effects. The model identifies the running distance to be the on-field covariate that is most strongly connected to the match outcome.  相似文献   

7.
The problem of determining the number of variables to be included in the linear regression model is considered under the assumption that the dependent and independent variables have a joint normal distribution. It is shown that for a given sample size n there exists an optimal number k0 (0 ≤ k0 < n-2) of variables among all independent variables in the model, such that the expectation of the mean squared error corresponding to the prediction equation with k0 variables is minimal.Application of this result to ustepwise procedures is discussed.  相似文献   

8.
This article considers the objective Bayesian testing in the normal regression models with first-order autoregressive residuals. We propose some solutions based on a Bayesian model selection procedure to this problem where no subjective input is considered. We construct the proper priors for testing the autocorrelation coefficient based on measures of divergence between competing models, which is called the divergence-based (DB) priors and then propose the objective Bayesian decision-theoretic rule, which is called the Bayesian reference criterion (BRC). Finally, we derive the intrinsic test statistic for testing the autocorrelation coefficient. The behavior of the Bayes factor-based DB priors is examined by comparing with the BRC in a simulation study and an example.  相似文献   

9.
The least absolute shrinkage and selection operator (lasso) has been widely used in regression analysis. Based on the piecewise linear property of the solution path, least angle regression provides an efficient algorithm for computing the solution paths of lasso. Group lasso is an important generalization of lasso that can be applied to regression with grouped variables. However, the solution path of group lasso is not piecewise linear and hence cannot be obtained by least angle regression. By transforming the problem into a system of differential equations, we develop an algorithm for efficient computation of group lasso solution paths. Simulation studies are conducted for comparing the proposed algorithm to the best existing algorithm: the groupwise-majorization-descent algorithm.  相似文献   

10.
A test of congruence among distance matrices is described. It tests the hypothesis that several matrices, containing different types of variables about the same objects, are congruent with one another, so they can be used jointly in statistical analysis. Raw data tables are turned into similarity or distance matrices prior to testing; they can then be compared to data that naturally come in the form of distance matrices. The proposed test can be seen as a generalization of the Mantel test of matrix correspondence to any number of distance matrices. This paper shows that the new test has the correct rate of Type I error and good power. Power increases as the number of objects and the number of congruent data matrices increase; power is higher when the total number of matrices in the study is smaller. To illustrate the method, the proposed test is used to test the hypothesis that matrices representing different types of organoleptic variables (colour, nose, body, palate and finish) in single‐malt Scotch whiskies are congruent.  相似文献   

11.
As modeling efforts expand to a broader spectrum of areas the amount of computer time required to exercise the corresponding computer codes has become quite costly (several hours for a single run is not uncommon). This costly process can be directly tied to the complexity of the modeling and to the large number of input variables (often numbering in the hundreds) Further, the complexity of the modeling (usually involving systems of differential equations) makes the relationships among the input variables not mathematically tractable. In this setting it is desired to perform sensitivity studies of the input-output relationships. Hence, a judicious selection procedure for the choic of values of input variables is required, Latin hypercube sampling has been shown to work well on this type of problem.

However, a variety of situations require that decisions and judgments be made in the face of uncertainty. The source of this uncertainty may be lack ul knowledge about probability distributions associated with input variables, or about different hypothesized future conditions, or may be present as a result of different strategies associated with a decision making process In this paper a generalization of Latin hypercube sampling is given that allows these areas to be investigated without making additional computer runs. In particular it is shown how weights associated with Latin hypercube input vectors may be rhangpd to reflect different probability distribution assumptions on key input variables and yet provide: an unbiased estimate of the cumulative distribution function of the output variable. This allows for different distribution assumptions on input variables to be studied without additional computer runs and without fitting a response surface. In addition these same weights can be used in a modified nonparametric Friedman test to compare treatments, Sample size requirements needed to apply the results of the work are also considered. The procedures presented in this paper are illustrated using a model associated with the risk assessment of geologic disposal of radioactive waste.  相似文献   

12.
Linear regression with compositional explanatory variables   总被引:1,自引:0,他引:1  
Compositional explanatory variables should not be directly used in a linear regression model because any inference statistic can become misleading. While various approaches for this problem were proposed, here an approach based on the isometric logratio (ilr) transformation is used. It turns out that the resulting model is easy to handle, and that parameter estimation can be done in like in usual linear regression. Moreover, it is possible to use the ilr variables for inference statistics in order to obtain an appropriate interpretation of the model.  相似文献   

13.
The exact distribution of a linear combination of n indepedent negative exponential random variables , when the coefficients cf the linear combination are distinct and positive , is well-known. Recently Ali and Obaidullah (1982) extended this result by taking the coeff icients to be arbitrary real numbers. They used a lengthy geometric.

al approach to arrive at the result . This article gives a simple derivation of the result with the help of a generalized partial fraction technique. This technique also works when the variables involved are gamma variables with certain types of parameters. Results are presented in a form which can easily be programmed for computational purposes. Connection of this problem t o various problems in different fields is also pointed out.  相似文献   

14.
The likelihood function of a general nonlinear, non-Gaussian state space model is a high-dimensional integral with no closed-form solution. In this article, I show how to calculate the likelihood function exactly for a large class of non-Gaussian state space models that include stochastic intensity, stochastic volatility, and stochastic duration models among others. The state variables in this class follow a nonnegative stochastic process that is popular in econometrics for modeling volatility and intensities. In addition to calculating the likelihood, I also show how to perform filtering and smoothing to estimate the latent variables in the model. The procedures in this article can be used for either Bayesian or frequentist estimation of the model’s unknown parameters as well as the latent state variables. Supplementary materials for this article are available online.  相似文献   

15.
Summary.  Many contemporary classifiers are constructed to provide good performance for very high dimensional data. However, an issue that is at least as important as good classification is determining which of the many potential variables provide key information for good decisions. Responding to this issue can help us to determine which aspects of the datagenerating mechanism (e.g. which genes in a genomic study) are of greatest importance in terms of distinguishing between populations. We introduce tilting methods for addressing this problem. We apply weights to the components of data vectors, rather than to the data vectors themselves (as is commonly the case in related work). In addition we tilt in a way that is governed by L 2-distance between weight vectors, rather than by the more commonly used Kullback–Leibler distance. It is shown that this approach, together with the added constraint that the weights should be non-negative, produces an algorithm which eliminates vector components that have little influence on the classification decision. In particular, use of the L 2-distance in this problem produces properties that are reminiscent of those that arise when L 1-penalties are employed to eliminate explanatory variables in very high dimensional prediction problems, e.g. those involving the lasso. We introduce techniques that can be implemented very rapidly, and we show how to use bootstrap methods to assess the accuracy of our variable ranking and variable elimination procedures.  相似文献   

16.
Let us denote by (n,k,d)-code, a binary linear code with code length nk information symbols and the minimum distance d. It is well known that the problem of obtaining a binary linear code whose code length n is minimum among (n,k,d)-codes for given integers k and d, is equivalent to solve a linear programming whose solutions correspond to a minimum redundancy error-correcting code. In this paper it will be shown that for some given integers d, there exists no solution of the linear programming except a solution which is obtained using a flat in a finite projective geometry.  相似文献   

17.
张波  刘晓倩 《统计研究》2019,36(4):119-128
本文旨在研究基于fused惩罚的稀疏主成分分析方法,以适用于相邻变量之间高度相关甚至完全相等的数据情形。首先,从回归分析角度出发,提出一种求解稀疏主成分的简便思路,给出一种广义的稀疏主成分模型—— GSPCA模型及其求解算法,并证明在惩罚函数取1-范数时,该模型与现有的稀疏主成分模型——SPC模型的求解结果一致。其次,本文提出将fused惩罚与主成分分析相结合,得到一种fused稀疏主成分分析方法,并从惩罚性矩阵分解和回归分析两个角度,给出两种模型形式。在理论上证明了两种模型的求解结果是一致的,故将其统称为FSPCA模型。模拟实验显示,FSPCA模型在处理相邻变量之间高度相关甚至完全相等的数据集上的表现良好。最后,将FSPCA模型应用于手写数字识别,发现与SPC模型相比,FSPCA模型所提取的主成分具备更好的解释性,这使得该模型更具实用价值。  相似文献   

18.
In contrast to the common belief that the logit model has no analytical presentation, it is possible to find such a solution in the case of categorical predictors. This paper shows that a binary logistic regression by categorical explanatory variables can be constructed in a closed-form solution. No special software and no iterative procedures of nonlinear estimation are needed to obtain a model with all its parameters and characteristics, including coefficients of regression, their standard errors and t-statistics, as well as the residual and null deviances. The derivation is performed for logistic models with one binary or categorical predictor, and several binary or categorical predictors. The analytical formulae can be used for arithmetical calculation of all the parameters of the logit regression. The explicit expressions for the characteristics of logit regression are convenient for the analysis and interpretation of the results of logistic modeling.  相似文献   

19.
The use of large-dimensional factor models in forecasting has received much attention in the literature with the consensus being that improvements on forecasts can be achieved when comparing with standard models. However, recent contributions in the literature have demonstrated that care needs to be taken when choosing which variables to include in the model. A number of different approaches to determining these variables have been put forward. These are, however, often based on ad hoc procedures or abandon the underlying theoretical factor model. In this article, we will take a different approach to the problem by using the least absolute shrinkage and selection operator (LASSO) as a variable selection method to choose between the possible variables and thus obtain sparse loadings from which factors or diffusion indexes can be formed. This allows us to build a more parsimonious factor model that is better suited for forecasting compared to the traditional principal components (PC) approach. We provide an asymptotic analysis of the estimator and illustrate its merits empirically in a forecasting experiment based on U.S. macroeconomic data. Overall we find that compared to PC we obtain improvements in forecasting accuracy and thus find it to be an important alternative to PC. Supplementary materials for this article are available online.  相似文献   

20.
In pattern classification of sampled vector valued random variables it is often essential, due to computational and accuracy considerations, to consider certain measurable transformations of the random variable. These transformations are generally of a dimension-reducing nature. In this paper we consider the class of linear dimension reducing transformations, i.e., the k × n matrices of rank k where k < n and n is the dimension of the range of the sampled vector random variable.

In this connection, we use certain results (Decell and Quirein, 1973), that guarantee, relative to various class separability criteria, the existence of an extremal transformation. These results also guarantee that the extremal transformation can be expressed in the form (Ik∣ Z)U where Ik is the k × k identity matrix and U is an orthogonal n × n matrix. These results actually limit the search for the extremal linear transformation to a search over the obviously smaller class of k × n matrices of the form (Ik ∣Z)U. In this paper these results are refined in the sense that any extremal transformation can be expressed in the form (IK∣Z)Hp … H1 where p ≤ min{k, n?k} and Hi is a Householder transformation i=l,…, p, The latter result allows one to construct a sequence of transformations (LK∣ Z)H1, (IK Z)H2H1 … such that the values of the class separability criterion evaluated at this sequence is a bounded, monotone sequence of real numbers. The construction of the i-th element of the sequence of transformations requires the solution of an n-dimensional optimization problem. The solution, for various class separability criteria, of the optimization problem will be the subject of later papers. We have conjectured (with supporting theorems and empirical results) that, since the bounded monotone sequence of real class separability values converges to its least upper bound, this least upper bound is an extremal value of the class separability criterion.

Several open questions are stated and the practical implications of the results are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号