首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
We consider the problem of estimating hybrid frequency moments of two dimensional data streams. In this model, data is viewed to be organized in a matrix form (A i,j )1≤i,j,≤n . The entries A i,j are updated coordinate-wise, in arbitrary order and possibly multiple times. The updates include both increments and decrements to the current value of A i,j . The hybrid frequency moment F p,q (A) is defined as \(\sum_{j=1}^{n}(\sum_{i=1}^{n}{A_{i,j}}^{p})^{q}\) and is a generalization of the frequency moment of one-dimensional data streams.We present the first \(\tilde{O}(1)\) space algorithm for the problem of estimating F p,q for p∈[0,2] and q∈[0,1] to within an approximation factor of 1±ε. The \(\tilde{O}\) notation hides poly-logarithmic factors in the size of the stream m, the matrix size n and polynomial factors of ε ?1. We also present the first \(\tilde{O}(n^{1-1/q})\) space algorithm for estimating F p,q for p∈[0,2] and q∈(1,2].  相似文献   

2.
We study algorithms for clustering data that were recently proposed by Balcan et al. (SODA’09: 19th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1068–1077, 2009a) and that have already given rise to several follow-up papers. The input for the clustering problem consists of points in a metric space and a number k, specifying the desired number of clusters. The algorithms find a clustering that is provably close to a target clustering, provided that the instance has the “(1+α,ε)-property”, which means that the instance is such that all solutions to the k-median problem for which the objective value is at most (1+α) times the optimal objective value correspond to clusterings that misclassify at most an ε fraction of the points with respect to the target clustering. We investigate the theoretical and practical implications of their results. Our main contributions are as follows. First, we show that instances that have the (1+α,ε)-property and for which, additionally, the clusters in the target clustering are large, are easier than general instances: the algorithm proposed in Balcan et al. (SODA’09: 19th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1068–1077, 2009a) is a constant factor approximation algorithm with an approximation guarantee that is better than the known hardness of approximation for general instances. Further, we show that it is NP-hard to check if an instance satisfies the (1+α,ε)-property for a given (α,ε); the algorithms in Balcan et al. (SODA’09: 19th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1068–1077, 2009a) need such α and ε as input parameters, however. We propose ways to use their algorithms even if we do not know values of α and ε for which the assumption holds. Finally, we implement these methods and other popular methods, and test them on real world data sets. We find that on these data sets there are no α and ε so that the dataset has both (1+α,ε)-property and sufficiently large clusters in the target solution. For the general case where there are no assumptions about the cluster sizes, we show that on our data sets the performance guarantee proved by Balcan et a. (SODA’09: 19th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1068–1077, 2009a) is meaningless for the values of α,ε for which the data set has the (1+α,ε)-property. The algorithm nonetheless gives reasonable results, although it is outperformed by other methods.  相似文献   

3.
This paper presents three sets of results about equilibrium bias of technology. First, I show that when the menu of technological possibilities only allows for factor‐augmenting technologies, the increase in the supply of a factor induces technological change relatively biased toward that factor—meaning that the induced technological change increases the relative marginal product of the factor becoming more abundant. Moreover, this induced bias can be strong enough to make the relative marginal product of a factor increasing in response to an increase in its supply, thus leading to an upward‐sloping relative demand curve. I also show that these results about relative bias do not generalize when more general menus of technological possibilities are considered. Second, I prove that under mild assumptions, the increase in the supply of a factor induces technological change that is absolutely biased toward that factor—meaning that it increases its marginal product at given factor proportions. The third and most important result in the paper establishes the possibility of and conditions for strong absolute equilibrium bias—whereby the price (marginal product) of a factor increases in response to an increase in its supply. I prove that, under some regularity conditions, there will be strong absolute equilibrium bias if and only if the aggregate production function of the economy fails to be jointly concave in factors and technology. This type of failure of joint concavity is possible in economies where equilibrium factor demands and technologies result from the decisions of different agents.  相似文献   

4.
The largest well predicted subset problem is formulated for comparison of two predicted 3D protein structures from the same sequence. A 3D protein structure is represented by an ordered point set A={a 1,…,a n }, where each a i is a point in 3D space. Given two ordered point sets A={a 1,…,a n } and B={b 1,b 2,…b n } containing n points, and a threshold d, the largest well predicted subset problem is to find the rigid body transformation T for a largest subset B opt of B such that the distance between a i and T(b i ) is at most d for every b i in B opt . A meaningful prediction requires that the size of B opt is at least αn for some constant α (Li et al. in CPM 2008, 2008). We use LWPS(A,B,d,α) to denote the largest well predicted subset problem with meaningful prediction. An (1+δ 1,1?δ 2)-approximation for LWPS(A,B,d,α) is to find a transformation T to bring a subset B′?B of size at least (1?δ 2)|B opt | such that for each b i B′, the Euclidean distance between the two points distance?(a i ,T(b i ))≤(1+δ 1)d. We develop a constant time (1+δ 1,1?δ 2)-approximation algorithm for LWPS(A,B,d,α) for arbitrary positive constants δ 1 and δ 2. To our knowledge, this is the first constant time algorithm in this area. Li et al. (CPM 2008, 2008) showed an $O(n(\log n)^{2}/\delta_{1}^{5})$ time randomized (1+δ 1)-distance approximation algorithm for the largest well predicted subset problem under meaningful prediction. We also study a closely related problem, the bottleneck distance problem, where we are given two ordered point sets A={a 1,…,a n } and B={b 1,b 2,…b n } containing n points and the problem is to find the smallest d opt such that there exists a rigid transformation T with distance(a i ,T(b i ))≤d opt for every point b i B. A (1+δ)-approximation for the bottleneck distance problem is to find a transformation T, such that for each b i B, distance?(a i ,T(b i ))≤(1+δ)d opt , where δ is a constant. For an arbitrary constant δ, we obtain a linear O(n/δ 6) time (1+δ)-algorithm for the bottleneck distance problem. The best known algorithms for both problems require super-linear time (Li et al. in CPM 2008, 2008).  相似文献   

5.
Traditionally in combinatorics on words one studies unavoidable regularities that appear in sufficiently long strings over a fixed size alphabet. Inspired by permutation problems originating from information security, another viewpoint is taken in this paper. We focus on combinatorial properties of long words in which the number of occurrences of any symbol is restricted by a fixed given constant. More precisely, we show that for all positive integers m and q there exists the least positive integer N(m,q) which is smaller than $m^{2^{q-1}}$ and satisfies the following: If α is a word such that
  1. |alph(α)|≥N(m,q) (i.e., the cardinality of the alphabet of α is at least N(m,q)); and
  2. |α| a q for each a∈alph(α) (i.e., the number of occurrences of any symbol of alph(α) in α is at most q),
then there exist a set A?alph(α) of cardinality |A|=m, an integer p∈{1,2,…,q}, and permutations σ 1,σ 2,…,σ p :{1,2,…,m}→{1,2,…,m} for which $$\pi_A(\alpha)\in a_{\sigma_1(1)}^+\cdots a_{\sigma_1(m)}^+a_{\sigma _2(1)}^+\cdots a_{\sigma_2(m)}^+\cdots a_{\sigma_p(1)}^+\cdots a_{\sigma_p(m)}^+ .$$ Here A={a 1,a 2,…,a m } and π A is the projection morphism from alph(α)? into A ?. The second part of the paper considers information security. We give an introduction to (generalized iterated) hash functions and their security properties; finally we demonstrate how our combinatorial results are connected to constructing multicollision attacks on these functions.  相似文献   

6.
The maximum quasi-biclique problem has been proposed for finding interacting protein group pairs from large protein-protein interaction (PPI) networks. The problem is defined as follows: The Maximum Quasi-biclique Problem: Given a bipartite graph G=(XY,E) and a number 0<δ≤0.5, find a subset X opt of X and a subset Y opt of Y such that any vertex xX opt is incident to at least (1?δ)|Y opt | vertices in Y opt , any vertex yY opt is incident to at least (1?δ)|X opt | vertices in X opt and |X opt |+|Y opt | is maximized. The problem was proved to be NP-hard. We design a polynomial time approximation scheme to give a quasi-biclique X′?X and Y′?Y with |X′|+|Y′|≥(1?ε)(|X opt |+|Y opt |) such that any vertex xX′ is incident to at least (1?δ?ε)|Y′| vertices in Y′ and any vertex yY′ is incident to at least (1?δ?ε)|X′| vertices in X′ for any ε>0, where X opt and Y opt form the optimal solution.  相似文献   

7.
This paper analyzes the complexity of the contraction fixed point problem: compute an ε‐approximation to the fixed point V*Γ(V*) of a contraction mapping Γ that maps a Banach space Bd of continuous functions of d variables into itself. We focus on quasi linear contractions where Γ is a nonlinear functional of a finite number of conditional expectation operators. This class includes contractive Fredholm integral equations that arise in asset pricing applications and the contractive Bellman equation from dynamic programming. In the absence of further restrictions on the domain of Γ, the quasi linear fixed point problem is subject to the curse of dimensionality, i.e., in the worst case the minimal number of function evaluations and arithmetic operations required to compute an ε‐approximation to a fixed point V*∈Bd increases exponentially in d. We show that the curse of dimensionality disappears if the domain of Γ has additional special structure. We identify a particular type of special structure for which the problem is strongly tractable even in the worst case, i.e., the number of function evaluations and arithmetic operations needed to compute an ε‐approximation of V* is bounded by Cεp where C and p are constants independent of d. We present examples of economic problems that have this type of special structure including a class of rational expectations asset pricing problems for which the optimal exponent p1 is nearly achieved.  相似文献   

8.
We are given a digraph D=(V,A;w), a length (delay) function w:AR +, a positive integer d and a set $\mathcal{P}=\{(s_{i},t_{i};B_{i}) | i=1,2,\ldots,k\}$ of k requests, where s i V is called as the ith source node, t i V is called the ith sink node and B i is called as the ith length constraint. For a given positive integer d, the subdivision-constrained routing requests problem (SCRR, for short) is to find a directed subgraph D′=(V′,A′) of D, satisfying the two constraints: (1) Each request (s i ,t i ;B i ) has a path P i from s i to t i in D′ with length $w(P_{i})=\sum_{e\in P_{i}} w(e)$ no more than B i ; (2) Insert some nodes uniformly on each arc eA′ to ensure that each new arc has length no more than d. The objective is to minimize the total number of the nodes inserted on the arcs in A′. We obtain the following three main results: (1) The SCRR problem is at least as hard as the set cover problem even if each request has the same source s, i.e., s i =s for each i=1,2,…,k; (2) For each request (s,t;B), we design a dynamic programming algorithm to find a path from s to t with length no more than B such that the number of the nodes inserted on such a path is minimized, and as a corollary, we present a k-approximation algorithm to solve the SCRR problem for any k requests; (3) We finally present an optimal algorithm for the case where $\mathcal{P}$ contains all possible requests (s i ,t i ) in V×V and B i is equal to the length of the shortest path in D from s i to t i . To the best of our knowledge, this is the first time that the dynamic programming algorithm within polynomial time in (2) is designed for a weighted optimization problem while previous optimal algorithms run in pseudo-polynomial time.  相似文献   

9.
The induced path number ??(G) of a graph G is defined as the minimum number of subsets into which the vertex set of G can be partitioned so that each subset induces a graph. A Nordhaus-Gaddum-type result is a (tight) lower or upper bound on the sum (or product) of a parameter of a graph and its complement. If G is a subgraph of H, then the graph H?E(G) is the complement of G relative to H. In this paper, we consider Nordhaus-Gaddum-type results for the parameter ?? when the relative complement is taken with respect to the complete bipartite graph K n,n .  相似文献   

10.
In this paper we study several geometric problems of color-spanning sets: given n points with m colors in the plane, selecting m points with m distinct colors such that some geometric properties of the m selected points are minimized or maximized. The geometric properties studied in this paper are the maximum diameter, the largest closest pair, the planar smallest minimum spanning tree, the planar largest minimum spanning tree and the planar smallest perimeter convex hull. We propose an O(n 1+ε ) time algorithm for the maximum diameter color-spanning set problem where ε could be an arbitrarily small positive constant. Then, we present hardness proofs for the other problems and propose two efficient constant factor approximation algorithms for the planar smallest perimeter color-spanning convex hull problem.  相似文献   

11.
In telecommunication networks design the problem of obtaining optimal (arc or node) disjoint paths, for increasing network reliability, is extremely important. The problem of calculating k c disjoint paths from s to t (two distinct nodes), in a network with k c different (arbitrary) costs on every arc such that the total cost of the paths is minimised, is NP-complete even for k c =2. When k c =2 these networks are usually designated as dual arc cost networks.  相似文献   

12.
Given a directed graph G=(N,A) with arc capacities u ij and a minimum cost flow problem defined on G, the capacity inverse minimum cost flow problem is to find a new capacity vector [^(u)]\hat{u} for the arc set A such that a given feasible flow [^(x)]\hat{x} is optimal with respect to the modified capacities. Among all capacity vectors [^(u)]\hat{u} satisfying this condition, we would like to find one with minimum ||[^(u)]-u||\|\hat{u}-u\| value. We consider two distance measures for ||[^(u)]-u||\|\hat{u}-u\| , rectilinear (L 1) and Chebyshev (L ) distances. By reduction from the feedback arc set problem we show that the capacity inverse minimum cost flow problem is NP\mathcal{NP} -hard in the rectilinear case. On the other hand, it is polynomially solvable by a greedy algorithm for the Chebyshev norm. In the latter case we propose a heuristic for the bicriteria problem, where we minimize among all optimal solutions the number of affected arcs. We also present computational results for this heuristic.  相似文献   

13.
In the minimum weighted dominating set problem (MWDS), we are given a unit disk graph with non-negative weight on each vertex. The MWDS seeks a subset of the vertices of the graph with minimum total weight such that each vertex of the graph is either in the subset or adjacent to some nodes in the subset. A?weight function is called smooth, if the ratio of the weights of any two adjacent nodes is upper bounded by a constant. MWDS is known to be NP-hard. In this paper, we give the first polynomial time approximation scheme (PTAS) for MWDS with smooth weights on unit disk graphs, which achieves a (1+ε)-approximation for MWDS, for any ε>0.  相似文献   

14.
Let p and q be positive integers. An L(p,q)-labeling of a graph G with a span s is a labeling of its vertices by integers between 0 and s such that adjacent vertices of G are labeled using colors at least p apart, and vertices having a common neighbor are labeled using colors at least q apart. We denote by λ p,q (G) the least integer k such that G has an L(p,q)-labeling with span k. The maximum average degree of a graph G, denoted by $\operatorname {Mad}(G)$ , is the maximum among the average degrees of its subgraphs (i.e. $\operatorname {Mad}(G) = \max\{\frac{2|E(H)|}{|V(H)|} ; H \subseteq G \}$ ). We consider graphs G with $\operatorname {Mad}(G) < \frac{10}{3}$ , 3 and $\frac{14}{5}$ . These sets of graphs contain planar graphs with girth 5, 6 and 7 respectively. We prove in this paper that every graph G with maximum average degree m and maximum degree Δ has:
  • λ p,q (G)≤(2q?1)Δ+6p+10q?8 if $m < \frac{10}{3}$ and p≥2q.
  • λ p,q (G)≤(2q?1)Δ+4p+14q?9 if $m < \frac{10}{3}$ and 2q>p.
  • λ p,q (G)≤(2q?1)Δ+4p+6q?5 if m<3.
  • λ p,q (G)≤(2q?1)Δ+4p+4q?4 if $m < \frac{14}{5}$ .
  • We give also some refined bounds for specific values of p, q, or Δ. By the way we improve results of Lih and Wang (SIAM J. Discrete Math. 17(2):264–275, 2003).  相似文献   

    15.
    Given real numbers ba>0, an (a,b)-Roman dominating function of a graph G=(V,E) is a function f:V→{0,a,b} such that every vertex v with f(v)=0 has a neighbor u with f(u)=b. An independent/connected/total (a,b)-Roman dominating function is an (a,b)-Roman dominating function f such that {vV:f(v)≠0} induces a subgraph without edges/that is connected/without isolated vertices. For a weight function $w{:} V\to\Bbb{R}$ , the weight of f is w(f)=∑ vV w(v)f(v). The weighted (a,b)-Roman domination number $\gamma^{(a,b)}_{R}(G,w)$ is the minimum weight of an (a,b)-Roman dominating function of G. Similarly, we can define the weighted independent (a,b)-Roman domination number $\gamma^{(a,b)}_{Ri}(G,w)$ . In this paper, we first prove that for any fixed (a,b) the (a,b)-Roman domination and the total/connected/independent (a,b)-Roman domination problems are NP-complete for bipartite graphs. We also show that for any fixed (a,b) the (a,b)-Roman domination and the total/connected/weighted independent (a,b)-Roman domination problems are NP-complete for chordal graphs. We then give linear-time algorithms for the weighted (a,b)-Roman domination problem with ba>0, and the weighted independent (a,b)-Roman domination problem with 2aba>0 on strongly chordal graphs with a strong elimination ordering provided.  相似文献   

    16.
    This paper analyzes the linear regression model y = xβ+ε with a conditional median assumption med (ε| z) = 0, where z is a vector of exogenous instrument random variables. We study inference on the parameter β when y is censored and x is endogenous. We treat the censored model as a model with interval observation on an outcome, thus obtaining an incomplete model with inequality restrictions on conditional median regressions. We analyze the identified features of the model and provide sufficient conditions for point identification of the parameter β. We use a minimum distance estimator to consistently estimate the identified features of the model. We show that under point identification conditions and additional regularity conditions, the estimator based on inequality restrictions is normal and we derive its asymptotic variance. One can use our setup to treat the identification and estimation of endogenous linear median regression models with no censoring. A Monte Carlo analysis illustrates our estimator in the censored and the uncensored case.  相似文献   

    17.
    In this paper we study identification and estimation of a correlated random coefficients (CRC) panel data model. The outcome of interest varies linearly with a vector of endogenous regressors. The coefficients on these regressors are heterogenous across units and may covary with them. We consider the average partial effect (APE) of a small change in the regressor vector on the outcome (cf. Chamberlain (1984), Wooldridge (2005a)). Chamberlain (1992) calculated the semiparametric efficiency bound for the APE in our model and proposed a √N‐consistent estimator. Nonsingularity of the APE's information bound, and hence the appropriateness of Chamberlain's (1992) estimator, requires (i) the time dimension of the panel (T) to strictly exceed the number of random coefficients (p) and (ii) strong conditions on the time series properties of the regressor vector. We demonstrate irregular identification of the APE when T = p and for more persistent regressor processes. Our approach exploits the different identifying content of the subpopulations of stayers—or units whose regressor values change little across periods—and movers—or units whose regressor values change substantially across periods. We propose a feasible estimator based on our identification result and characterize its large sample properties. While irregularity precludes our estimator from attaining parametric rates of convergence, its limiting distribution is normal and inference is straightforward to conduct. Standard software may be used to compute point estimates and standard errors. We use our methods to estimate the average elasticity of calorie consumption with respect to total outlay for a sample of poor Nicaraguan households.  相似文献   

    18.
    Let j, k and m be positive numbers, a circular m-L(j,k)-labeling of a graph G is a function f:V(G)→[0,m) such that |f(u)?f(v)| m j if u and v are adjacent, and |f(u)?f(v)| m k if u and v are at distance two, where |a?b| m =min{|a?b|,m?|a?b|}. The minimum m such that there exist a circular m-L(j,k)-labeling of G is called the circular L(j,k)-labeling number of G and is denoted by σ j,k (G). In this paper, for any two positive numbers j and k with jk, we give some results about the circular L(j,k)-labeling number of direct product of path and cycle.  相似文献   

    19.
    A linear extension of a poset P=(X,?) is a permutation x 1,x 2,…,x |X| of X such that i<j whenever x i ?x j . For a given poset P=(X,?) and a cost function c(x,y) defined on X×X, we want to find a linear extension of P such that maximum cost is as small as possible. For the general case, it is NP-complete. In this paper we consider the linear extension problem with the assumption that c(x,y)=0 whenever x and y are incomparable. First, we prove the discussed problem is polynomially solvable for a special poset. And then, we present a polynomial algorithm to obtain an approximate solution.  相似文献   

    20.
    Yrjö Seppälä 《Omega》1980,8(1):39-45
    A relative value of a management information system (MIS) is defined in this paper by a ratio u1u0, where u0 is a value of a utility function of an enterprise whose management information system is perfect, and u1 is its value when it is not perfect and may produce inaccurate or out-of-date data among correct information. Our simulation model contains beuristics which describe the operational and strategic information system of an enterprise. The environment of the enterprise may be stable or dynamic. A mathematical formula, based on simulations, is developed. This formula describes how the relative value of an MIS depends on such factors as the accuracy of an operational information system, delays in information flow, the quality of a strategic information system, a reinvestment ratio used in the enterprise, and a number of investment periods. This formula has been found suitable in an enterprise with a strategically stable environment, but not with a turbulent environment.  相似文献   

    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号