首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we consider the problem of enumerating all maximal motifs in an input string for the class of repeated motifs with wild cards. A maximal motif is such a representative motif that is not properly contained in any larger motifs with the same location lists. Although the enumeration problem for maximal motifs with wild cards has been studied in Parida et al. (2001), Pisanti et al. (2003) and Pelfrêne et al. (2003), its output-polynomial time computability has been still open. The main result of this paper is a polynomial space polynomial delay algorithm for the maximal motif enumeration problem for the repeated motifs with wild cards. This algorithm enumerates all maximal motifs in an input string of length n in O(n 3) time per motif with O(n) space, in particular O(n 3) delay. The key of the algorithm is depth-first search on a tree-shaped search route over all maximal motifs based on a technique called prefix-preserving closure extension. We also show an exponential lower bound and a succinctness result on the number of maximal motifs, which indicate the limit of a straightforward approach. The results of the computational experiments show that our algorithm can be applicable to huge string data such as genome data in practice, and does not take large additional computational cost compared to usual frequent motif mining algorithms. This work is done during the Hiroki Arimura’s visit in LIRIS, University Claude-Bernard Lyon 1, France.  相似文献   

2.
We study the problem of separating sublinear time computations via approximating the diameter for a sequence S=p 1 p 2 ⋅⋅⋅ p n of points in a metric space, in which any two consecutive points have the same distance. The computation is considered respectively under deterministic, zero error randomized, and bounded error randomized models. We obtain a class of separations using various versions of the approximate diameter problem based on restrictions on input data. We derive tight sublinear time separations for each of the three computation models via proving that computation with O(n r ) time is strictly more powerful than that with O(n rε ) time, where r and ε are arbitrary parameters in (0,1) and (0,r) respectively. We show that, for any parameter r∈(0,1), the bounded error randomized sublinear time computation in time O(n r ) cannot be simulated by any zero error randomized sublinear time algorithm in o(n) time or queries; and the same is true for zero error randomized computation versus deterministic computation.  相似文献   

3.
Sequence alignment is a central problem in bioinformatics. The classical dynamic programming algorithm aligns two sequences by optimizing over possible insertions, deletions and substitutions. However, other evolutionary events can be observed, such as inversions, tandem duplications or moves (transpositions). It has been established that the extension of the problem to move operations is NP-complete. Previous work has shown that an extension restricted to non-overlapping inversions can be solved in O(n 3) with a restricted scoring scheme. In this paper, we show that the alignment problem extended to non-overlapping moves can be solved in O(n 5) for general scoring schemes, O(n 4log n) for concave scoring schemes and O(n 4) for restricted scoring schemes. Furthermore, we show that the alignment problem extended to non-overlapping moves, inversions and tandem duplications can be solved with the same time complexities. Finally, an example of an alignment with non-overlapping moves is provided. A preliminary version of this paper appeared in the Proceedings of COCOON 2007, LNCS, vol. 4598, pp. 151–164.  相似文献   

4.
The 2-interval pattern problem over its various models and restrictions was proposed by Vialette (2004) for the application of RNA secondary structure prediction. We present an O(n 3logn)-time 2-approximation algorithm for the problem of finding a largest { < ,-structured subset of 2-intervals given an input 2-interval set of size n. This greatly improves the previous best approximation ratio of 6 by Crochemore et al. (2005).  相似文献   

5.
Given a set P of n points and a straight line L, we study three important variations of minimum enclosing circle problem as follows:
  1. Computing k identical circles of minimum radius with centers on L, whose union covers all the points in P.
  2. Computing the minimum radius circle centered on L that can enclose at least k points of P.
  3. If each point in P is associated with one of the k given colors, then compute a minimum radius circle with center on L such that at least one point of each color lies inside it.
We propose three algorithms for Problem (i). The first one runs in O(nklogn) time and O(n) space. The second one is efficient where k?n; it runs in O(nlogn+nk+k 2log3 n) time and O(nlogn) space. The third one is based on parametric search and it runs in O(nlogn+klog4 n) time. For Problem (ii), the time and space complexities of the proposed algorithm are O(nk) and O(n) respectively. For Problem (iii), our proposed algorithm runs in O(nlogn) time and O(n) space.  相似文献   

6.
In this paper, we present a new model for RNA multiple sequence structural alignment based on the longest common subsequence. We consider both the off-line and on-line cases. For the off-line case, i.e., when the longest common subsequence is given as a linear graph with n vertices, we first present a polynomial O(n 2) time algorithm to compute its maximum nested loop. We then consider a slightly different problem—the Maximum Loop Chain problem and present an algorithm which runs in O(n 5) time. For the on-line case, i.e., given m RNA sequences of lengths n, compute the longest common subsequence of them such that this subsequence either induces a maximum nested loop or the maximum number of matches, we present efficient algorithms using dynamic programming when m is small. This research is partially supported by EPSCOR Visiting Scholar's Program and MSU Short-term Professional Development Program.  相似文献   

7.
We present algorithms and lower bounds for the Longest Increasing Subsequence (LIS) and Longest Common Subsequence (LCS) problems in the data-streaming model. To decide if the LIS of a given stream of elements drawn from an alphabet αbet has length at least k, we discuss a one-pass algorithm using O(k log αbetsize) space, with update time either O(log k) or O(log log αbetsize); for αbetsize = O(1), we can achieve O(log k) space and constant-time updates. We also prove a lower bound of Ω(k) on the space requirement for this problem for general alphabets αbet, even when the input stream is a permutation of αbet. For finding the actual LIS, we give a ⌈log (1 + 1/ɛ)-pass algorithm using O(k1+ɛlog αbetsize) space, for any ɛ > 0. For LCS, there is a trivial Θ(1)-approximate O(log n)-space streaming algorithm when αbetsize = O(1). For general alphabets αbet, the problem is much harder. We prove several lower bounds on the LCS problem, of which the strongest is the following: it is necessary to use Ω(n2) space to approximate the LCS of two n-element streams to within a factor of ρ, even if the streams are permutations of each other. A preliminary version of this paper appears in the Proceedings of the 11th International Computing and Combinatorics Conference (COCOON'05), August 2005, pp. 263–272.  相似文献   

8.
In recent years, more and more algorithms related to imprecise data have been proposed. Specifically, some algorithms on computing the maximum area convex hull are designed recently when the imprecise data are modeled as non-overlapping axis-aligned squares or as equal size squares. The time complexity of the best known algorithm based on non-overlapping axis-aligned squares is O(n 7). If the squares have equal size and can overlap, the time complexity of the best known algorithm is O(n 5). In this paper, we improve the former from O(n 7) to O(n 5) and improve the latter from O(n 5) to O(n 2). These results are obtained by exploiting the non-trivial geometric properties of the problems.  相似文献   

9.
The 2-INTERVAL PATTERN problem is to find the largest constrained pattern in a set of 2-intervals. The constrained pattern is a subset of the given 2-intervals such that any pair of them are R-comparable, where model . The problem stems from the study of general representation of RNA secondary structures. In this paper, we give three improved algorithms for different models. Firstly, an O(n{log} n +L) algorithm is proposed for the case , where is the total length of all 2-intervals (density d is the maximum number of 2-intervals over any point). This improves previous O(n 2log n) algorithm. Secondly, we use dynamic programming techniques to obtain an O(nlog n + dn) algorithm for the case R = { <, ⊏ }, which improves previous O(n 2) result. Finally, we present another algorithm for the case with disjoint support(interval ground set), which improves previous O(n 2n) upper bound. A preliminary version of this article appears in Proceedings of the 16th Annual International Symposium on Algorithms and Computation, Springer LNCS, Vol. 3827, pp. 412–421, Hainan, China, December 19–21, 2005.  相似文献   

10.
In this paper, we study the problem of supporting range sum queries on a compressed sequence of values. For a sequence of n k-bit integers, kO(log n), our data structures require asymptotically the same amount of storage as the compressed sequence if compressed using the Lempel-Ziv algorithm. The basic structure supports range sum queries in O(log n) time. With an increase by a constant factor in the storage complexity, the query time can be improved to O(log log n + k). The work described in this paper is fully supported by a grant from the Research Grant Council of the Hong Kong Special Administrative Region, China (CityU 1071/02E). A preliminary version has appeared in 11th International Conference in Computing and Combinatorics (COCOON'05).  相似文献   

11.
We investigated the problem of constructing the maximum consensus tree from rooted triples. We showed the NP-hardness of the problem and developed exact and heuristic algorithms. The exact algorithm is based on the dynamic programming strategy and runs in O((m + n 2)3 n ) time and O(2 n ) space. The heuristic algorithms run in polynomial time and their performances are tested and shown by comparing with the optimal solutions. In the tests, the worst and average relative error ratios are 1.200 and 1.072 respectively. We also implemented the two heuristic algorithms proposed by Gasieniec et al. The experimental result shows that our heuristic algorithm is better than theirs in most of the tests.  相似文献   

12.
Given a k-connected graph G=(V,E) and V V, k-Vertex-Connected Subgraph Augmentation Problem (k-VCSAP) is to find SVV with minimum cardinality such that the subgraph induced by V S is k-connected. In this paper, we study the hardness of k-VCSAP in undirect graphs. We first prove k-VCSAP is APX-hard. Then, we improve the lower bound in two ways by relying on different assumptions. That is, we prove no algorithm for k-VCSAP has a PR better than O(log (log n)) unless P=NP and O(log n) unless NPDTIME(n O(log log n)), where n is the size of an input graph.  相似文献   

13.
Let T = (V,E,w) be an undirected and weighted tree with node set V and edge set E, where w(e) is an edge weight function for e E. The density of a path, say e1, e2,..., ek, is defined as ki = 1 w(ei)/k. The length of a path is the number of its edges. Given a tree with n edges and a lower bound L where 1 L n, this paper presents two efficient algorithms for finding a maximum-density path of length at least L in O(nL) time. One of them is further modified to solve some special cases such as full m-ary trees in O(n) time.  相似文献   

14.
We study the problem of (off-line) broadcast scheduling in minimizing total flow time and propose a dynamic programming approach to compute an optimal broadcast schedule. Suppose the broadcast server has k pages and the last page request arrives at time n. The optimal schedule can be computed in O(k3(n+k)k−1) time for the case that the server has a single broadcast channel. For m channels case, i.e., the server can broadcast m different pages at a time where m < k, the optimal schedule can be computed in O(nkm) time when k and m are constants. Note that this broadcast scheduling problem is NP-hard when k is a variable and will take O(nkm+1) time when k is fixed and m ≥ 1 with the straightforward implementation of the dynamic programming approach. The preliminary version of this paper appeared in Proceedings of the 11th Annual International Computing and Combinatorics Conference as “Off-line Algorithms for Minimizing the Total Flow Time in Broadcast Scheduling”.  相似文献   

15.
Rocchio’s similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive supervised learning algorithm from examples. In practice, Rocchio’s algorithm often uses a fixed query updating factor. When this is the case, we strengthen the linear Ω(n) lower bound obtained by Chen and Zhu (Inf. Retr. 5:61–86, 2002) and prove that Rocchio’s algorithm makes Ω(k(nk)) mistakes in searching for a collection of documents represented by a monotone disjunction of k relevant features over the n-dimensional binary vector space {0,1} n , when the inner product similarity measure is used. A quadratic lower bound is obtained when k is linearly proportional to n. We also prove an O(k(nk)3) upper bound for Rocchio’s algorithm with the inner product similarity measure in searching for such a collection of documents with a constant query updating factor and a zero classification threshold.  相似文献   

16.
We consider the bootstrap unit root tests based on finite order autoregressive integrated models driven by iid innovations, with or without deterministic time trends. A general methodology is developed to approximate asymptotic distributions for the models driven by integrated time series, and used to obtain asymptotic expansions for the Dickey–Fuller unit root tests. The second‐order terms in their expansions are of stochastic orders Op(n−1/4) and Op(n−1/2), and involve functionals of Brownian motions and normal random variates. The asymptotic expansions for the bootstrap tests are also derived and compared with those of the Dickey–Fuller tests. We show in particular that the bootstrap offers asymptotic refinements for the Dickey–Fuller tests, i.e., it corrects their second‐order errors. More precisely, it is shown that the critical values obtained by the bootstrap resampling are correct up to the second‐order terms, and the errors in rejection probabilities are of order o(n−1/2) if the tests are based upon the bootstrap critical values. Through simulations, we investigate how effective is the bootstrap correction in small samples.  相似文献   

17.
We study job scheduling on processors capable of running at variable voltage/speed to minimize energy consumption. Each job in a problem instance is specified by its arrival time and deadline, together with required number of CPU cycles. It is known that the minimum energy schedule for n jobs can be computed in O(n3) time, assuming a convex energy function. We investigate more efficient algorithms for computing the optimal schedule when the job sets have certain special structures. When the time intervals are structured as trees, the minimum energy schedule is shown to have a succinct characterization and is computable in time O(P) where P is the tree’s total path length. We also study an on-line average-rate heuristics AVR and prove that its energy consumption achieves a small constant competitive ratio for nested job sets and for job sets with limited overlap. Some simulation results are also given. This work is supported in part by Research Grants Council of Hong Kong under grant No. CityU 1165/04E, National Natural Science Foundation of China under Grant No. 60135010, 60321002 and the Chinese National Key Foundation Research & Development Plan (2004CB318108).  相似文献   

18.
We present an O(n3)-time randomized approximation algorithm for the maximum traveling salesman problem whose expected approximation ratio is asymptotically , where n is the number of vertices in the input (undirected) graph. This improves the previous best.Part of work done while visiting City University of Hong Kong.  相似文献   

19.
In this paper, we are interested in computing the number of edge colourings and total colourings of a connected graph. We prove that the maximum number of k-edge-colourings of a connected k-regular graph on n vertices is k?((k?1)!) n/2. Our proof is constructive and leads to a branching algorithm enumerating all the k-edge-colourings of a connected k-regular graph in time O ?(((k?1)!) n/2) and polynomial space. In particular, we obtain a algorithm to enumerate all the 3-edge-colourings of a connected cubic graph in time O ?(2 n/2)=O ?(1.4143 n ) and polynomial space. This improves the running time of O ?(1.5423 n ) of the algorithm due to Golovach et al. (Proceedings of WG 2010, pp. 39–50, 2010). We also show that the number of 4-total-colourings of a connected cubic graph is at most 3?23n/2. Again, our proof yields a branching algorithm to enumerate all the 4-total-colourings of a connected cubic graph.  相似文献   

20.
We study the probabilistic model in the key tree management problem. Users have different behaviors. Normal users have probability p to issue join/leave request while the loyal users have probability zero. Given the numbers of such users, our objective is to construct a key tree with minimum expected updating cost. We observe that a single LUN (Loyal User Node) is enough to represent all loyal users. When 1−p≤0.57 we prove that the optimal tree that minimizes the cost is a star. When 1−p>0.57, we try to bound the size of the subtree rooted at every non-root node. Based on the size bound, we construct the optimal tree using dynamic programming algorithm in O(nK+K 4) time where K=min {4(log (1−p)−1)−1,n} and n is the number of normal users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号