harrypotter0's blog

By harrypotter0, history, 7 years ago, In English

Given an int array which might contain duplicates, find the largest subset of it which form a sequence. Ex: {1,6,10,4,7,9,5} then ans is 4,5,6,7

Sorting is an obvious solution. Can this be done in O(n) time ??

Thanks in advance :)

  • Vote: I like it
  • -8
  • Vote: I do not like it

| Write comment?
»
7 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I have a simple idea, but I am not sure whether this is correct or not.

We can adopt a "hash table" to record the integers that have appeared in the set, i.e., for some integer N, if it appears in the set, then hash[N] = 1; otherwise hash[N] = 0. Then, the problem is equivalent to finding out the longest sub-sequence consisting of "1"s. If I were correct, this solution has complexity O(n). However, this approach fails if the maximum integer turns out to be too large, for instance, 109.

»
7 years ago, # |
Rev. 4   Vote: I like it +15 Vote: I do not like it

On the one hand, using a hash table, we can dfs/bfs and find connected components on the path graph (add an edge for each adjacent elements) to solve the problem in Θ(n) expected time.

On the other hand, on the algebraic decision tree model, your problem cannot be solved in O(n) time. To see this, take an instance of the set disjointness problem (A, B) and transform it to [3A1, ..., 3An, 3B1 + 1, ..., 3Bn + 1]. Because the set disjointness problem has lower bound [1], your problem has the same bound.