NAINAAAA's blog

By NAINAAAA, history, 7 months ago, In English

HELLO CF community i'm wondering can we solve below problem w/o using map in O(n)? constraint n < 1e5 & |a[i]| < 1e9

Given an array of integers, find the first repeating element in it. We need to find the element that occurs more than once and whose index of first occurrence is smallest.

  • Vote: I like it
  • +7
  • Vote: I do not like it

»
7 months ago, # |
  Vote: I like it +9 Vote: I do not like it

jo sanjh khwab dekhte the

»
7 months ago, # |
  Vote: I like it -18 Vote: I do not like it

Yes, of course it can be done. Here's an idea for writing: You can save each element of the first array along with its index in an array of pairs. Then we have the Radix Sort algorithm, which sorts in O(n), with which we will sort the array of pairs. Note that so far our asymptotics are O(n). Now we will maintain the response index variable, go through the array of pairs, and while the value is equal to the next one, we will compare the indexes: the answer and the index of the current one (which we just saved in the second value of the pair). Thus, we will go over O(n), the final asymptotics is O(n).

  • »
    »
    7 months ago, # ^ |
    Rev. 4   Vote: I like it 0 Vote: I do not like it

    So, i wrote it. That's implementation. But if you need more speed than my implementation, you can use bitwise radix.

  • »
    »
    7 months ago, # ^ |
      Vote: I like it -9 Vote: I do not like it

    Thanks a ton IceHydra I got some Idea now

    • »
      »
      »
      7 months ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      You may think that my code does not work on negative numbers, and however you are right. But in order to fix this, you only need to do answ = std::max(answ, std::abs(val)) in the FindMaxElement function instead of answ = std::max(answ, val);

  • »
    »
    7 months ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    radix sort is not o(n), it is o(n log n). you can say it is n * number of digits, but number of digits is just log base 10 n so it's a constant difference from log n

    • »
      »
      »
      7 months ago, # ^ |
      Rev. 4   Vote: I like it 0 Vote: I do not like it

      Yes, it really looks like the truth, but if you use a byte radix sort, you can achieve a complexity of O(4 * n), so O(4 * n) ~ O(n). And if you try it on practice, byte radix will be faster in 2 times than std::sort.

      • »
        »
        »
        »
        7 months ago, # ^ |
        Rev. 2   Vote: I like it +24 Vote: I do not like it

        byte radix sort is still technically o(n log n) though, if a is large enough it is much slower than o(n). By that logic you could argue that sieve is linear because n log log n for even 10^7 is less than 4.

    • »
      »
      »
      7 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I don't buy this argument, because if so you can never achieve O(n) as all the numbers have a logarithmic number of digits so reading them is O(n log n). Radix sort is O(n) in terms of the number of inputs, since we take the size of the individual inputs to be O(1) in terms of n.

  • »
    »
    7 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Isn’t radix sort $$$O(n \log {\max{a_i}})$$$? Like it has a constant factor that grows logarithmically with the size of the maximum element.

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

can someone tell me if below code works for this problem or not :

#include <iostream>
#include <vector>
#include <climits>

using namespace std;

void solve() {
    int n;
    cin >> n;

    vector<int> v(n);
    for (int i = 0; i < n; ++i) {
        cin >> v[i];
    }

    vector<int> vis(1e5 + 5, 0);

    int ind = INT_MAX, ele = -1;
    for (int i = 0; i < n; ++i) {
        if (!vis[v[i]]) {
            vis[v[i]] = i + 1;
        } else {
            int i2 = vis[v[i]];
            if (i2 < ind) {
                ind = i2;
                ele = v[i];
            }
        }
    }

    cout << ele << endl;
}

int main() {
    solve();
    return 0;
}
  • »
    »
    7 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    how will this work for negative numbers and if number is >1e5?

    • »
      »
      »
      7 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      yeah , actually it's too wrong I got it

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

I think hashing might work but I'm not sure.

»
7 months ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

.

»
7 months ago, # |
  Vote: I like it +15 Vote: I do not like it

You can use a bitset of size 1E9 (https://stackoverflow.com/questions/5780112/define-a-large-bitset-in-c), which uses 125 MB.

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it
  • »
    »
    7 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    But here we don't need to iterate min. bucket size, we know it's = 1 (d=0)

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

can we use sets??

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Just use gp_hash_table. It's a O(1) hashtable.

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

The best algorithm that I made by this time works in O($$$n\cdot log_n(max A)$$$). But with this restrictions it will work in linear time. First of all, we can make all numbers positive. Just find minimum of the array and add to all elements. We do not care about exact element, but we care about it's index. Then, if maximum of array is less or equal to $$$n$$$, we can do it in O($$$n$$$) just iterating through array and memorizing in the array index of first occurents. Otherwise, we can try to compress elements. Make array of $$$n$$$ elements and name it (let it be b). Then $$$b[x]$$$ is array of elements from the array $$$a$$$ with remainder $$$x$$$ after dividing by $$$n$$$. In other words, $$$b[x]$$$ has all elements $$$a[i]$$$, such that a[i]%n=x. We can do it in linear time. Now we can go in each such $$$x$$$ where $$$b[x]$$$ is non-empty, and if $$$b[x][i]$$$ equal to $$$x+n*k$$$ for some $$$k$$$, we can replace it by $$$k$$$. In such way, all elements that are equal will remain equal and not equal not equal. Also, it make $$$max A$$$ decrease in $$$n$$$ times at least. Then we can apply recursevily this algorithm for each $$$x$$$, such that $$$b[x]$$$ is non-empty.

We cannot use master theorem to analyze this recursion, but we can write out recursive tree and get that branch factor is at most $$$n$$$ but number of elements in each recursion is also at most $$$n$$$, so the time spent on each level is $$$O(n)$$$ and there are at most $$$log_n(maxA)$$$ levels, because we decrease $$$maxA$$$ in $$$n$$$ times at least on each level, so total time is $$$O(n\cdot log_n(maxA))$$$.

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

ok

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

.