A Stupid Data Structure That Does O(log(n)) Range Queries

Правка en7, от estoy-re-sebado, 2023-11-22 06:01:59

TLDR: just take this blog of mine and sprinkle some canonical skew-binary numbers and boom, you have a stupid version of the Fenwick tree that is asymptotically faster that the Fenwick tree on an obscure operation.

The idea

I explained on my previous blog that we can think of the Fenwick tree as a set of intervals, and that we can use those intervals to answer any range query in $$$O(\log(N)^{2})$$$. But there is no real reason to use exactly the same intervals that the Fenwick tree uses. Instead we can try to use some other set of intervals to make things more efficient (at least in an asymptotic sense).

Instead, we use intervals based on the jump pointers described on that one blog from the catalog. That blog claims that we can jump any distance from any node in $$$O(\log(N))$$$ by following a greedy algorithm (the same greedy I used to perform range queries on a Fenwick tree). It follows that we can use the same kind of thing to perform range queries in $$$O(\log(N))$$$ time.

To be clear I put things in bullet points:

  • We want to compute range sum queries on some array A
  • The data structure stores tree[i] = sum of A(i-jmp[i], i] (yes, an open-closed interval)
  • The jumps are defined recursively: if jmp[i] == jmp[i-jmp[i]] then jmp[i+1] = 2*jmp[i]+1 and otherwise jmp[i+1] = 1
  • The greedy algorithm to perform queries goes as follows:
int A[MAXN+1], jmp[MAXN+1], tree[MAXN+1];

int range(int i, int j) { // inclusive, 1-indexed
  int ans = 0;
  while (j-i+1 > 0) {
    if (j-i+1 >= jmp[j]) {
      ans += tree[j];
      j -= jmp[j];
    } else {
      ans += A[j];
      j -= 1;
    }
  }
  return ans;
}

Some observations:

  • The above algorithm performs any query in $$$O(\log(N))$$$ time. This is due to some properties of skew-binary numbers (trust me bro...)
  • The jumps are well-nested and they have power of two (minus one) length
  • It follows that each position is covered by up to $$$\log(N)+O(1)$$$ intervals
  • In other words, when we change a value we only need to update $$$O(\log(N))$$$ intervals

Sadly, I haven't figured out how to compute which intervals cover a particular index (nor the jumps) on-line but we can just precompute them.

Below is the precomputation of the jumps and the intervals that cover each index:

vector<int> covered[MAXN+1];
void init() {
  jmp[1] = jmp[2] = 1;
  for (int i = 2; i+1 <= MAXN; ++i) {
    if (jmp[i] == jmp[i-jmp[i]]) {
      jmp[i+1] = 2 * jmp[i] + 1;
    } else {
      jmp[i+1] = 1;
    }
  }
  for (int i = 1; i <= MAXN; ++i) {
    for (int j = i; i > j-jmp[i]; ++i) {
      covered[j].push_back(i);
    }
  }
}

Since we have precomputed the intervals that cover each index, updating them is trivial.

void update(int i, int x) {
  A[i] += x;
  for (int j : covered[i]) tree[j] += x;
}

We can do this with less memory and time by storing the smallest interval that strictly covers each interval, and walking that as a linked list. It also looks trivial to construct given the way the jumps are constructed but this data structure is too useless for me to care.

Conclusion

We have invented a Fenwick tree-flavored data structure that does $$$O(\log(N))$$$ range queries and updates. I had never heard of this. It might even be novel (but probably known in China for 30 years).

But surprise: it's dog slow in practice. Don't use it to perform range queries. But maybe the idea could be useful in some ad-hoc problem or whatever.

Thanks to ponysalvaje for explaining me how Fenwick tree works.

Теги range query, fenwick tree, binary indexed tree, binary lifting

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en10 Английский estoy-re-sebado 2023-11-22 23:15:33 3715 Make the data structure use O(N) memory, add section on better point updates.
en9 Английский estoy-re-sebado 2023-11-22 20:54:36 32
en8 Английский estoy-re-sebado 2023-11-22 18:07:39 88
en7 Английский estoy-re-sebado 2023-11-22 06:01:59 209
en6 Английский estoy-re-sebado 2023-11-22 05:31:38 13 Tiny change: 'mp[MAXN+1];\nlong long tree[MAXN' -> 'mp[MAXN+1], tree[MAXN'
en5 Английский estoy-re-sebado 2023-11-22 05:30:43 31
en4 Английский estoy-re-sebado 2023-11-22 05:23:45 170
en3 Английский estoy-re-sebado 2023-11-22 05:21:40 97
en2 Английский estoy-re-sebado 2023-11-22 05:20:28 3085 Tiny change: 'onstructed, but this ' -> 'onstructed but this ' (published)
en1 Английский estoy-re-sebado 2023-10-16 19:19:12 1161 Initial revision (saved to drafts)