Easy and (Semi)Efficient Dynamic Segment Trees (with Policy Hash Tables)

#	User	Rating
1	jiangly	3846
2	tourist	3799
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3590
6	Ormlis	3533
7	Benq	3468
8	Radewoosh	3463
9	ecnerwala	3451
9	Um_nik	3451

#	User	Contrib.
1	cry	165
2	-is-this-fft-	161
3	Qingyu	160
4	atcoder_official	157
5	Dominater069	156
6	adamant	154
7	Um_nik	151
7	djm03178	151
9	luogu_official	149
10	awoo	147

MOne of my favorite implementations of segment trees has always been ["Easy and Efficient Segment Trees](http://codeforces.me/blog/entry/18051), by [user:Al.Cash,2018-07-27]. ~~You can implement~~ I used to dread segtree problems, but after reading that blog post and~~ard segtree with it, you can implement lazy propagation, etc. It suffices for the vast majority of segtree probl~~ adapting a super simple implementation I've gotten a lot better with thems. However, there are some types of segtree that you can't implement in that fashion, namely dynamic segtrees and persistent segtrees. See here for [criticism](http://codeforces.me/blog/entry/18051?#comment-287332). With the advent of policy hash tables, however, one can now implement dynamic segtrees in [user:Al.Cash,2018-07-27]'s style with somewhat comparable performance to a custom dynamic segtree.↵
↵
### Standard segtree↵
This is how a standard segtree looks like. You can set a single element, and query for ranges. It's nice and simple, and I think it's a great implementation.↵
↵
~~~~~↵
int N;↵
int seg[2 * MAXN];↵
↵
void modify(int p, int val) {↵
    for (seg[p += N] = val; p > 0; p >>= 1)↵
        seg[p >> 1] = seg[p] + seg[p ^ 1];↵
}↵
↵
int query(int l, int r) {↵
    int res = 0;↵
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {↵
        if (l & 1)↵
            res += seg[l++];↵
        if (r & 1)↵
            res += seg[--r];↵
    }↵
    return res;↵
}↵
~~~~~↵
↵
### Dynamic Segtree ↵
However, say your underlying array had 1e9 possible locations, but it only contained 1e5 elements. For example, take a look at this [post](https://codeforces.me/blog/entry/19080). Obviously, you can't store all 2e9 elements in your segtree, so what should you do? Here's one solution, replace the array with a hash table. However, as [user:adamant,2018-07-26] [mentions](https://codeforces.me/blog/entry/19080), unordered_map has too much overhead. We'll be benchmarking against the dynamic segtree provided [here](https://codeforces.me/blog/entry/19080?#comment-239925). I'll also be using a custom hash function. So to be clear, the implementation now looks like:↵
↵
<spoiler summary="Code">↵
~~~~~↵
int N;↵
unordered_map<int, int, chash> seg;↵
↵
void modify(int p, int val) {↵
    for (seg[p += N] = val; p > 0; p >>= 1)↵
        seg[p >> 1] = seg[p] + seg[p ^ 1];↵
}↵
↵
int query(int l, int r) {↵
    int res = 0;↵
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {↵
        if (l & 1)↵
            res += seg[l++];↵
        if (r & 1)↵
            res += seg[--r];↵
    }↵
    return res;↵
}↵
~~~~~↵
</spoiler>↵
↵
And benchmarking it with 1e5 random insertions and 1e5 random queries.↵
↵
~~~~~↵
pointer: 0.171485↵
unordered_map: 2.0646↵
~~~~~↵
↵
Wow. The unordered_map is nearly 12x slower. That's not really feasible for a lot of contests. What if we replace it with a policy hash table, though?↵
↵
<spoiler summary="Code">↵
~~~~~~~~~~↵
int N;↵
unordered_map<int, int, chash> seg;↵
↵
void modify(int p, int val) {↵
    for (seg[p += N] = val; p > 0; p >>= 1)↵
        seg[p >> 1] = seg[p] + seg[p ^ 1];↵
}↵
↵
int query(int l, int r) {↵
    int res = 0;↵
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {↵
        if (l & 1)↵
            res += seg[l++];↵
        if (r & 1)↵
            res += seg[--r];↵
    }↵
    return res;↵
}↵
~~~~~~~~~~↵
↵
</spoiler>↵
↵
~~~~~↵
pointer: 0.202186↵
policy hash table: 0.384312↵
~~~~~↵
↵
Only a 2x decrease in speed. That's already very feasible. However, one might notice that since maps in C++ create elements if you try to access a key that doesn't exist, we're creating a lot of useless elements. Thus, we can simply wrap a check to make sure the element is in the array before we try to access it.↵
↵
~~~~~↵
gp_hash_table<ll, ll, chash> seg;↵
↵
ll get(ll x) { return (seg.find(x) == seg.end()) ? 0 : seg[x]; }↵
void modify(ll p, ll val) {↵
    for (seg[p += N] = val; p > 0; p >>= 1) {↵
        seg[p >> 1] = get(p) + get(p ^ 1);↵
    }↵
}↵
↵
ll query(ll l, ll r) {↵
    ll res = 0;↵
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {↵
        if (l & 1)↵
            res += get(l++);↵
        if (r & 1)↵
            res += get(--r);↵
    }↵
    return res;↵
}↵
~~~~~↵
Results:↵
↵
~~~~~↵
pointer: 0.208667↵
policy hash table: 0.310953↵
~~pointer: 0.208667↵~~
~~~~~↵
↵
Only 1.5x slower!↵
↵
One more thing. Although it may seem like this implementation is just 1.5x slower, that's a little bit misleading. If we break down the numbers between insertion/querying, we see this:↵
↵
↵
~~~~~↵
policy hash table insertion: 0.24153↵
pointer insertion: 0.048696↵
~~~~~↵
↵
~~~~~↵
policy hash table queries: 0.08687↵
pointer queries: 0.133678↵
~~~~~↵
↵
So the custom implementation is actually 50% slower than the policy hash table one for queries. Not too bad.↵
↵
↵
↵
↵
↵
↵
↵
↵
↵

Rev.	By	When	Δ	Comment
en11	Chilli	2018-12-10 01:58:10	0	(published)
en10	Chilli	2018-12-10 01:57:34	16	Tiny change: ' queries\n~~~~~\np' -> ' queries\n\n~~~~~\np' (saved to drafts)
en9	Chilli	2018-12-10 01:38:15	1038
en8	Chilli	2018-12-09 10:24:32	0	(published)
en7	Chilli	2018-12-09 10:23:47	18	Tiny change: 'og/entry/19080), unorder' -> 'og/entry/18051?#comment-288074), unorder' (saved to drafts)
en6	Chilli	2018-12-09 10:21:18	212	(published)
en5	Chilli	2018-12-09 10:19:30	6
en4	Chilli	2018-07-27 06:48:45	9	Tiny change: 'tyle with comparabl' -> 'tyle with somewhat comparabl'
en3	Chilli	2018-07-27 06:48:10	7	Tiny change: '="Code">\n\n~~~~~\nint N;\n' -> '="Code">\n~~~~~\n\nint N;\n'
en2	Chilli	2018-07-27 06:46:51	91	Tiny change: 'er:AlCash]'s style w' -> 'er:AlCash] 's style w'
en1	Chilli	2018-07-26 04:18:23	4617	Initial revision (saved to drafts)

History