Abstracting Associative Range Query

Revision en3, by EbTech, 2019-07-16 10:30:18

I want to talk about segment trees or, as I prefer to call them, arqtrees. The "arq" is pronounced "arc", which has a similar meaning to "segment", but it also stands for "associative range query". I believe this name gets to the crux of what this data structure is: you start with a semigroup (S, +), where + stands for any associative binary operation. In other words, it satisfied the following law:

(a + b) + c = a + (b + c) for all a, b, c in S.

In other words, parentheses can be ignored and so it makes sense to talk about range queries of the form:

a_i + a_{i+1} + ... + a_j.

Note that some arqtree introductions talk about monoids instead of semigroups. The only difference is that monoids have an identity element id, satisfying the following law:

a + id = id + a = a for all a in S.

Here's how you would represent Semigroup and Monoid as traits in Rust. Each function must satisfy its corresponding law: ~~~~~ trait Semigroup { fn compose(&self, other: &Self) -> Self; }

trait Monoid: Semigroup { fn identity() -> Self; } ~~~~~ Any semigroup can trivially be extended into a monoid by adding an identity element:

impl<T: Semigroup + Clone> Semigroup for Option<T> {
    fn compose(&self, other: &Self) -> Self {
        match self {
            Some(ref a) => match other {
                Some(ref b) => Some(a.compose(b)),
                None => self.clone()
            },
            None => other.clone()
        }
    }
}

impl<T: Semigroup + Clone> Monoid for Option<T> {
    fn identity() -> Self {
        None
    }
}

Note that if a type implements Monoid, then Clone is not needed since a copy can be generated by composition with the indentity. We could have made our arqtree operate on elements of a type implementing Semigroup + Clone, but I think it's more convenient to work with Monoid. And so, we have our arqtree API:

pub struct ArqTree<T> {
    val: Vec<T>
}

impl<T: Monoid> ArqTree<T> {
    fn modify(&mut self, pos: usize, f: &dyn Fn(&T) -> T) {
        // implement modify
    }
    
    fn query(&self, l: usize, r: usize) -> T {
        // implement query
    }
}

However, this will not suffice when we need to do range updates! In order to update an entire range efficiently, we will need to apply f lazily, storing it in internal nodes of the tree. If multiple updates are performed, we may have to store a composition of updates for postponed application. While one may implement a composition operation f \circle g which simply calls applies g first and then f, this makes the cost of function application no longer O(1)!

Thus, we must switch from function pointers to an explicit, composable representation of our update operator. For instance, instead of storing a function that adds 10, we might store the number 10 to remember that we should add 10 later. The composition of an add-10 with an add-5 is an add-15, which still takes O(1) time.

pub struct ArqTree<T: ArqSpec> {
    app: Vec<Option<T::F>>,
    val: Vec<T::M>,
}

impl<T: ArqSpec> ArqTree<T>
where
    T::F: Clone,
{ ... }

pub trait ArqSpec {
    /// Type of data representing an endomorphism.
    // Note that while a Fn(M) -> M may seem like a more natural representation
    // for an endomorphism, compositions would then have to delegate to each of
    // their parts. This representation is more efficient.
    type F;
    /// Type of monoid elements.
    type M;

    /// For eager updates, compose() ho be unimplemented!(). For lazy updates:
    /// Require for all f,g,a: apply(compose(f, g), a) = apply(f, apply(g, a))
    fn compose(f: &Self::F, g: &Self::F) -> Self::F;
    /// For eager updates, apply() can assume to act on a leaf. For lazy updates:
    /// Require for all f,a,b: apply(f, op(a, b)) = op(apply(f, a), apply(f, b))
    fn apply(f: &Self::F, a: &Self::M) -> Self::M;
    /// Require for all a,b,c: op(a, op(b, c)) = op(op(a, b), c)
    fn op(a: &Self::M, b: &Self::M) -> Self::M;
    /// Require for all a: op(a, identity()) = op(identity(), a) = a
    fn identity() -> Self::M;
}
Tags rust, #segment tree, #lazy propagation, #persistence, monoid

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en30 English EbTech 2020-04-14 07:01:18 30 Tiny change: 's _segments trees_. H' -> 's _segment trees_. H'
en29 English EbTech 2020-04-14 05:55:59 4 Tiny change: 't as with _S_, we can c' -> 't as with $S$, we can c'
en28 English EbTech 2020-04-14 05:55:06 17 Tiny change: ', just as before, can choos' -> ', just as with _S_, we can choos'
en27 English EbTech 2020-04-14 05:46:13 5 Tiny change: 'push()`/`pop()` API ma' -> 'push()`/`pull()` API ma'
en26 English EbTech 2020-04-14 04:59:38 3732 Tiny change: 'ongs to a semigroup $(S, +)$;' -> 'ongs to a _semigroup_ $(S, +)$;'
en25 English EbTech 2020-04-14 03:10:53 63
en24 English EbTech 2020-04-14 03:09:03 169 Tiny change: 'pagation, based on ' -> 'pagation, which I like to call an ARQBIT. It's based on '
en23 English EbTech 2020-04-14 02:53:27 1126 Advanced Usage section
en22 English EbTech 2020-04-13 23:23:05 23 Tiny change: ' ARQ tree based on ' -> ' ARQ tree with lazy propagation, based on '
en21 English EbTech 2020-04-13 23:20:44 647
en20 English EbTech 2020-04-13 23:05:34 304
en19 English EbTech 2020-04-13 22:38:38 1708 Tiny change: '}\n~~~~~\nIn pract' -> '}\n~~~~~\n\n### Equivalence\n\nIn pract'
en18 English EbTech 2020-04-13 02:31:13 18
en17 English EbTech 2020-04-13 01:11:47 2 Tiny change: ' S \times X \rightarr' -> ' S \times S \rightarr'
en16 English EbTech 2020-04-13 01:10:47 81
en15 English EbTech 2020-04-13 01:08:33 1742 Tiny change: ')$, where + stands fo' -> ')$, where $+$ stands fo' (published)
en14 English EbTech 2020-04-13 00:43:22 5983
en13 English EbTech 2020-04-13 00:19:49 705 Tiny change: 'a_{l+1} + ... a_r$.\n\n' -> 'a_{l+1} + \ldots + a_r$.\n\n'
en12 English EbTech 2020-04-12 23:56:40 741 Tiny change: ' language.\n\nIn the' -> ' language. $\sqrt{3}$\n\nIn the'
en11 English EbTech 2019-07-25 22:44:43 67
en10 English EbTech 2019-07-25 22:43:58 6 Tiny change: 'sentation]\n//! (http://co' -> 'sentation](http://co'
en9 English EbTech 2019-07-25 22:43:12 969
en8 English EbTech 2019-07-25 22:38:33 821
en7 English EbTech 2019-07-25 22:31:36 710
en6 English EbTech 2019-07-25 22:19:38 2061
en5 English EbTech 2019-07-25 22:01:00 1493
en4 English EbTech 2019-07-16 10:30:56 2 Tiny change: 'ing law:\n~~~~~\nt' -> 'ing law:\n\n~~~~~\nt'
en3 English EbTech 2019-07-16 10:30:18 1217
en2 English EbTech 2019-07-16 06:20:45 1916 Tiny change: 'usize, f: fn(&T) -> T' -> 'usize, f: &dyn Fn(&T) -> T'
en1 English EbTech 2019-07-16 05:23:54 1106 Initial revision (saved to drafts)