std::generator and recursive lambdas in C++23

Revision en3, by adamant, 2024-11-13 02:21:46

Hi everyone!

As Codeforces now supports C++23, it seems to be the right time to discuss some of the particularly interesting features.

Some noteworthy ones that were already mentioned elsewhere include:

  • views::zip that maps two ranges into pairs (A[i], B[i]).
  • views::enumerate that maps range into pairs (i, a[i]).
  • views::adjacent<k> that maps range into tuples (a[i], ..., a[i+k-1]).
  • views::cartesian_product that maps two ranges into pairs (A[i], B[j]) with all possible i and j.
  • Some more specialized views.
  • ranges::to<Container> that creates a container out of a view, e.g. to<vector>(views::iota(0, n)).
  • ranges::fold_left and ranges::fold_right, range versions of std::accumulate/std::reduce.
  • insert_range/append_range/prepend_range/assign_range for containers (not in GCC yet).
  • print/println for formatted printing (it seems that standard formatters for ranges are not in GCC yet).

But there are also two features that weren't covered in as much detail as they deserve, deducing this and generators.

Deducing this and recursive lambdas

Assume that you want to write a recursive lambda. What are your options? Naturally, you'd try something like this:

    auto fact = [&](int x) {
        return x ? x * fact(x - 1) : 1;
    };

You will not be allowed to do it, because inside the lambda, you use fact before its type is deduced. One way to circumvent it is to write function<int(int)> fact = ... instead. While it is attractive, it makes the calls to fact more expensive, and in certain cases might even lead you to TLE, e.g. if you try to use this for depth-first traversal of a big graph.

Until C++23, the best practice was to do something like this:

    auto fact = [&](auto &&self, int x) -> int {
        return x ? x * self(self, x - 1) : 1;
    };
    cout << fact(fact, n) << endl;

While much more efficient on practice, it looks very ugly. And C++23 helps us, allowing to do this instead:

    auto fact = [&](this auto fact, int x) -> int {
        return x ? x * fact(x - 1) : 1;
    };
    cout << fact(n) << endl;

Just as if it was a proper recursive function!

std::generator

Another interesting feature that I haven't seen mentioned in competitive programming discussions at all are coroutines. Assume that you need to factorize a number. If you depend on Pollard's rho algorithm, your flow probably looks as follows:

    vector<int> factors;
    void factorize(uint64_t m) {
        if(is_prime(m)) {
            factors.push_back(m);
        } else if(m > 1) {
            auto g = proper_divisor(m);
            factorize(g);
            factorize(m / g);
        }
    }

And it's always annoying that to store the result, you have to either keep a global vector, or take output vector as an argument by reference (and pass it around each time). Ideally you'd want to return a vector, but then you might be afraid of accidentally getting a quadratic runtime, and even besides that you'd need to write extra code to merge the results. Coroutines allow us to rewrite it as follows:

    std::generator<uint64_t> factorize(uint64_t m) {
        if(is_prime(m)) {
            co_yield m;
        } else if(m > 1) {
            auto g = proper_divisor(m);
            co_yield std::ranges::elements_of(factorize(g));
            co_yield std::ranges::elements_of(factorize(m / g));
        }
    }
    
    for(int p: factorize(m)) {
        ...
    }

In this manner, factorize will return a generator, which is basically a view wrapper to the function above which generates consecutive elements on the range "on the fly", while also suspending execution of the function between accesses to the resulting range. This way, we avoid the need to store the results of the recursive function somewhere, as well as the need to incorporate external logic or callbacks into the function if we want to do something as soon as you get the next element.

From performance perspective, of course, coroutines may be slightly inferior to having a global vector to store the results. For example, this submission to finding strongly connected components takes 278ms and 108 MB of memory, while its global vector version only needs 229ms and 65 MB. Still I think coroutines are pretty nice concept to keep in mind and might simplify code or make it better structured in a lot of cases.

Tags c++23, coroutines, ranges

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en3 English adamant 2024-11-13 02:21:46 21 Tiny change: ' strongly takes 278' -> ' strongly connected components takes 278'
en2 English adamant 2024-11-13 02:05:28 1027 Tiny change: 'oteworthy features that we' -> 'oteworthy ones that we' (published)
en1 English adamant 2024-11-12 23:53:11 3725 Initial revision (saved to drafts)