Hello, Codeforces community! Today, I want to share my solution to a problem from recent contest: Binary Subsequence Value Sum. Rather than using polynomials and FFT, I tried to approach this problem from a different viewpoint — probability! Now, let's get started with the solution itself.
1. Deterministic definition of the score
Suppose you have a binary string $$$v$$$ of length $$$m$$$. Interpret each character as a “contribution”: assign $$$+1$$$ for a ‘1’ and $$$-1$$$ for a ‘0’. Define
so that $$$d$$$ is the overall “imbalance” of $$$v$$$.
The score of $$$v$$$ is defined as
where $$$F(v,1,i)$$$ is imbalance of prefix and $$$F(v,i+1,m)$$$ is imbalance of suffix; if the prefix imbalance is $$$S$$$, then the suffix is $$$d-S$$$, and the product is
For fixed $$$d$$$, the product $$$S(d-S)$$$ is a quadratic function in $$$S$$$. It is maximized when
However, since $$$S$$$ must be an integer (and may not be exactly $$$d/2$$$ when $$$d$$$ is odd), the best we can do is to take
Thus the maximum product is
So in one compact expression, the score of $$$v$$$ can be written as
2. Counting scores over all subsequences
Now, consider a fixed binary string $$$s$$$ of length $$$n$$$ with $$$R$$$ ones and $$$Z = n-R$$$ zeros.
A nonempty subsequence of $$$s$$$ is obtained by choosing, independently for each position, whether to include that bit. (There are $$$2^n$$$ subsequences in total, including the empty one—but the empty subsequence contributes nothing to the score.)
Imagine that for each bit in $$$s$$$ we make an independent choice with probability $$$1/2$$$ of including it. Then every subsequence occurs with equal probability $$$(\frac{1}{2})^n$$$.
For any subsequence $$$v$$$, let its imbalance be
Then the score of that subsequence is, by our previous argument,
We want to sum the scores over all nonempty subsequences. Rather than summing directly, we use the idea of expectation.
Recall that for any function $$$f(v)$$$ defined on the set of subsequences (with the uniform probability distribution), the average value is:
Multiplying by $$$2^n$$$ recovers the sum over all subsequences:
3. Calculating the Expectation $$$E[(d')^2]$$$
For each bit of $$$s$$$, define a random variable $$$X_i$$$ as follows:
- If the $$$i$$$th bit is a ‘1’ and is chosen, $$$X_i=+1$$$; if it’s not chosen, $$$X_i=0$$$.
- If the $$$i$$$th bit is a ‘0’ and is chosen, $$$X_i=-1$$$; if it’s not chosen, $$$X_i=0$$$.
Then the imbalance for a given subsequence is
- For ‘1’: $$$E[X_i]= \frac{1}{2}$$$ (since it contributes $$$+1$$$ with probability $$$1/2$$$).
- For ‘0’: $$$E[X_i]= -\frac{1}{2}$$$.
Thus, if there are $$$R$$$ ones and $$$Z$$$ zeros,
Each $$$X_i$$$ is a Bernoulli-type variable with two outcomes (nonzero with probability $$$1/2$$$ and zero with probability $$$1/2$$$). In either case (for ‘1’ or ‘0’), one can show that
Since the choices are independent,
Recall that
Thus,
Then the expected value of $$$\frac{(d')^2}{4}$$$ is
So if we (naively) approximated the score of every subsequence by $$$\frac{(d')^2}{4}$$$, then the total over all $$$2^n$$$ subsequences would be:
4. $$$d'$$$ parity correction
The true score of a subsequence with imbalance $$$d'$$$ is
which means that whenever $$$d'$$$ is odd, our naive expression $$$\frac{(d')^2}{4}$$$ is an overestimate by $$$\frac{1}{4}$$$.
A classical combinatorial fact (which one may also derive via the bruteforce check, as I did) is that the total number of subsequences with an odd imbalance is exactly half of all subsequences. Thus, the total “overcount” when summing $$$\frac{(d')^2}{4}$$$ is:
Therefore, by linearity of expectation, the correct total sum $$$S$$$ over all nonempty subsequences is:
Substitute the expectation we computed:
5. Simplifying to the Final Formula
Factor $$$2^{n-4}$$$ out of both terms, and we obtain the final closed-form formula:
My solution: 309852412. Thank you for reading!
Nice solution!
I'd like to
hijack this posttake this opportunity to advertise my own solution for the same problem, which I haven't really seen anywhere else:So, do as the above blog says at first. Then, this reduces to finding the sum of $$$\lfloor \frac{s^2}{4} \rfloor$$$ over all subsequences ($$$s$$$ denotes the sum). We can reduce to finding $$$s^2 - (s^2 \pmod{4})$$$, but the second term depends only on $$$s \pmod{2}$$$, which is classic (the sum of that over all subsequences is $$$2^{n - 1}$$$ if there is an odd number in the array, and $$$0$$$ otherwise), therefore we can reduce it to finding the sum of $$$s^2$$$ over all subsequences. Now, here is the main part: We will use a segment tree to achieve this. How do we make the combine function? Well, we obviously only need to consider $$$(\text{sum}(a) + \text{sum}(b))^2$$$ where $$$a$$$ is a subsequence (possibly empty) of the left child $$$l$$$, and $$$b$$$ is a subsequence (possibly empty) of the right child $$$r$$$. Thus, it's $$$\text{sum}(a)^2 + \text{sum}(b)^2 + 2 \cdot \text{sum}(a) \cdot \text{sum}(b)$$$. What is the contribution of the first term? For a fixed subsequence $$$a$$$ of $$$l$$$, it is counted $$$2^{\text{length}(r)}$$$ times, because we can pair any subsequence in $$$r$$$ (possibly empty) with this term. Thus, it's $$$2^{\text{length}(r)} \cdot \text{answer}(l)$$$. Similarly, the contribution of the second term is $$$2^{\text{length}(l)} \cdot \text{answer}(r)$$$. What is the contribution of the last term? Well, for a fixed subsequence $$$a$$$ of $$$l$$$, we take each subsequence of $$$b$$$ with it, and we take the sum of that over all subsequences $$$a$$$, thus the total sum is $$$2 \cdot (\text{sum of sums of all subsequences of} \; l) \cdot (\text{sum of sums of all subsequences of} \; r)$$$. Both of these are standard to calculate; they are equal to $$$2^{\text{length}(l) - 1} \cdot \text{sum}(l)$$$, and $$$2^{\text{length}(r) - 1} \cdot \text{sum}(r)$$$, which can both be stored in a segment tree.
Code: 309800687
Too many Gods on this site
I know I am no where near comprehending this level of math, but another thing I know is that i will get there someday.
базар жоқ тигр