Блог пользователя lis05

Автор lis05, история, 3 года назад, По-английски

Hello codeforces! Try to solve both of the problems from https://codeforces.me/contestInvitation/fcf424afaf5b7a5d01aebf1a908d6ef3589e4fb3. good luck

Solution:

This is a simple dp problem(very simple), but with a high constraints. Standard knapsack runtime is O(NW), but we can optimize it to run in O(NW/32) using bitset.

#include<bits/stdc++.h>
using namespace std;

const int W=2e5;
bitset<W>b;
 
signed main(){
    int n,w;
    scanf("%d %d",&n,&w);
    b[0]=1;
    while(n--){
        int a;
        scanf("%d",&a);
        b|=(b<<a);
    }
    if(b[w])printf("YES\n");
    else printf("NO\n");
}

How does it work? b[k] contains 0, if it's not possible to get sum k, and 1, if it's possible.

At start, we set b[0] to 1(because we can get sum 0). Next, for each item we left-shift out bitset by a[i]. new_b=b<<a[i] After this move, new bitset contains information about what can we get if we will take current item, but it ignores all previous moves(when we didn't take current item). To fix this, we need to connect two bitsets in one using bitwise or. new_new_b=new_b|b

To do this fast, we just write b|=(b<<a)

Why this works fast? We do N operations of shifting W elements. But bitset works as a long binary number, which constructs of a many 32bit integers. So we don't shift W numbers, we shift only W/32. this is why it work's so fast

But this solution runs in 1s, which is too much. How to improve the improvement? Add some useful pragmas:

#include<bits/stdc++.h>
using namespace std;

// very useful pragmas
#pragma GCC optimize("O3")
#pragma GCC optimize("unroll-loops")
#pragma GCC target("avx2")


const int W=2e5;
bitset<W>b;
 
signed main(){
    int n,w;
    scanf("%d %d",&n,&w);
    b[0]=1;
    while(n--){
        int a;
        scanf("%d",&a);
        b|=(b<<a);
    }
    if(b[w])printf("YES\n");
    else printf("NO\n");
}

Now it runs in ~700ms

P.S This was our first ever problem created on polygon, so we failed it a little with bad tests. Sorry, we will improve ourself for the future!

Also thanks for a great callback!

  • Проголосовать: нравится
  • +35
  • Проголосовать: не нравится

»
3 года назад, # |
Rev. 2   Проголосовать: нравится -10 Проголосовать: не нравится

how is the second problem possible to solve?

  • »
    »
    3 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +5 Проголосовать: не нравится

    Hello! I am going to post an editorial soon.

    • »
      »
      »
      3 года назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      Hi, i was wondering if it was intended to kill FFT solutions for A2 or is just a bad implementation from me.

      • »
        »
        »
        »
        3 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Your bad :(

        • »
          »
          »
          »
          »
          3 года назад, # ^ |
          Rev. 2   Проголосовать: нравится +3 Проголосовать: не нравится

          Well, i couldn't pass FFT during contest so i used an interesting trick based on the fact that $$$Sum(A_i)$$$ is bounded to 2e5.

          Consider all elements are distinct, we can have at most $$$sqrt(MAX)$$$ distinct elements. Now, imagine we want to insert a new element X with frequency F.

          I claim that we can decompose F into at most $$$logF+1$$$ factors such that every number between 1 and F can be formed by adding up some non-empty subset of those factors.

          That way you can do the basic knapsack with $$$sqrt(M)*log(N)$$$ elements instead of $$$2e5$$$.

          I'm not too sure of how strong the test cases are but it ran in 46 ms.

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    check the tutorial :D

»
3 года назад, # |
  Проголосовать: нравится +9 Проголосовать: не нравится

Ok so isnt it kinda stupid to not be able to see other's submissions, since its obviously some trick? I feel like would be much more productive otherwise

»
3 года назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

For 2nd task memory limit is quite tough, using long long was giving mle and int was giving ac.

»
3 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Auto comment: topic has been updated by lis05 (previous revision, new revision, compare).

»
3 года назад, # |
  Проголосовать: нравится +14 Проголосовать: не нравится

$$$\mathcal O(n \sqrt S)$$$ solution where $$$S = a_1 + \dots + a_n$$$. Passes A2 in 46 ms.

»
3 года назад, # |
Rev. 3   Проголосовать: нравится +19 Проголосовать: не нравится

Because of condition $$$a_1 + a_2 + \dots + a_n \le 2 \cdot 10^5$$$ it's possible to solve this problem in $$$O(\frac{w \cdot \sqrt{2 \cdot 10^5}}{32})$$$.

If we have $$$3$$$ or more items with equal value (say $$$x$$$), we can remove $$$2$$$ of this items and instead of them add one item with value $$$2x$$$. Let's do all this possible exchanges. Now we have $$$\le 2$$$ items of each value. And values sum is still $$$\le 2 \cdot 10^5$$$, because exchanges don't change total sum. It's easy to prove then that items count is $$$O(\sqrt{2 \cdot 10^5})$$$. And we just do simple knapsack with bitset on this items.

Source code

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Thanks!

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Are there any sources that mention this trick? Also why do we need >=3 items and not >=2 items when trying to merge?

    • »
      »
      »
      3 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Merging >= 2 won't work in cases like the following: we have 4 copies of an item $$$x$$$. 2 pairs of them merge into $$$2x$$$, then the 2 $$$2x$$$ items merge into one $$$4x$$$, which prevents us from getting sums such as $$$3x$$$. Merging when we have >= 3 fixes that issue because we leave at least one item of size $$$x$$$ behind to make smaller denominations.

      • »
        »
        »
        »
        3 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Oh, it is somewhat like binary bits. If we merge >=2 pairs, then we won't have the option to set this bit to 1 (if number of items is even). Thanks, I get it now.

    • »
      »
      »
      3 года назад, # ^ |
      Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

      Because we may want to take only one $$$x$$$ in answer, so we need to leave one of them in our set. But when we want to take two $$$x$$$ in answer, we can imagine that we take item with $$$2x$$$ instead.

      I saw this idea in this comment.

»
3 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Btw, your code doesn't work for W = 2e5 because bitsets start indexing at 0. And it's also pretty tight on TL for N, M = 2e5 and all Ai = 1.

»
3 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

In A2, submitting the above code even with the pragmas will TLE in C++ (64). Use C++ (32 bit).