SecondThread's blog

By SecondThread, history, 12 days ago, In English

Meta Hacker Cup Round 3

Round 3 is upon us! Round 3 begins in under 24 hours on the Hacker Cup site. We hope to see you there after Round 984!

As in previous years:

  • To qualify for the human track of Round 3, you must have finished within the top 500 of Round 2.
  • To qualify for the Finals Round, you must have finished within the top 25 of Round 3.
  • For those competing in the human track, the top 200 placing participants have a special badge on their shirt.

Good morning, have fun, and we hope to see you on the scoreboard! Head on over after Codeforces Round 979, and be careful not to make any algorithms mistakes!

  • Vote: I like it
  • +78
  • Vote: I do not like it

»
12 days ago, # |
  Vote: I like it -8 Vote: I do not like it

That's fantastic! I think it will be funny.

»
12 days ago, # |
  Vote: I like it +43 Vote: I do not like it

Has T-shirts for round 2 been released yet?

»
12 days ago, # |
  Vote: I like it +30 Vote: I do not like it

What about the T-shirts for the top 2000 in round 2??

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +2 Vote: I do not like it

    Top 2k in round 2 will get shirts, they just won't have a "Top 200" badge on them.

»
12 days ago, # |
  Vote: I like it 0 Vote: I do not like it

Well I guess time travel is possible nowadays.

»
11 days ago, # |
  Vote: I like it +159 Vote: I do not like it

Problem E is one of the best joke statements I've seen. A brilliant idea to use the name of Dijkstra!

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +33 Vote: I do not like it

    I'm glad you like it :)

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +28 Vote: I do not like it

    Agreed! Though I am confused why anyone went for E2 given its point value.

    • »
      »
      »
      11 days ago, # ^ |
        Vote: I like it +20 Vote: I do not like it

      The contest organizers had some interesting thoughts about E's value and strategy. If the set is easy and likely to get solved, it would make a lot of sense to do E1/E2, that way your penalty gets significantly improved.

      But if solving the set isn't important, then going after E instead of C or D wouldn't be too helpful; because of the double penalty. It's an interesting call you have to make early on in the contest.

      • »
        »
        »
        »
        11 days ago, # ^ |
        Rev. 2   Vote: I like it +10 Vote: I do not like it

        Makes sense. I reasoned after finishing E1 that I almost certainly wouldn't have the time to finish everything, so I ended up going for the larger point values.

        Though I still think it's a bit strange how C = D = E1 + E2; IMO E1 + E2 should be worth more.

      • »
        »
        »
        »
        11 days ago, # ^ |
          Vote: I like it +20 Vote: I do not like it

        I kinda get your point, but I am not the fan of the end result. Solving the whole set is very rarely required to get to the finals. Assuming it is not required, there are basically no advantages of going for E2. Doing so would be ""high risk ... no reward" situation. And even in the unlikely case that the whole set was required ... I think that E2 will still require a lot of effort, even having solving E1 before. I didn't try a lot to solve it, but I'd imagine that B and C were easier for me than E2 would have been

        • »
          »
          »
          »
          »
          11 days ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          E2's existence obviously mattered even without anyone solving set. Without it, tourist wouldn't have qualified for finals, and with it, Errichto and ko_osaga probably would have.

          "Assuming it is not required" is not reliable assumption to make given, for example, last year's results. If it were required, E2 likely would be the most important early-solve in the entire set.

          The logical jump between E1 and E2 is IMO not too large. It's especially easy to stumble across if you already have a bruteforcer for E1 and look at linked lists. Contestants seemed to have proven us right that it was easier than D. Perhaps the judges overestimated the difficulty of the calculus in C, but at large I think the point distribution makes a lot of sense actually.

          If you disagree, vote with the problems you choose to solve (like Benq did). Either you'll be right and perform really well because of it, or you'll learn why you were wrong.

»
11 days ago, # |
  Vote: I like it +27 Vote: I do not like it

FST fiesta incoming, A's validation cases don't check $$$k = 3$$$ and E1 has $$$5$$$ validation cases for a YES / NO problem lol (where I'm pretty sure 3 are the direct samples).

»
11 days ago, # |
  Vote: I like it +25 Vote: I do not like it

Thanks for the weak pretest, I thought I wasn't gonna get a top 200 this year xdd

»
11 days ago, # |
  Vote: I like it +55 Vote: I do not like it

After the contest I felt quite happy, as I had enjoyed competing. Then I looked at the scoreboard.

In any case, thanks to the organizers!

»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

So are weak pretests in MHC now going to become a thing? Asking for the next year and future version.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    There is a reason why they say validation input rather than pretests.
    These tests exist to just ensure one's output file format is correct.

»
11 days ago, # |
  Vote: I like it +4 Vote: I do not like it

Solved D 12 minutes after the contest ended...

»
11 days ago, # |
Rev. 4   Vote: I like it 0 Vote: I do not like it

Good news: I didn't failed system test this time.

Bad news: I only solve problem A. (And FST only make my ranking drop from 296 to 303)

So saddddd.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I FSTed both A and E1, and my B code RTEd on main tests. gg

»
11 days ago, # |
  Vote: I like it +15 Vote: I do not like it

What manually submitting 50+ submissions for B did to my Downloads folder

»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

I was doing so well on C until Wolfram Alpha told me to use a digamma function ... Tragic.

»
11 days ago, # |
  Vote: I like it +10 Vote: I do not like it

Are all the correct and incorrect results true now (except for the penalty calculation) ?

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +36 Vote: I do not like it

    They are correct, and penalty is correct. We might DQ some people for plagiarism, but usually there's not too much of that.

»
11 days ago, # |
Rev. 3   Vote: I like it +33 Vote: I do not like it

Ah, I solved B and C, but in B I stupidly did return {set, map} instead of return {move(set), move(map)}, and in C I relied on boost::math::digamma to compute the sum of $$$\frac{1}{A+Bk}$$$ over $$$k$$$ from $$$L$$$ to $$$R$$$, but it turned out to be not precise enough (produces relative error of $$$8 \cdot 10^{-6}$$$). Sadge, two avoidable things landed me 17 places away from a better t-shirt :(

UPD: The bug in C was actually because I only checked $$$D$$$ up to $$$100$$$ instead of up to $$$101$$$. I guess I feel somewhat better now, knowing that it was not a precision issue.

»
11 days ago, # |
  Vote: I like it +10 Vote: I do not like it

What is special with p=0 in problem C?

  • »
    »
    11 days ago, # ^ |
    Rev. 3   Vote: I like it -10 Vote: I do not like it

    Oh, yes, I mixed up which part(prefix or suffix) I need to approximate using integral.

    Sorry for the inconvenience.

    • »
      »
      »
      11 days ago, # ^ |
      Rev. 2   Vote: I like it +18 Vote: I do not like it

      Maybe you missed taking this factor into account. I did the same mistake.

      • »
        »
        »
        »
        11 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I had a different typo, but thank you for mentioning this.

        In my approximation I do not need to consider Euler's constant.

    • »
      »
      »
      11 days ago, # ^ |
        Vote: I like it -12 Vote: I do not like it

      With that, you claim that your solution is correct, while tens of other contestants who got ACs, actually have wrong answers. And not just that, but all of their answers have relative error from a peculiar interval of $$$[10^{-2} - 10^{-6}, 10^{-2} + 10^{-6}]$$$! I don't think that passes sanity check :)

      • »
        »
        »
        »
        11 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        First, I've never claimed that my solution is correct, only that I struggle to find what is wrong.

        Second, I was more curious whether Jury knows how to calculate the exact answer somehow, as in the editorial the approximation is mentioned to be "good enough".

        Third, last round 95% of submissions failed hidden tests for B, so this scenario(that somehow everybody used the same wrong approximation) is not completely improbable in this type of competition.

        • »
          »
          »
          »
          »
          11 days ago, # ^ |
            Vote: I like it -12 Vote: I do not like it

          Okk, I'll treat this as a miscommunication then :p

        • »
          »
          »
          »
          »
          10 days ago, # ^ |
          Rev. 3   Vote: I like it +10 Vote: I do not like it

          I was more curious whether Jury knows how to calculate the exact answer somehow

          By running an O(N) DP for a loooooong time :)

»
11 days ago, # |
Rev. 3   Vote: I like it +39 Vote: I do not like it
»
11 days ago, # |
  Vote: I like it +18 Vote: I do not like it

What was the issue with B?

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    We're still looking into it. There's nothing fundamentally interesting about the I/O compared to other problems. It has a large input file, but so do other problems, and that doesn't affect uploading the output of course.

    Interestingly, the judges had no issue making any of the submissions, and it's the same mechanism under the hood for judges and contestants.

    • »
      »
      »
      11 days ago, # ^ |
      Rev. 7   Vote: I like it +11 Vote: I do not like it

      Also on the subject of problem B, I'm curious why there was no test nearing max depth (e.g., a line of length $$$3\cdot 10^6$$$).

      I was testing locally before downloading the full test set and noticed that my recursive DFS solution compiled with homebrew GCC would give a runtime error on a line of size $$$3\cdot 10^6$$$ (though it passes on a line of size $$$2\cdot 10^6$$$, or if I switch to clang ...). Anyway, I rewrote my solution to remove the recursive DFS.

      (On that note, does anyone know how to increase the stack size limit on Mac past 512 MB?)

      • »
        »
        »
        »
        9 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Oh, that is indeed an oversight. One of the large cases was meant to be such a long line. I see where I've gone wrong in my data generator. I validated a variety of other properties, but not the actual depth.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Idk if this only happened to me, but I had to redownload input for 5 times. After that I couldn't create submission and got repetitive error responses. (Probably, I requested too much download) Then, the timer expired and I sent my output and code to clar within a minute and they made a submission for me.

    It's weird that I could attach and send my submission in clar just fine, but not problem B.

    • »
      »
      »
      11 days ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I also had the same submission issue... Judges also helped me created a submission, which is pretty nice :)

»
11 days ago, # |
Rev. 2   Vote: I like it +141 Vote: I do not like it

Thanks for the round! I was (luckily) just right about how careful I should be.

It was evident that A and E1 have "weak" validation if we download them (before solving). What was not evident for me is C's weakness (the case where you use the 100% option with large $$$N$$$, or where the strategy borderline comes at an integer). I was rescued by stress testing for all of them...

By the way my directory and my brain messed up by downloading validation inputs before solving and caring about the problem titles. Perhaps this has been raised before, but I was wondering if the organizers could attach problem ID as a prefix to input file names (like A- or E1_).

  • »
    »
    11 days ago, # ^ |
    Rev. 2   Vote: I like it +26 Vote: I do not like it

    Perhaps this has been raised before, but I was wondering if the organizers could attach problem ID as a prefix to input file names (like A- or E1_).

    Agree that this would be nice. I always end up having to rename the files manually (e.g., E1_val.txt, E1_act.txt).

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    I definitely support that suggestion!

»
11 days ago, # |
Rev. 2   Vote: I like it +34 Vote: I do not like it

Sad for jiangly. I wanted to see him in the finals.

»
11 days ago, # |
  Vote: I like it +30 Vote: I do not like it

How were the "ground truth" answers found in problem C?

»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

I wonder if the setter of problem C played Danganronpa... Literally the same setting is implemented as a "mini-game" there, except for the fact that the formulas are different there, so that it's always optimal to pay 1 dollar...

»
11 days ago, # |
Rev. 2   Vote: I like it +24 Vote: I do not like it

Somehow problem D went relatively untouched. Putting aside the issue that me and PurpleCrayon both cheesed through with (slightly carefully implemeneted) $$$O(n^2)$$$, the actual solution is only one tiny observation away. With all the precisions with C and weak pretests with E, D is the easiest problem by expected value it seems. I got into finals basically by getting the free D plus very fast (+correct) A and B.

I didn't have the foresight to actually go for D though. I am just forced to do that by standings. After messing up on C, I can only get top 25 by D as E1+E2 would not give enough penalty.

I also thought that I would only ever have enough time to implement it if I went for the $$$O(n^2)$$$. This turns out to be false.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Unfortunately, there was no way for me to beat properly optimized $$$O(N^{2})$$$ solutions with only ~70MB of input data in total. But I agree that having such a solution required having all the observations in place, only the data structure to perform it with better complexity could be the missing part. I even had multiple solutions that are $$$O(N^{2})$$$ and use most of the obervations, but are not optimized enough and those worked for approximately 10 minutes on my machine. I wonder if anyone expected the data structure in the model solution to be SQRT-decomposition though.

    • »
      »
      »
      11 days ago, # ^ |
      Rev. 5   Vote: I like it 0 Vote: I do not like it

      Hm, I'm looking at arvindf232's solution and it doesn't look like it requires any observations? It clearly does proportional to $$$N\times M$$$ operations per test. I assume "slightly careful" refers to improving the constant factor (e.g., by removing modulus operations) and doing the DP in HLD order so that the memory usage is $$$O(N + M\log N)$$$.

      Tried it in C++ just now and it took just under 6 minutes with multithreading. Though I had to rewrite the DFS to get around the stack size limit of 512MB on Mac (anyone know a way to increase this?).

      benjaminqi:Desktop % g++-14 -std=c++17 -O2 -Wl,-stack_size,30000000 D_threaded.cpp 
      ld: -stack_size must be <= 512MB on arm64 platforms
      
      Code (Less efficient, but same time complexity as below)
      Code (More efficient)
      • »
        »
        »
        »
        11 days ago, # ^ |
          Vote: I like it +10 Vote: I do not like it

        It doesn’t, but I thought the only observation after that is just the “key observation” as in the tutorial, and it doesn’t seem to be particularly hard to find.

        I am saying “carefully Implemented” just as a caution since, I see a few people attempted to download data but cannot make a submission, and I am guessing this might have been what happened.

        It is also totally trivial to do O(n) memory total if you only keep upto the height of the vertex, alongside an easy factor of 2 improvement. I didn’t end up doing so as I believe it is already fast enough to pass, after knowing a max case would run in 1.2 minutes.

        • »
          »
          »
          »
          »
          11 days ago, # ^ |
          Rev. 3   Vote: I like it 0 Vote: I do not like it

          it doesn’t seem to be particularly hard to find.

          I agree — I'm surprised that my solve was the only "legitimate" one. I thought my $$$O(N\log M)$$$ implementation was quite clean, though as evidenced both above and below it seems that many others got caught up in implementation issues.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I would've probably tried $$$O(n^2)$$$ if I ran out of time because I literally wrote it and verified it with pretests. What happened instead is that I was able to optimize it to $$$O(n \log n)$$$, along with some stupid bugs that took ages to find :(

»
11 days ago, # |
Rev. 2   Vote: I like it +8 Vote: I do not like it

Might be skill issue but B stack overflows up to N = 300000 (1/10 the biggest tc) on my mac with whatever tricks I find online. Ended up spending 30+ minutes getting my school's clusters to work.. Anyone get it to run on mac?

  • »
    »
    11 days ago, # ^ |
    Rev. 4   Vote: I like it 0 Vote: I do not like it

    Are you compiling with -Wl,-stack_size,20000000 as mentioned here? It will increase the stack size limit to 512MB.

    https://codeforces.me/blog/entry/134678?#comment-1204701

    That being said, 512MB is not always enough (see my comments above) and idk how to increase further. If you are near the stack size limit and don't want to change any code, you can try switching from GCC to clang or vice versa.

    • »
      »
      »
      11 days ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      That's exactly the post I tried. I got the same error as you post above when compiling with -Wl,-stack_size,20000000.

      • »
        »
        »
        »
        11 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Can you share your code?

        • »
          »
          »
          »
          »
          11 days ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Made a random subission: 289708610

          • »
            »
            »
            »
            »
            »
            11 days ago, # ^ |
            Rev. 4   Vote: I like it 0 Vote: I do not like it

            Tried running locally on the full data, as well as the $$$N=3\cdot 10^6$$$ line I constructed, and it finishes successfully.

            g++-14 -std=c++17 -O2 -Wl,-stack_size,20000000 B.cpp -o B && ./B < Bact.txt 
            

            About my Mac in case it matters:

            • 14-inch, 2023
            • Chip: Apple M2 Pro
            • Memory: 16GB
  • »
    »
    11 days ago, # ^ |
    Rev. 3   Vote: I like it 0 Vote: I do not like it

    Are you sure it is stack overflow? The trees in the full data aren't particularly deep, so even if you don't increase stack size past the default it should pass.

    Can you include the exact error are you getting?

    • »
      »
      »
      11 days ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      Thanks for running my code btw.

      To clarify I get stack overflow on mine extreme tc when the tree is a chain. Therefore, I didn't even bother running locally the full data during the contest. However, it turns out the full data has max depth < 9000, so when I run it just now it passed. I guess that explained why nobody else mentioned stack overflow for B.

      It's weird though so many people FST'ed other problems but the full data for B is weak in this sense.

»
11 days ago, # |
  Vote: I like it +23 Vote: I do not like it

t shirt designs reveal pls SecondThread

/\____/\
꒰ ˶• ༝ — ˶꒱
./づᡕᠵ᠊ᡃ࡚ࠢ࠘ ⸝່ࠡࠣ᠊߯᠆ࠣ࠘ᡁࠣ࠘᠊᠊°.~♡︎

»
11 days ago, # |
  Vote: I like it +8 Vote: I do not like it

FSTed on round 3 — A.

How about a top 500 badge on the shirt for those who managed to qualify to round 3?

»
11 days ago, # |
  Vote: I like it +94 Vote: I do not like it

What's the correct answer to the 6th sample in problem C? The sample statement says $$$2.203697500670872 \times 10^{10}$$$, but Wolfram Alpha seems to evaluate to something smaller by relative error of $$$\approx 2 \times 10^{-8}$$$, which is quite large.

I believe the formula is

$$$ \displaystyle \sum_{i=1}^{3052126200} 3 \cdot \frac{9988776655 * 100}{i * (100 - (3-1) * 44) + 9988776655 * ((3-1) * 44)} + \sum_{i=3052126201}^{ 9988776655 } 1 \cdot \frac{9988776655}{i} $$$

I evaluated the sums with Wolfram Alpha and it adds up to $$$2.2036974638794566727721073889157678558271191858 \times 10^{10}$$$.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Sorry for changing the topic but you surprised me with the "6th sample". I always assumed that MHC has randomized tests that come from a larger pool, selected by some hash for each contestant individually to prevent cheating with multiple accounts (downloading datasets from burnable account, running some bruteforce for like an hour, and submitting from main account within 6 minutes). It seems that's not true then?

    • »
      »
      »
      10 days ago, # ^ |
        Vote: I like it +26 Vote: I do not like it

      Validation input is the same for everyone. Although, full input is randomized.

      • »
        »
        »
        »
        10 days ago, # ^ |
        Rev. 2   Vote: I like it +8 Vote: I do not like it

        Okay, this makes a lot of sense actually, since randomizing validation inputs could give more lucky contestants an unfair edge if they got some corner case to a validation set by chance. And also would enable a cheater to sample the testcase pool by requesting validation inputs from burnable accounts (though this would be much more cumbersome compared to everyone having the same full input)

  • »
    »
    10 days ago, # ^ |
    Rev. 2   Vote: I like it +10 Vote: I do not like it

    The O(N) DP that I used to compute the "ground truth" answers can indeed have some errors up to about 2e-8. We made the judger slightly more lenient to account for this.

    • »
      »
      »
      9 days ago, # ^ |
        Vote: I like it +28 Vote: I do not like it

      You can get extremely accurate estimates of harmonic sums using the Euler-Maclaurin formulas. The first few terms give:

      $$$ \displaystyle \sum_{i=a}^{b} \frac{1}{i} = \log(b) - \log(a) + \frac{1}{2}\left( \frac{1}{a} + \frac{1}{b} \right) + \frac{1}{12} \left(\frac{1}{a^2} - \frac{1}{b^2}\right) + O\left(\frac{1}{a^4}\right) $$$

      with analogous formulas for the linear-progression case.

      In general, I think it's pretty misleading to give so many digits of the sample output that aren't even correct. I know strictly speaking this is a possible correct answer, but for floating point in particular, the sample output really should allow people to accurately check their precision.

      • »
        »
        »
        »
        9 days ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Thanks for the feedback -- with the approach that I took, it would've been worth using BigDecimal or similar to ensure the full 15 digits of precision.

»
11 days ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

I have upsolved problem D with a $$$O(nm)$$$ complexity solution with $$$O(n)$$$ memory, it runs in about a minute on my laptop for the dataset you can download in the practice. (I used multithreading)

»
3 days ago, # |
Rev. 2   Vote: I like it +1 Vote: I do not like it

i think facebuk forgor about the tshirt