Since nobody posted this...
Tasks are here: https://ioi2021.sg/ioi-2021-tasks/
You can solve all Day 2 tasks here: https://oj.uz/problems/source/577
In case you want to solve it in 5 hours, you can try here: https://oj.uz/contest/view/IOI2021DAY2
- Related post: https://codeforces.me/blog/entry/92088
Similarly to Pajaraja few days ago, let me come out as an author of Registers problem. I am very happy that this problem made it to the day 2 problemset, kudos to the scientific committee for choosing and preparing it :) Let me share a few thoughts related to this problem.
First of all, I hope that it was interesting to the contestants; it is somehow similar to IOI'2012 Odometer problem in its setting: you must generate program in artificial programming language with some desired properties. I was participating in that IOI and I liked that problem very much, and maybe that's what led me to come up with something similar 9 years later.
In my original proposal there was an extra subtask for sorting $$$n$$$ integers in $$$o(n)$$$ instructions, e.g. in $$$O(log^2 n)$$$ or even $$$O(log n)$$$. Such subtask is heavily related to the notion of a sorting network: almost any efficient sorting network consists of regular comparison patterns which may be executed in parallel using comparison primitive from the earlier part of the problem. My personal opinion was that including a subtask for $$$O(poly(log n))$$$-depth sorting network would be fine since coming up with something like bitonic sorting network is possible with pen and paper without any beforehand knowledge, but I also believe that scientific committee had some reasons not to include that subtask :) Maybe there are issues with proper constraints for such subtask, I'm not sure. But neverthless, for me it seems as an impressive fact that sorting $$$n$$$ integers in such computation model may be done in $$$o(n)$$$ instructions, that sounds quite counter-intuitive at the first glance.
Finally let me note that this problem is quite practical; there are actual implementations of similar concepts that allow sorting short vectors of integers using modern SIMD instructions which are amazingly fast. For example, here one can find a repository with quite good implementation of small sorting networks in assembly language: SortingNetworks.
How do you solve subtask 6 using $$$\Omega(n)$$$ instructions? The only way I know to solve that subtask is using bitonic sort.
You may use simple bubble-sort like procedure.
On even steps perform "swap if wrong order" among pairs $$$(a[0], a[1]), (a[2], a[3]), ...$$$; on odd steps perform "swap if wrong order" among pairs $$$(a[1], a[2]), (a[3], a[4]), ...$$$. One may show that it is enough to do $$$n$$$ such operations in order to obtain sorted sequence.
Refer to "When allowing for parallel comparators, bubble sort and insertion sort are identical" picture here for an illustration.
This idea was also used in this year's Baltic OI (see subtask 5).
UPD: I didn't notice Benq's comment, sorry :D
The comment is 2 years late but I think they decided not to include the subtask because sorting networks and bitonic sorting network appeared in BOI 2021 P4 (swaps) 2 months before the IOI.
Let's compute a binary-lifting kind of $$$dp$$$: $$$dp[i][j][k]$$$ simulates the game starting from opponent $$$i$$$ for $$$2^j$$$ steps with the following rules: if you face an opponent with strength less than $$$2^k$$$, you beat them. Otherwise, they beat you. Note that it neither depends on your strength nor changes it. We'll figure out what it returns later. Suppose you currently have strength $$$z$$$ with most significant bit at index $$$k$$$, and suppose you simulate using the $$$dp$$$. The simulation will be absolutely correct until the moment you face an opponent with strength at least $$$2^k$$$ and beat them. But notice that after this happens, the most significant bit in your strength must increase, so this can only happen $$$O(log(s_i))$$$ times. So we can afford to simulate up to this opponent with the $$$dp$$$, beat this opponent, then start again. What's left? We need to figure out when we'll face that opponent. We beat an opponent if (initial strength)+(gained strength)-(opponent strength)>=0, so we can make our $$$dp$$$ return the maximum value of (gained strength)-(opponent strength) and binary-lift to find the opponent that will move us to a higher most significant bit. Of course, you'd need another $$$dp$$$ that gives you the gained strength to compute this $$$dp$$$, and a third $$$dp$$$ that gives you the monster's index, but they all have the same concept.
I did the same. How did this not MLE?
I didn't write a code, and I didn't really pay attention to the memory. My fix in rev. 1 is stupid. I'll think of a way to fix it.
UPD: an ugly way to fix it is to work in a base higher than base $$$2$$$. You'll use a bit more time and a bit less memory.
I think your memory problems mostly come from using jump-pointers, which take $$$O(N \log N)$$$ memory for a single tree. Instead, just consider this functional graph for each $$$k$$$, and build something more memory-efficient, like HLD or these jump pointers https://codeforces.me/blog/entry/74847.
Yep, working in a base higher than 2 works. But I TLE'd when I tried to use base 3 for both parts. Instead, I use base 8 for the power of two range containing the values, and base 2 for the sparse table. This took like 5s and a bit over half of ML on oj.uz
You can use log4() :)) as i was
I have a solution that works in $$$O(N \log X + Q \log X \log N)$$$ time and $$$O(N \log X)$$$ memory, I'm very angry because this fails assert from a stupid place and I have no idea how it's possible. But probably the idea is correct.
Do HLD on each functional graph (remove arbitrary cycle edges). If you reach the root of the forest, you can calculate how much time you have to turn through the cycle with simple maths. So the issue is to find the first failure point from each vertex-to-root path. With some prefix sum stuffs you can determine in $$$O(1)$$$, if the failure point is in the prefix of some chains. So you can determine the chain with failure point in $$$O(\log N)$$$ time.
If you find that chain, you can binary search for the first failure point. First criteria assumes that you simply failed but earned enough money to jump to next level. This is a partial sum, so ofc $$$O(\log N)$$$ time suffices. Next criteria is whether you have some point with $$$S_i \in [2^k, hp]$$$. This can be considered as finding a largest $$$l$$$ where $$$min[l, r] > X$$$ for some array. So walking down the RMQ suffices. Since we spent $$$O(\log N) + O(\log N)$$$ time for query, time complexity for query is $$$O(\log X \log N)$$$. Since we need linear time for constructing RMQ, time complexity for initialization is $$$O(N \log X)$$$.
Isn't DNA similar to Another IOI task?
Hi, I am the author of the dna problem :)
Indeed, the final version used in the contest ends up being fairly similar to that task, now with queries. However, in my original proposal the four DNA bases were used and there was no restriction to only three (which makes for a more natural story :) ). Also it had updates on the strings, but that is indeed a fairly minor thing, as I can only imagine that it would make a different for students knowing prefix sums but not knowing segment/fenwick trees.
The original proposal even suggested inventing an alien or extra DNA base to have 5 bases, as the task can be solved extending the same idea to that case, and the proof that it works for 5 came as quite a surprising result to me when I noticed it. Specially since for the case of an unrestricted number of different letters, the problem becomes NP-hard, and the same natural "greedy" approach that works up to 5 letters breaks precisely for the case of 6 letters.
In the end, I am very happy that the problem was selected and used, even though it was simplified, and I think that the jury made the right call to change it, since specially after seeing IOI 2021 Day 1, I think that the contest was lacking one easy problem for overall balance.
Original version with four bases would've been much better in many ways IMHO
With 3 it was too easy (didn't affect medal distribution at all), too similar to that task and also not very natural
Hello, Is there any complete analysis of this task on the internet? I mean the proof of the 3 DNAs, 4 DNAs, and 5 DNAs case, and also the NP-hard version?
Here is an extract from my old submission of the problem https://filebin.net/xovlisuc39fnvdlw/dnaswapsolution.pdf
I had this approach for problem dungeons in the contest but didn't manage to code it in time, Did anyone pass with this approach?
Let's have a magic number $$$x$$$ that equals 11835, Let's divide the nodes into Heavy and Light nodes, node $$$u$$$ is heavy node iff $$$S_u \geq x$$$ and light otherwise.
Let's construct a graph, If $$$u$$$ is heavy node then there's a directed edge from $$$w_u$$$ to $$$u$$$, else if $$$u$$$ is Light node then there's a directed edge from $$$L_u$$$ to $$$u$$$, It turns out that this graph is functional graph such that it consists of multiple trees linked by a cycle.
Let the initial strength be $$$z$$$ where $$$z \geq x$$$, notice that we will need at most $$$ \lceil{\frac{10^7}{x}}\rceil$$$ steps to get a value $$$\geq 10^7$$$ If we can jump from every node to some node such that the difference between strength $$$\geq x$$$
Now to find for every node the next node to jump to, we can Just binary lifting on this, Notice that we will need at most $$$x$$$ operations to get to next node.
Total time complexity $$$O(Q * \lceil{\frac{10^7}{x}}\rceil * log(x))$$$ (timelimit is 4 seconds)
Is it possible to solve DNA in $$$o(n)$$$ time per query if you're only allowed to swap adjacent elements?
I actually misread the problem and solved this version during the contest.
First of all, we will count the number of inversions among each pair of distinct characters (
A
andT
,A
andC
, andT
andC
). We will then sum those counts up, and we should have our final answer.WLOG, let's assume that the strings contain only characters
A
andT
. Let's split the given range greedily from left to right into several subranges such that each subrange can be sorted independently. Here, the observation is that the answer for each such subrange equals the absolute value of (sum of indices that containA
in the first string) — (sum of indices that containA
in the second string). We will use binary lifting to calculate these values quickly.I'm interested to know, if someone from the ISC is reading this, why such a weird difficulty distribution? The number of solves per problem are 24 17 11 300+ 15 7, respectively. I know that selecting the tasks for IOI is a difficult job, but I'm not sure how you end up with such contest, While the tasks themselves were interesting, it's hard to see a point in having the difference of 275+ solves between the easiest and the second easiest task.
Did you lack easier task proposals? Was there misjudgment in the task difficulty? Or did you intend it to be like this (while having highscores without ACs)?
Certainly, it's not a new thing for IOI to have a large gap between the easiest and second easiest task, but this year it was especially so. Again, no hate, just curious what happened.
Not speaking about this particular contest (dna is clearly much easier than all the other 5 regardless of the way you measure) but more about the "general trend" and the importance (or not :) ) of measuring by "full score".
I might be completely mistaken, but based on previous commentary during IOI General Assemblies, I am under the impression that "score distribution" is generally prioritized more than "Number of full solves". For example, past members of the ISC have said (when asked about giving difficulty feedback to students / sorting problems by difficulty) that speaking of overall "task difficulty" is quite misleading, as there is no inherent "extra bonus" for solving a problem with 100 points vs 90, and it is very often the case that something more or less like this happens:
1) Achieving 100 for task A is extremely hard 2) Achieving say 50 points for task A is extremely easy, as it has a "gift" subtask. 3) For another problem B, there is little difference in the difficulty of achieving 50 points or 100 points (the task is more "all or nothing" so to speak, probably relies on having one "click" moment or great idea), and solving it is between 1) and 2) in difficulty.
For the previous situation, which problem is easier? By full solves, B is easier by far, but maybe by average score they are similar or maybe even task A can be easier. One example is Horses vs Sorting in 2015 https://stats.ioinformatics.org/tasks/ : One problem has twice as many full solves, but they have very similar average score.
The full-solve measure is probably useful to directly compare the difficulty of subtasks, as opposed to full tasks. The IOI can be considered to be a contest made up of a lot (About 15?) of little (sub)problems each having different scores, and some of them being very related to each other. If thought in this way (as if each subtask were a different "ICPC style problem", say), I guess that the distribution of full solves among all subtasks might look like a more balanced "ICPC style" contest.
Yes, that's something I considered as an option, it's just that I don't agree with that kind of thinking. In some cases I understand it, for instance problem Big Prize from 2017, I liked the 90 point vs 100 point distinction and of course for a problem like that, the number of full solves will be deceiving. But for something like Parks, giving 70 points for solutions that don't have much in common with the main solution (and are probably much easier), while at the same time valuing the full solution at 30 points (assuming you already coded the other subtasks, as you said, as separate ICPC tasks) is a little absurd for such a difficult task. Since this would obviously be intentional, I think it's a good thing to have discussions about it. I don't think most contestants view IOI in the same way as the ISC does then and I think that's a problem. At least for me, while I was still competing, while collecting subtasks was seen as the easiest way to get a medal, it was always (and I would say always will be) viewed as less honorable than solving full tasks.
Also, as a final note: just because the score distribution is good for medal distribution (aka there are not too many ties, especially for medals), it does not mean that the scores accurately represent the contestants coding/problemsolving ability, which I think is more important. In this way, while having so many options, strategy becomes a much bigger issue than I would like.
I fully agree with the way that you describe the facts. For a long time now, the IOI has been a much more "strategical" contest than ICPC or Codeforces, for example. Is that the way the community wants it to be? Good question, I don't really know but I would expect it to be quite split to different opinions, some loving the IOI format as is, some preferring an ICPC or Codeforces style much more.
The "task A" situation you describe (a hard task with many points from easy subtasks) should never happen. It pretty much takes away the problem-solving part of the competition for everyone except the ones competing for the very top spots, by failing to reward contestants who choose to spend their time thinking about solving problems instead of thinking about which subtasks give most points. I don't understand why anyone who appreciates IOI being a problem-solving competition would consider it desirable to have this, and yet as you said it happens very often.
While it's true that subtasks can help with the difficulty distribution, the subtasks in this contest seem just about as bad: on day 1, there are an absurd number of 11s and 37s on the first two problems and not nearly enough people who scored between that and full score. Dungeons seems better on this front, but the rest are pretty awful.
I heard from ISC that the intended difficulty of day 1 tasks was supposed to be park<keys<candies (opposite of what ACs show).
I think parks is very easy to underestimate the difficulty as the construction is stupidly simple. And they even gave multiple of 5 subtasks to hint that it was the "easy" problem :|
Very interesting!
Extending on my other comment about general score and not only full solves, MUCH more people have above 50 points in parks than in either candies or keys, and A LOT have 70 points which few people have on the other problems. So even when not that many got to the precious 100 point mark, the problem parks gave away a lot of points. I would not dare call it "the absolute hardest of day one" taking this into account, even though it was the hardest to get 100 points.
Yeah, I can definitely see how you can underestimate a construction problem, especially if you've been looking at it for a while, after a while it starts to look obvious :) And there's always the scary possibility that someone will think of an easier construction than the committee.
The difficulty for parks was indeed underestimated. When two silver medalists (who tested the contest blind) failed to solve parks, I realized that the difficulty estimation was off.
On the other hand, candies difficulty was over-estimated and I felt that it was closer to being a "medium" rather than a "hard" problem.
The result was having 3 medium problems instead (or maybe you could consider parks as "hard"). Regardless, I decided not to change the line-up and it does not warrant replacing the questions with backups.
Will Codeforces host those problems one day?
I assume you can read Vietnamese from your name, you can submit IOI problems in Vietnamese OJ
Wow, thanks. It's great to see that VNOI has improved so much.
Will we see the combined standings of the people who participated in the virtual competition?