In this question 822C - Hacker, pack your bags! , when I used set it gave TLE on test case 10 but when I used unordered_set, it got Accepted. I read some blogs where they told not to use unordered_set/map on CF as it can be hacked, so what should I do, should I use unordered set/map or not?
Submission 1 (TLE) : 256948385 Submission 2 (Accepted) : 256948457
It's a very common question, that has already appeared multiple times in Codeforces blogs.
unordered_map gives you O(1) complexity on average, while map gives O(logn). Seems that unordered_map is better in such case, but at what cost does it provide such complexity?
Hashing. What's the problem about hashing?
Hashing leads to collision. Yes, when data is random probability of collision is very small ($$$2^{-n}$$$, where n is the length of the hash when it's a binary number), but when your solution can be viewed by everyone there is a really large change that some enthusiast will appear that knows some cryptography and will find an input that leads to hash collision in no time, what leads to O(n) complexity instead of O(1).
How to prevent this? Everything described here.
And the same thing works for unordered_set and similar data structures as well. As a rule of thumb, if you don't know how to prevent your solution from getting hacked using unstable data structures — use their stable analogues, but not them.
This is a really nice comment
But what the OP said here is the other way around, Unordered got accepted and ordered got TLE
ordered_set and just set are different data structures from STL, don't confuse them with each other.
Solution with set has got TLE because, once again, on random data unstable structures as unordered_set give better performance, but it applies only and only on tasks with closed solutions. If solutions are opened — wait for successful hack.
Very often when your solution can't pass because of the additional logn it's either because your idea is inefficient or because your implementation is inefficient. It's very rare when task requires to use unstable data structures with no alternatives to them
I meant the regular set.
Got your point thanks, but what do you say about using custom hash functions ?
Using custom hash functions allows you to modify unordered_set hashing so that it's less likely to be hacked.
Here's my template:
Please note that not everything is able to be given a hash, so only stick to basic data types such as int, string etc.
ah yes string, my favorite basic data type
If you're using unordered_map or unordered_set you MUST have them anyway. I've already gave a link to the blog that gives comprehensive information about which exactly function you should use, so I won't duplicate it.
About what orz said: I completely agree with him in terms of competitive programming. When you have the prewritten protection code then it's much better to use unordered_map or unordered_set if you only don't need the sorted elements in these structures. Also, most of the time, you really don't need anti-anti-hack protection, because it's very rare now that someone will try to fetch the random and write the test based on it, because it's enormously time consuming and it doesn't cost it.
If you look in my solutions, then you will see that it's very rare when I'm using such structures, and I have an explanation for this as well. Actually, I've a couple of reasons:
What do you mean by “You can't use random hashing in real programming”?
Every time you restart your program your hashing function will have different power and/or module, and therefore with every new launch of the program the hash for the same data will be different, and therefore if you are saving it somewhere it will produce new and new entries without finding the old ones.
If these structures are used just for temporary data, then yes, you can use it even there.
thanks a lot!
I personally believe it is vice versa — use
unordered_map
if you do not really need the elements to be stored in the sorted order.Yes, there is a danger of hacks, but, firstly, the cure is not very complicated (one of the commenters already gave a link to a comprehensive material), secondly, I suppose the problem is maybe exaggerated. I believe I don't even have an anti-anti-hash test protection in my template, but I still have never been hacked this way.
For me personally a logical line of behavior looks like this: just use
unordered_map
like hacks don't exist. And only if someone hacks your solution or it fails system tests, then probably it is worth researching into this problem further.If you want to be secure and prefer to put on an extra layer of protection, better read https://codeforces.me/blog/entry/62393, add some needed lines to your code and never remember again about
unordered_map
being hackable.orz orz
thanks a lot!
I use ordered map sorted by key value pairs