The round 1 of Facebook Hacker Cup 2016 starts in a little over 4 hours at 10am PST.
Everyone who solved at least one problem in the qualification round (CF post) is eligible to compete. Click here to enter Round 1.
Note that advancement criteria to round 2 was changed. The criteria now is to achieve at least 30 points in Round 1, instead of same points as 500th. These changes were published in the competition FAQ.
Moving it into recent actions. Contest will start in 3 minutes.
I already hate laundry and counting number of zeroes in constraints :)
http://codeforces.me/contest/452/problem/D I hate laundry.
Could someone please explain the first sample test of Yatchzee? The only step costs 2 dollars, meaning the amount of money left will never exceed 1 dollar, how can the expected value be 1.666666667?
Why?
EDIT: You people have no respect for the Socratic Method.
The statement said that he will stop when he can't afford the cost of the next step, so if he's left with 2 dollars or more, he will continue paying.
Oh I see it now. The initial money is a real value.
For problem Laundro,Matt which data structure should I use? Can anyone give any suggestions?
NO
That's against the rules. We'd be disqualified for discussing the problems before the contest ends.
Sorry!!I did not know that.
Do you usually discuss questions during the exams with your friends in your school (or when you were at school) ?
why they have this one time 6 minutes rule ? it's so annoying for countries with crappy internet. I couldn't download input file fast enough and it's expired now :| I sent an email to them, and they responded :
"Unfortunately, we cannot extend the time limit — it's your responsibility to be able to download the input in time (we make sure it's not too large)."
it's just so disappointing!
An idea for organizers: encrypt the input (ZIP should be ok I think), make tests downloadable at any moment and show password on request (which should also start countdown).
Same thing works for uploading: make user upload hash of their output (probably, with some salt so sharing hashes does not help) under 6 minutes and then allow more time to upload actual file.
Hashing output have a few drawbacks, since hashing is very sensitive to small changes:
Lower/Upper case: case #1:
Forgot #: Case 1:
Precision: Case #1: 1.2345678910
Trailing spaces: Case #1: 1 2 3 4 (trailing SPACE)
Getting an encrypted input beforehand would be nice though.
In case they use smart checker which can handle all these things — you can ask about sending hash of your output during 6 minutes timespan, and then uploading your output (which should have same hash) in next X hours.
On the other side — output files are small, so it doesn't make much sense, unlike giving encoded input beforehand.
Yeah, I like the idea of uploading a hash, and then allowing arbitrarily long upload times.
(though usually input is by far the larger file)
I really hope that yeputons's idea is applied in the next round.
Not next round, but potentially next year.
That's a good idea, but overcomplicated for those who have fast internet access (unzip inputs and getting hashes for all tasks can be annoying). Maybe you can keep both the current format and yeputons's idea?
Yeah, this is a minority use case. We'd want both formats.
One solution is like this:
For getting input, we can write test case generator in javascript — when you click 'Download data', the server will send the js code, your browser will run it then get an input file. Nothing will look different from current workflow, only backend changes.
For uploading output, we can keep the current UI, when you click 'Upload Output', the browser first send hash to server, then try to upload the actual file, if failed — then it will show 'Please upload output file with hash xxxxxxxx'. So if your internet is good, then nothing will be changed.
I wish you all the best with generating 10MB file with JavaScript in browser. Especially if generator is not that simple.
Also, that would expose generating strategies to participants, which is not the best option. E.g. there are some problems where you have to find, say, Hamiltonian cycle in a graph. Exposing generator is a really bad idea here.
You are right, there may be performance issue for some cases.
For your second concern, we can uglify that code, then it will be very hard to get some useful info under 6 minutes.
I guess by uglify you mean to obfuscate/minify? the code, in this case you can go to one of the many online tools to prettify it, there are many algorithms easily identifiable seeing code structure, I think not many people can do something useful with 6 minutes but I'm sure is not impossible.
it's fun because AFAIK there is special initiative for facebook employees "emulate crappy internet on your workplace", exactly for such cases. I suspect hackercup developers opted to avoid it
I'm wondering why the constraints are written as T ≤ maxt while actually it's always T = maxt :P
In sample inputs T < maxt.
Может кто-нибудь из "прошаренных" выложить свои входные/исходные файлы?
Lets share input/output files!
My Outputs MD5:
a) 8ac11aa9fc477d369246edcfc70b9520
c) 7493cd269e214695e9492e3695b638f7
d) wrong((
output without input doesn't make much sense:)
My in/out files
Apparently, B is incorrect:)
My answers are similar (1-st and 3-rd problems)!
hooooh, submited A and B. Compare As — ok, same answers. Compare Bs — Different answers! And mine is Accepted xD.
You can grab the official I/O here: https://www.dropbox.com/sh/6f6wqzt8372azqo/AABAA8gWT44c_x4LBHnrTkCqa?dl=0
We'll be putting up a solutions post in a bit.
I was lucky that my C passed the tests. I had a potential overflow bug in a very specific case. If Mike adds the problems to the gym, you might want to add the following test case:
Output: 249999.002001000
The test cases are very bad
For C Yachtzee most Ci are less than 1000 and does not contain boundary cases at all. For example
Expected answer is 0.5 while many people, if using
double
instead oflong double
and simply using(f(b) - f(a)) / (b - a)
, will output0
and for D i don't know why they have to repeat N = 2 so many times and there is not even a single
9 9
for N = 16I completely agree. The test cases are extremely weak and certainly not fair to everyone. In problem B, a lot of people complained that the input contained N > 10000 (before they changed the constraints). The input file I received had N < 1000 for all cases. I know that the inputs are generated randomly and are different for every contestant, but what kind of randomly generated input contains N < 1000 for some and N < 10^5 for some others, considering all 50 cases? And how exactly do you call this fair?
Exact same problem repeats in C. It was mentioned that Ci <= 10^9, but in my input file, all Ci < 1000. Even apart from this, the test cases for C are very weak. Already a lot of contestants are mentioning the cases for which their solution fails in the comments. Not to mention those tricky cases where not using long double will lead to precision error for some contestants.
Please take care of these issues in the next round. I can accept somewhat weak test cases, but the input file of two contestants shouldn't differ so significantly in their constraints to ensure a fair competition.
Can you please explain why we need long double? I mean can't double handle 10^18? and only time here precision comes is when we multiply it segment values with 0.5, and finally divide by b-a. In the whole process,there is no extra division, so I'm confused why double won't work. I used only double, and it passes the cases you guys mentioned here :S
double
contains only 52+1 significant bits.I don't mean it will fail everyone. I just mean it will fail some people who are careless enough. And it's the organizer's responsibility to catch those people.
Thanks a lot for the clean explanation :) So when numbers are floating point,instead of integers, the range double can accurately represent decreases,right? because if a 0.5 is added to (2^53)-1, it is no longer accurately represented, 0.5 takes some bits.
Does long double contain 63 significant bits?
Yes.
long double
contains 64 significant bits and therefore can store anylong long
value accuratelyHow to solve Laundro, Matt?
We want to finish washing the laundry as soon as possible. So we put the possible finishing times for the washing machines in a priority queue.
Then, we process the L items and assign each one of them to the earliest possible finishing time. This is just the minimum in the priority queue. Suppose this minimum time is X. Then, this washing machine is able to finish washing another item at time X + WashTimeOfThisMachine.
We can use the same reasoning for the driers, but it's not necessary. We will put an item to dry asap as well. This is X + D is there is a drier available, or D past the next available drier time. If we are processing item i, the next available drier is available at time WashTime[i-M] + D.
L=6,N=3 W=[4,4,7]
Priority queue passes this test?
Yes, why wouldn't it?
A simple priority queue will give this ordering — w1 w2 w3 w1 w2 w3, but the optimal ordering is w1 w2 w3 w1 w2 w1. So, I guess, some extra logic was to be applied to priority queue :)
No, that ordering would be achieved if we just sorted the wash times once. When we pop the minimum from the priority queue, we add the new available time of this machine, which is X + WashTimeOfThisMachine. The priority queue takes care of putting this new element in the right place so that we keep getting the minimum.
I don't understand the last formula. Could you elaborate on that?
All driers spend the same amount of time D to dry an item. So they are indistinguishable and we will obviously put the first M washed items in the M driers.
Note that we have the order of the times the items finish getting washed because we are getting the minimums from the priority queue. WashTime[i] < WashTime[i+1].
So the first item will be washed and dried at time WashTime[0]+D. The second at WashTime[1] + D and the Mth at WashTime[M-1]+D.
Since WashTime[i] < WashTime[i+1], WashTime[i]+D < WashTime[i+1]+D.
When the (M+1)th item finishes washing, we will put it in the first drier because it is the one that is available first. The first drier is available at time WashTime[0]+D. So we either put item M+1 there at time WashTime[M] or WashTime[0]+D, whichever is greater.
Now I get it :) Thank you for your effort.
You're welcome!
I solved it using binary search to find the minimum time to wash everything. If this minimum time is X then I divide X by wi for all w to get the maximum loads to be inserted in a washer and store these in a vector. Then I sort them, and choose the first L values after sorting. If p=(L-1)%m, then this is the dryer number which will have the last load. Thus, I calculate minimum drying time of last load by operating on p-dryer for L/m times using formula time=max(current dryer time, load's wash time).
I am just curious.
Does somebody find the self-documenting style of code clear and intuitive?
I try to write my code in that style, so it can serve a higher purpose and help someone struggling with thier's understanding (rather than just solve a problem). But recently I've heard an opinion that I am wrong and self-documenting code is more confusing than clear.
For example, what do you people think of the solution to the problem.
Do you find it clear or confusing?
Only сoach could see your submission. Use ideone/pastbin. i think it is clear but too slow for competing.
Thank you, I didn't know about that.
As for the writing speed, I think that it shouldn't be a concern unless the coding speed is the bottleneck. I think that the first priority should be the clarity of a thought process and the clarity in the writing style could help in that endevour.
But I may be wrong, that is why I am seeking for a feedback :)
Look at the Petr submissions. Almost all of them are in self-documenting style. I enjoy reading his code and who can say that he is slow? :)
One more thing, using short meaningless variables leads to hard debugging process, especially for big code. Also, you type large name only once, further intellisense helps you when you use some IDE. I hope I'll also use the same style soon.
completely unreadable. Just use variable names introduced in problem description and i, j, k for array indices. Use comments if you need to clarify something.
And, to be honest, nobody cares about your code unless you're red
When the round 2 will starts ?
Next Saturday at 10am PST.
I thought and coded for C and took me about 12 hourse -_- Then after I submitted 3 minutes before the contest end, I found my solution will give wrong ans for the following test case (and it will take two more lines to fix it)
1 2 4 2
I was devastated. But a while ago, I found out it had passed the test :)
Well, I felt after 12 hours of nonstop thinking, I at least deserved something.
The contest is in Gym: 2016 Facebook Hacker Cup, Round 1.
Range 1 ≤ Wi ≤ 109 is not specified in the Problem B.
Thanks. Fixed.
If anyone is interested, here is some statistic on the number of advancers:
2298 — 35 points (by new rules)
560 — 80 points(by old rules)
What the heck was going on with tests to B???
When I was downloading input, there was one announcement, that L<=1e6 not L<=1e5, but only this one (at that time n<=1e4). There was also constraint that m<=1e9. In test I download in all cases n,m<1e3 holds, seriously? Technically, this test fulfills all constraints, however cases with constraints at least close to their maximal value given by constraints should be present!
What is more, after some time constraint for n was enlarged to 1e5 and my friend got test with n=1e5 in his test (and I got all of them <1e3...).
In this particular task, I think that enlarging those constraints weren't that important, but I don't see any explanation for such behavior of organizers (both enlarging constraints during contest and not putting Maximal values of variables in tests).
Yeah, the data weakness was my fault. To be clear, the bounds weren't changed during the contest, in the sense that people were already receiving files with the higher bounds (I just typo'd the number of zeroes in the constraints). Additionally, the large cases weren't forcibly included in everyone's input files.
I'm sure there are a few people who wrote a slow solution but got lucky with a bad input file. Our precedent is to let those solutions stand, especially in a round with no fixed number of advancers. You can be sure I'll be quadruple-checking the Round 2 data.
Good to hear that you will carefully check constraints next time, however I am still not happy about the statement "Additionally, the large cases weren't forcibly included in everyone's input files." and as far as I understood you weren't referring to this. This probably holds for all other types of cases. I don't know what is the exact algorithm of choosing which tests to include, but you should ensure that whatever set of testcases is chosen it should test every possible aspect of solution. It is unacceptable that it is possible that guys A and B both have written solutions with significantly too slow time complexities or solutions which fail for an edge case n=1 and one of them will pass test and the second one will not because first guy got lucky with set of testcases he downloaded.
Swistakk,
I fully agree. The way I would do it is to have several groups of tests (corner cases, maximum constraints, tricky cases, randomized test). Then sample several tests from each group, mix them in randomized way and use for testing. Sampling tests from the whole population (without grouping) is not fair.
Yeah, this is exactly how we've set things up. The problem is that all of the tests were accidentally placed in the random bucket :)
How to solve problem C-Yachtzee ?
My approach: find cumulative sum of the costs. then for each cumulative sums from last to first find the disjoint sets of modulos that can appear. Actually there can be 3 types of them call them L, M , R and they are set by the total cost of all the steps.
take the following test case:
2 9 20
8 2
the total cost is 10
the first number that is divisible by 10 after or equal to A=9 is 10
so L is [9,10)
the first number that is divisible by 10 before or equal to B=20 is 20
So R is [20,20]
and M is the range between L and R [10,20)
So if we modulo it by 10 we can get the following ranges:
1) From L: [9,10) 2) From M: [0,10) 3) From R: [0]
However notice that the modulo range those are above or equal to 8 can be spent by the first cost as the programmer greedily spends his money from first to last step
So the ranges are further edited.
1) From L: [1,2) 2) From M: [0,8)[0,2) 3) From R: [0]
From here you can calculate the expected value by calculating the length and midpoint of each ranges
I guess I got lucky :) Only submitted problem D solution which is basically wrong: start with identity permutation from 1 to N, and in every step swap a random element from the first half with one from the second half (doing this 2000000 times). It passed the tests.
http://attachment.fbsbx.com/hackercup_source.php?sid=960450640657968
That's really interesting :)
There aren't that many distinct tournament trees due to symmetry, so I guess your solution has a high probability to find the right answer in all tests.
It's interesting what's worst possible input for this solution.
Best one I come up with is this (#1 needs a lot of luck to win this one). Still random is good enough to solve it.
If even in Google Codejam finals tomek ALMOST got away with using simulated annealing... Why not? :)
Even I did it in the same way! but my output failed.
For N = 1, 2, 4, 8 we can just iterate over all permutations. Which fits very well in the 6 minutes time.
But for N = 16, I used random_shuffle() doing around 70000 iterations. But sadly my output failed for 1 test case. Where as it worked for 80000 iterations. :(
This random has a bigger chances)))(отжиг)
http://codeforces.me/gym/100875/submission/15430079
I almost qualified ;_;
One teeny tiny error held me back this time :(
Don't worry, you aren't alone ))): http://codeforces.me/gym/100875/submission/15422585
I wrote
It should've been
I would've cried when I realized this if I was alone in my room at that time. But thanks for your consolation. It is always better to know you aren't the only one :)
For problem D, I wrote a backtrack + BPM solution. It seemed like a good idea when I was writing it :p
Please tell me what is BPM?
Bipartite matching.
I have always thought that BPM is "Beats per minute" :).
I was really confused to see so many people solving 2-3 problems. Though I thought the problems were standard. It is really weird to see almost 2000 people qualifying for the round 2. Now it makes sense after reading all the discussions above. :|