Randomized Algorithms and Online Judges

№	Пользователь	Рейтинг
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

№	Пользователь	Вклад
1	cry	168
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	159
5	atcoder_official	156
6	djm03178	153
6	adamant	153
8	luogu_official	149
9	awoo	148
10	TheScrasse	146

Randomization is known to be a powerful algorithmic technique, and I want to start a discussion about it specifically in the context of competitive programming. What extra considerations should we make about how online judges work when analyzing our randomized algorithms, and approving such problems for an official contest? In particular, I want to discuss algorithms whose running times are uncertain.

Let's say that the running time of an algorithm is a random variable $$$X$$$. Typically, we might say the algorithm is acceptable if the expected running time is a fraction of the time limit: $$$E(X)=T/k$$$. With Markov's inequality, we can say $$$P(X>T)\le 1/k$$$. That is, the probability of failing a test case due to timeout is at most $$$1/k$$$.

TLE Trick

Online judges usually run your program a few times before giving the Time Limit Exceeded verdict. This fact is known to help randomized algorithms, as mentioned here: https://codeforces.me/blog/entry/78817

Let's apply this in the context of random running times. If your program is run $$$n$$$ times, the actual probability of failing a test case is at most $$$1/k^n$$$. If there are $$$m$$$ test cases total, the probability of failing at least one of the test cases is $$$1-(1-1/k^n)^m$$$

Solving for $$$k$$$

Let's suppose we call a program acceptable if its probability of failure is at most $$$p$$$. Then we are interested in what the constant $$$k$$$ should be in order to be confident of success.

$$$k\ge \left[ 1-(1-p)^{1/m}\right]^{-1/n}$$$

Let's plug in some values. Suppose there are $$$m=200$$$ test cases, your program gets $$$n=3$$$ chances per test case, and we can tolerate a failure probability of $$$p=10^{-3}$$$. Then we should require $$$k\approx 58.47$$$, which is quite a considerable constant factor that we should be careful about when making randomized solutions.

Future Directions

If we make extra assumptions on the distribution of $$$X$$$ that apply widely to randomized algorithms, can we be confident with a much smaller value of $$$k$$$? What will the analysis look like on some specific cases like treaps? I'm interested to hear your thoughts.

Комментарии (1)

Показать архивные | Написать комментарий?

Errichto

4 года назад, # |

+42

Is there actually any platform that runs your program three times? It seems unnecessary. I always thought that CF runs your program second time if TLE, and most of other platforms don't even do that. And then your computations are way off because indeed running many times helps a lot if you independently have probability ~0.01 to fail.

If the distribution of $$$X$$$ is quite irregular, it will just be difficult to estimate anything. In practice, it's just better to run your program multiple times on your machine and check if it's good enough on hundreds of tests.

Slightly related: Petr once asked this question about the correctness of the intended solution but sadly there was no further discussion on it:

[Sometimes] you don't have a way to be 100% certain your output is correct even given infinite computational resources. You can, of course, be 99,9999999999999999% certain. How many nines is acceptable here?

source: https://petr-mitrichev.blogspot.com/2016/07/random-problems.html

→ Ответить

Блог пользователя Monogon

TLE Trick

Solving for $$$k$$$

Future Directions