How do authors ensure that their solutions are correct? - Codeforces

→ Обратите внимание

До соревнования
Codeforces Round 973 (Div. 2)
11:44:12
Зарегистрироваться »

*есть доп. регистрация

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3773
3	Radewoosh	3646
4	ecnerwala	3624
5	jqdai0815	3620
5	Benq	3620
7	orzdevinwang	3612
8	Geothermal	3569
8	cnnfls_csy	3569
10	Um_nik	3396

Страны | Города | Организации

→ Лидеры (вклад)

№	Пользователь	Вклад
1	Um_nik	163
2	cry	161
3	maomao90	160
4	-is-this-fft-	159
5	awoo	158
6	atcoder_official	157
7	adamant	155
7	nor	155
9	maroonrk	152
10	Dominater069	148

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя mazihang2022

How do authors ensure that their solutions are correct?

Автор mazihang2022, история, 3 года назад, По-английски

По-английски

Is it possible to hack authors' solutions?

Теги

question, hack, contest

+38

mazihang2022
3 года назад
6

Комментарии

Комментарии (6)

Написать комментарий?

»

3 года назад, # |

Проголосовать: нравится

+54

Проголосовать: не нравится

Sometimes it is possible to hack the authors' solution, as seen here. However, it is generally rare, as there are many people (the coordinator and the testers) who read the problem and the solution and verify that they are correct.

Another way that authors ensure that the solutions are correct is by writing an exponential brute force, and checking their solutions against thousands of small cases. This is usually enough to catch most errors.

→ Ответить

»

»

3 года назад, # ^ |

Проголосовать: нравится

+4

Проголосовать: не нравится

So will the contest be unrated if this happens?

→ Ответить

»

»

»

3 года назад, # ^ |

Проголосовать: нравится

+17

Проголосовать: не нравится

It depends on how much it affects the contest. For example. when issue is fixed quickly enough (by changing solution or problem statement) or when error in solution appears only on systests and correct solution exists, contest will be rated.

→ Ответить

»

3 года назад, # |

← Rev. 2 →

Проголосовать: нравится

+4

Проголосовать: не нравится

Multiple solutions: different authors, various approaches (including slow ones).
Stress-testing.
Setter should write down a proof (if any) and somebody else should check it.

→ Ответить

»

»

3 года назад, # ^ |

Проголосовать: нравится

+13

Проголосовать: не нравится

Also, reference solutions with added bugs to ensure tests actually catch them.

→ Ответить

»

3 года назад, # |

Проголосовать: нравится

+49

Проголосовать: не нравится

For Codeforces, the problem selection procedure is pretty high quality, so the chance of this happening is very low. What follows below is only from my experience with problemsetting/coordinating for my uni contests (which spans about a dozen contests or so, and about a hundred problems).

Whenever I coordinate contests for my university CP club contests, I ask problemsetters to submit a formal mathematical proof of correctness of the intended solution as well as implementation (line-by-line, if it isn't standard) before setting the problem on Polygon (for instance, proof of claims needed for greedy, dp and binary search problems; structural properties for graph problems and so on). Any problem that I have ever approved using this strategy has had a correct solution.

From my experience, the more formal (and low-level) the proofs are, the less probable it is to make mistakes and miss out corner cases. For instance, while a high-level proof might have a chance of missing the fact that there are no cycles with length $$$\le 2$$$ in an undirected graph but there are such cycles in directed graphs, a low-level proof will definitely not miss out that detail. Maybe this arises out of my mathematical inclination, but I've seen that the more "machine-verifiable" a plausible proof seems to be, the more the chances are that it is correct.

An interesting thing that we've been thinking of implementing is to somewhat formalize the test-generation pipeline (since the judge has the responsibility to minimize false positives and eliminate false negatives), which is overlooked by quite a few new problemsetters. Testing with a brute-force solution also helps detect any incorrect author's solution, so it is imperative that the tests comprise a diverse enough subsets of all permissible inputs which can eliminate false positives and negatives on most sane approaches, even for the author's solution. Some of the things we currently think of while making tests are as follows, but they are just heuristics and not totally formal:

TLE tests for both heuristic solutions as well as plain dumb solutions (using stress tests)
WA tests for stuff like greedy/dp, for heuristic solutions with the wrong greedy or dp
Exhaustive tests for small cases
Random tests (with certain structural properties, so not completely random) to account for cases we can't manually find
RE tests (out of bounds, overflow)
Ensure that if the problem has subtasks, then if a subtask A generalizes subtask B, then tests of subtask A should contain those of subtask B.
Look through the proof of the author's solution (as well as the tester's solution and others who have solved the problem), and consult a list of generic mistakes that people make while implementing various algorithms (for instance, for binary search, people sometimes write an infinite loop, make off by one errors, miss out the boundaries and so on), and modify some pre-built tests to fit that problem.

If there are some more test-generation ideas that I missed out on, or some strategies that can "formalize" this procedure to a good extent, please let me know!

→ Ответить

Codeforces (c) Copyright 2010-2024 Михаил Мирзаянов

Соревнования по программированию 2.0

Время на сервере: 20.09.2024 05:50:49 (l1).

Десктопная версия, переключиться на мобильную.

При поддержке