How do authors ensure that their solutions are correct? - Codeforces

→ Pay attention

Before contest
Codeforces Round 1006 (Div. 3)
3 days
Register now »

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	156
7	djm03178	151
7	adamant	151
9	luogu_official	150
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

mazihang2022's blog

How do authors ensure that their solutions are correct?

By mazihang2022, history, 3 years ago, In English

In English

Is it possible to hack authors' solutions?

Tags

question, hack, contest

+38

mazihang2022
3 years ago
6

Comments

Comments (6)

Write comment?

»

3 years ago, # |

Vote: I like it

+54

Vote: I do not like it

Sometimes it is possible to hack the authors' solution, as seen here. However, it is generally rare, as there are many people (the coordinator and the testers) who read the problem and the solution and verify that they are correct.

Another way that authors ensure that the solutions are correct is by writing an exponential brute force, and checking their solutions against thousands of small cases. This is usually enough to catch most errors.

→ Reply

»

»

3 years ago, # ^ |

Vote: I like it

+4

Vote: I do not like it

So will the contest be unrated if this happens?

→ Reply

»

»

»

3 years ago, # ^ |

Vote: I like it

+17

Vote: I do not like it

It depends on how much it affects the contest. For example. when issue is fixed quickly enough (by changing solution or problem statement) or when error in solution appears only on systests and correct solution exists, contest will be rated.

→ Reply

»

3 years ago, # |

← Rev. 2 →

Vote: I like it

+4

Vote: I do not like it

Multiple solutions: different authors, various approaches (including slow ones).
Stress-testing.
Setter should write down a proof (if any) and somebody else should check it.

→ Reply

»

»

3 years ago, # ^ |

Vote: I like it

+13

Vote: I do not like it

Also, reference solutions with added bugs to ensure tests actually catch them.

→ Reply

»

3 years ago, # |

Vote: I like it

+49

Vote: I do not like it

For Codeforces, the problem selection procedure is pretty high quality, so the chance of this happening is very low. What follows below is only from my experience with problemsetting/coordinating for my uni contests (which spans about a dozen contests or so, and about a hundred problems).

Whenever I coordinate contests for my university CP club contests, I ask problemsetters to submit a formal mathematical proof of correctness of the intended solution as well as implementation (line-by-line, if it isn't standard) before setting the problem on Polygon (for instance, proof of claims needed for greedy, dp and binary search problems; structural properties for graph problems and so on). Any problem that I have ever approved using this strategy has had a correct solution.

From my experience, the more formal (and low-level) the proofs are, the less probable it is to make mistakes and miss out corner cases. For instance, while a high-level proof might have a chance of missing the fact that there are no cycles with length $$$\le 2$$$ in an undirected graph but there are such cycles in directed graphs, a low-level proof will definitely not miss out that detail. Maybe this arises out of my mathematical inclination, but I've seen that the more "machine-verifiable" a plausible proof seems to be, the more the chances are that it is correct.

An interesting thing that we've been thinking of implementing is to somewhat formalize the test-generation pipeline (since the judge has the responsibility to minimize false positives and eliminate false negatives), which is overlooked by quite a few new problemsetters. Testing with a brute-force solution also helps detect any incorrect author's solution, so it is imperative that the tests comprise a diverse enough subsets of all permissible inputs which can eliminate false positives and negatives on most sane approaches, even for the author's solution. Some of the things we currently think of while making tests are as follows, but they are just heuristics and not totally formal:

TLE tests for both heuristic solutions as well as plain dumb solutions (using stress tests)
WA tests for stuff like greedy/dp, for heuristic solutions with the wrong greedy or dp
Exhaustive tests for small cases
Random tests (with certain structural properties, so not completely random) to account for cases we can't manually find
RE tests (out of bounds, overflow)
Ensure that if the problem has subtasks, then if a subtask A generalizes subtask B, then tests of subtask A should contain those of subtask B.
Look through the proof of the author's solution (as well as the tester's solution and others who have solved the problem), and consult a list of generic mistakes that people make while implementing various algorithms (for instance, for binary search, people sometimes write an infinite loop, make off by one errors, miss out the boundaries and so on), and modify some pre-built tests to fit that problem.

If there are some more test-generation ideas that I missed out on, or some strategies that can "formalize" this procedure to a good extent, please let me know!

→ Reply