Hello dear users and MikeMirzayanov!
Lately I've often come across discussions about the use of AI in competitions. I decided to share my thoughts and personal experience on this issue, as well as make some suggestions that could help to fight the problem of foul play more effectively.
As an experiment, I tested several ChatGPT models on a paid basis to see how well they are able to efficiently solve contests on different algorithms in terms of correct solutions and optimization.
- o3-mini copes well with div2 A-C level problems, but starting from div2 D the model without an explicitly defined solution idea often generates suboptimal code, incorrectly defining the optimal asymptotics of the algorithm. Problems are especially noticeable in dynamic programming problems when composing the recurrence formula and in interactive problems.
- o3-mini-high shows better understanding of the problems, deeper reasoning, but also makes mistakes on complex div2 E-F problems, especially those related to dynamic programming. Often, to get the correct and optimal solution, we have to manually describe the model algorithm with the exact formula and asymptotics.
In my experience, of course, for purely experimental and scientific purposes, I noticed one interesting thing — ChatGPT in standard conditions without detailed explanations of the algorithm usually generates the same code pattern, containing approximately the same structure of functions and their names, similar style of variables and repeated comments, frequent use of the same blanks and templates, especially for typical algorithms. Yes, of course an ordinary user can write a similar algorithm, but there will be a clear difference from AI in code layout and code structure.
This, in my opinion, can be used as one of the layers of protection for automatic moderation of solutions on the platform. Already now, moderation on Codeforces sometimes successfully detects suspiciously similar solutions on this very basis. But it is actually very easy to circumvent this now by obfuscating the code, adding “garbage” and unnecessary functions, because of which the code will be difficult to read in the end. You can think of several more layers of protection that will work in parallel with the main one.
The basic idea is behavioral analysis of user actions in real-time. The following functionality can be implemented:
- Real-time analysis of the difference between the moment of opening the task condition and the first sending of the solution
- Tracking sudden changes in the speed of problem solving, for example, solving a complex problem too quickly after a long idle time without activity.
- Fixing absolutely all user actions on the page of the contest with batch sending of encrypted data to the server for the anti-fraud module, without which further action on the site will be impossible, we will take into account especially carefully the events of copying the task condition from the page.
A separate and key solution will be the implementation of a client application Client <-> Server. Its main task will be monitoring of suspicious actions. In some ways its logic will be similar to the work of a proctor when conducting, for example, an online screener.
- Analyzing processes running during the contest
- Analyzing incoming/outgoing traffic to control requests to AI services
- Creating and sending random screenshots of the user's screen to the server only when there is suspicious activity
Of course, from a privacy point of view, this solution would not be a good one, so it's worth thinking more about how to implement it properly.
Write your thoughts and ideas in comments, it will be interesting to hear!