[Repost] "Justice may be delayed, but it cannot be absent": New Evidence on NXIST's Cheating Scandal

Revision en6, by cjj490168650, 2023-04-10 09:34:24

Translated by GPT-4 with some adjustment. Original post: 「正义可以迟来但不能缺席」:关于 NXIST 的一些新证据

This article provides a logically complete set of evidence, which does not involve any non-public internet resources, regarding the "suspected cheating" incident involving the ICPC Yinchuan Station and ICPC Shenyang Station in 2021. By discovering the suspected GitHub account (NaokiLH, renamed to https://github.com/brokenTarget) of a team member from Ningxia Institute of Science and Technology (NXIST) TS 1 team, Lan Hao, two years ago, and by mining and analyzing the commit records of his algorithm competition repo, we have obtained direct evidence that at least 4 questions from the 2021 Yinchuan regional contest set and at least 6 questions (including scrapped questions) from the 2021 Shenyang regional contest set were leaked to him at least one week before the competition. The substantial amount of new public evidence indicates that the TS 1 team indeed cheated, and the relevant students were heavily involved.

This is the first direct evidence related to the incident after several years of discussion. This article is based on the mining and analysis of publicly available online information by @lucas110550 and @曾耀辉, and all the evidence provided does not involve any infringement or violation of relevant regulations. At the same time, @陈靖邦 conducted overall coordination and review. We welcome everyone to report and supervise.

Considering that the vast majority of the evidence comes from the commit history of NaokiLH (suspected account of Lan Hao)'s GitHub repo, to prevent the person involved from deleting and fleeing after this article is published, we strongly suggest everyone fork the corresponding repo to permanently keep this record.

https://github.com/NaokiLH/algorithm_trans

UPD: The original repo has been deleted, those interested can move to the personal backup repo:

https://github.com/NXIST-backup/algorithm_trans

Background Information

How to evaluate the ICPC Yinchuan Competition in 2021?

How to view the cheating controversy in the 2021 ICPC Yinchuan Station and the publication of outstanding team members winning gold medals by Ningxia Institute of Science and Technology on its official WeChat account?

How to view Ningxia Institute of Science and Technology winning one gold and one silver in the 2021 ICPC Yinchuan competition?

PDS Plagiarism Detection System Example: A demonstration of plagiarism detection in a certain competition

https://weibo.com/u/7535856183

Main Content

Recently, NXIST announced the hosting of the 2023 Silk Road China Invitational.

How to evaluate the 2023 International Collegiate Programming Contest (ICPC) Silk Road China Invitational?

Upon learning of this, I was not only shocked but also deeply saddened: What is the purpose of doing such things?

So, on a leisurely afternoon, I began to search the internet for information about the award-winning team members, namely the TS 1 team members: Lan Hao, Ni Binqi, and Zhou Jianing. In an inconspicuous corner of GitHub, I found a homework submission repository for "Geek University's Python Advanced Training Camp — 1st Term" with the same name as one of the parties involved:

Week08 Homework Link Collection · Issue #52 · Python001-class01/Python001-class01

The information submitted by user upupqi

The information submitted by user upupqi contains the name of one of the parties involved, Ni Binqi. This led me to the user upupqi's profile directly:

User upupqi's GitHub profile

Of course, we can't directly conclude that this is the person in question (after all, there are many people with the same name). After some investigation (such as his algorithm competition repository, confirming that he is also an algorithm competition participant), we obtained a very strong piece of evidence (and the source of this article): his mutual follower NaokiLH, an ID that is suspected to point to another party involved, Lan Hao (LH).

upupqi and NaokiLH's follower interface

In an early issue raised by NaokiLH's account, there is a screenshot of his computer interface, where we can find a "Lan Hao 45418016" compressed file, which preliminarily confirms that the owner of this account is also named Lan Hao.

NaokiLH's screenshot of his computer desktop in a GitHub issue

Now that we have the GitHub accounts of the two parties involved, curiosity drove me to dig through their GitHub repos to see if there was anything interesting. The first conclusions were: 1. Both of them are not very proficient in using GitHub (including upupqi not knowing how to inherit repositories, and NaokiLH's commits being very messy and not compliant, criticism is raised here) 2. Both of their algorithm levels are not very high (both of their algorithm repos had only learned some basic things before May 2021, and upupqi was still in the AcWing training camp in August 2021. Most of their Codeforces VPs are only at the Div.2 AB level), and it is hard to imagine how such a team could win a gold medal in the regional competition.

What really gave birth to this article was NaokiLH's algorithm competition repo:

https://github.com/NaokiLH/algorithm_trans

It seems quite normal, nothing strange.

Going directly to the commit records during May 2021, I found some interesting things:

NaokiLH's git commits in May 2021

The commits are very casual, and I raise criticism. Let's first look at the commit "3123131" at the bottom, which occurred on May 10, 2021: 3123131 · NaokiLH/algorithm_trans@03efcf1

I found that NaokiLH created a new folder called "yinchuan" and uploaded codes for problems B, G, and I:

NaokiLH's commit records on May 10, 2021

Then in the commit "423423" on May 13, 2021: 423423 · NaokiLH/algorithm_trans@76bd49e

NaokiLH uploaded the code for problem K:

NaokiLH's commit records on May 13, 2021

Let's carefully compare these four pieces of code with the official competition problems of the 2020 Yinchuan:

Ref:

We can see that, apart from the correctness of these four pieces of code, their input and output, as well as some variable names, can be completely matched with the problem statement. By submitting (interested readers can verify themselves), two of these four pieces of code can only pass the sample cases, while the other two cannot even pass the sample cases.

So when did the official competition of the 2020 Yinchuan take place? May 16, 2021.

Title of the 2020 Yinchuan problem statement (held on May 16, 2021)

In other words, NaokiLH (suspected account of Lan Hao) had already obtained enough information on May 10 and May 13 (one week before the competition) to complete the initial codes for problems B, G, I, and K, which were supposed to take place in the official competition on May 16. The input and output matched the problem statement, and some of the code could already pass the sample cases. We can reasonably suspect that the problem statement was leaked one week before the competition, and Lan Hao, as a party involved, was already aware of it and heavily involved.

In the subsequent official competition, he passed problems A, B, E, G, J, and K, among which B, G, and K are highly suspicious problems derived from the above investigation. Problems B and J have been mentioned in Dai@NeverLand: PDS Plagiarism Detection System Example: Plagiarism Demonstration of a Certain Competition with code overlap.

Final leaderboard of the 2020 Yinchuan (held on May 16, 2021)

On May 22, 2021 (one week after the competition), the uploaded code was deleted by NaokiLH with a recorded commit: 321312 · NaokiLH/algorithm_trans@80c1103

NaokiLH's commit records on May 22, 2021

After May 22, everything returned to normal. NaokiLH began learning KMP and participating in AcWing training.

In-depth review, let's speculate on the situation at the time according to the timeline:

Early May, NaokiLH obtains the leaked problem statements, which include at least Problems B, G, I, and K. However, the leak only contains problem statements and examples, not solutions or standard inputs and outputs.

May 10, NaokiLH, through research or seeking help from others, writes the code for Problems B, G, and I. However, given their skill level, they cannot guarantee the correctness of these three pieces of code. NaokiLH thinks for a while and decides to upload the code to GitHub as a backup.

May 13, NaokiLH completes the code for Problem K and uploads it to GitHub as a backup.

May 16, The Yinchuan Regional Competition officially begins. TS 1 team tries (perhaps?) to submit the pre-written code without success. They then obtain the passing code from other teams through some means provided by the organizer and submit it to achieve AC (Accepted), ultimately winning the gold medal. Public opinions start to form.

May 22, NaokiLH deletes the code for Problems B, G, I, and K from the GitHub repo.

Bonus Content

2020 Shenyang Regional Contest: How Do TS 1 Team Prove Themselves?

Background Information

Translation: We have already talked to quailty that we will participate in Shenyang.

Translation: Now, the competitions have all come to an end, and they have returned to their college life, doing the same things they have always done, over and over again. "There's nothing to be proud of" is the phrase that appears most often in their conversations. While others are still immersed in their last victory, they have already started preparing for the next competition. (From the NXIST public account)

Explanation of the Leak of the 2020-2021 Shenyang Contest Abandoned Questions

Video at 1 minute 29 seconds: The competition time for the Shenyang station was postponed from the original May 23 to July 18.

Content

May 21, One week after the end of the Yinchuan competition, NaokiLH makes a new round of commits: 88888 · NaokiLH/algorithm_trans@7e35b60. They create a new directory under the original repo called ICPC/shenyang and upload A.cpp. The next day, May 22, NaokiLH creates another directory called blue_book/sh and moves the original A.cpp from ICPC/shenyang to this new directory: 321312 · NaokiLH/algorithm_trans@80c1103.

NaokiLH's commit records on May 21, 2021

NaokiLH's commit records on May 22, 2021

May 23, NaokiLH uploads B.cpp to the blue_book/sh directory: 4324324 · NaokiLH/algorithm_trans@1f8b5a2

NaokiLH's commit records on May 23, 2021

May 24, NaokiLH uploads F.cpp and H.cpp to the blue_book/sh directory: 3123123 · NaokiLH/algorithm_trans@d765bf7

NaokiLH's commit records on May 24, 2021

By June 11, NaokiLH had made modifications and ultimately completed the changes to the code in the blue_book/sh directory. Here is the final version of the directory at that time (including code for Problems A, B, F, and H).

We can easily find that the code for Problems A, B, F, and H does not match the problem statements of the Shenyang Regional Contest, and the clues seem to be disconnected. What went wrong? As it turns out, this situation is closely related to the Shenyang Regional Contest's scrapped problem event (see earlier references):

  • A.cpp actually corresponds to a problem called "jailbreak" in the scrapped Shenyang Regional Contest. As of the time of writing, this problem has not been publicly used. However, due to the passage of time, the scrapped problem PDF has been lost. Here, we provide only the relevant information and preview of the problem statement:

information

The problem was based on the Polygon platform for the question-making process, and the last edit was made on 2021-05-16 11:30:53 (UTC time).

Overview of the problem statement for the scrapped Shenyang Regional Contest problem 'jailbreak'

Overview of the problem statement for the scrapped Shenyang Regional Contest problem 'jailbreak'

We can see that the input method of this code is completely consistent with the original problem, but the output does not match: the problem requires the output to be "yes" or "no", while the output in the code is "YES" or "NO" (with an additional line of information). This actually corresponds to subsequent modifications to the problem, although this code still cannot pass the problem:

Jailbreak problem version modification records

Jailbreak problem version modification records

  • B.cpp can correspond to Problem H of the Shenyang Regional Contest 103202H - The Boomsday Project. The input and output methods are completely consistent, and it can pass the example cases. However, due to the adjustment of the data range in subsequent versions, it cannot pass all data. Interested students can compare it themselves.
  • F.cpp can correspond to Problem J of the Shenyang Regional Contest 103202J - Descent of Dragons. The input and output methods are completely consistent. This code can pass only some of the data besides the example cases.
  • H.cpp can correspond to the 2021 NewCoder Summer Camp Training 8: F. Robots. We need to explain that this problem was once one of the scrapped problems of the Shenyang Regional Contest and was later used in the 2021 NewCoder Summer Camp Training 8 held on August 9, 2021. Prior to that, it had not been publicly used. As it is a paid competition that requires registration to view the problems, we also provide a preview of the problem statement here:

Overview of the scrapped Shenyang Regional Contest problem 'robots'

We can see that the input method of this code is completely consistent with the original problem, but the output does not match: the problem requires the output to be "yes" or "no", while the output in the code is "Y" or "N". Interestingly, this coincides with subsequent modifications to the problem (shown below). By aligning the output methods, the code can also pass some of the official data besides the example cases.

Robots problem version modification records

Robots problem version modification records

A brief recap:

From May 21, 2021, to June 11, 2021, all of NaokiLH's commits related to the Shenyang site were impossible to complete without leaking the problems. The submitted codes correspond to some of the problems from the Shenyang Regional Contest (July 2021) or some of the scrapped problems from the official contest. Some of the scrapped problems only appeared in the August 2021 NewCoder multi-school contest, and some problems like Jailbreak have not appeared to this day.

After June 11, NaokiLH continued to study LeetCode and regularly checked in with AcWing. Until the Shenyang Regional Contest offline competition on July 18, no other suspicious commits appeared.

On July 18, the 2020 ICPC Shenyang Regional Contest offline competition officially began. Team TS 1 ultimately won the silver medal. The second wave of public opinion began.

https://board.xcpcio.com/icpc/2020/shenyang

The final leaderboard of the 2020 Shenyang contest (held on July 18, 2021)

Is this the end of the story? Not quite.

On July 23 (one week after the Shenyang contest), NaokiLH made a mysterious commit: 423423 · NaokiLH/algorithm_trans@1fb8f50. In this commit, we can see that the original A, B, F, and H codes under the blue_book/sh directory have been deleted and replaced with some codes prefixed with "tempo."

NaokiLH's commit record on July 23, 2021

These codes have disorganized names. After analysis, we found some corresponding relationships:

tempo1.cpp corresponds to the scrapped Shenyang contest problem Jailbreak, which has not been used to date. tempo2.cpp and tempo3.cpp correspond to the 20 Shenyang contest problem K. Scholomance Academy, and the code styles and implementations are completely different. tempo4.cpp, tempo5.cpp, and tempo6.cpp all correspond to the 20 Shenyang contest problem H. The Boomsday Project. These three codes are completely different. Among them, tempo4 corresponds to the version before the data range adjustment, and tempo5 and tempo6 correspond to the version after the data range adjustment. Both tempo5 and tempo6 can pass the sample cases. It is worth noting that tempo5.cpp outputs wrong answers on some data sets, while tempo6.cpp can pass all the data. tempo7.cpp corresponds to the 2021 NewCoder Summer Multi-School Training Camp 8: H. Scholomance Academy. This problem was also one of the scrapped problems of the Shenyang contest and was later used in the 2021 NewCoder Summer Multi-School Training Camp 8 on August 9, 2021. Since it is a paid competition that requires registration to view the problems, we also provide a preview of the problem statement here: Shenyang contest scrapped problem Scholomance Academy overview

NaokiLH's blue_book/sh/tempo7.cpp

We can see that the input and output methods are consistent, and it can only pass the sample cases. The code seems to be created to match the samples.

tempo8.cpp corresponds to the 2021 NewCoder Summer Multi-School Training Camp 8: B. Dohna Dohna. This problem was also one of the scrapped problems of the Shenyang contest and was later used in the 2021 NewCoder Summer Multi-School Training Camp 8 on August 9, 2021. Similarly, since it is a paid competition that requires registration to view the problems, we also provide a preview of the problem statement here: Shenyang contest scrapped problem Dohna Dohna overview

NaokiLH's blue_book/sh/tempo8.cpp

We can see that the code and problem input and output methods are consistent, and it can pass some data besides the sample cases.

tempo9.cpp corresponds to the previously mentioned 2021 NewCoder Summer Multi-School Training Camp 8: F. Robots. This problem was also one of the scrapped problems of the Shenyang contest and was later used in the 2021 NewCoder Summer Multi-School Training Camp 8 on August 9, 2021. tempo10.cpp corresponds to the template code for outputting the Catalan numbers modulo 1000000007. tempo11.cpp corresponds to the fast exponentiation template code. tempo12.cpp corresponds to the 20 Shenyang contest problem F. Kobolds and Catacombs. Although the commit 423423 · NaokiLH/algorithm_trans@1fb8f50 on July 23 is later than the 20 Shenyang contest date of July 18,

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en11 English cjj490168650 2023-04-10 10:22:46 4 fix some links
en10 English cjj490168650 2023-04-10 10:06:17 0 (published)
en9 English cjj490168650 2023-04-10 10:04:59 8
en8 English cjj490168650 2023-04-10 10:02:33 637
en7 English cjj490168650 2023-04-10 09:58:32 11448 Tiny change: 'silence, Ningxia Polytechnic has start' -> 'silence, NXIST has start'
en6 English cjj490168650 2023-04-10 09:34:24 10657
en5 English cjj490168650 2023-04-10 09:26:28 11788 Tiny change: ' 2021)")\n' -> ' 2021)")\n\n'
en4 English cjj490168650 2023-04-10 08:51:32 11159 Tiny change: '40w.jpg)\n\nOf cou' -> '40w.jpg)\nUser upupqi's GitHub profile\n\nOf cou'
en3 English cjj490168650 2023-04-10 08:03:59 3060 Tiny change: 'ectly:\n\n[User upup' -> 'ectly:\n\n![User upup'
en2 English cjj490168650 2023-04-10 07:54:09 336
en1 English cjj490168650 2023-04-10 07:47:32 2914 Initial revision (saved to drafts)