cfsearch's blog

By cfsearch, history, 3 years ago, In English

I recently scraped almost all of the submissions from Codeforces. Here I share all the source code and metadata (problem ID, submitter, language, verdict, etc.): https://mega.nz/folder/Sypi0BrS#iNbQXf3EwcjZbpwXRKHOnQ. The dataset contains at least 99.8% of the public submissions with ID <= 128M. In total, there are ~98M submissions.

In addition, I created a source code reverse search engine based on this dataset, which you can access at https://cfsearch.top/.

Disclaimer: The scraping process violates Codeforces' Robots.txt. Use of this dataset may even violate Codeforces' terms. Use it at your own risk.

Btw, MikeMirzayanov, is it possible to share the official dataset?

Full text and comments »

  • Vote: I like it
  • +95
  • Vote: I do not like it