UPDATE: the new rating system paper will appear in the Web Conference 2021!
Last year, I published ratings using a contest rating system that I had developed at the end of 2015. Back then, I promised to eventually write in detail about the system's inner workings.
Over the past week, I've cleaned up and optimized the code: it now takes 24 minutes to process the entire history of Codeforces on my small laptop!!!
More importantly, I cleaned up the paper. Please ignore the last sections for now, as they're incomplete, but the main sections that explain how the rating system was derived are now ready! I claim my Elo-MMR is a more principled extension of Elo/Glicko to the programming contest setting, with nicer properties than the systems that contest sites currently use.
The main work that remains to be done are quantitative empirical studies comparing the properties of the different ratings systems. Since this is just my hobby project, I might not have the time to do all of it alone. If anyone wants to help run experiments, let's chat about it!
Insert your rating system into Codeforces Simulator
Wow this is an awesome paper, even though I don't really understand the math.
This just got in my arXiv feed. Will you write a blog here about some details? Did you submit this somewhere?
https://arxiv.org/pdf/2101.00400.pdf
There will be more in the coming weeks and months! We're working on getting it into a conference, and I'll be sure to blog about it too. In the meantime, I'm available to help if anyone wants to try something with the code.
Update: you'll find it at www2021.thewebconf.org soon!
Hey EbTech, I really enjoyed the paper until the first actual math (lol) where the joint distribution is introduced. Nevertheless, I shakily understand this, and hope to read more of the work.
Do you think that there is application of these techniques to provide problem ratings? I've heard it mentioned a couple of times that a problem has x rating if a user of rating x will solves it with probability 1/2, but I think there is some manual changes of problem ratings (right?) and so ths work could both speed that up, make problem ratings more accurate for training, or even provide problem ratings for other platforms (OI, AtCoder, ICPC, ...).
That's an interesting question! To rate problems, I suggest using the algorithm from the Performance Estimation section. In other words, consider the problem to "win" against a contestant if that contestant doesn't solve it.
This works best if the problem was used in a rated contest; it's harder to apply in ICPC. Furthermore, whether a problem gets solved or not seems to be affected by what other problems come before it, so ideally we should find some way to adjust for those.
Python bindings.
https://pypi.org/project/Elo-MMR-Py/
https://github.com/aropan/elo-mmr-py/