Yet Another Rating System for CodeForces

#	User	Rating
1	jiangly	3898
2	tourist	3840
3	orzdevinwang	3706
4	ksun48	3691
5	jqdai0815	3682
6	ecnerwala	3525
7	gamegame	3477
8	Benq	3468
9	Ormlis	3381
10	maroonrk	3379

#	User	Contrib.
1	cry	168
2	-is-this-fft-	165
3	Dominater069	161
4	Um_nik	160
5	atcoder_official	159
6	djm03178	157
7	adamant	153
8	luogu_official	150
9	awoo	149
10	TheScrasse	146

Finally, I finished the implementation of an improved version of TrueSkill rating system that [user:EbTech,2020-07-04] named "TrueSkill from St.Petersburg".↵
↵
TL;DR↵
=====↵
↵
Results are [here](https://raw.githubusercontent.com/nikgaevoy/SPbTrueSkill/master/data/CFratings_actual.txt). Full repository is [here](https://github.com/nikgaevoy/SPbTrueSkill). ↵
↵
Important notes on the results↵
------------------------------↵
↵
Note that those results were obtained by running on the [user:EbTech,2020-07-04]'s testset (rounds only before Codeforces Round #645 and team contests excluded) and thus do not represent the current standings on Codeforces. Also, the file contains only users with at least 10 contests and at least one contest in 2020 (otherwise, the file would be very large). However, you may obtain full rating by running the algorithm on your computer, it should take about a minute. Also note that the results may seem not very impressive since I chose the first parameters out of the blue and they should be changed of course.↵
↵
What is this rating system?↵
===========================↵
↵
It is like original TrueSkill, but without its problems, like multiway ties. What is TrueSkill? It is a generalization of Elo/Glicko to the case when many teams compete simultaneously. However, the original TrueSkill was made to deal with many players, large teams and not so many ties, and therefore should not work correctly in our case. The improved version of TrueSkill was designed for the sport version of "What? Where? When?", thus it solves the original TrueSkill problems and perfectly match Codeforces-like competitions.↵
↵
![ ](/predownloaded/5e/f5/5ef56d978c3ed29336b8321eba6c74d5f1575ce5.jpg)↵
↵
Pros↵
----↵
↵
- It is mathematically perfect. It does exactly Bayesian inference on the model with many simultaneous players (just like Codeforces), no approximate inference or any other tweaking of the results. Therefore, no other rating system could be more precise. However, there is a room for customisation to fit your personal preferences (the stability of the rating, etc.).↵
↵
- It is very fast. As I have already mentioned, it processes almost all Codeforces history in less than a minute. The complexity of processing contest with $n$ players is $\mathcal{O}\left(n \log \frac{1}{\varepsilon} \right)$ where $\varepsilon$ is the designated precision of users' ratings.↵
↵
- Teams out-of-the-box. Despite the fact that results above were obtained on tests with team contests excluded, it is not the rating system issue. The only reason why I excluded team contests from the testset is that CF-loader I took from [user:EbTech,2020-07-04]'s repository doesn't process team contests and I don't want to spend time for making a loader by myself. However, now team performance is computed as sum of player performances, but it could be easily changed to any other formula.↵
↵
- Reduced rating inflation, explicit performances, visible rating feature and more.↵
↵
- [user:tourist,2020-07-04] on the top of the rating!↵
↵
Cons↵
----↵
↵
- Some formulas lack of numerical stability. That happened because formulas from the [original article](https://logic.pdmi.ras.ru/~sergey/papers/NS11_Ratings.pdf) contain some mistakes (at least they fail on some obvious tests like $(-m) \cdot \chi_{[-\varepsilon, \varepsilon]} = -\left(m \cdot \chi_{[-\varepsilon, \varepsilon]}\right)$) and I failed to retrace the author's steps in the computations. However, they seem to be somewhere near the truth and they are numerically stable, so I believe that it could be fixed.↵
↵
Further discussion↵
==================↵
↵
Rating system parameters↵
------------------------↵
↵
As I already mentioned, I haven't spent lots of time choosing the best parameters for Codeforces case, so there is still some work with data ahead.↵
↵
Normal vs logistic distribution↵
-------------------------------↵
↵
Which distribution expresses player rating and performance better? Any choice is very debatable. Currently my implementation uses normal distribution, but it could be easily replaced with logistic or any other distribution, the only thing you need to do is to implement operators like multiplication, division, addition and two multiplication on indicator operators (last two are more tricky than the others, but still not very hard, see article appendix for explanation what to do and [this file](https://github.com/nikgaevoy/SPbTrueSkill/blob/master/math/indicators.pdf) as an example for normal distributions).↵
↵
Large multiway tie on the last place↵
------------------------------------↵
↵
Codeforces standings have an unusual artifact of a large multiway tie of a players with zero score which does not perfectly match the theory and possibly affect those players' ratings too much. There are many ways to deal with this problem, e.g. exclude them from rating at all, make separate tie-$\varepsilon$ for them, etc.↵
↵
Acknowledgements↵
================↵
↵
I'd like to thank [user:EbTech,2020-07-04] for starting [this discussion](https://codeforces.me/blog/entry/77890?#comment-630911) that led us here, S. Nikolenko for pointing the solution out to me and [user:MikeMirzayanov,2020-07-04] for great Codeforces system that crashed during my experiments only twice.↵

Rev.	By	When	Δ	Comment
en10	nikgaevoy	2020-07-04 21:59:43	0	(published)
en9	nikgaevoy	2020-07-04 21:59:24	351	(saved to drafts)
en8	nikgaevoy	2020-07-04 21:37:27	0	(published)
en7	nikgaevoy	2020-07-04 21:36:57	101
en6	nikgaevoy	2020-07-04 21:32:20	5	Tiny change: 'case when when many' -> 'case when many'
en5	nikgaevoy	2020-07-04 21:31:51	3	Tiny change: '. Also, this file cont' -> '. Also, the file cont'
en4	nikgaevoy	2020-07-04 21:30:49	488	Tiny change: '5.jpg)\n\nPros\n' -> '5.jpg)\n\n[cut]\n\nPros\n'
en3	nikgaevoy	2020-07-04 17:28:43	264	Tiny change: 'n[cut]\n\n### Pr' -> 'n[cut]\n\n\n### Pr'
en2	nikgaevoy	2020-07-04 17:23:38	5
en1	nikgaevoy	2020-07-04 17:21:47	4673	Initial revision (saved to drafts)

History