Friday 21 December 2007

Round Robins and Ratings

(Disclaimer: The following post may contain assertions that are very weak mathematically)

One reason I'm in favour of "banded" Round Robins (and RR's in general), is that I believe they result in more accurate ratings, and ratings changes. Unlike swiss tournaments, the field you play is fixed in advance and rating performances depend more on results and less on the field you play.
But lets have an example
Imagine you are an "improving" player. By "improving" I mean you've reached the stage where you can beat everyone lower rated than you (even 1 point lower), score about 50/50 with players up to 100 points above you, but still can't beat players above that.
Now you play a typical 7 round swiss, where you happen to be seeded about 10th out of 35 players. This means you win the first round , lose the second , wins the third, lose the 4th etc (a typical Swiss bounce tournament) . Now it turns out that you finish on 4/7, but it also turns out that the average difference between you and higher rated opponents is 150 points. On the other hand the tournament has a long tail, and the average difference between you and your lower rated opponents is 300 points. So the average rating of your field is 107 points below your rating. Under the ELO system your expected score is 65%, but in this case it is only 57%, resulting in a drop in your rating. Now sure, it can be argued to maintain your rating you should have scored 30% (almost 1 point) against your higher rated opponents, but this is far more difficult than say dropping a point against a weaker opponent.
Now imagine the same tournament as a RR where your are in the second group, with only 1 opponent rated above you (and probably close to your rating), and the rest below you. Under these circumstances you score 6/7 and up goes your rating.
So why the difference? Basically in a swiss you will always play someone N/2 places away from you in a score group of size N. If N stays large you often play someone either too easy or too hard (from a ratings point of view), and only rarely will you play someone close to your rating. In a banded RR you get to play players close to your rating and therefore your rating change depends mainly on your score, not your (unpredictable) field.

2 comments:

DeNovoMeme said...

Shaun, I enjoyed reading your example, and, it does appear to support your antipathy toward the Swiss system. However, mathematically you need to say what would happen as the number of games/tournaments you improving player plays. As the number increases, the anomaly you present becomes statistically averaged away.

Shaun Press said...

I don't dislike the swiss system, although I feel it has driven the Round Robin tournament unfairly out.
Of course a player cannot improve forever, and the conditions I listed in my example would go away after a time.