ELO player ratings for two-player games?

peyrol · February 2, 2014, 4:59pm

The ELO rating system provides a way to determine a player’s strength at a two-player game of skill. It was created by the physics professor Arpad Elo. If everyone using the Vassal server had a unique identification, then for each module which can be played by just two players, a record of the game results could be kept, and a database of player ratings could be built up.

There already is a website that has a player ratings database for Avalon Hill games, including games with more than two players. I’m not sure if the results from Vassal or other virtual games are included. A much simpler formula is used there. Some players have commented on a “bouncing around” effect which a specific rating might undergo before it tends to settle within a more narrow range of values.

This would be a lot of work, but as time goes by, the database would have increasing value. This is rating not ranking. A higher value indicates a stronger player. Of course the accuracy of outcome prediction would depend on the size of the database for that game. ELO has a sound statistical basis, and is designed to provide accurate values relatively quickly, which is to say, in general, fewer games need to be played in order to get reasonably accurate ratings. Once you obtain an established rating, regardless of what it is, all you should need to know is the ratings difference between yourself and your opponent in order to predict the likelihood of winning.

Another advantage that a Vassal database would have would be its automation. No one would need to manually submit game results by email. Players would agree between themselves, before the game starts, whether it will be rated or not. The Vassal engine could provide buttons or a popup window to submit the game result, whether the game was played on the server or by email. Perhaps both players would be required to submit in this fashion, in order to reduce the temptation to abuse the system. As long as the players are registered, the server could calculate the new ratings. A player’s rating could be displayed next to their ID on the server. Maybe there could be two sets of ratings for each game, one for email play and one for real time play.

As far as I know, games with more than two players are not well suited for the ELO system. But as long as a game could be played by just two players, then ratings could be built up for those games. I should add that two player games with more than three possible outcomes (player A won, player B won, or draw), that is to say, games with degrees of victory or loss, can be incorporated.

What do you think?