Explanation of Wobus/Margin Rankings and Ratings


This is about "MOV" (i.e., margin-of-victory) rankings I produce for DIA Football and Men's DI Basketball.

I give each team a numerical rating, then rank them accordingly. The ratings are done by calculation; my own judgment entered the process only when I devised the calculation.

These ratings are based on the score of games played in a season, taking into account the winner and loser and the margin-of-victory. They do not take into account when the game was played, or which was the home team, or how many total points were scored by each team represented by that margin, or any other data. Other fine ratings/rankings easily found online take those into account, so fortunately, you are not deprived. I would incorporate those as well, but haven't devised a way to do that at this point. One has to start somewhere.

In a nutshell, for each game a team plays, I take whether they won and what the margin-of-victory was, and I apply a function that purports to estimate the probability that the team would win the game, e.g., if the game were played a second time. I then use this probability in the Zermelo/Bradley/Terry algorithm to produce a rating (see Bradley-Terry model in Wikipedia, i.e., what KRACH uses), and use that rating to produce a ranking. The function I use is a simple one: for the winning team, I begin with a constant fraction (a touch above .5), and to it I add a second constant fraction (smaller) once for each point in the winning margin. The sum is my estimate of the probability that the team would win again against the same opponent. For the loser, I estimate a probability of one minus my estimate for the winner.

The basic way to use the Zermelo/Bradley/Terry algorithm is to feed it won-lost data, in effect, giving a team 1 if it won and 0 if it lost. Instead, of 1 or 0, I give the fractional probability that this game's margin-of-victory indicates the team would win a game like this one. To make it easy for a literal-minded person like myself, I actually multiply the probability by 1000 and run the algorithm as if 1000 games were played for each actual game, e.g., 12000 games in a football season.

I devised the function by doing a small study covering a few basketball seasons, specifically looking at cases where a pair of teams played more than once in the season. Then, for each margin-of-victory (1, 2, 3, etc.), I compiled the won/lost results of the other game(s) played by the same two teams in the same season. So for each instance of game won by three, assuming the two teams played another game, I counted the win or loss of that other game. With this, I produced a graph of percentage-of-other-games-won for each margin-of-victory.

My analysis at that point was by eyeball, and as far as I could see, the relationship was linear. This doesn't make sense because with a sufficiently large margin of victory, such a function would indicate a greater-than-one probability of winning a game. However I could see no significant deviation from linear, given the data I used, the results of several seasons. Not even the hint of any other trend. One has to start somewhere, and I haven't improved upon it, so that's what I'm using. It is apparently not too bad for the regime of scores-seen-in-actual-games.

From the graph, I estimated a "slope and intercept". I followed this up with tweaking: applying it to a season, I compared measures with those in Kenneth Massey's Ranking Comparisons/Composites, i.e., correlation to consensus and ranking violations. I tweaked the the slope and intercept by experiment, aiming for bettering those measures. I believe I also tweaked based on its predicting ability, i.e., compared these week-old and two-week-old rankings with current won-lost data and a consensus of other's current rankings. I also applied the slope and intercept with football season data and experimented in a similar manner to find values for "football use".

Here are the numbers I use at this point (1/6/2018). I list them here as percentages of probability, i.e., 50 is 50% chance or .5 probability of winning:

SportSlopeIntercept
DI Men's Basketball0.0551
DIA Football0.028051.025

There is effectively a "bonus" for winning: the team that won is considered to have the intercept plus number points in the margin times the slope, i.e., winning by three at DI Men's Basketball yields a percentage of 51.15 which is 51 plus .05 × 3. The loser has 100 percent minus that.

I was surprised by the numbers that my own study produced: the slope seems quite small to me and the intercept quite close to 50%. Before going to the trouble of a study, I Googled for any information on the subject, and the best I could find were numbers that odds-makers are purported to use to produce odds from point-spreads. I don't recall they used anything remotely as small as small as .05 percent probability per point. I was expecting my own study to find something I could perhaps tie into Pythagorean Expectation, but I don't see that.

Well, what I have right now is simple-to-the-point-of-simplistic. But it's my start, to get into the fancy stuff: MOV-based ratings.

John Wobus, 1/6/2018

 


Wobus Sports: www.vaporia.com/sports