These are listings of the number of the seasons' wins explained by various rankings of sports teams (college football and men's basketball), where a ranking is considered to have explained a win if the game was won by the higher ranked team.
The aim is to compare how well individual rankings managed to explain the wins we've seen so far, including wins both before and since the rankings were produced.
Data source is Kenneth Massey's comparison pages: |
These "explained wins" are as follows: a specific ranking is considered to have "explained the win" if the team that won the game was ranked higher by that ranking. The numbers I show are the fraction of the games between rated teams that the ranking explained, multiplied by a thousand and truncated.
For example, if West State wins a game against East Academy, and if ranking XYZ ranked West State at 5 and East Academy at 6, this counts as one game between rated teams (the denominator), and as a "win explained" by ranking XYZ (the numerator). Further, if there have been 250 games between rated teams so far, and rating XYZ explained 200 of them (i.e., in the other 50 cases, the winning team was not rated by XYZ as higher than the losing team), then XYZ's fraction of explained games is 200/250 or .8. I list this as 800 for convenience (e.g. as are baseball batting averages).
I use rankings produced through various weeks of the year, but use all the wins that have occurred so far. Thus, what this measures includes a predictive element.
This evaluation covers both some retrodictive and predictive abilities, i.e. compares the ranking with both wins that had already happened when the ranking came out, and subsequent wins. In effect, it tests how well the ranking system predicts what we now see as a retrodictive ranking. Over the long term, and given teams that aren't streaky or changing strength, but merely randomly showing their strength in the games they play, one would expect any predictive ranking would, in effect, be predicting a far-future retrodictive ranking.
I use the phrase "what we now see as a retrodictive ranking" as if it means a ranking that maximizes the number of occurrences of games with a higher ranking team beat a lower ranking team. However, that is certainly not the only possible criteria for a retrodictive ranking.
Massey's comparison lists ranking violations, specifically a percent, described as follows: The percentage of all games played such that in retrospect the lower ranked team defeated the higher ranked team. This is the inverse of "wins explained" (when expressed as fractions, the sum of the "ranking violations average" and the "wins explained average" would add up to 1.000, or 1000 as we express them) except that it is not clear how teams that are tied in the rankings are handled. For the purposes of these "wins explained", a ranking which declares both teams of a game as equal (with the same rank) has not explained that win. I've thought about allowing to half explain the win (adding .5 to the numerator) but have not done that. But still, ranking violations and wins explained are generally very close for systems that produce few or no ties.
There are two systems listed on the Massey comparison that deliberately try to minimize rankings violations in their ranking, thus maximize "wins explained". "GMP" or GridMap and "CMV" or Coleman's Minv. (I tried to produce such a ranking once, and I can tell you that it requires a systematic approach: my ad hoc attempts didn't come close.) If either achieve a maximum wins explained, then in the current column, they should be number 1 with no possibility of better, and by looking at what it did in earlier weeks, you can see how a system geared to this type of criteria fares in handling more of the season.
It would seem that in not-too-distantly-past weeks, these rankings violations minimizers would score high on explaining all the wins so far, but in the more distant past, other measures of how the teams compare would probably do better.
Note: For the top 25 version of the page, the number is the total number of all the games that are explained by the ranking of 25 teams. If East State is ranked 10 and West Academy is not in the top 25, then that win is explained. If neither is in the top 25, the win is not explained, but the game still counts. Thus, all systems come in with a lot fewer wins explained.
My observation is that there is a lot of variation week-to-week in how the various systems score on these criteria.
(sample)
Overall System - CurWeek Week9 Week8 Week7 ... ------- ------ - ------- ----- ----- ----- ... 1/959 MAS - 6/993 2/983 1/965 2/953 ... 2/959 Consensus - 1/984 4/963 3/952 ... 3/957 RUD - 8/991 4/980 2/965 1/954 ... 4/954 BCM - 1/996 3/981 5/962 5/948 ... 5/954 CLA - 2/995 5/978 17/956 4/948 ... ...
In addition to the rankings on Kenneth's comparison pages, two more are listed. Consensus is the mean ranking from Kenneth's page, and Con2004 (or whatever year) is the final mean ranking from the previous year. The latter is compared with each weeks' rankings.
At the bottom of each data column is the count of rankings that week, the average (mean) of their wins explained average, the standard deviation, the minimum, the median, the maximum, and the range (maximum minus minimum) of each column.
Wobus Sports: www.vaporia.com/sports