These are correlations of various rankings of sports teams (college football and men's basketball) comparing earlier weeks' rankings with Kenneth Massey's comparison pages' most recent mean ranking. The aim is to compare how well individual rankings managed to match what is now considered to be a good ranking, back when they had less data on which to base their assessment.
Data source is Kenneth Massey's comparison pages: |
There are numerous systems to rate and rank college sports teams, and the web, which includes comparison pages such as Kenneth Massey's, makes it is easy to find and compare quite a number of them.
How can we know which system is best? There are various ways to test them, and these correlations represent one approach, which I've applied to the rankings listed by Kenneth's rankings comparison pages. The approach here is motivated by an assumption: that as the season progresses, eventually the very latest average (mean) ranking on Kenneth's page, which he derives from the mean of all the individual rankings listed, is pretty close to the "correct" ranking, i.e., pretty close to the best possible ranking based upon games played so far. This assumption matches my intuition, but I don't know any way to prove it. However, I observe that as the season progresses, the rankings produced by various systems grow more consistent with each other and grow closer to this mean ranking. The question I've addressed is this: among all the individual ranking systems, which one produced rankings closest to this "nearly best" ranking in previous weeks?
It would be great, during the middle of the season, to know how close some particular ranking is to rankings that will be produced later in the season. For example, it would be nice if during "week 8" of the season, you knew which system was producing rankings closest to what will generally be seen in rankings during, say, "week 10". But naturally we cannot see into the future.
Yet we can do the same thing after the fact. Once we've reached "week 10", we can take a generally accepted ranking and look back to see which system was getting close to it back during "week 8". That is what these correlations do: they correlate the rankings produced by systems in earlier weeks with the most recent mean ranking, which we assume to be pretty close to the best ranking available.
Seeing that a ranking system produces the results closest to "week 10" rankings back in "week 8" suggests the system has done the best job so far. Thus we might conclude that that system's final season rankings are the best of the lot. However, there remains the possibility that some ranking systems do not work so well with just a few weeks' data yet produce excellent results near and at the end of the season. Indeed these correlations turn up examples of pairs of rankings such that one of the two has a closer correlation to the final ranking during early weeks and the other has the closer correlation during the later weeks. Thus this does not provide a fool-proof test of ranking quality.
(sample)
Overall System - CurWeek Week9 Week8 Week7 ... ------- ------ - ------- ----- ----- ----- ... 1/959 MAS - 6/993 2/983 1/965 2/953 ... 2/959 Consensus - 1/984 4/963 3/952 ... 3/957 RUD - 8/991 4/980 2/965 1/954 ... 4/954 BCM - 1/996 3/981 5/962 5/948 ... 5/954 CLA - 2/995 5/978 17/956 4/948 ... ...
In addition to the rankings on Kenneth's comparison pages, two more are listed. Consensus is the mean ranking from Kenneth's page, and Con2004 (or whatever year) is the final mean ranking from the previous year. The latter is compared with each weeks' rankings.
At the bottom of each data column is the count of rankings that week, the average (mean) of their correlations, the standard deviation, the minimum, the median, the maximum, and the range (maximum minus minimum) of each column.
Wobus Sports: www.vaporia.com/sports