General Algorithm Overview:
The development and implementation of the algorithm that rates college football teams on this blog occurred over a very short period of time (roughly a half hour, so make any assessments of quality you wish to), mostly due to some lingering boredom and a semi-interest in the challenge of ranking college football teams.
Primarily, this rating algorithm was developed to rate college football teams, but it could be adapted to other sports as well. As the only important team sport without a postseason tournament or playoff to determine an annual champion, ranking 1-A football teams is given a special importance amongst all the rankings of the world (sounds dramatic, does it not?). No one really cares about college basketball rankings because ultimately there is a tournament that decides things on the court, and until there is a college football tournament, amateurs like me will continue to develop stupid computerized rankings like the one highlighted on this blog. It’s like revenge of the nerds, but with sports.
(1) Purpose of this ranking.
This ranking system does not attempt or pretend to in any way be a useful predictor of future performances, but is rather meant to provide a ranking of what a team has accomplished on the field so far, using a formula based on my own personal biases and philosophies regarding how awesome a team truly is. Most computer rankings claim to remove the bias of rating and ranking sports teams, which is quite noble. I, on the other hand, have no problem admitting that I set up the algorithm based on what I consider to be important and impressive about winning football games.
I mostly developed this ranking system for my own personal enjoyment, which is somewhat of a greedy reason to do anything. If it would make you feel better if I were doing this for the betterment of mankind or for the noble pursuit of honor and valor, by all means, live in whatever delusional realm you reside in.
(2) General mathy explanation.
This ranking system is by no means the most “mathematical” college football ranking system in existence, but being the owner of a computer, a couple of degrees in engineering, and a general obsession with college football makes me more than qualified enough to attempt to rank college football teams (at least more qualified than a lot of the people who are allowed to cast votes for the traditional human polls). Generally, this rankings system assigns a weight to each game a team plays based on the location of the game and the opponent’s relative strength. The weights of each individual game are then used to compute a pseudo-central mean for each team, and each team is assigned a strength scale based on the relative mean game weight. The scales are determined in such a way that mediocre teams will tend to have a scale factor of about .75-1.25, with better teams having scales greater than and worse teams having scales less than this centralized mediocrity.
Weighting a Team’s Games:
All games were not created equal. In fact, the circumstances and importance of any particular game played could represent anything from a pointless afterthought, a season-defining victory, a crushing defeat, an incredible upset, or (most likely) something in between. When attempting to assign a weight to each particular game’s outcome for each team, several variables could potentially be factored in, including the basic who, what, where, when, why, and how of the game, as well as more intricate factors like the surface the game was played on, a team’s weekly injuries, or the hotness of the team’s cheerleading squad. Ultimately, this rating system opted to limit the considered factors to the following simple criteria:
(1) Where the game is played matters.
Winning on the road should be rewarded more than winning at home, losing at home should be punished more than losing on the road, and winning or losing on a neutral field should lie somewhere in between. It is an undeniable fact of life that it is more difficult for a college football team to win on the road than it is to win at home. In a perfect computer ranking scheme, the relative effect of this home field advantage would vary based on the particular home field, since winning certain road games at certain locations is quite a bit easier than it would be at other locations.
Specifically for this ranking system, a simple home field advantage scaling factor will be determined weekly based on the outcomes of home teams versus road teams. The home field advantage factor will be defined as the square root of the ratio of home victories to road victories (which should end up in the 1.2-1.5 range and will be updated as the season unfolds). This factor will scale road victories and home defeats to weigh more, home victories and road defeats to weigh less, and will not alter neutral field victories or defeats. I probably weight road victories and home losses way too much (and also home victories and road losses too little), but that is how I decided I wanted to arbitrarily weigh games, so deal with it.
Ultimately, road victory > neutral victory > home victory > road loss > neutral loss > home loss.
(2) Who the opponent is matters.
Wins against better teams are more impressive than wins against worse teams. This may seem obvious, but the traditional human polls tend to neglect this factor as long as the teams on top of the polls keep winning. Also, human polls tend to lower a team’s ranking after a loss regardless of whether or not the loss came against a better team. It has gotten to the point where the human polls rank teams based on who lost least recently relative to an arbitrary starting point called the preseason polls.
I wanted my ranking scheme to be more than just a sorting of the teams based on overall winning percentage, and I definitely did not want anything resembling the arbitrariness of preseason polls. I think an accurate ranking system should consider every game individually every week, since as more information is gathered, certain game outcomes may become more or less impressive than when the game was actually played (especially with regard to teams that are underrated or overrated early in the season). Practically, a win against a crappy team is just a win against a crappy team, even if that crappy team is historically a big name program.
Specifically for this ranking system, game results will be scaled based on the opponent team’s numerical scale. Victories will be multiplied by the opponent’s scale, and defeats will be divided by the opponent’s scale, another arbitrary decision on my part that at least appears to weigh games how I think they should be. I probably put too much emphasis on strength of schedule, but as with home field advantage, this is my ranking scheme and this is my personal bias.
Ultimately, victory over a good team > victory over a bad team > loss to a good team > loss to a bad team.
(3) The final outcome of the game matters, but the score does not.
This is probably the most controversial of the considerations included. For the purposes of this ranking scheme, wins and losses are all that matter. This decision was made because the final score of the game is the result of too many factors that would need to be considered and filtered out, including offensive and defensive strategy, coaching mentality, when the losing team gives up, weather conditions, in-game circumstances, emotional circumstances, and other factors too numerous to list.
Ultimately, a 17-10 win could represent as dominant or as impressive a victory as a 49-10 win, even if the same teams are involved in both games. A system attempting to predict the future would no doubt need to include scoring data in some way, but that is not the purpose of this system. I have no problem with a team running up the score (in most circumstances, good sportsmanship is drastically overrated), but when it comes to ranking teams, all I care about is wins, losses, and location. Just win, baby.
Specifically for this ranking system, the final score of each game will only be considered as far as determining which team won the game. The point differential, point ratio, or total points scored are not considered in any way. Many other ranking systems do include point data, and that is good for them. I do not, and that is the way it is.
Determining a Team’s Scaling Factor:
The basic challenge of this ranking scheme is to assign a scaling factor to each team based on the results of that team’s games played.
(1) Initializing each team.
To start the calculation, every team is assigned a scale factor of 1.0, meaning every team starts each week as equals before the final rating is determined. Some might think that initializing each team equally to start is too politically correct or not based on reality, but I do not really care. Starting every team with an equal strength factor makes sense to me, and that is all that really matters. Week to week, the scale factor is not related beyond sharing some set of games in common.
(2) Raw game weight mean.
Using the results of the games a team has played and the location that the games were played, an pseudo-central mean is calculated for the game results of each team. Game results are scaled based on where the game is played as well as the strength of the opponent. A game outcome is given an initial weight of 1.0, but is then scaled as follows:
a. Home victory: Multiplied by the opponent’s scale and divided by the home field advantage factor.
b. Neutral victory: Multiplied by the opponent’s scale.
c. Road victory: Multiplied by the opponent’s scale and multiplied by the home field advantage factor.
d. Home defeat: Divided by the negated opponent’s scale and multiplied by the home field advantage factor.
e. Neutral defeat: Divided by the negated opponent’s scale.
f. Road defeat: Divided by the negated opponent’s scale and divided by the home field advantage factor.
Then, the pseudo-central mean of the game weights for the games played by each team is computed, and this value represents the raw game weight mean of the team. The pseudo-central mean is something between a true central mean and a straight average, to smooth the effect of outliers.
As can be seen, it is recommended not to lose to a bad team at home, and it is strongly recommended to beat good teams on the road.
(3) The actual team scale factor.
The team scale factor is determined in a pretty simple way. The average and standard deviation are calculated for the raw game weight means for the teams. Then, teams with raw game weight means greater than the average are assigned a scale factor that is one plus the number of standard deviations the team’s raw game weight mean is above the average value. Teams with raw game weight means less than the average are assigned a scale factor that is the reciprocal of one plus the number of standard deviations the team’s raw game weight mean is below the average value. Symmetry is a beautiful thing.
(4) Convergence and iteration.
The raw game weight mean and scale factor are calculated for each team repeatedly until the scale factors for the teams converge to their final values. This may take several iterations, but the computer is doing all the work, so that does not really bother me.
1 Response to “Algorithm Description”