Ahead of the release of the first BCS poll (is it “Broken Cash Scheme” or “Bowl Conspiracy Series”–I digress), I decided to tackle the issue of polling, the single most controversial system for determining a champion in all of sports. You know, the one where a college football champion is given a big glass Fritos bowl based on some complicated (or is that convoluted) formula that no one really understands (and that’s because you are not meant to. But if you enjoy the room spinning around until you pass out from confusion, here is one explanation of a formula that makes the theory of relativity look rather pedestrian).
Ahh, the BCS, the merging of computer and human polls that tells us that you just can’t believe what you see with your own eyes anymore.
What motivated me to write this is the following two polls: the first is the AP poll released early this week after the University of South Carolina’s stunning defeat (and yes South Carolina fans, it was stunning as you’ve never really won a conference game of any consequence since the BCS system was put in place, so try and be a little gracious) of the Alabama Crimson Tide, who had won 19 straight games and had been sitting on top of just about every poll ever created by anyone. The loss by the Tide not only disappointed their fans, but set in motion a firestorm that has fans in Boise, Columbus, Eugene, and Lincoln, amongst other places, all claiming the rights to the glass throne. The second poll printed below is a simulated prediction of how the BCS standings could look through week six. In that order, the two polls:
The Associated Press Poll (AP) for week 6: (1) Ohio State; (2) Oregon; (3) Boise State; (4) TCU; (5) Nebraska; (6) Oklahoma; (7) Auburn; (8) Alabama; (9) LSU; (10) South Carolina; (11) Utah; (12) Arkansas; (13) Michigan State; (14) Stanford; (15) Iowa; (16) Florida State; (17) Arizona; (18) Wisconsin; (19) Nevada; (20) Oklahoma State; (21) Missouri; (22) Florida; (23) Air Force; (24) Oregon State; (25) West Virginia.
Now compare these results to a poll generated at a site called the BCS guru, who claims to use the exact formula utilized in the BCS equation, to generate a projected BCS poll:
(1) Boise State; (2) Oregon; (3) TCU; (4) Oklahoma; (5) Ohio State; (6) Nebraska; (7) LSU; (8) Auburn; (9) Alabama; (10) Michigan State; (11) South Carolina; (12) Utah; (13) Stanford; (14) Missouri; (15) Arkansas; (16) Olahoma State; (17) Nevada; (18) Iowa; (19) Arizona; (20) Florida State; (21) Wisconsin; (22) Air Force; (23) Florida; (24) Michigan; (25) Oregon State.
Now, as my readers know, I participate in a fan poll at the Best Damn Poll.com, a site created by the fans for the fans, fed up with the stuff that appears in the AP poll (the AP poll is a conglomerate of media people who apparently draw names out of a hat to rank the teams in between churning out their pithy columns) and the Coaches Poll (a real novel concept of having the coaches of the actual teams, who are busy doing of all things, coaching, cast ballots even though they have a vested interest in the outcome. If that weren’t bad enough, these ballots are top secret so you can’t see that coach of All American U, who stands to make millions of dollars based on these rankings, gave his team an unwarranted bump and/or sabotaged one of his competitors). The long and short of it is that I spend hours each week analyzing polls, and I would never do silly things like rank a one loss South Carolina team two spots below a one loss Alabama team one week after SC beat Alabama (or rank Iowa three places ahead of Arizona despite the fact that the teams have identical records and Arizona beat Iowa this season). Perhaps these people in the media are just smarter than I am, but I decided to use the old fashioned “results of head-to-head contests on the field,” when such information is available to me. Yep, I’m silly like that.
Now, looking at the two polls, and admittedly a fan of The Ohio State University, you can probably see what got my attention above. When South Carolina pulled the upset and knocked off Alabama last weekend, with Ohio State sitting #2 in most polls since the preseason, many assumed the AP would simply bump the Buckeyes up to their rightful place as the nation’s number one team (and in fact, that’s exactly what the AP voters did). Surely the BCS guru is just messing with us (and lest you think it is just one crackpot, check out Brad Edwards projections completed for the Evil Empire here). No, according to both of these projected polls, The Ohio State University, undefeated and never really challenged but for a half on the road against Illinois in 20 m.p.h. gusty winds, sits fifth in the projected poll, while Boise State, the greatest show on that cute Smurf Turf sits at the top of the heap (yep, I know what a heap is and I choose my words carefully) ready to play the Oregon Ducks (those same Ducks that Ohio State beat with mostly the same players just six games ago soundly in the Rose Bowl). How can this be you ask?
Luckily for me, I had the perfect tool to test the polls. Prior to the season starting, I identified the variables I thought would be the best predictor of the relative strengths of the contenders this season. Unfortunately for me, the project was more daunting than I imagined, and I didn’t get the spreadsheet completed until about a month ago. Thinking that it was too late to utilize my data, I left it to wither in obscurity on my hard drive, figuring that it had the same utility at this point as am eight track tape player. Looking at it a few days ago to see if it might shed some light on the BCS forecasts, I was a bit surprised at my findings.
Think things are crazy in college football now? Just wait until Boise State is ranked #1 in the the first BCS poll to be released this weekend, if some of the so-called experts are correct.
Now, before I get into my methodology, a caveat for my readers: I was not that far down the winding path of my college career when I determined I was not meant to be the world’s next great mathematician (call me Bad Mike Counting, the antithesis of the fictional janitorial character that solved the most challenging of math proofs, if that’s even a thing). So, while someone with exceptional skills may scoff at my methodology, it’s important for my readers to know that the point of my attempt was to forecasts the relative strengths of each football program heading into the 2010 season, rather than predict their order of finish. That is to say, while I generated a numerical score for each team used to rank them, the actual value of the score is less important to me than the weight assigned to each of the variables utilized in the spreadsheet. This is so, because, the numbers for each of the variables are independent numbers, and thus, not scaled to be compared against one another in the first place. But as I suggest, this doesn’t really matter so long as the mad scientist decides ahead of time the weight he intends to give each variable to ensure that his results are not generated with a specific goal in mind (e.g. getting those Buckeyes at or near the top)
The variables that I considered most important are as follows: 1) how many total wins the team had in 2009, weighted by the strength of that team’s schedule; 2) how many starters the team returned on each side of the ball, with extra weight given for those teams that returned their starting quarterback (further broken down by the number of years experience the quarterback had prior to that season; and 3) the average value of the four recruiting classes, weighted so that juniors and seniors (those players with the most experience and the likely starters) are given extra weight in the formula.
In crafting my formula, I very much wanted to account for the differences in the coaches, theorizing that the strongest teams most likely have the best leaders. This, however, proved to be very difficult to do, as I could not seem to come up with a set of metrics that I felt could accurate assess the strengths and weaknesses of each team’s head coach. For example, I considered using the obvious, winning percentage, but felt it was problematic weighing someone like say Joe Paterno, who has almost four hundred wins (but whose teams have struggled a bit, at least this season) against say, someone like Jim Tressel, who just this week won his 100th game (and what does this say anyone? Could it be that the former’s best days are behind him while the latter is in his prime?). While I feel like the head coach is a huge factor in a team’s success, not having a separate variable is not problematic for me because I believe the influence of the coach is factored into how well the program did the year before, and in the recruiting numbers, as there most certainly is a correlation between the good coaches and the prospects they are able to recruit.
This limitation aside, I constructed the formula in three parts. The first part looks at the number of wins the program had last year. I started here because it’s my premise that contenders rarely come from the ranks of 3 and 4 win teams the previous season. As a whole, when looking at the numbers, more often than not your BCS champion won somewhere between 8-13 games the previous season, which allowed me to refine my contenders from the top 61 teams last season, the 7-8 win breaking point. So, in looking over my spreadsheet, some of the most diehard college football fans will notice that the Notre Dame’s and Michigan’s of the world need not apply. I then took the number of wins last year, and multiplied this times Sagarin’s strength of schedule rating, theorizing that a team’s wins against better competition was a better indicator of that team’s overall strength.
My next set of numbers had to do with the number of starters returning. My theory is that this provides a huge advantage for a team, both in terms of the experience of the players, but also in terms of familiarity with the system the coach intends to run. While doing this, though, it struck me that more than any one player, the quarterback is absolutely essential to a power team’s chances of making a championship run. For each of the 61 teams, I looked to see: 1) was the quarterback returning; and 2) if so, how many years experience did the QB have (since I had to draw a cutoff somewhere, the QB was only given credit if he had at least 100 attempts throwing the ball in a given season). I then assigned a value in equal increments of 10, 20, and 30 for 1, 2, and 3 years experience running a team’s offense respectively. To further underscore the importance of a starting QB, I deducted a -20 for any team who lost their starting QB on the premise that a transition in the offense was in the offing. For the starters, I multiplied each returning player times 7.5, a fairly random number that was chosen more because it helped balance out the weight of the three categories than because this number was somehow significant in and of itself. For my returning player metric, I simply added the returning starters figure to the QB returning figure, to come up with a second set of numbers.
Finally, the third category was to measure each teams recruiting success covering the four years of players that would comprise each team’s roster. A few notes are worth mentioning here though. It was beyond the scope of my spreadsheet to delve into details such as how many freshman a team chose to redshirt (have them sit out preserving a year of eligibility), meaning that the value of the first year players might be slightly inflated. The flip side is that I did not include the numbers for the fifth year players, which would undoubtedly include some players that red shirted at some point in their career, meaning that the value of some seniors are not included in these figures. Simply put, it would have been too difficult to ascribe the correct value for this category, as undoubtedly many of the fifth year players have already graduated, may not actually be starters, or for various other reasons, are no longer on the team (quit, transferred, etc.). The important thing to keep in mind is that this is meant to be an approximation of recruiting strength, not an exact figure. To balance things out, I counted the freshman and sophomores at a value of 1, and the juniors and seniors at a value 2, reasoning that the latter players likely make up the bulk of this year’s starters for each team. I then added these figures together for my third metric.
The final part of the formula was for me to decide the relative weights to give each of these three metrics. As I discuss this, I will admit that it was simply my decision how much value to place on each of these categories, and I do realize that changing the weights on them affects the outcome of the power rankings. That said, it’s important for the reader to know that I made this decision prior to running the numbers, in order to test a theory rather than to produce results with a certain objective in mind. Since the numbers have not been normed for comparison, it became necessary to add multipliers to numbers that otherwise would have been overvalued or devalued. For example, because I was using recruiting numbers for four seasons, left unadjusted, recruiting numbers for the best teams reached the 600 mark (Alabama had the highest four year recruiting mark at 599.07 actually), whereas, the highest number for the returning starters’ metric was 177.5 (Boise State, who returned 20 of 22 starters this year). Simply put, I did not want my formula to be completely dominated by recruiting figures, numbers themselves which some consider supsect (last year’s record helps to offset the fact that some highly recruited prospects just don’t pan out as expected). In the final formula, I merged the previous years win total metric with the returning starters metric and multiplied this times 2. I then took the four year recruiting figures and multiplied that number times .67. Again, there is nothing magic about my adjustments to the formula–it merely represents the amount of weight I chose to give each of the various categories. The result was that the recruiting figures ended up being worth roughly two times the value of the previous win season metric and returning players metric combined (e.g. Alabama had figures of 301.38 and 599.07).
Looking through these glasses, you might be tempted to pencil the Ducks into the Championship game if played after week Six. As a famous college analyst is fond of saying, “Not so fast my friend.”
With this explanation, my preseason power rankings with the totals in parenthesis (the raw numbers obviously mean little, but it allows the user to compare the difference between the teams in a numerical fashion), heretofore referred to as the Pole Position Power Index (or the soon to be famous PPPI rating, because not just anyone can throw numbers together):
1. Alabama (1004.14); 2. Boise State (917.63); 3. Florida (915.60); 4. Virginia Tech (87.06); 5. Texas (869.34); 6. Ohio State (866.82); 7. Miami (859.17); 8. LSU (856.81); 9. TCU (846.47); 10. Arkansas (842.86); 11. North Carolina (836.72); 12. Oregon (836.27); 13. South Carolina (827.93); 14. Auburn (814.18); 15. Missouri (812.26); 16. Iowa (811.92); 17. Nebraska (808.00); 18. Wisconsin (794.06); 19. Houston (780.60); 20. Oklahoma (777.85); 21. Georgia Tech (773.64); 22. Clemson (766.07); 23. Florida State (762.75); 24. Georgia (761.60); 25. U.S.C. (761.48).
Now, it’s important to understand just what these numbers represent and what they do not. These numbers are simply power rankings based on the factors described above, and thus, probably would function poorly as a predictor of actual finish. This is because, as any college football fan knows, a team’s ranking in the BCS is a weighted average of two human polls (Harris and Coaches, and obviously very subjective), plus computer polls that factor in such things as Sagarin’s strength of schedule (SOS hereafter) formula, a subject too complicated to delve into here. Since this poll was only attempting to ascertain the relative strength of the teams going into the season, factoring in SOS for the 2010 season made little sense, as a team’s overall strength has nothing really to do with the schedule that it will face that upcoming year (this is another way of saying that it is possible that a team could still be the best overall team though they played a weak schedule, a fact that many don’t understand). And let me state the obvious before I get the expected “how in the world can you have Oregon #12 you moron type comments.” Look, college football is still a game played by people, and, as is usually the case when people are involved, it is nearly impossible to predict behavior with any degree of certainty. No metric in the world will ever be able to capture desire, or predict that a quarterback will have a meltdown and turn the ball over five times against a five win team (I’m not thinking of Ohio State versus Purdue here, I swear), or that a program would sit 13 kids ahead of an NCAA investigation. It’s because we don’t know what will happen that we tune in to watch the games in the first place (just as an aside, I think Oregon’s fall in the power rankings can be explained by the fact that a new QB stepped in to run the offense, and rather than see a decline in the offense, it actually improved. Just ask the Florida Gators how difficult this is to do).
These power rankings could only function to some degree as an accurate predictor of where each team would finish (assuming my weights are appropriate of course) if each team went undefeated next season, an impossibility since teams belong in conferences and will ultimately play each other head to head. But what if each of the top six teams went undefeated, not considering SOS for a moment? That’s right Buckeyes fans, at least according to my metrics using variables to assess overall team strength, Ohio State would indeed be the sixth ranked team in the country. Is Ohio State really only the sixth best team in the country right now? I have no idea (though I do chuckle a bit at the meathead who has them #22 in his computer poll–if this Buckeye team in fact is only the 22nd best team in the country, then we truly are watching an unprecedented season of college superpowers). The point is, is that, the Buckeyes just might be only the sixth best team in the country. The outrage that will ensue with the Lebron-esqe BCS Reveal Show, a boring and anticlimactic one hour “special” that will release this miniature nuclear bomb on the college football world will be wholly a function of fans expecting Ohio State to be the number one team simply because they were ranked #2 preseason and the number one team has already lost. For those of us conditioned to follow the polls like Pavlovian dogs, we simply expect the pollsters to bump the Buckeye up before returning to the business of churning out useless drivel (now, I should say that I do my own poll and have the Buckeyes ranked number one based on my subjective, albeit possibly biased, assessment of their performance to date).
As I have intimated throughout this piece, if you change the values, the weights, or change the categories considered, teams will be moved up or down in these power rankings accordingly. As such, I realize that my spreadsheet does not stand for the proposition that Ohio State should be ranked sixth (or anywhere for that matter). The purpose in this exercise is to demonstrate that variables could be put together that function to challenge the conventional wisdom. It also wholly supports my contention that preseason polls are absolutely worthless, based wholly on speculative analysis of pundits and so called experts who are biased and flying by the seat of their pants, writing columns solely to stoke the coals to generate readership interest, not that this guy would be drawn in by such an article you understand.
Categories: College Football