For me, one of the fascinating aspects of the NFL is its mind-boggling complexity. In an earlier post I pointed out that 'experts' regularly fail to predict the first round of the NFL draft better than random selection. Following up on that post, I will consider some of the difficulties associated with modeling NFL games. I've steered clear of the math in this one, so it may be a bit wordy.
Several articles related to Chip Kelly's arrival as an NFL head coach - such as this from Yahoo Sports Moneyball- suggested Kelly and his use of a computer simulation program called Zeus would revolutionize the NFL. Here is a representative quote:
"Whenever Kelly does enter the league, he'll play the game aggressively, with "aggressively" meaning in a mathematically logical fashion. By the end of the season every coach will be going for it on fourth down, attempting fake punts, fake field goals, two-point conversions, and they'll likely do all of this oblivious to the fact that there's astounding mathematical evidence supporting the decisions they're making.
They'll just see Chip Kelly's team lighting up the scoreboard and follow suit because … well, 90 percent of NFL coaches are followers."
the Zeus computer program, which takes fourth-down situations such as the ones Oregon had against Arizona, and runs them to conclusion as many as a million times to determine the optimum play-call. The system incorporates the teams' characteristics, ball position, yards to first down, clock, and timeouts. Needless to say, these guys are smart.
Complex algorithms aside, the outcomes are incredibly simple. Zeus tells us that teams should almost always go for it on fourth and short, attempt more onsides kicks, go for two-point conversions, in other words, do all the things that Chip Kelly does on a routine basis. But the most important thing to take away from Zeus' findings is that the math isn't even close. The numbers are so overwhelming that teams that kick field goals on fourth and short at the 20-yard line aren't just wrong, they're so wrong it's ludicrous.
A few observations. First, Chip Kelly's first year was successful by any measure - 10-6 with a playoff loss. However, in that loss, and several times throughout the year his propensity for going for it on fourth down got his team into trouble. He was also inconsistent with his fourth down calls using what he called instinct, not just his magic program, to make his decisions.
Second, can you model something like going for it on 4th down? Answer, likely not very well. Notice that the prediction that every NFL coach would be going for it on every fourth and short because the numbers were so clearly in favor of going for it did not materialize. There are several related problems here.
One, NFL coaches are not idiots. If there were a dead obvious right decision to be made, they would have been making it. Also, NFL organizations are aware of the existence of mathematics and many teams employ folks to analyze all kinds of questions related to likely outcomes. This does not mean Chip Kelly's idea that 4th and short should be attempted more often than is the current norm is wrong, it only means that the likely deviation from some perfect answer is not going to be huge, i.e. some gross mistake ALL NFL coaches have been making for years, but rather a small deviation that suggested MOST NFL coaches had been too conservative in SOME circumstances.
But, you say, the magic ZEUS says GO FOR IT. Well, Zeus is only as good as the model and some things, like the NFL, are really hard to model. The article states that "millions" of simulations were run. Ha, ha, ha. I'm sure this is the reporters error. Big Blue, the first computer to beat a Chess master, could run 8 million positions a SECOND and had all kinds of special modeling techniques to search the likeliest lines of play. The NFL is many, many times more complex than Chess. Millions of simulations in a game with 22 moving people is so absurdly small as to be funny. Using chess as an example, after one move from white and black there are 400 possible positions. Given the combinations of ways team align their players, there are at least several hundred positions BEFORE the play starts and then ALL 22 players move simultaneously.
Next, consider the 4th and short scenario. On one hand, it should be simple to pick up a few yards. Also, short yardage plays are more likely to succeed than long plays. So if I only need to run the ball two yards, I am in business because the average NFL running play gains around 4 yards.
Here comes the complexity. If the defensive coordinator knows I am going to run the ball to gain a limited number of yards, he will switch his defense to stop the run. Now, if you run the ball against 8 or 9 man boxes your average per carry goes way, way down. The entire field of game theory was developed to deal with these kinds of scenarios and it turns out, not surprisingly, to be hard to evaluate real life situations. Now if I am Kelly and I know they suspect that I am going to run the ball, then I should maybe play action and throw the ball. However, passes are riskier than runs, hence my chance of success goes down. Also, the closer I am to the goal line, the easier it is to defend the pass because the field is smaller. Further, if the defense decides that they think I am going play action . . .
So one of the problems is that there is no such thing as a 'simple' fourth and short because the dynamics of play-calling have such a huge impact on the outcome of each play. Further, there is a HUGE difference between 4th and short against the Seahawks or 49ers and the Jaguars.
Yet another complication is that trick plays are more likely to succeed when they are tricky. If I know you are going to go for it on fourth down, it removes one of the most effective ways of getting a 4th down conversion - say a fake punt. It is not that these kinds of dynamics can't be modeled, it is that the models generated tend to be very sensitive to initial conditions. So, if I assume the other team will play 7 men in the box when I need to run for 3 yards, then I should go for it. But that is clearly a false assumption. And whatever the odds say, they are predictive of an average outcome. Which is another problem.
Most of these models argue that, over an extended number of repetitions the average outcome of going for it on 4th down will convert to a winning strategy. In 2012, however, there were only 468 fourth down conversion attempts - 203 successful. That means there are less than two 4th down attempts per game. This is a very, very low repetition environment. Consider the possible outcomes of a fourth and short attempt. 1) Score a touchdown, 2) get a first down, 3) fail to get a first down and turn the ball over, 4) receive a penalty that necessitates a punt 5) fumble - turnover 6) Interception for a turnover 7) Turnover the other team scores. Obviously 2 and 3 are the more likely outcome, but I've seen all of the above. When you have at least 7 possible outcomes from a very low repetition event with lots of dynamic feedback you cannot generate a model the predicts likely outcomes with great accuracy. Then add in all the elements of field position, time on the clock, score and weather and it turns out the 4th and shorts are in many instances singularities. The precise events of time, score, yardage, weather, opponent, and so on happen so rarely - perhaps once or twice a year - that statistical models based on past outcomes do not provide a reliable guide. And hypothetical projections based on game conditions must oversimplify to a disabling degree.
Consider when the Rams successfully faked a punt from their own goal line against the 49ers. I think it was 4th and 6. I have never seen any model that suggest you should go for it on 4th and 6 from your goal line. Indeed, this is a suicidal idea. Unless you have a former QB for a punter and give him a clear option: If the receiver is uncovered, throw it, if not, punt. Because it was so clearly stupid to try a fake punt in this scenario, the 49ers did not cover the receiver and hence stupidity becomes genius. On the other hand, how many punters would you trust to make that throw? I'm not sure how many times this play has been run under similar circumstances in the past 10 years, but I'm guessing it is once. You can't build a model from one repetition.
Or consider a scenario from abut 4 years ago involving the Vikings( I don't remember who they were playing, can someone help me out?). Adrian Peterson was visibly upset with the play calling and general offensive performance - understandable as he plays for the Vikings. It was a crucial 4th and 2 and he was begging, absolutely begging to get the ball. Now whatever the game situation, Peterson is really all the Vikings have on offense. He is the franchise. They gave him the ball, and he came up short. Was it the wrong call?? I don't know, but if you alienate Peterson in Minnesota what the hell do you have left? There is no way to model this scenario because game considerations in this case are only part of equation.
The best evidence against the modeling idea is that Chip Kelly didn't behave all that differently and the rest of the NFL does not seem to be following over themselves to change their approach. If Kelly wins a couple of Super Bowls making seemingly crazy decisions then you can bet things will change quickly - think of the spread of the West Coast Offense. I wouldn't, however, hold my breath.
In essence, 4th and shorts are rare, attempts to go for them are very rare, and the number of variables surrounding each attempt is large. In such instances, computer modeling breaks down.
In the next, more mathematical installment, I'll try to give an idea of the kind of complexity the NFL generates. The short version is; pretty damn complex.