Defense Wins Championships: A Lesson in Data Analytics from the Wide World of Sports

Super Bowl XLVII (48 for the Non-Roman) was so highly anticipated that some ESPN analysts were predicting it to be one of the best matchups ever, not only in Super Bowl history, but in all of major sports championships history. This year’s Super Bowl pitted the NFL’s best offense in history (the Denver Broncos) in terms of points scored (37.9/game) and total yardage (457.3/game), amongst many other metrics; against the NFL’s best defense this year (Seattle Seahawks) in terms of points allowed (14.4/game) and total yardage allowed (273.6), amongst many other metrics.

This was the first Super Bowl in NFL history where the league’s best offense was facing the league’s best defense in which both the best offensive team and best defensive team were ranked #1 in as many categories as they were. This game was supposed to be an epic chess match between two highly capable head coaches and their incredibly gifted athletes. What it turned into, as everybody knows now, was one of the most phenomenal defensive displays the Super Bowl has ever seen (I only shy away from saying the “best ever” because I am a die-hard Bears fan), and also one of the most lop-sided victories in Super Bowl history, where the Seahawks pummeled the Broncos 43-8.

Who could have possibly predicted that outcome? According to most major online sport news sources, including ESPN Commentator (most are former NFL players and coaches), Sports Illustrated, USA Today Sports, and SB Nation predictions, the Broncos were the clear favorite, with an average pick ratio of 3:1 for the Broncos to prevail over the Seahawks. A study performed by SB Nation even revealed that online viewers had picked the Broncos to win by a decent margin (60% of the 1,159 polled picked the Broncos, while 40% picked the Seahawks), meaning that 50% more of the total respondent population picked the Broncos to win.

The Broncos were also the co-preseason favorites to win the Super Bowl, with one of the greatest quarterbacks of all time in Peyton Manning at the helm, who was coming off a multiple record-setting regular season (most touchdowns with 55, most passing yards with 5,477, most 4-touchdown games with 9, most 2-touchdown games with 15, most 400-yard passing games with 4, and most 90+ passing rating games with 23). Heading into the Super Bowl, the Broncos were favored (according to Las Vegas odds) to triumph over the Seahawks by a margin of almost a field goal (2.5). Also, according to Vegas, the over/under for total points of the game was set at 48.5, meaning a relatively high-scoring game was predicted, which favored the high powered offense of the Broncos.

What could have possibly happened, to have had such a large impact on the game where not only did the Broncos lose, but lose by such a large margin? Was it predictable at all? Whether you look at the historical metrics of both teams leading up to the Super Bowl, analyze the predictions from the major sports news outlets, or wade through the mountain of statistics that exist about Super Bowl 48, there are very valuable lessons to learn from analyzing the data associated with such a momentous event.

The first lesson to learn is to avoid Analysis by Paralysis. We live in a time where so much data is available, especially in the professional sports communities, that it is far too easy to get overwhelmed with almost infinite statistics about variables that may not be influencing the outcome of an event as much as others. You can easily go to http://www.nfl.com/stats/, http://mlb.mlb.com/stats/, http://www.nhl.com/ice/statshome, or http://stats.nba.com/ to see complete stats of the major U.S. professional sports leagues, both by team and by player. The NBA stats page even has a brand new dashboard to graphically show the highlights of the best stats of the day per player.

What is important in avoiding Analysis by Paralysis is to figure out the best metrics to use to achieve the goal (s) of your given strategy, and focus on them. A great example of this is shown in the movie Money Ball, where the Oakland Athletics General Manager Billy Beane, frustrated by playoff loss and faced with a weak budget in order to recruit top-level talent, employs a sabermetric approach that allows him to analyze a baseball player’s value by using on-base % and slugging % as key metrics, instead of utillizing the traditional metrics of batting average, home runs, and runs batted in. Implementing this approach, Billy Beane is able to recruit and manage his team for an approximate budget of $40 million (third lowest in the league at the time), but find incredible success, including setting the American League successive win record at 20 games in a row in 2002.

The second lesson to learn is to realize that correlation does not imply causation. This means that even though the change in variable “x” may explain the vast majority of change in variable “y,” it will never explain 100% of that change. For Super Bowl 48, if we only focused on how much better the Denver Broncos’ offense was than the rest of the NFL (where the Broncos scored over 10 points on average more per game than the next highest points-per-game scoring team), and used that metric to determine the outcome of the Super Bowl (implying that whichever NFL team is the best offensively will win the Super Bowl), we would be sorely mistaken. Instead of focusing on just one variable and how it affects a desired outcome (which is inherently risky), it is pertinent to instead focus on a group of variables that are most closely correlated to that same desired outcome, and analyze those instead.

The third lesson to learn is to constantly track the current trends of the KPI’s (Key Performance Metrics) that are most close to your business. Using Super Bowl 48 as an example again, it is very true that the regular season was a clear domination by the Broncos on the offensive front (averaging an NFL record 37.9 points/game); however, when you look at the same offensive metric for the post-season games leading up to Super Bowl, it dropped from 37.9 points/game to only 25.0 points/game, an incredible 12.9 point (35%) drop.

Tracking this trend is significant, because it shows that there is a possibility that the Broncos’ offense was becoming less explosive when facing better, playoff-caliber teams. This trend also, more importantly, shows that the positive gap between the Broncos offense and the Seahawks offense was declining, possibly leveling the playing field for the Seahawks. Also potentially leveling this playing field was the acceleration of the Seahawks defense, where their defensive dominance continued (only declined from a league-leading, regular season 14.4 points allowed-per-game to a post season 16.0 points-per-game, while interceptions and sacks actually increased per game). Knowing and being aware of these trends would have been imperative to ensuring success for winning the Super Bowl.

Although there are many other lessons you could draw from analyzing professional sports data, including that from Super Bowl 48, the last one I am going to focus on is the ability to be flexible in your analysis, the ability to shift gears and bring in more data to provide a larger perspective on a situation. Leading up to Super Bowl 48, the vast majority of the positive media attention was on Payton Manning and the incredible Denver Broncos offense, where Manning would have a chance to not only accentuate his amazing season with his second Super Bowl Ring, but also, in some ways, solidify himself as one of the greatest NFL players of all time. On the opposite end of this media coverage, was the outspoken Richard Sherman, leader of the Legion of Boom, who was still defending the smack talk of Michael Crabtree in the NFC Championship.

The situations the Denver Broncos and Seattle Seahawks were in were polar opposites. The immense amount of social media data that become available during Super Bowl’s press week illustrated that America had a preference for the Broncos to win (derived from the massive amounts of Facebook and Twitter statuses regarding the Super Bowl); however that same data also illustrated that the vast majority of America thought the Seahawks “were very worthy opponents” and “incredible contenders”. Adding the social media data in analyzing Super Bowl 48’s potential outcome would have been an incredible asset, because it would have given an opportunity to see the general public’s current opinion on a highly debated subject.

Overall, as seen in professional sports, there is no singular, ultimate way of determining how to predict an event or achieve a desired outcome. There are only principles and lessons you learn along the way which make you better at analyzing and presenting the insights from the data you have access to. This analysis constantly gets improved along the way, and eventually you are able to put yourself and those around you in a better position to make more informed decisions that will affect your business in a positive way.

For the record, I thought the Broncos were going to win. In fact, part of me even kind of wanted them to win. When the Seahawks won in such a dominant fashion, I was just as stunned as many of you. Even the best data analysis and interpretation of trends can’t save you from potential predictive error.

Hey, at least the Seahawks didn’t have a higher point margin in their victory than the ’85 Bears did in their victory. I can always hold on to that.

Cheers.

Related Articles

Scroll to Top