Predicting the Outcome of NFL Games — Success!

Ray Manning
4 min readMay 25, 2018


Part 1. Background

After less than satisfactory results from previous efforts and three previous articles, I stepped back and tried to decide as to what was going wrong.

Certainly my neural network methodology was fairly sound since I was able to use neural networks to “classify” other problems and determine other outcomes based upon input attributes.

I wondered if some of my data was corrupt. Recall that I had been using team statistics from 2015 National Football League (NFL) games in order to train neural networks to determine patterns in the data and be able to predict the outcome of games (that were not used in the training of the neural networks). The results had yielded about a 25% success rate where the out-of-training data could be predicted to within 2 points of the actual outcome. Not good enough to go to bet in Las Vegas or to make team decisions about game plans or personnel moves.

Part 2. Restart

I woke up in the middle of the night this past Monday and could not sleep. I stayed in bed for 45 minutes before getting up and re-attacking this problem.

I started by scraping the statistics for every regular season football game played in the NFL from the years 2003 to 2017. (Caveat: The site that I scraped from was missing week six data from 2003. So until that is fixed, I will not be using 2003 season data.)

For each season between 2003 and 2017, I scraped the team statistic data and game outcome. I went through and ran the basic statistics through a neural network and detected some parsing errors that had occurred and fixed those. (Though none of those parsing errors had been in the 2015 season which I was previously using.) I then went back to sleep at 5am and got up a bit after 7am to go cycling and energize the metabolism for health and intellectual stimulus. I sometimes find my best ideas as I pedal away on either the road bicycle or the mountain bicycle.

After the bicycle ride I reviewed what I had done in the middle of the night and decided to try and run the neural networks on the newly-scraped and cleaned data.

To my surprise (and expectation), I was able to start generating good data with the neural networks. I ran through each season using 35% of the games as training data and trying to predict the outcomes of the remaining 65% of games

Figure 1 shows the results of the in-training and out-of-training data. The red bars show the fraction of total season games that you can predict within 2 points of the actual outcome. These red bars include both in-training and out-of-training data. More importantly, the blue bars show the fraction of out-of-training games that the trained neural network can predict within 2 points of the actual outcome. The chart indicates that after training, one is able to predict the outcome of about 75% of the games from the out-of-training set to within 2 points.

Figure 1. Neural Network Results, Years 2004 through 2017

Now you’re starting to get to the level where you would be willing to wager large bets on games in Las Vegas or make changes to upcoming game plans or personnel.

One should note that there are always going to be “outlier” games where a team has the wrong game plan, a team comes out flat and underperforms, and/or critical personnel are not at full strength or absent and can throw these predictions off.

But following the re-scraping and cleaning of data, the neural network prediction algorithm is starting to behave as originally expected and desired.

Part 3. Next Steps

Now that we have good data that we believe is clean and free of significant errors, our next steps will be:

1) Compare a pure linear regression analysis with the Principal Component Analysis (PCA) effort that we alluded to in previous articles.

2) Utilize the PCA basis functions in the neural network to see if it provides superior results to the current pure statistics method of analysis.

3) Continue with the determination of confidence levels in predictions.

4) Continue with the sensitivity of predictions to slight-to-moderate perturbations or changes to a team’s expected performance when determining their win chances.

5) Possibly go back and determine what was the root cause of the original 2015 leading to such miserable results.

What could be more fun that combining sports outcomes and algorithmic geek work?