Winning Powerball with Big Data

If you are looking for the magic formula to win the lottery you have come to the wrong place. However today we will have some fun with some of the Powerball’s previous winning numbers so we can better decide on how we want to make our picks for the upcoming drawing.

Before we get started lets take a step back and get a basic overview/understanding of how powerball works. Drawings are held Wednesday and Saturday evenings at 10:59 p.m. Eastern Time. The game uses a 5/69 (white balls) + 1/26 (Powerballs) matrix from which winning numbers are chosen. Each play costs $2, or $3 with the Power Play option.

I also want to note that there was a format change on October 4, 2015 which we will want to keep in mind when calculating powerball winning historics. The white-ball pool increased from 59 to 69 while the Powerball pool decreased from 35 to 26. The new setup was designed to help create more winners and also more rollovers. Which is why we are now looking at a record estimated $1.3 billion dollar prize.

So the more and more I’ve researched different ways to predict powerball numbers using big data I was able to come up with two important facts that will save you some time and heart ache.

  1. The Powerball numbers are drawn randomly
  2. As humans we pick numbers non-randomly

About 70% to 80% of powerball purchases are computer picks and about 70% to 80% of winners are from computer picks. [source] So if I was going to play I would buy mostly randomly generated tickets based off of a computer; but possibly play one based off of historical data. I built this excel spreadsheet [download] with the data from powerball.com on all historical winning numbers back to 1997. So download and enjoy (if you win I would expect a small donation 😉 )

In the excel spreadsheet you notice I have setup 3 tabs. The first tab is a random number generator for each of the white balls and the powerball in the event you don’t trust the powerball computer picks.

generate random powerball numbers

The second tab is a table of all of the past winning numbers in the order they were picked. This would be a good data source if you wanted to utilize some data processing to calculate odds and trends.

The third tab is the frequency of each winning number called and broken down by white balls and power balls. Here are some screenshots:

powerball-frequency-2
powerball-frequency

As I started crunching the numbers and odds I realized that the overall odds of random are already so high that if I want to actually play the powerball my best bet is to just let the computer pick the numbers. The moral of this unfulfilled blog entry is that we pick numbers non randomly and the powerball is picked at random.  So if you want to win, make sure you do it randomly 😉

Let me know your thoughts on the excel data and how else we could make the numbers more useful in the comments section below.

Category: Big DataUncategorized

Tags:

  • Let me know what you think of the Powerball excel file, and how we can improve it

    • Jeremy

      Your excel file rocks Rob.
      I added the Powerball into your total white ball count (as just another ball minus its difference in probability with the white balls). Your frequency winner then changed ever so slightly. Still irrelevant. Fun poking around in the spreadsheet you created tho. Next time we should plot the winner geo data and see if the lat/long’s have any correlation. 😉 good stuff.

Article by: Rob Steele

Rob Steele is a Principal Systems Engineer with Roundtower Technologies specializing in enterprise data center architecture. For over 15 years he has architected, implemented, and supported data center solutions for Fortune 50-5000 customers. Rob holds multiple vendor certifications and is an active voice in the IT community. Current areas of focus include: Software Defined Data Center, Cloud Automation, Flash Storage, Big Data, Scale Out NAS, Third Platform, IoT - Internet of Things. Follow Rob on twitter: @RobSteele #EMCElect #CiscoChampion #vExpert