Fun with probability

This graph shows the probability that there is...

Photo credit: Wikipedia

At Union Street Media, we have a support ticketing system that uses ticket IDs of the form ABC-123456. That gives a total range of 17,576,000,000 possible ticket values, roughly (I’m sure they removed some dirty words from the three-letter combo). Today, we had a ticket ID collision (where a new ticket has the same ID as an old one) and I thought, “wow, what are the odds of that happening?” Well, it struck a chord in me so I decided I’d figure out exactly what the odds were!

Essentially, this is the birthday problem on a much bigger scale. We have about 26,000 tickets in our system. Here’s the math:

\left ( \frac{17576000000 - 1}{17576000000} \right )^{26000\times (26000-1)/2} \approx 0.019

The result? At 26,000 tickets, there’s about 2% chance of a collision. Those odds may be low, but they are considerably higher than I was expecting!

The take-away here is that as a software developer, you need to know that you can’t depend on variability to prevent collision. Kayako should have been checking to see if a ticket ID exists in the database before assigning it to a new ticket*. I’m sure there are people using their software who have many more tickets in their system than we do. At 157,000 tickets, there’s a greater chance of having a collision than not having a collision.

*To be fair, we’re using an old version of Kayako’s software. I would expect that they’ve since fixed this bug.

Update: Here’s a great article about the birthday paradox in the NY Times