Anatomy of a Bug
Nov. 1st, 2014 04:42 pmThis morning I got up bright and early to get my Calgary Comic and Entertainment Expo tickets. I'd have slept through it had I not fortuitously spent some time with friends who work at CCE the night before - they warned me that the sale was this morning.
So about an hour before the sale I got all my ducks in a row. I had the CCE Eventbright page up, and I was logged into my account with them. I had my credit card handy - all I had to do was wait for the appropriate time and refresh the page.
Of course, several hundred other people were doing precisely the same thing.
10 AM roles around and I'm refreshing the page as fast as the browser will let me. One second after it lets me order two VIP tickets. Yay me.
I now have two two tickets reserved for the next 8 minutes while I fill out the form. The bottom of the form includes a drop-down box for the size of my tee-shirt. Apparently VIPs get tee-shirts. I select XL because I'm Shreck, and click OK to finish the transaction. The form complains because I've done something wrong. Not the best error message - basically, I've selected something for which there are no more, but it doesn't tell me which field it was. Well the only thing that could potentially "run out" are the shirts, so I go to make another selection. Now all of the shirt sizes are disabled, their are no possible selections to be made. And it's a required field - it will not go on without me making an impossible selection.
So know I'm in a panic - I don't want to refresh the page because I might lose the reservation. I go to the web page to complain only to discover that this is affecting lots of people.
Logically, a bunch of people were going to time out at precisely 10:08, so I started refreshing the first page right before then. It took about half a minute, but I got a new reservation. This time, the tee-shirts were all enabled. I proceeded and got my two tickets. A little panic, but everything turned out OK.
So what happened? Without looking at the code I can only guess, but this is what I do professionally.
The tee-shirts probably had some value associated with them, a "number of shirts available", and that number was low. As the first batch of people checked out, the shirts ran out. The next bunch of people grabbed shirts that weren't their size on the theory that they could simply correct it later. This cascaded until all the shirts were gone, leaving a bunch of people in the lurch.
The hack fix would be to go in and crank all those tee-shirt values past 550 (the number of VIP tickets available). 550 smalls, 550 mediums, etc. That way none of them could run out.
A better fix would be to not have a value associated with it at all - after all, these don't represent real shirts sitting in a warehouse somewhere that can run out. This is just to let the organizers know how many shirts of each size to make when the time comes. This is the sort of thing that basic debugging has a hard time finding. Sometimes when coding, you fail to see the big picture (i.e. don't use a counter when a string will do) As a professional computer programmer, I'm pretty forgiving of this sort of thing - it's a bug I could easily see myself writing. Hell, QA specialists might have a hard time finding that one. A good code editor might make the connection, but that's not a specialty that many companies hire.
CCE or Eventbright did manage to identify the problem and fix it in less than eight minutes. That impresses me. Good work under trying circumstances there. I've long maintained that it's impossible to be perfect, and the real test of an organization is how they deal with imperfection. CCE came through pretty good in my books.
Of course, they're going to get a new one ripped out of them on social media, because that's what fans do. It does suck for the people who had the promise of a ticket yanked away from them. Not everyone is going to figure out that more tickets would be available after 8 minutes.
So about an hour before the sale I got all my ducks in a row. I had the CCE Eventbright page up, and I was logged into my account with them. I had my credit card handy - all I had to do was wait for the appropriate time and refresh the page.
Of course, several hundred other people were doing precisely the same thing.
10 AM roles around and I'm refreshing the page as fast as the browser will let me. One second after it lets me order two VIP tickets. Yay me.
I now have two two tickets reserved for the next 8 minutes while I fill out the form. The bottom of the form includes a drop-down box for the size of my tee-shirt. Apparently VIPs get tee-shirts. I select XL because I'm Shreck, and click OK to finish the transaction. The form complains because I've done something wrong. Not the best error message - basically, I've selected something for which there are no more, but it doesn't tell me which field it was. Well the only thing that could potentially "run out" are the shirts, so I go to make another selection. Now all of the shirt sizes are disabled, their are no possible selections to be made. And it's a required field - it will not go on without me making an impossible selection.
So know I'm in a panic - I don't want to refresh the page because I might lose the reservation. I go to the web page to complain only to discover that this is affecting lots of people.
Logically, a bunch of people were going to time out at precisely 10:08, so I started refreshing the first page right before then. It took about half a minute, but I got a new reservation. This time, the tee-shirts were all enabled. I proceeded and got my two tickets. A little panic, but everything turned out OK.
So what happened? Without looking at the code I can only guess, but this is what I do professionally.
The tee-shirts probably had some value associated with them, a "number of shirts available", and that number was low. As the first batch of people checked out, the shirts ran out. The next bunch of people grabbed shirts that weren't their size on the theory that they could simply correct it later. This cascaded until all the shirts were gone, leaving a bunch of people in the lurch.
The hack fix would be to go in and crank all those tee-shirt values past 550 (the number of VIP tickets available). 550 smalls, 550 mediums, etc. That way none of them could run out.
A better fix would be to not have a value associated with it at all - after all, these don't represent real shirts sitting in a warehouse somewhere that can run out. This is just to let the organizers know how many shirts of each size to make when the time comes. This is the sort of thing that basic debugging has a hard time finding. Sometimes when coding, you fail to see the big picture (i.e. don't use a counter when a string will do) As a professional computer programmer, I'm pretty forgiving of this sort of thing - it's a bug I could easily see myself writing. Hell, QA specialists might have a hard time finding that one. A good code editor might make the connection, but that's not a specialty that many companies hire.
CCE or Eventbright did manage to identify the problem and fix it in less than eight minutes. That impresses me. Good work under trying circumstances there. I've long maintained that it's impossible to be perfect, and the real test of an organization is how they deal with imperfection. CCE came through pretty good in my books.
Of course, they're going to get a new one ripped out of them on social media, because that's what fans do. It does suck for the people who had the promise of a ticket yanked away from them. Not everyone is going to figure out that more tickets would be available after 8 minutes.