Fog Creek Software
Discussion Board




Best Disasters

OK guys, the previous thread makes me think that our disasters could be very entertaining.  What's the worst/funniest thing that has happened to you in your computer career...

Actively Disengaged
Tuesday, August 31, 2004

Classic 'disasters':

1.  The Denver airport baggage handling facility.  They estimated two years, they were 'given' one year.  They finished in two years.

2.  The IRS 'Modernization' effort.

3.  The FAA 'Modernization' effort.  They tried to make a 'paperless' process for tracking planes.  The controllers needed their flight strips.

I've been involved in an interesting situation or two, but it would do my career no good at all to mention them.  Once I retire, you can read my memoirs, I suppose.

AllanL5
Tuesday, August 31, 2004

Here is a (not so) funny one that is hardware related.  There was a very important legacy server running around the time of the Y2K horror fest.  The company was contacted by the motherboard vendor to indicate that the bios was not Y2K compatible about 2 weeks before the end of the year.

New bios was ordered and shipped priority mail as the Y2K flip over loomed.

With much trepidation I shut down the box hoping more than 2 disks in the raid array would not bind after being active for so many years (disks tend to accumulate materials and work fine when continuously spinning but when power cycled they like to die).

The bios chip was really bound in the socket and, lacking a chip puller, my fat fingers were used.  Needless to say the chip finally released and impaled into my finger.  Now bleeding, I also noticed that a number of the pins were totally separated from the chip!

Muttering under my now labored breath, I prayed that the new chip would work because there was no going back. 

I inserted the new chip without incident and booted the box.... it did not post.  Now the blood pressure started to rise, on the phone to the vendor and they had no ideas what the problem could be.  In an act of desperation the old bios was loaded onto a floppy and the new bios was flashed back to the old version (since destroyed with some of its pins still imbedded in my bleeding finger).  Of course this flash operation is a one time affair and if anything went wrong the "new" bios would also be destroyed.  The flash completed and the system posted and booted without error.  A huge sigh of relief but now we were back to square one with a system that was not Y2K compliant.

This is enough to convince management that a new box is required and one is rush ordered.  The system migrated to new hardware without incident.  Y2K comes around and IT (no joy on new years) is huddled around waiting for the sky to fall and the box, now with no production load, WORKS PERFECTLY FINE.  Even the date flipped over correctly.

Ahh…. the joys of 1999-2000!

Y2K was so much fun!
Tuesday, August 31, 2004

Today's episode of Eric Sink Sw Factory with SP1. Sorry Eric :)


Tuesday, August 31, 2004

Once, I ran a SQL script on the wrong server. I realized this just as the script finished. Needless to say, it dropped all the tables on the production server.
Needless to say, there was no recent backup.
Needless to say, we could not recover the logs.
This was not truly a disaster : fortunately, there wasn't much new data (and that wasn't vital data). It seems that nobody really noticed. You might call that a free lesson.

Pakter
Tuesday, August 31, 2004

@pakter: Free lesson?!? I was the poor chap whos data you sacked! I was fired a short time after that because your screwing around threw me off my deadline!!!

anon-y-mous cow-ard
Tuesday, August 31, 2004

Sorry if that recalls sad memories, but apart from the white colour on my face, there was no such bad side-effect in my case ;-)

Pakter
Tuesday, August 31, 2004

This was a combined hardware/software disaster.

A rendering device was being created. I looked at the spec for the underlying chips and realized (even though I'm "only" a programmer) that the device would be at least 10 times too slow. That is, it wouldn't just run slow, it would not run at all.

I went to the engineering director and told him this. He said, as about that at the next joint meeting with our department and the people in charge of the device.

I did. The designer of the boards just shrugged and agreed, like there was no problem with this being true.

For unknown reasons work continued for a few months. Then, mercifully, I was transferred to a better project and I got to watch the old one shrivel up and die.

I was new at programming then and was amazed such waste could be tolerated.

Besides the "opportunity cost" of being an extra year late to market.

anonymous for this one.
Tuesday, August 31, 2004

I was testing the charging code for a handheld device, going through all the permutations of the device is on when the charger is plugged in, device is off when charge plugged in, device is off and powered on via the power key, etc.  Thought I'd nailed it and found a new problem, the device always powered off when I removed the charger.  Always.  Even in the middle of running a program.

Crap.  Took about 3 hours of debugging to realize, wait for it....

The damn battery had died.

Snotnose
Tuesday, August 31, 2004

Some of the biggest disasters I have witnessed tended to be related to massive projects which involved building a replacement for legacy systems using leading edge technologies. I should also add that more often that not that clueless consultants from Andersens had stuck their finger in the pie and produced reams of documentation which was nothing more than unintelligible, tautological wank to keep the client happy.

Invariably, those projects were either abandoned or drastically pared-down.

TheGeezer
Tuesday, August 31, 2004

>> I should also add that more often that not that clueless consultants from Andersens

D'oh - it pays to review before posting. What I meant to say was:

"I should also add that more often than not, clueless consultants from Andersens..."

TheGeezer
Tuesday, August 31, 2004


I work at a bank.

I wrote the bug that changed customers' buy orders into sells!!

NOT BAD!!!

It happened one night............

THE CLIENT MESSAGE:
<XXXXXXXXXXXSIDE=4XXXXXXXXXXPRICE=4.34XXXXXXXXXXCOSTMER=2>

THE CODE:
  TRDMSG *msg = getMsg();
  int side = getSide(msg);
  if (4 == side ) { doBuy(); }
  if (5 == side ) { doSell(); }

THE TESTING:
  WELL.  A BUNCH OF BUYS CAME AND A BUNCH OF SELLS CAME & TEST EXECUTIONS WERE SENT BACK. I DIDN'T CARE WHICH WAS WHICH. WHO WOULD? SORRY BOSS.

THE SURPRISE:
  4 is a sell. 

Voodoo
Tuesday, August 31, 2004

I once bungled a mailer script that was supposed to send out an email containing their username and password to several thousand users. The mail component I was using did not clear its address and body buffers after sending. Neither did I.

The first lucky person on the list got the email as expected. The second person got the body for two emails, containing both his/her details and those of Person #1. Oh, and person #1 got a copy of that second email, too. You can see where this is going.

By the time I stopped it, around 35,000 emails had been sent out. Quite a few were stuck in the mailer queue; some did not send because you can't have 35,000 addresses in the To field. Some were rejected by the receiving servers for all kinds of reasons (this was before the age of spam, so no spam-filters, not that this mail was spam anyway). Still, a lot got through.

One poor guy got 633 emails, with ever-increasing numbers of usernames and passwords in. And one guy phoned us in a blind panic because the username and password he'd used for our Web discussion board was the admin password for his entire company network.

All this for two lines of code. Surprisingly, I didn't get fired.

Needless to say, I have never, ever been cavalier again about anything that sends email out. Ever.

Neil Hewitt
Tuesday, August 31, 2004


This is a well known story in Aus.

A software company had been contracted to build a battlefied simulator for the army.

Looking a bit like a Real Time Strategy game it allowed the military guys to move around the scene, watching tanks roll across the landscape and the solders running around etc.

The day came to present it to the minister of defence and the top brass. 

Somone at the company thought that they should add a bit of polish and make it a little more visually exciting for the big day. 

Late one night (surely it must have been late one night) a programmer added some images for animals and to save time he used to same objects for the animals that he had been using for the soldiers.

Well he must have forgotten to disable something because right in the middle of the presentation, with the minister and the generals,  a hellicopter flew across the battle field spraying it's machine gun.

A group of passing kangaroos immediately hit the ground and returned fire.

braid_ged
Tuesday, August 31, 2004

One company I worked at had a statistical analysis program. The lead developer was a funny Russian character with a very sadistic streak in him, especially with badly formed data sets from users.

One day I got an email from a user stating that the following error appeared on his screen:

"Your calculations are doo-doo!"

It amazes me to this day that he didn't get fired. There were other nasty messages buried in the code as well, that were hastily removed as a result.

Erik
Tuesday, August 31, 2004

'Roo Power!

Yeah, baby!

Kangaroo Jack
Tuesday, August 31, 2004

One year ago, a Japanese team asked  a Chinese company to help them solve a tricky bug.

Their system (12~15 lines of C code) will crash when debugged in GDB and they could not find any clue after two days hardwork.

it was my first work day in that Chinese company and the boss let me take the task

The key point was that the Japanese team REFUSED to show me the source codes and our company could not provide any information to simulate the environments.

At last I found the bug is caused by a GDB's bug. The bug only happened in DEBUG version of the system.

redguardtoo
http://www.d2ksoft.com

redguardtoo
Tuesday, August 31, 2004

Last year we outsourced a module of our core product to an Indian "CMM Level 5" company, then not only did our engineers spend a huge amount of time trying to get them up to speed, we got back a huge pile of poo that not only didn't work, but their code actually contained a function with 80,000 lines of code (i.e. instead of a simple for-loop that reads in an element from a data file and calls a function with that as a parameter, they wrote 80K function calls with each element hardcoded as a parameter).


Tuesday, August 31, 2004

"The FAA 'Modernization' effort.  They tried to make a 'paperless' process for tracking planes.  The controllers needed their flight strips."

I was part of that project (AAS), which was started in the late 80's before the whole fad of actually talking to the end user. The air traffic controllers had this system where they'd make notations on a strip of paper to remind the assistant controller to make a computer entry to update the database, so we Information Science experts simply computerized that whole task.

Five billion U.S. dollars later, we presented them with a robust, thoroughly-documented system where the controllers had to make computer entries to remind themselves to make computer entries to update the database. And for some reason they didn't like it, so just scrapped it.

Oh well, at least we got paid!

Rick
Tuesday, August 31, 2004

I didn't do it myself, but I worked at an investment bank in the 1990's where another programmer had put some batch shell script that got into production (yes programmers updated the production systems), and included a line something like:

rm -rf /some/random/temp/directory /*

Notice the accidental space after "directory".

Oops.

I got the support call at 2 AM.

At least it didn't run as root.
Tuesday, August 31, 2004

> A group of passing kangaroos immediately hit the ground and returned fire.

MEMO

One of the journalists saw Project Roo-Surprise in action. Please issue a press release explaining it was a "programmer error."

Minister for Defence
Tuesday, August 31, 2004

At one job I had, system admin was split among several of us engineers, being too small a place for an IT department. We managed to get an Exchange server running for email OK. But one day it crashed and for some reason my boss performed smoe kind of restore operation after rebooting. The net effect was that all of the email we had received for 6 months was resent (it kept arriving for 2 days) AND all the email we sent for 6 months was resent. Which of course triggered may "What the hell is this?" replies.

Totally bogged us down for 3 days......

sgf
Tuesday, August 31, 2004


Don't remember the details but a lecturer when I was at university used to write code for a heart 'defibrilator' (sp?). 

A pace maker type thingo.

Well, they had to prove all the code correct and do so much QA it wasn't funny, as you would expect.

Well... there was a bug somewhere and very occasionally the thing (it was now out-there... inside people),  would malfunction, something to do with temperatue sensing or something.

The companies reaction was that people shouldn't panic but alas doctors started removing the thing.

Twice as many people died in the operations to remove the device than from the malfunctionings of the device.

The company was sued, crashed and burned.

I guess that's the sort of project problem you cant be too cheerfully philosophical about.

braid_ged
Tuesday, August 31, 2004

TheGeezer nailed it.  Any implementation of PeopleSoft will fit the topic of this thread nicely -- like this one:

http://www.idsnews.com/subsite/story.php?id=24282

Bobby Knight is laughing
Tuesday, August 31, 2004

Here as well:
http://www.zdnet.com.au/news/business/0,39023166,20279146,00.htm

trollop
Tuesday, August 31, 2004

Back in the old DOS 3.x days, as a joke, I patched COMMAND.COM changing the string "Bad Command or Filename" to "Try Again You Stupid Twat". . . .

Several weeks later I got a call from the lead partner of a large law firm that I had installed and configured machines for, she was not amused . . .

John Murray
Wednesday, September 01, 2004

http://digilander.libero.it/chiediloapippo/Engineering/iarchitect/stupid.htm

Search for Autodesk.

They had some nice context-sensitive help that reads "Click this to display an overview of this dialog box, idiot."

Shipped to many thousands of people. :)

indeed
Wednesday, September 01, 2004

We had a web script that send out an email christmas card a few years ago for a client. It was a really basic thing that looked in a database for the email addresses and sent them all a common message. Took about 5 minutes to write and worked perfectly. Was intended to be used once and deleted.

But after running it we forgot to delete the script off the server. Somehow Google found it a few days later and the bot kept hitting it, and ended up sending a lot of people a lot of Christmas cards! It was a week or so before anyone thought to tell us that they were getting lots of cards, and people ended up with about 15 cards each!

In hindsight we should have put it in a password protected area, or just deleted it quickly afterwards like we had planned to all along.

James U-S
Wednesday, September 01, 2004

A long time ago, in a life far far away, I was responsible for daily backups of the company's production db. The process was the usual n revolving backups, in our case using huge disk packs. In order to start the backup one had to remove what was already on the backup disk. One time I did this and wiped the live db instead. Oops.

Luckily we had a good DRP and restored from previous backups and the day's transaction set. I did sweat for several hours while we did it though!


Wednesday, September 01, 2004

The Kangaroo firing military sim story is true but it isn't as embarassing as the popular legend that brain_ged mentioned.

http://www.snopes.com/humor/nonsense/kangaroo.htm

It was an intentional piece of fun rather than an inadvertent programming mistake. Makes a good story though.

SC
Wednesday, September 01, 2004

Around 1995 I installed Linux and created a swap drive. Unfortunatelly I specified a wrong partition. The bastard didn't nofity me before making the main partition a swap.

Since that I do not touch Linux.


Wednesday, September 01, 2004

>It was an intentional piece of fun rather than an inadvertent programming mistake. Makes a good story though.

Now that sounds more Aussie!

Aussie Chick
Wednesday, September 01, 2004

Beware of marsupials carry beachballs.


Wednesday, September 01, 2004

"Since that I do not touch Linux."

If you can't stand the heat, get out the kitchen.

And starve.

Vladimir Gritsenko
Wednesday, September 01, 2004

My favourite is the Mars Orbiter disaster:
http://www.tysknews.com/Depts/Metrication/mystery_of_orbiter_crash_solved.htm 

Reminds me of last week's thread on resistance to SI units....

Freddie boy
Wednesday, September 01, 2004

Wrote a website inhouse. It was our first one as a team. We were very proud of it so we gave it to our director to present to the board. We developed on IE, he had Netscape on his laptop. Someone forgot to schedule the no-frames work, and our stub read <NOFRAMES>your browser was written by poofs</NOFRAMES>

I do swear, flames were coming out of his eyes when he found me.

We did go live in the end.

ohLardy
Wednesday, September 01, 2004

Military mistakes are always fun.  Here's one hardware and one software:

The SRT (Standard Remote Terminal) my workcenter used to maintain had a disk pack where the heads were driven by a 35 pound (16kg) linear motor.  A maintenance technician at a UK airbase commanded one to do a random seek test during a preventative maintenance inspection (PMI).  All was fine, the heads moving in and out, seeking randomly, until the feedback control circuit either died or hiccupped.  The head assembly was driven out the back of the drive through the drive casing and a layer of sheet metal, and stuck, quivering, in the wall behind it, with all it's wires dangling from it. 

This was bad enough, but what was worse was the drive was at crotch height, and it missed the airman's family jewels by only a few inches.

The other story I heard, ocurred during development of the F-16 flight control software (the F-16 can't fly without it's computers -- it's inherently unstable).  The pilot was in the flight simulator, and decided to fly to South America.  As soon as he crossed the equator, the plane flipped upside down.  A little odd...  He crosses back into the northern hemisphere (flying inverted), and the plane flips right-side up again.  A bug was found in the software, of course.

example
Wednesday, September 01, 2004


> Surprisingly, I didn't get fired.

The story is told with varying amounts and different characters:

An executive makes an error that costs his company twenty million dollars. He goes to his boss and says, "I'm very sorry. I expect you wish my resignation."

"Resign, hell no," the boss says. "I just spent twenty million dollars educating you."

frustrated
Wednesday, September 01, 2004

Excellent stuff guys, very funny.  Now, where was Muppets story?

Actively Disengaged
Wednesday, September 01, 2004

The head of our journalism department came to me and wanted me to look at the NT box they used for video editing.

It had become slower and slower and it was thrashing like crazy now whenever they edited anything.

I immediately suspected it was due to the disk having become severely fragmented. I estimated they generated and edited a few GB of video every day and the lack of any sort of maintenance had finally added up.

I never played with NT before. I had only heard of it and knew it bore some resemblance to other windows. NTFS then, was completely foreign to me.

When I couldn't find a defrag utility anywhere in the menus, I cleverly remembered that the machine had a dual boot option. I rebooted to DOS and ran defrag from there. I was so smart! It ran perfect.

Nothing on that machine probably EVER worked again, but I wouldn't know because that teacher NEVER asked for my help again either.

I was Jack's formerly FAT intelligence
Wednesday, September 01, 2004

This story comes from opera, not the browser, but the real thing.  A famous singer made his entrance to deliver the line, "hark, I hear the cannons roar".  In rehearsals, they had used timpani drums for the cannons, but this was a major production, and to the delight of the packed auditorium, they had REAL cannons.  So the guy walks out there, and KABOOM! go the cannons... he bellows "HOLY S***!" and hits the deck! whoops!

devinmoore.com
Wednesday, September 01, 2004

I'm sure muppets story will probably start off something like this:

..."One day, while I  doing my Christmas shopping online at work instead of, you know, actually "working"...

 
Wednesday, September 01, 2004

Well you've already got it wrong because I don't shop for Christmas.

muppet
Wednesday, September 01, 2004

I was working a job for a contractor to Anheuser Busch (Bud Lite beer company).  We were moving pallets of beer on an automatic guided vehicle in a demo system.

Me and the guys I worked with were all Mormons and knew little about beer or their advertisements.  We thought it would be clever to have the load IDs alternate between "Tastes great" and "Less filling" (from a recent popular beer commercial.

It was a great idea and we were proud.  Until the vice presedent of Busch pointed out that Miller Beer (the competitor) ran the "Tastes great" ad.

XYZZY
Wednesday, September 01, 2004

And he's NOT jewish either.

&#948;
Wednesday, September 01, 2004

I have another one for you... this comes from my 4th grade, where a super-hacker friend of mine rigged the o/s on the schools' computers to say "suck mine" (sp) instead of any of the regular command-line error messages.  Well, things got pretty interesting for him when the programming class rolled in and he forgot to take it off...

devinmoore.com
Wednesday, September 01, 2004

About five weeks ago I ran an UPDATE statement on a production database.  I got distracted while I was writing the query, and ran it before I added the WHERE clause.  Suddenly the web-based application that hits the database was displaying several thousand records for each user.  We only lost about two hours work, but I had to call about fifteen users to tell them I lost their work and they were going to have to re-enter it.

It's hard to describe the feeling you get when you see "11,237 records affected" in the results pane of Query Analyzer, but I guess everyone who has posted autobiographically in this thread knows what I'm talking about.

OffMyMeds
Wednesday, September 01, 2004

We had developed a proof-of-concept of a web-based intranet content managment application to let people upload collateral for some 100 products made by the company. It was deployed on our team's little Sun box running an unstable application server that needed to be bounced every 30 minutes by a cron job to ensure continued operation. Good enough for prototyping...

Anyway, the application was pretty cool, so our boss gave a demo to the company COO, who apparently liked it. He liked it so much that he sent out a company-wide email to the effect of "here is the latest greatest thing since sliced bread, use it NOW!!!".

Our "proof-of-concept" got 25,000 hits in one day.


Funniest thing was, it handled it just fine :)

genius
Wednesday, September 01, 2004

Q: Frogsdabble, how do you change ownership of a directory to bob?

A: chown -R  bob *

Q: from root?

A: Noooooooooooooooooooo ...

trollop
Wednesday, September 01, 2004

We had a system running a production line, storing the locations and statuses of thousands of products and components. The system was a bit rubbish and we were in the process of replacing it, but it worked ok if you knew what you were doing. I was the system manager and I used to get lots of calls.

One night, I had a call from one of the production engineers who said that the system had crashed, but it was OK, he'd restored the database and he was so proud of himself he had to ring and tell me (at 1am). The system basically ran on three servers, one live, one live backup and a standby. The 'live' backup software actually ran up to 40 minutes behind the live database (I did say it was a bit rubbish). The backlog in that process often caused the system to slow up or, in user parlance, 'crash'. Alarm bells ringing yet ?

At 2.00am I had another call to say that the system was corrupt.

I arrived on site at 2.15. The support guy had 'restored' the 'live' backup over the live database and  the status/location information for thousands of peices of work in progress was subtly wrong in 80% of cases. Just wrong enough, in fact, to create absolute perfect chaos which took nearly a week to clean up.

When I asked how he'd found the password for an account with enough access, he pointed to a peice of paper taped on the wall with all of the passwords for the system written on it.

Question 1)

In this scenario, who gets fired and who gets training ?

WoodenTongue
Thursday, September 02, 2004

Our companies' website keeps track of how many online reports each of our customers view (for billing purposes) in a database table which used by several stored procedures. 

Well, I got careless with SQL Server's export utility to upload the latest version of a stored procedure from development to production and left the "include dependencies" option checked, therefore replacing the production table with the older development table. 

Luckily, the previous night's backup was successful so we restored okay, but everyone got free reports for a day.

Later that day I learned that another employee of the company was being fired for "screwing up too much".

aBitNervousEverSince
Thursday, September 02, 2004

@WoodenTongue: I believe the answer to your question is, "fire them all and sort it out later".

anon-y-mous cow-ard
Thursday, September 02, 2004

aBitNervous, I've done that.

It wasn't important because I was moving procedures from production to a test db. It still made a few beads of sweat pop out and now I always think about paying closer attention to it in the future.

In our defense, that is a poorly thought out default.

I am Jack's reluctant admittance
Thursday, September 02, 2004

Working as a contractor, the development server had a hard drive failure. We discovered that the back-up people had not been backing that server for about a year. The code between the dev server and the qa server was between 1 week and 8 weeks different.

Since we used Visual Studio, and that keeps a local copy of the code you were working on, I told everyone they were going to take a long lunch break. I got them to write down their passwords, and leave for about 3 hours (the head of the dept agreed and approved). I ran to the local computer superstore and bought a USB cd burner (which were kind of rare at the time, and I wanted an excuse to buy one for myself anyway), came back, copied what everyone had, and managed to recover *most* of the lost files. We figured it saved about 18 person-weeks of effort.

So... guess who gets laid off the following week because they know other people's passwords?

Peter
Thursday, September 02, 2004

*  Recent Topics

*  Fog Creek Home