Fog Creek Software
Discussion Board




Worst engineering mistake you've ever made?

The thread about looking back on old code and being embarrassed got me thinking...does anyone have any memories of their worst engineering mistakes?

The worst mistake I made was many years ago while writing an application at my first company.  Part of my program's function was to generate a unique ID that was used to name files in a globally available directory.  However, my "unique ID" algorithm wasn't so unique, and it turned out that different workstations (and even the same workstation!) could generate identical IDs.  Since my code further assumed that the ID would always be unique, no checks were in place when using the ID as a filename, so I ended up bashing over previously existing output files.

The worst part?  This problem wasn't discovered until my code was in production at a major bank.  Ouch!!!

Anyone else have any scary memories like this?

Oops
Friday, May 16, 2003

If only it was just memories :)

Imagine an app with the following flow:
1. User inserts records in system A (sysA)
2. Hours later, process runs and transfers said records to system B (sysB)

During the time between these two steps, the user can safely alter the data, because it's still in its original system, and there is even an app option for this.

Enter the shot in the foot :)

In the new version, the steps 1 and 2 are immediate, unless sysB is unvailable, in which case the record is stored in sysA and sent later by a batch process.

When I wrote the specs for this new version, I completely forgot that the app option to alter the data would become virtually obsolete, since the normal state of affairs would be for records to go to sysB as soon as they're entered.

So, the first version of my spec went on and on about the importance of this option, because it allowed the users to alter the data before they were published to sysB, etc.

And then, one of the client depts sent me a mail asking the obvious - how are we going to alter the record before if goes to sysB? Isn't it supposed to go to sysB right after we insert it into sysA?

duh! :)

--
"Suravye ninto manshima taishite (Peace favor your sword)" (Shienaran salute)
"Life is a dream from which we all must wake before we can dream again" (Amys, Aiel Wise One)

Paulo Caetano
Friday, May 16, 2003

Slightly off-tangent but as I just done this - literally - in the last 10 minutes I thought I'd share it...

USB pens - arrgghhh.  No, I really mean Arrrggggggggghhhh.  Trying to push my pen into the USB slot with my thumb.  Slipped.  Thumb forced against the metal keyring prongy bit.  Large piece of metal inserted in thumb.  Extreme pain.  Trip to engineers room to get pliers to remove the 4 mm piece of metal (still attached to the pen) from me thumb.  Ouch.  Lots of blood.  White-faced engineers.

All patched up now and feeling kinda stoopid!

John Fletcher
Friday, May 16, 2003

Okay, what the is a USB pen and what part of your USB port features a sharp piece of metal?

DrAwesome
Friday, May 16, 2003

The USB pen is a strangely named device that is basically a memory card which can be plugged into a USB interface - you can use it just like a removable hard drive.  Very handy for transporting data around instead of on loads of floppy disks.

The pen (daft name, but I never invented and named the thing), has a small metal keyring-type clip.  That's the piece I managed to embed into my thumb (gives a whole new meaning to the embedded device development I'm doing!).

Check out www.usbee.co.uk to see what one looks like (the pen, not my damaged thumb).

John Fletcher
Friday, May 16, 2003

Usually I build my personal machine but my latest one will likely be the last one built by me. While building it I took the heatsink off the processor because the machine was booting up but no screen no beep no nothing. I thought the processor wasn't seated properly so I reseated it and hit the on button. Immediately smoke rose from the processor area and I had a toasted chip. I still keep it around just to remind me and chuckle a little.

Ian Stallings
Friday, May 16, 2003

I built something for one of those Linux companies.  Tested, worked fine, then as the ceo and his buddies came over, insanity struck and for some very very very stupid reason I decided, "Hey this piece of code annoys me," and I changed it.

Inserting a div by zero error.

My fingers refuse to go on.

Tj
Friday, May 16, 2003

I wrote a document system written in Word that interfaces to a legacy billing system. A bug in my design meant that when saving the document, the document could get filed under the wrong person.

No big deal, right?  Wong!!!!

The problem was that those documents were diagnose info at a medical center, and the billing system was a patient billing system. The end result of this was diagnose information being filed under the wrong patient.

We did catch the error after the first week of the system running live. Needless to say, every singe diagnoses had to be re-checked. Fortunately, due to the work flow process, this did not result in any patient receiving incorrect diagnosis information.

You can see however this is about the worst possible horror story one could have occur in software.

The real story here is anytime you deal with a data system, ask your self what are the consequences if that data is filed under the wrong person?

Sending the wrong payroll check to someone can be really embarrassing, but the wrong patient diagnose is the stuff of lawsuits.

Albert D. Kallal
Edmonton, Alberta Canada
kallal@msn.com
http://www.attcanada.net/~kallal.msn

Albert D. Kallal
Friday, May 16, 2003

You ran a modern processor without any cooling? No, you shouldn't be building your own machines.

Fred
Friday, May 16, 2003

[You ran a modern processor without any cooling? No, you shouldn't be building your own machines. ]

It's called a "mistake". I know you are likely infallable but I, a human, make mistakes. I've been building my own computers now for 12+ years. So take that mr. party pooper!

Ian Stallings
Friday, May 16, 2003

While I've made plenty of design mistakes, my biggest embarassment came when I left some debugging code into my application while it was being demonstrated to a potential client.

I had a long night debugging, wasn't in the best of moods and I put in a rather nasty debugging statement.

So during the demo, up pops this message box with a less than pleasant debugging message.

Ooops.

Mark Hoffman
Friday, May 16, 2003

The very first PC application I shipped (this was in 1985) consisted of three main programs and ten support utilities. I finished, made ten copies on floppy disks and shipped to ten customers. Three months later I discovered that the entire functionality of all the support utilities, as shipped,  was to print hexadecimal error messages. However we still found out before our customers.

David Clayworth
Friday, May 16, 2003

When I was at university... in the summer I used to work at a software company

I (and everybody else) hated our boss [who hadn't written any code for about the last 5 years, but had in the company's early days, (**really** horrid code still in the app), but was now mostly focused on fussing us about stuff, often wierd stuff]

In our source file comment, I wrote "I HATE ....name...." , copied about 100 times.

When I came back next year, he had got a once every 5 year urge to edit code, decided to work on my app, and discovered the comment.

S. Tanna
Friday, May 16, 2003

I was working on a system as a contractor to automatically send email to users subscribed to a particular ticket (for a helpdesk system).

An incorrect loop in the code coupled with a system capable of sending live messages meant that we not only bombarded the recipient with hundreds of copies of confidential information, but as luck would have it, said recipient happened to be an ex-client of the software owner on glacial terms.

Needless to say, the client was not pleased.  I don't know how the ex-client took it, because although I called them and contritely apologized to their automated answering system (stressing that it was my fault, as per client's request), I never heard back.

Guess who now treats auto-emailing with kid gloves?

Anon to protect the guilty
Friday, May 16, 2003


I had an app in the early days of Windows 95 that liked to exchange it's icon with the Recycle Bin. My friends still likes to bug me about it, heh.

Leonardo Herrera
Friday, May 16, 2003

I used to work for a chip design company. One project I did involved tuning a critical path through the cpu microcode ROM. The chip used 4 non-overlapping clock signals (phi1 - phi4), each of which was active for 25% of the clock cycle.

Back in the day, we offered this chip at speeds up to 30 MHz, though yields at the highest speed could have been better.  The ROM timing was divided into two halves: a precharge half and an evaluation half. The redesign involved retiming the ROM so that it began to evaluate on the falling edge of phi2 instead of the rising edge of phi3; this was worth a few nanoseconds and the increased yields more than made up for the engineering and manufacturing effort.

Well, it did after rev. 2. The first rev exhibited much lower yields than expected. In fact, the first set of reticles had to be trashed because it turned out that I'd wired phi3 to the "evaluate" clock input instead of phi2, giving the whole rom only 25% of the clock cycle in which to evaluate (instead of 50%). At the time, a full set of reticles cost around £20k but the real costs came because there was something like a 30-day lead time between ordering the new reticles and getting samples back from the fab.

An why wasn't this caught in simulation? Because we didn't have static path analysers and all timing simulations had to be done with SPICE. Now, even today, there probably isn't a machine in existence with the capacity to perform an analogue simulation of an entire chip in a reasonable timeframe. We created a detailed timing model of a single slice through the rom and used that to thrash out the circuitry, but at the chip level we were left with logic simulation only. I.e., we could prove that the rom evaluated correctly but couldn't infer a damn thing about the timings.

You see? Even today, I'm still trying to rationalise my way out of it. Oh, the shame...

Paul Sharples
Friday, May 16, 2003

Not strictly pertinent but 10 minutes into a demo I was criminally ill-prepared for I experienced an unprecedented case of flop-sweat.  I'm talkin' Singing in the Rain, to steal a line.  I don't suffer in general but now I worry more about that than the software.

Anyhow, at one point I leaned back on my chair and rested my head against the wall.  When I righted my chair I had left a 6' diameter wet mark on the forest green wall in front of 100 giggling on-lookers including the CEO, Canadian President, American President...

Oh, and the demo dumped at least 10 times.

I tend to be anal about preparation since, go figger.

I'm coming off as a boob, aren't I? <g>

B#
Friday, May 16, 2003

Recently I wrote an application which I was determined would contain the least number of bugs possible before going to QA. This was because I was also using this application to demonstrate the benefits of Defensive Programming to my co-workers. I was indeed mostly successful in that QA has managed to locate about 10 bugs in this entire application. Unfortunately, one of these bugs ...

I had also given this application to two of my co-workers to play with. I jokingly told them: "don't worry, it won't format your HD". It didn't. Unfortunately it did try to recursively delete every folder under their C: drive. They were not amused.

All the other 9 bugs were minor, and all of them were fixed,  but one of the two still won't run this app ...

Dan Shappir
Friday, May 16, 2003

A long time ago (back in the 80's), I worked on a pneumatically-controlled bench press machine. The "weight" could be applied in either the up or down direction, so that you could do lat pull-downs as well as military and bench presses.

The force output was controlled by an 8-bit DAC. I found out the hard way that I hadn't put checks for wraparound in the right place when I decreased the weight past zero and the carriage flew up the track. One of the physical stops at the end of the track broke off and embedded itself in the ceiling.

Luckily, that was one of the first "does the hardware work" versions of the code, and not anything being shown to the customer, let alone a shipping version.

Steve Wheeler
Friday, May 16, 2003

Fun thread.  Thanks for starting it.

I wrote a driver utility for Windows 3.1 which once made its way into PC Magazine, and not in a good way.  It showed up in a section called "User Interface Bloopers" or something.  Anyway, what I had done was code up a little dialog box with an error message and "OK" button.  Problem was that pressing OK didn't disappear the window... instead, it caused another one to appear.

The shame of my bug showing up in a magazine with a circulation of... well, a lot... was tremendous. 

That is, by far, my worst ever engineering mistake.

-Thomas

Thomas
Friday, May 16, 2003

My current one, that I have to fix this weekend - in a management interface on a web app I have a treeview that allows access to all the files received by the application. Being a DHTML treeview, the whole thing loads before it's sent over the wire.

All of it.

200 documents/day for the three months it's been running...

[well it seemed like a good idea at the time]

Philo

Philo
Friday, May 16, 2003

I wouldn't say that forgetting to override Object.equals() is my worst engineering mistake, but the result was that objects weren't being pulled out of an email queue.  So, there was a Thread on an otherwise lightly loaded dual processor server that was dedicated to spamming a test email account on our production mail server.  I forgot how long it took before anyone noticed that it was on its knees...

Brian
Friday, May 16, 2003

Philo,

I once worked on a web application that also used the tree paradigm. In the design it was assumed that the customer would never have more than a hundred or so branches. Turns out so of our customers intended to use the app in ways we hadn't foreseen (think thousands of branches). When management asked if we could speed up the tree processing to accommodate my answer was: "why do you think Yahoo doesn't use a tree control for its directory?"

BTW we were able to speed up tree generation considerably in the following way: the original impl. built a huge string that contained all the tree nodes. Turns out the string concatenation operations were killing us. We changed the app so that instead data for each node was written out distinctly.

Dan Shappir
Friday, May 16, 2003

Back in the days of "super-mini" computers, I was the jack-of-all-trades: sysadmin, sysop, programmer, and tape monkey.

One of my jobs as tape monkey was to do the daily and weekly backup to tape, which took about 15 minutes per day, and three hours on the weekend.  These backups were very important to the company, as all of our Accounting data was stored on them.

After about six months of religiously backing up, we installed the latest-and-greatest version of the OS.  This involved - I know this may be hard to believe - doing a full backup to tape, formatting the hard drives, installing the new OS, and restoring from tape.

Well, this whole process took me all weekend, and it wasn't until Sunday evening that I discovered something horrible: Not one of my backup tapes were any good.

It turns out that the old version of the OS had a tiny bug: it wouldn't correctly save ISAM files that spanned more than two tapes.  Guess what kind of files our Accounting application used?

By the grace of Codd I was able to find a set of tapes I had made a month earlier using a Beta version of a totally different tape-backup utility, and those tapes were good.  (Otherwise, I think I'd have been out of a job.)

The only problem was, the Accounting department had to re-key in an entire month's worth of data.  Some of them wouldn't talk to me for years...

Spaghetti Rustler
Friday, May 16, 2003

When management asked if we could speed up the tree processing to accommodate my answer was: "why do you think Yahoo doesn't use a tree control for its directory?"

... probably for the same reason Microsoft uses it on the MSDN library -- because its obnoxious to navigate and slow.  =-)

Alyosha`
Friday, May 16, 2003

format c:

This was for a computer that ran the embroidery machine in my mom's protective clothing factory. No computer no embroidery. No embroidery no deliveries.

That entire line was down for a few days while we waited for new disks to arrive.

The machine had been supplied ready installed, and they did not provide the software disks. I don't think it ever quite crossed their minds that someone would really format the hard drive :(

Nothing like that "Insert system disk in drive and press any key to continue"  screen to make you sweat blood.

tapiwa
Saturday, May 17, 2003

Not really a mistake that I made, more a question of utter stupidity. I once had a complete mental blockage and posted to a Delphi newsgroup asking if there was a simple way to determine if a number was negative or not!

Name withheld to preserve career
Sunday, May 18, 2003

Spot the deliberate mistake :)

// move files
for( file in lostafiles)
{
    CopyFile(file, here, there);
    Delete(file);
}

I don't really need to check the return code from CopyFile, do I? What do you mean the destination hard disk is full? Oh.

Too ashamed to own up...
Sunday, May 18, 2003

performance test, one of the system's transactions sends out 1 email notification / transaction.

notification email address configured to somebody's live email address.

Cranked up the loading machines and let the test rip. Playback fast as possible, zero delay.

Only ran for maybe an hour or so at the most.

Several **million** transactions later, we finished the test, system was fine, but...

well, will just say the guy's inbox was kinda full...
seems like the email server was a bit sluggish, too. go figure.

fortunately, system still in dev, target email one of our own, so nothing got outside.

nope, not this time.
Monday, May 19, 2003

Three years ago (in the modem years) I dodn't want to dial-up to check my email, so I wrote a little nice thing (in LotusScript for Notes) that split up the text in incoming email messages (into e.g. 10 parts) and forwarded those as text messages to my cell phone. Nice little thing.

Problem was:
In Lotus Notes, the emails are actually "documents", that is with fields containing "from", "to" - and "cc" and "bcc". My agent (as thing code was called) simply made a copy of the original email-document, replaced the "body" text and forwarded the mail to my cell phone.

For some reason, in version 0.1 of my agent I didn't manage to clear the "cc" and "bcc" fields of the incoming mails, so the result was this:
I was "cc"'ed on one email, and my code forwarded this mail to
1: my cell phone
2: my inbox.
(see the problem ?)
DOH!

Net result: I realised it fairly quickly, and stopped my agent from running, but I still received text messages on my cell phone for 3/4 of a whole day.  Talk about being spammed !!!

masken
Monday, May 19, 2003

memcpy(source, destination, size)

That was in a piece of code which assembled HTTP messages from TCP packets. The code included a couple of memcpy's, only one with source and destination twisted.

It took days to find out why the HTTP messages were corrupted sometimes.

Yes
Tuesday, May 20, 2003

At work today we were still getting the iWorm/Palyh (the one that masquerades as a message from support@microsoft.com)

I forwarded the offending message to our sysadmin with the question "Why isn't Norton on the email server catching this virus?"

Five minutes later he phoned. "Steve, I can't open the attachment you sent me about the virus"

"Oops!".

Stephen Jones
Tuesday, May 20, 2003

*  Recent Topics

*  Fog Creek Home